The Promise and Peril of RDF for Formalizing the Humanities

Creel, James Silas; Potvin, Sarah

View/Open

JPEG (515.1Kb)

Date

2015-04-10

Author

Creel, James Silas

Potvin, Sarah

Metadata

Show full item record

Abstract

The Resource Description Framework (RDF) defines structures for describing entities identifiable by Uniform Resource Identifiers (URIs). RDF exists at the top of the stack of technology standards proposed by the World Wide Web Consortium (W3C) that dominate the ecosystem of the Web to- day, and is purported to enable a Semantic Web on which machines can interpret webpages to perform information seeking and processing tasks on our behalf. RDF describes entities by means of triples each consisting of a subject, predicate, and object. Triples express that a subject stands in a certain re- lation (the predicate) to the object. Subjects and objects are either URIs or literals (such as strings or numbers), but predicates are always URIs refer- ring to abstract relations present in an RDF schema (sometimes informally referred to as an ontology). Since anyone can define schemata and assign new URIs to entities, RDF offers a flexibility of expression approaching natural language. Institutions have adopted RDF for description of humanities works and projects as well as scholars and humanists themselves. Using RDF, digital repositories like Fedora and DSpace expose works, and the VIVO semantic networking tool describes researchers, their affiliations, and works. To the degree that the URIs occur in such different contexts, our data are linked, forming a graph. Yet to what degree can RDF graphs bear meaning like expressions of natural language? The answer does not depend simply upon how linked the data are. More telling is how the data are used and by whom. A related question is what it means for our RDF graphs to be machine readable. Strictly speaking, ontologies do not enable machines to understand each other’s data. Rather, ontologies can help humans understand other humans expressions in a digi- tal medium. A predicate such as <http://purl.org/dc/elements/1.1/author> (i.e. dc:author) is significant of the authorship relation only insofar as hu- mans using the data believe and act as though it is. Our concern, then, is a hermeneutic one: how do humans interpret RDF and how can machines facilitate this interpretation? Our view is that representational systems take on meaning only by consensus. No predicate or URI can bear communica- ble meaning by fiat of a single agent. Naturally, as systems scale in scope and ambition, ever more stakeholders must reach consensus. When a large ontology is developed without continual feedback from a community, the complexity of the finished product will prove a barrier to its adoption. In this talk, we will briefly consider some of the historic lessons learned in formal knowledge representation in the computer science space, where decades of work have yielded mixed results. We will also look at two cur- rent projects that have successfully applied RDF in the humanities space: the Pelagios (http://pelagios-project.blogspot.co.uk) initiative to annotate historic documents with references to places in gazetteers, and the Pleiades (http://pleiades.stoa.org) project that provides such a gazetteer of the an- cient world. These systems have enjoyed success by employing RDF predi- cates in common use and by cultivating community involvement from their inception.

URI

http://hdl.handle.net/10106/25724

Collections

TXDHC 2015 Presenter Abstracts