Reification is a mechanism to make RDF statements about other RDF statements. This document introduces a SHACL constraint component based on the property dash:reifiableBy that can be used to instruct processors about the shape that reified statements should conform to. dash:reifiableBy links a SHACL property shape with a SHACL node shape that may be used to drive user input or to validate reified statements.

This document uses the prefix dash which represents the DASH Data Shapes namespace http://datashapes.org/dash# which is accessible via its URL http://datashapes.org/dash.

Goals

This document introduces a general-purpose vocabulary that can easily be supported by APIs and tools from various vendors. The approach is, for example, implemented as part of TopBraid 6.3.

Like most features from the DASH namespace, the specifications here may serve as input to future iterations of the official SHACL standards.

Background: Implementation Approaches to Reification in RDF

Reification is the ability to make statements about statements. For example, you may want to track the date a statement was made and who made it:

ex:Bob ex:age 23 .
    # ex:date "2019-12-05"^^xsd:date ; 
    # ex:author ex:Claire ;
			

In its current 1.1 version, the RDF data model is based on subject-predicate-object triples only, but does not have a built-in mechanism to efficiently represent such reified statements. Let's discuss a couple of options.

Reification based on rdf:Statement

The only "official" reification vocabulary for RDF is based on the class rdf:Statement. Instances of that class can be used to represent statements and the statements about them:

ex:Bob ex:age 23 .

ex:BobAge23Reification
	a rdf:Statement ;
	rdf:subject ex:Bob ;
	rdf:predicate ex:age ;
	rdf:object 23 ;
	ex:date "2019-12-05"^^xsd:date ;
	ex:author ex:Claire .

The rdf:Statement approach has the following characteristics:

Reification based on RDF*

RDF* introduces a new type of RDF nodes that represents triples themselves. Such triples can then appear as subject or object in other triples. RDF* then uses this mechanism to define extensions to Turtle and SPARQL.

ex:Bob ex:age 23 .

<<ex:Bob ex:age 23>>
	ex:date "2019-12-05"^^xsd:date ;
	ex:author ex:Claire .

Note that in the example above, the triple ex:Bob ex:age 23 is both asserted in the graph and then also the subject of two reified statements. There are ongoing discussions about whether reified statements alone would also imply that the real triple exists too.

Various syntactic variations of the above (in Turtle) have been suggested, for example the following, which would both assert the triple and allow to make reified statements about it:

ex:Bob ex:age 23 [[
	ex:date "2019-12-05"^^xsd:date ;
	ex:author ex:Claire ;
]] .

RDF* has the following characteristics:

Reification based on URI-Encoded Triples

This approach is introduced by TopBraid 6.3. Not sure if it's written up anywhere else, but there is a similar approach described by Jerven Bolleman.

Given a triple (S, P, O) and an optional system-wide set of prefix mappings (to shorten URIs), a reification URI can be generated of the form

<urn:triple:${enc(S)}:${enc(P)}:${enc(O)}>

where enc(N) is the URI-encoded Turtle serialization of a given node N using the provided prefix mappings. For example, assuming a suitable prefix for ex:

ex:Bob ex:age 23 .

<urn:triple:ex%3ABob:ex%3Aage:23>
	ex:date "2019-12-05"^^xsd:date ;
	ex:author ex:Claire .

URI-based reification has the following characteristics:

This approach can be combined with rdf:Statements. For example, these reification URIs could have values for rdf:subject etc to remove the look-up problems mentioned above. Furthermore, TopBraid EDG will convert rdf:Statements into reification URIs when files are imported, and can restore the rdf:Statement instances when exporting using the sorted Turtle writer. In that case, the property dash:uri is used to remember the original URIs of the rdf:Statements for round-tripping. As a result of this, TopBraid can efficiently query reified triples at run-time without sacrificing interoperability with 3rd party systems that require rdf:Statements.

This approach can also be combined with RDF* and provides a practical low-cost implementation strategy that doesn't require introducing a new RDF node type for triples. Future versions of TopBraid will probably support Turtle* and SPARQL* as syntax while using triple URIs for the implementation.

dash:reifiableBy

The property dash:reifiableBy can be used to link a SHACL property shape with one or more node shapes. Any reified statement must conform to these node shapes.

The following example states that all reified values of ex:age at the class ex:Person must conform to the (provenance) shape that defines date and author properties:

ex:ProvenanceShape
	a sh:NodeShape ;
	sh:property [
		a sh:PropertyShape ;
		sh:path ex:date ;
		sh:datatype xsd:date ;
		sh:maxCount 1 ;
		sh:order "0"^^xsd:decimal ;
	] ;
	sh:property [
		a sh:PropertyShape ;
		sh:path ex:author ;
		sh:nodeKind sh:IRI ;
		sh:maxCount 1 ;
		sh:order "1"^^xsd:decimal ;
	] .

ex:PersonShape
	a sh:NodeShape ;
	sh:targetClass ex:Person ;
	sh:property ex:PersonShape-age .

ex:PersonShape-age
	a sh:PropertyShape ;
	sh:datatype xsd:integer ;
	sh:maxCount 1 ;
	dash:reifiableBy ex:ProvenanceShape .

Regardless of which specific reification implementation is chosen, the information above can be exploited by tools to drive and validate user input. For example, in TopBraid 6.3, the edit forms will display a "nested" form section that can be opened below each value that is of a reifiable property:

Similar reification shapes can be defined to attach metadata about SKOS labels, covering some of the use cases of SKOS-XL:

Tools can use dash:reifiableBy triples to check for the presence of reified triples and then highlight them in the user interface. Then, as the user enters details, the shape definition can be used to drive the input forms.

The SHACL validation component for dash:reifiableBy is called dash:ReifiableByConstraintComponent. Depending on the implemented reification approach, it may produce constraint violations on the focus node, path and value node of the base triple, and then use sh:detail to list the problems that were found on the reification shape.

Reification SPARQL Functions supported by TopBraid

This appendix lists some SPARQL functions that can be used within TopBraid to translate between reification URIs and triples. These functions may change and hopefully become obsolete in future versions, once a more user-friendly SPARQL syntax similar to SPARQL-star is implemented. Meanwhile this section is provided as reference to TopBraid users.

tosh:reificationURI

The SPARQL function tosh:reificationURI constructs a URI that is used to represent a reified triple. The input is a subject, a predicate and an object node, and the output is a URI node that can then be used for example to add reified values or to query existing reifications.

The following query produces reification URIs for each triple that has owl:Thing as its subject and then fetches those reification URIs that have a value for ex:creator:

SELECT *
WHERE {
	BIND (owl:Thing AS ?s) .
	?s ?p ?o .
	BIND (tosh:reificationURI(?s, ?p, ?o) AS ?uri) .
	?uri ex:creator ?creator .
}

tosh:reificationURIOf

The SPARQL property function (magic property) tosh:reificationURIOf can be used to convert a reification URI (e.g., produced by tosh:reificationURI) back into subject, predicate and object components. It requires a reification URI on the left hand side and three unbound variables for subject, predicate and object on the right hand side.

In the following example query, we first iterate over all reification URIs that have a value for ex:creator and then disassemble them into subject, predicate and object components to learn the original triples that have been reified.

SELECT *
WHERE {
	?uri ex:creator ?creator .
	?uri tosh:reificationURIOf ( ?s ?p ?o ) .
}

tosh:reificationSubject/Predicate/Object

The SPARQL functions tosh:reificationSubject, tosh:reificationPredicate and tosh:reificationObject can be used to convert a reification URI (e.g., produced by tosh:reificationURI) back into subject, predicate and object components.

In the following example query, we first iterate over all reification URIs that have a value for ex:creator and then disassemble them into subject, predicate and object components to learn the original triples that have been reified.

SELECT *
WHERE {
	?uri ex:creator ?creator .
	BIND (tosh:reificationSubject(?uri) AS ?s) .
	BIND (tosh:reificationPredicate(?uri) AS ?p) .
	BIND (tosh:reificationObject(?uri) AS ?o) .
}

tosh:reifiedValue

The SPARQL function tosh:reifiedValue provides direct access to a value of a reified triple, e.g. the timestamp. It is primarily a convenience function to look up values if you already know subject, predicate and object.

The following example fetches the creator of the triple owl:Thing rdf:type rdfs:Class, if any exists.

SELECT ?creator
WHERE {
	BIND (tosh:reifiedValue(owl:Thing, rdf:type, rdfs:Class, ex:creator) AS ?creator) . 
}