DASH Data Shapes Vocabulary

This document introduces the DASH Data Shapes Vocabulary, a collection of reusable extensions to SHACL for a wide range of use cases. In addition to a library of new SHACL constraint and target types, DASH also includes components for representing test cases, suggestions to fix constraint violations and an extended validation results vocabulary. Finally, DASH serves as a reference implementation of SHACL in SPARQL by providing default validators. DASH is intended to evolve as a standards-compliant open source vocabulary.

Motivation and Design Goals

SHACL [[shacl]] is a W3C standard for the representation of structural data constraints. Being based on RDF and designed with its own extension mechanisms, the main SHACL vocabulary has been intended to be the starting point of an evolving linked data ecosystem. The DASH namespace presented in this document extends SHACL in a standards-compliant way, following the design patterns established by SHACL.

Some aspects of the DASH namespace are purely declarative extensions of the SHACL data model, e.g. new types of validation results, subclassing sh:AbstractResult. But many components in DASH also provide executable instructions based on SPARQL [[!sparql11-query]]. Standards-compliant SHACL engines with full support for the SPARQL-based extension mechanisms will understand those extensions automatically, without requiring changes to the underlying engine implementation. Examples of this category include the new constraint components and target types. Finally, since the SHACL namespace itself does not include any executable SPARQL queries, DASH serves as a reference implementation of SHACL in SPARQL by providing default validators.

While the initial versions of DASH are maintained by TopQuadrant personnel, and TopBraid tools will provide optimized support for some of the DASH components, the explicit goal of DASH is to establish itself as a de-facto standard that is completely vendor neutral. We want to help the SHACL community grow. Contributions and suggestions for new features are more than welcome - please contact the author directly or join the discussion on the mailing list. As new use cases are established, DASH will be continuously but carefully extended, without bloating the library. Some of the components in DASH are actually features that "almost" made it into the official SHACL standard, but were not included due to lack of time or technical disagreements within the W3C working group.

In order to use DASH in your SHACL file, add the following triple. The DASH namespace already imports the SHACL namespace, so no additional import is needed.

    <http://example.org/myShapesGraph> owl:imports <http://datashapes.org/dash> .

Shapes

The DASH namespace includes some shape declarations that may be of general use. These shapes have no targets, but they can be references by other graphs through sh:node and similar means.

dash:ListShape

This shape can be used to verify that a given node is a well-formed RDF list. The node must either be rdf:nil and have neither rdf:first nor rdf:rest, or be different from rdf:nil and have exactly one rdf:first and exactly one rdf:rest. All nodes in the list must also fulfill the same conditions and there must not be cycles.

Targets

The advanced features of SHACL-SPARQL include an extension mechanism to define new types of target. These user-defined targets can be used in conjunction with the sh:target property.

dash:AllObjectsTarget

A variation of this feature (and its companion AllSubjectsTarget) had been part of earlier SHACL drafts but was taken out to simplify the core language by reducing the need to rely on the sh:target keyword.

The target type dash:AllObjectsTarget represents the set of all objects in the data graph. It can be used in cases where a shape is expected to apply to all objects, regardless of their subject or predicate. For example, it can be used to verify that a graph contains no literals with a language tag.

Since dash:AllObjectsTarget does not take any parameters, it is in principle sufficient to just have a single instance of that class that can be reused everywhere. For that purpose, the DASH namespace includes the instance dash:AllObjects.

The following example uses dash:AllObjects to verify that the data graph contains no literals.

ex:NoLiteralsShape
	a sh:NodeShape ;
	sh:target dash:AllObjects ;
	sh:nodeKind sh:BlankNodeOrIRI .

dash:AllSubjectsTarget

See dash:AllObjectsTarget, only for subjects instead of objects.

dash:HasValueTarget

The target type dash:HasValueTarget represents the set of all subjects that have a certain object value for a certain predicate.

The following example uses dash:HasValueTarget constrain the length of the postal code of addresses where the country is Australia.

ex:AustralianAddressShape
	a sh:NodeShape ;
	sh:target [
		a dash:HasValueTarget ;
		dash:predicate ex:country ;
		dash:object ex:Australia ;
	] ;
	sh:property [
		a sh:PropertyShape ;
		sh:path ex:postalCode ;
		sh:minLength 4 ;
		sh:maxLength 4 ;
	] .

Validation Results

DASH introduces various new subclasses of sh:AbstractResult that can be used to represent different kinds of validation results.

dash:SuccessResult

The SHACL standard itself has no result type to represent successful validation steps. Only violations (or warnings or info items) are produced by default. However, in some cases it would be informative to a user which shapes have been validated at all, e.g. to record the date of previous successful runs. The class dash:SuccessResult can be used in such cases.

The following example represents the successful execution of a given shape against a given focus node. If no sh:focusNode is provided, then the assumption is that it validated OK for all target nodes. If no sh:sourceConstraintComponent is provided, then the assumption is that the shape validated OK for all its components.

[
	a dash:SuccessResult ;
	sh:sourceShape ex:PersonShape ;
	sh:sourceConstraintComponent sh:MinCountConstraintComponent ;
] .

dash:FailureResult

The class dash:FailureResult can be used to represent failures during the execution of a SHACL validation process. Examples of failures include unsupported recursion or invalid shapes graphs. SHACL itself does not include vocabulary to represent those, but rather leaves reporting of failures as an implementation detail to engines. Many engines will simply throw an error ("exception") to signal a failure. However, in some cases it is useful to record such failures as part of the validation results graph.

The following example represents a failure due to an unsupported recursion within a nested sh:node constraint.

[
	a dash:FailureResult ;
	sh:focusNode ex:JoeDoe ;
	sh:sourceShape ex:MyRecursiveShape ;
	sh:sourceConstraintComponent sh:NodeConstraintComponent ;
] .

Linking Data with Shapes

The properties in this section have been introduced to fill perceived gaps in the practical use of the official SHACL namespace.

dash:applicableToClass

The property dash:applicableToClass is a softer version of sh:targetClass. If a shape is linked to a class via sh:targetClass then validation will be triggered, meaning "all instances of the class must conform to the shape". The property dash:applicableToClass points from a shape to a class and means "some instances of the class may conform to the shape". This loose association is often helpful to enumerate candidate shapes to categorize a collection of instances. Another example use case is for user interfaces to display a list of possible views that are available for a class.

ex:AdultPersonShape
	a sh:NodeShape ;
	dash:applicableToClass ex:Person ;
	sh:property [
		sh:path ex:age ;
		sh:datatype xsd:integer ;
		sh:minInclusive 18 ;
	] .

dash:shape

The property dash:shape can be used to state that a subject resource has a given shape. This property can, for example, be used to capture results of SHACL validation on static data. The property is similar to sh:targetNode, but the differences are that dash:shape does not automatically trigger validation, the dash:shape triples are located in the data graph (not shapes graph), and the direction is inverted from subject to object.

ex:JeanMichel
	a ex:Person ;
	ex:age 20 ;
	dash:shape ex:AdultPersonShape .

Abstract Classes

Abstract classes are well-established in object oriented systems. A class is called abstract if it cannot have direct instances - only non-abstract subclasses of an abstract class can be instantiated directly. Despite having a variety in application areas in the RDF world, there is no established standard to express that an RDFS class is abstract. This is in part due to the fact that RDF Schema and OWL have a different notion of inheritance as well as type inference that may be regarded as a contradiction to the traditional notion of abstractness. However, at a minimum, a flag to mark classes as abstract would be useful for user input tools to prevent the user from creating instances of certain classes. The SHACL vocabulary itself has examples of abstract classes, such as sh:AbstractResult.

DASH introduces a simple property dash:abstract that can be attached to any class in a data model (i.e. the property has rdfs:domain rdfs:Class). If this property is set to true then the associated class is supposed to be abstract. No constraint validation is currently associated with this constraint, although it would be easy to formulate a corresponding constraint component in the future.

ex:GeoEntity
	a rdfs:Class ;
	dash:abstract true .

ex:Country
	a rdfs:Class ;
	rdfs:subClassOf ex:GeoEntity .

Union Datatypes

In many use cases, a datatype property can take one out of several datatypes, for example either xsd:date or xsd:dateTime. The syntax to represent those cases in pure SHACL can become a bit repetitive, as it would look like:

ex:DateOrDateTimeShape
	a sh:NodeShape ;
	sh:property [
		sh:path ex:timeStamp ;
		sh:or (
			[ sh:datatype xsd:date ]
			[ sh:datatype xsd:dateTime ]
		)
	] .

In order to help with these design patterns, DASH includes some URIs that can be used in conjunction with sh:or.

dash:DateOrDateTime

The URI dash:DateOrDateTime is defined to be an rdf:List consisting of two sh:datatype constraints, for xsd:date and xsd:dateTime, respectively.

The following example is equivalent to the shape definition from above.

ex:DateOrDateTimeShape
	a sh:NodeShape ;
	sh:property [
		sh:path ex:timeStamp ;
		sh:or dash:DateOrDateTime ;
	] .

dash:StringOrLangString

The URI dash:StringOrLangString is similar to dash:DateOrDateTime, only for xsd:string or rdf:langString. This can be used to represent that a property can either take plain strings or strings with a language tag.

Composites

The DASH namespace includes support for the recurring design pattern of resource that for parent-child relationships, so that the life cycle of the children depends on the parents. An example for this is represented below:

ex:DatabaseShape
	a sh:NodeShape ;
	sh:targetClass ex:Database ;
	sh:property [
		sh:path ex:column ;
		sh:class ex:Column ;
		dash:composite true ;
	] .

In this example, instances of the class ex:Database may have values for a property ex:column, which must be instances of ex:Column. The property is also marked with dash:composite true which indicates that whenever a database gets deleted, then all columns should be deleted, too. This information can be queried by user interface tools and other algorithms to perform cascading deletes, but also for other use cases such as tree visualizations. (TopBraid supports this property as of version 5.2.1 both in Composer and the web products.) Note that dash:composite does not include any constraint validation semantics, i.e. it is purely an "annotation" property.

If the relationship points in the inverse direction, e.g. in the well-known cases of skos:broader or rdfs:subClassOf, then dash:composite can be applied to a property constraint that includes an inverse path, using something like sh:path [ sh:inversePath skos:broader ].

Reification Support for SHACL

Reification is a mechanism to make RDF statements about other RDF statements. This document introduces a SHACL constraint component based on the property dash:reifiableBy that can be used to instruct processors about the shape that reified statements should conform to. dash:reifiableBy links a SHACL property shape with a SHACL node shape that may be used to drive user input or to validate reified statements. See the separate document on Reification Support for SHACL.