Dynamic SHACL

This document introduces an extension to SHACL called Dynamic SHACL. In standard SHACL, the values of constraint parameters such as sh:maxCount are constant values such as the RDF literal 1. In Dynamic SHACL, these values may be SHACL Node Expressions that are computed before the validation starts. This added flexibility significantly expands the expressivity of the SHACL language, in particular for constraints that apply with different values depending on the context.

Motivation

In general, SHACL constraints apply to all target nodes of a shape. For example, a sh:in constraint at a property ex:state applies to all instances of the following class:


ex:Address
    a rdfs:Class, sh:NodeShape ;
	sh:property ex:Address-state ;
.
ex:Address-state
    a sh:PropertyShape ;
	sh:path ex:state ;
	sh:in ( "AL" "AK" "AZ" "AR" ... ) ;   # Here, the US states
.

Imagine instance data such as this:


ex:ArizonaAddress1
    a ex:Address ;
	ex:street "123 John Muir Ave" ;
	ex:country ex:USA ;
	ex:state "AZ" ;
.
ex:QueenslandAddress1
    a ex:Address ;
	ex:street "123 Bob Katter Cl" ;
	ex:country ex:Australia ;
	ex:state "QLD" ;
.

In the example above, the QLD address obviously violates the sh:in constraint because that is limited to the US states.

A recurring requirement for SHACL ontologies is to define constraints that apply only to certain instances of a class, or under certain circumstances. For example, we may want to express that If the Address is inside of Australia then the valid country codes are ACT, NSW, NT, QLD, SA, TAS, VIC and WA.

The following techniques can be used with standard SHACL to express this.

Using Distinct Subclasses

One technique is to define distinct subclasses such as ex:USAddress and ex:AUAddress and redefine a new sh:in constraint at each. This however requires changes to the instance data and would likely lead to an artificial explosion of types for all combinations of distinguishing conditions. Also, an instance of Address would need to dynamically change its rdf:type depending on the value of ex:country.

Using SPARQL-based Targets

Another technique to define conditional constraints is to define a shape that uses a SPARQL-based Target. In the SPARQL query it would be possible to target exactly the addresses that have country ex:Australia. However, this is not efficient as we would need to walk through all SPARQL queries to identify which ones apply. Furthermore this isn't really declarative and a lot of business logic is hidden in SPARQL strings.

Using sh:or

We could also express all cases through sh:or and sh:hasValue, e.g.


ex:Address
	a rdfs:Class, sh:NodeShape ;
	sh:property ex:Address-state ;
	sh:or (
		[
			sh:property [
				sh:path ex:country ;
				sh:hasValue ex:USA ;
			] ;
			sh:property [
				sh:path ex:state ;
				sh:in ( "AL" "AK" "AZ" "AR" ... ) ;
			]
		]
		[
			sh:property [
				sh:path ex:country ;
				sh:hasValue ex:Australia ;
			] ;
			sh:property [
				sh:path ex:state ;
				sh:in ( "ACT" "NSW" "NT" "QLD" "SA" "TAS" "VIC" "WA" ) ;
			]
		]
	) ;
.
ex:Address-state
	a sh:PropertyShape ;
	sh:path ex:state ;
.

This solution is quite convoluted. Although it would work for validation, it would be next to impossible for non-validation use cases such as user interface generators to use this information. In particular, a common technique in form builders would be to display a drop-down list whenever a property declares a sh:in constraint, see Enum Select Editor. A static analysis of the constraint would have trouble understanding the intent here as the sh:in is hidden deep within the shape definition. Human readers would also struggle to parse the meaning of this ontology. Also this structure isn't modular or model-driven, nor extensible.

Using SPARQL and Helper Objects

This technique works best if the valid state codes are attached to the country instances:


ex:Australia
    a ex:Country ;
	ex:stateCode "ACT", "NSW", "NT", "QLD", "SA", "TAS", "VIC", "WA" ;
.
ex:USA
    a ex:Country ;
	ex:stateCode "AL", "AK", "AZ", "AR", ... ;
.

This is a nice declarative and extensible solution. Using this background info the constraint can be expressed like:


ex:Address-state
    a sh:PropertyShape ;
	sh:path ex:state ;
	sh:sparql [
		a sh:SPARQLConstraint ;
		sh:message "State is not among those declared for the country" ;
		sh:prefixes ... ; # omitted
		sh:select """
		    SELECT $this ?value
			WHERE {
				$this ex:state ?value .
				FILTER NOT EXISTS {
					$this ex:country ?country .
					?country ex:stateCode ?value .
				}
			}
        """ ;
	] .

The query above will return all instances of Address ($this) that have a state that is not listed as ex:stateCode for the ex:country of the address. Such instances will be flagged as constraint violations.

This is a reasonable solution assuming you're ready to use SPARQL, but it suffers from the same drawback as all other solutions from this section: that this information can only be used for constraint validation but hardly for any other purpose such as user interface generation.

Example Using Dynamic SHACL

In the proposed Dynamic SHACL extension, the values of a constraint parameter such as sh:in can be computed dynamically based on SHACL AF Node Expressions. In general, Node Expressions take a focus node as input and produce any list of result nodes where the nodes may be computed by looking up property values elsewhere, by filtering values, by applying operations such as union and minus or even by executing a SPARQL query.

Using Dynamic SHACL, the country state example above could be expressed using the following techniques.

Using sh:path expressions

This Dynamic SHACL solution relies on the same helper structure from . It computes the values of the sh:in by fetching all values from the SHACL path expression ex:country / ex:stateCode.


ex:Address-state
    a sh:PropertyShape ;
	sh:path ex:state ;
	sh:in [
		sh:path ( ex:country ex:stateCode )
	] .

This is elegant because it clearly expresses that the valid values for ex:state are from an enumeration by using sh:in which instructs a user interface builder to pick something like a drop down selection list. It also provides enough information to any engine to compute the valid values beforehand, from a node expression.

Using a sh:select expression

In this solution, the value of sh:in is a SHACL Node Expression of type sh:select which is basically any SPARQL query that returns a collection of nodes as result variables.


ex:Address
	a rdfs:Class, sh:NodeShape ;
	sh:property ex:Address-state ;
.
ex:Address-state
	a sh:PropertyShape ;
	sh:path ex:state ;
	sh:in [
	    sh:select """
		    SELECT ?stateCode
			WHERE {
				$this ex:country/ex:stateCode ?stateCode .
			}
		"""
	]
.

This is more general than the solution from because you can use any SPARQL feature for example to perform joins, look up values and convert literal datatypes, or even to build new string values.