Source data for this project is converted from SAS Transport (XPT) row-by-column formt to Resource Description Framework (RDF). In RDF, a Subject Node is linked to an Object Node by a Predicate. The Predicate provides a meaningful relation between the two nodes.
Figure 1: Building Blocks of RDF: Subject, Predicate, Object
Nodes and their relationships join together to create a graph network. The graph for a specific clinical trial has a shape that is defined by the entities and relationships within it. Individual entities like an Animal Subject or a Treatment Arm have their shapes defined by the data and relationships attached to them. Individual nodes have attributes that can be validated (node constraints) as can the incoming and outgoing relations from a node, and the values associated with those connections.
Validation using SHACL
When data has shape, so can the validation rules that act upon it. This is accomplished in RDF using the W3C Standard Shapes Constraint Language (SHACL). SHACL is itself a graph, written as familiar Subject–Predicate–Object triples, as are the resulting Validation reports. A detailed description of SHACL syntax is beyond the scope of this project. Please refer to the References and Resources page to learn more about SHACL.
Figure 2: SHACL Shapes Concept for Data Validation
Interconnected Data, Constraints, and Reports
Validation reports from SHACL are also graph data, making it possible to easily link the data, validation constraints, and report, as shown in this figure illustrating violation of FDA rule SD0083. Click on the image to explore the connections in a 3-D visualization (opens in a new window.)
Figure 3: Violation of FDA Rule SD0083.
|Grey||Node||Predicate. Predicates shown as nodes because SHACL and Report values attached to them.|
|Blue||Node, Edge||Instance data|
|Red||Node||Data errors. In this case, two USUBJID values on for an animal subject|
|Orange||Node||Class. (A Type of thing, defined in an ontology)|