The information on this page is under active revision.

Animal Subject Shape

The project defines a set of re-usable, hierarchical shapes for use in a variety of data validation scenarios spanning the study data lifecycle. The first shapes are constructed for the Demographics (DM) domain, where each Animal Subject is a represented as a row in the source data. This makes the AnimalSubjectShape a natural choice for a shape containing additional constraints. The Animal Subject Shape definitions are located in this file:

The SHACL shapes structure and naming conventions are specified with reuse in mind. Where practical, shapes are named using a description of their function plus the word Shape followed by a dash and then an abbreviated name of the class or entity they act upon. Examples:

  • hasMin1Max1Shape-USubjID - validates that each Animal Subject has a minimum of one and maximum of one USUBJID value.
  • isUniqueShape-USubjID - validates the uniqueness of USUBJID values. A USUBJID cannot be assigned to more than one Animal Subject.

Shapes may include additional constraints such as data type, length, and other restrictions not explicitly stated in the original FDA rules.

The sh:message property provides meaningful messages about violations when they are detected. Where applicable, the related FDA Rule ID number is provided in square brackets at the end of the message text. In cases where a shape may be applied to more than one Rule, all rules covered by that shape are listed.

Example: sh:message "Subject --> USUBJID violation. [SD0083]" ;

A shape is created to define the constraints attached to the Animal Subject IRI. Each individual constraint is described in the sections that follow.

Rule Statement
One sh:property for each type of predicate ----> object relation attached directly to the AnimalSubject IRI.
Description
Each type of predicate ----> object relation for the AnimalSubject class, with the exception of predicates like `rdf:type`, `skos:prefLabel`, etc., has a `sh:property` definition for a shape that validates that type of entity.

The Node Shape study:AnimalSubjectShape describes nodes of the class study:AnimalSubject . FDA Rule numbers are added as comments to facilitate referencing back to the original FDA requirements.

study:AnimalSubjectShape
  a              sh:NodeShape ;
  sh:targetClass study:AnimalSubject ;
  sh:property    study:hasMin1Max1Shape-USubjID ;        # Rule SD0083
  sh:property    study:isUniqueShape-USubjID ;           # Rule SD0083
  sh:property    study:hasMin1Max1Shape-SubjID ;         # Rule SD1001
  sh:property    study:isUniqueShape-SubjID ;            # Rule SD1001
  sh:property    study:hasTypeXsdDate-Date ;             # Rule SD1002
  sh:property    study:hasMin1Max1Shape-Interval ;       # Rule SD1002
  sh:property    study:hasMin1Max1Shape-StartEndDates ;  # Rule SD1002
  sh:property    study:hasStartLEEndShape-Interval ;     # Rule SD1002
  sh:property    study:hasMinInclusive0Shape-Age ;       # Rule SD0084


  ... more property shapes will be added as they are developed

If an ontology defines study:AnimalSubject as a subclass of study:Subject, then shapes could use the sh:targetClass study:Subject (assuming common constraints for both classes.) If a clinical trial on human study subjects were to define a study:HumanStudySubject as a subclass of study:Subject, the same constraints could be used for both pre-clinical (non-human animals) and clinical (human) data validation. This work focusses on SEND, so the target class study:AnimalSubject is specified for simplicity.

study:Subject
  rdf:type owl:Class ;
  rdfs:subClassOf study:Party ;
  skos:prefLabel "Subject" ;
.
study:Animal
  rdf:type owl:Class ;
  rdfs:subClassOf study:BiologicEntity ;
  skos:prefLabel "Animal" ;
.
study:AnimalSubject
  rdf:type owl:Class ;
  rdfs:subClassOf study:Animal ;
  rdfs:subClassOf study:Subject ;
  skos:prefLabel "Animal subject" ;
.

Data

It is important to understand how the data is restructured from row-by-column XPT format by the Data Conversion process. Here is a subset of the original data for Animal Subject 00M01:

SubjectIRI subjid usubjid setcd species
Animal_a6d09184 00M01 CJ16050_00M01 00 Rat


And its translation into RDF triples:

cj16050:Animal_a6d09184
    a                          study:AnimalSubject ;
    skos:prefLabel             "Animal 00M01"^^xsd:string ;
    study:hasUniqueSubjectID   cj16050:UniqueSubjectIdentifier_CJ16050_00M01 ;
    study:hasSubjectID         cj16050:SubjectIdentifier_00M01 ;
    study:hasReferenceInterval cj16050:Interval_Animal_a6d09184 ;
    study:memberOf             cjprot:Set_00,
                               code:Species_Rat ;
    study:participatesIn       cj16050:SexDataCollection_Animal_a6d09184 ,
                               cj16050:AgeDataCollection_Animal_a6d09184,
                               cj16050:Randomization_Animal_a6d09184 .


The first validation shapes will be formed around the AnimalSubject Class study:AnimalSubject as described above. The first rules in our project center around the subject identifiers usubjid and subjid. The figure below shows the connections from the Animal Subject IRI to the USUBJID and SUBJID IRI values. Become familiar with both the figure and the RDF triples (above) before proceeding to the next steps.

Animal Subject Node to ID Values


FDA Rules as SHACL Shapes

Rule SD0083 : USUBJID

The spreadsheet FDA-Validator-Rules.xlsx defines the rule for USUBJID in the DM Domain as:

FDA Validator Rule ID FDA Validator Message Business or Conformance Rule Validated FDA Validator Rule
SD0083 Duplicate USUBJID Identifier used to uniquely identify a subject across all studies The value of Unique Subject Identifier (USUBJID) variable must be unique for each subject across all trials* in the submission.

* Because the prototype is based on data from a single trial, Rule SD0083 is evaluated within the context of a single study.

The Rule is deconstructed into the following components based on knowledge of the study data requirements, RDF data model (schema), and SD0083 rule statement:

RC1. An Animal Subject cannot have more than one USUBJID.

RC2. An Animal Subject cannot have a missing USUBJID.

RC3. A USUBJID cannot be assigned to more than one Animal Subject.

Translation of Rule Components into SHACL and evaluation of test data is described below. Rule Components RC1 and RC2 are satisfied by a single SHACL Shape, while a second shape evaluates the third component. Addition information for the traceability between rules, shapes, and the data used to evaluate them is available in the file TestCases.xlsx


SD0083-RC1, RC2 : A single, non-missing USUBJID per Animal Subject.

Rule Statement
:AnimalSubject has a sh:minCount and sh:maxCount of 1 USUBJID.
Description
An Animal Subject must be assigned one and only one USUBJID. Missing and multiple USUBJID values are not allowed for an AnimalSubject.

Test Data

A copy of the DM TTL file for SHACL testing is available at SHACL/CJ16050Constraints/DM-CJ16050-R.TTL

studyid domain usubjid subjid SubjectIRI Rule Violated
CJ16050 DM CJ16050_00M01 00M01 Animal_a6d09184 None
CJ16050 DM CJ16050_99T1 99T1 Animal_2a836191 SD0083-RC1
CJ16050 DM CJ16050_99T2 99T2 Animal_2a836191 SD0083-RC1
CJ16050 DM NA NA Animal_69fa85ac SD0083-RC2


Referring to the AnimalSubject IRIs (column SubjectIRI):

  • Animal_a6d09184 is compliant, with single values for usubjid, subjid.
  • Animal_2a836191 has two usubjid values and two subjid values.
  • Animal_69fa85ac is missing values for both usubjid and subjid

Triples for the compliant AnimalSubject appear as:

  cj16050:Animal_a6d09184
    a                        study:AnimalSubject ;
    skos:prefLabel           "Animal 00M01"^^xsd:string ;
    study:hasUniqueSubjectID cj16050:UniqueSubjectIdentifier_CJ16050_00M01  ;
  ...


Each AnimalSubject must be evaluated to ensure it has minimum of one usubjid and no more than one usubjid. The shape will therefore be named study:hasMin1Max1Shape-USubjID and it is a sh:property of the study:AnimalSubjectShape.

study:AnimalSubjectShape
  a              sh:NodeShape ;
  sh:targetClass study:AnimalSubject
  sh:property    study:hasMin1Max1Shape-USubjID  ;       

 ...


From the TTL data we see that usubjid is attached to the AnimalSubject IRI by the study:hasUniqueSubjectID predicate. This predicate is specified as the object (target) of sh:path as shown in the shape definition, below. The shape further specifies there should be a minimum of one and maximum of one path through study:hasUniqueSubjectID for a given AnimalSubject.

study:AnimalSubjectShape
  a              sh:NodeShape ;
  sh:targetClass study:AnimalSubject
  sh:property    study:hasMin1Max1Shape-USubjID  ;       

 ...

#--- Unique Subject ID (USUBJID) ----
study:hasMin1Max1Shape-USubjID 
  a              sh:PropertyShape ;
  sh:name        "minmaxUniqueSubjid" ;
  sh:description "A single, exclusive USUBJID must be assigned to a Subject." ;
  sh:message     "Subject --> USUBJID violation. [SD0083]" ;
  sh:path        study:hasUniqueSubjectID ;
  sh:minCount    1 ;
  sh:maxCount    1 .


SHACL file is located at /SENDConform/SHACL/CJ16050Constraints/SHACL-AnimalSubject.TTL


SD0083-Test Case 1 : Animal Subject Assigned Two USUBJID values

Test data for Animal Subject IRI Animal_2a836191 is assigned to two USUBJID values, violating SD0083-RC1.

  cj16050:Animal_2a836191
    a                        study:AnimalSubject ;
    skos:prefLabel "Animal 99T1"^^xsd:string,
                   "Animal 99T2"^^xsd:string ;


    study:hasUniqueSubjectID cj16050:UniqueSubjectIdentifier_CJ16050_99T1,
                             cj16050:UniqueSubjectIdentifier_CJ16050_99T2 ;
  ...

Violation of Rule Component 1 as detected by the sh:maxCount constraint:

  ...
  sh:path study:hasUniqueSubjectID ;
  sh:minCount  1 ;
  sh:maxCount  1 
  ...

See the instructions on the Validation page for how to apply the SHACL to the data and generate the Validation Report.

The Report correctly confirms AnimalSubject Animal_2a836191 has more than one USUBJID value, violating the MaxConstraintComponent of FDA Rule SD0083.

  a sh:ValidationResult ;
    sh:resultSeverity            sh:Violation ;
    sh:sourceShape               study:hasMin1Max1Shape-USubjID ;
    sh:focusNode                 cj16050:Animal_2a836191 ;
    sh:resultMessage             "Subject --> USUBJID violation. [SD0083]" ;
    sh:resultPath                study:hasUniqueSubjectID ;
    sh:sourceConstraintComponent sh:MaxCountConstraintComponent

The AnimalSubject IRI in the Report can be use to identify the USUBJID value that violates the constraint. File: /SPARQL/SD0083-TC1-Info.rq

 SELECT ?animalIRI ?usubjidLabel
  WHERE{
    cj16050:Animal_2a836191   study:hasUniqueSubjectID ?usubjidIRI .
    ?usubjidIRI              skos:prefLabel           ?usubjidLabel .
     BIND(IRI(cj16050:Animal_2a836191) AS ?animalIRI )
}

The query result shows Animal_2a836191 is assigned two usubjid, in violation of the rule.

  animalIRI                 usubjidLabel
  cj16050:Animal_2a836191  "CJ16050-99T1"
  cj16050:Animal_2a836191  "CJ16050_99T2"
Verify

SPARQL independently verifies Animal_2a836191 has two USUBJID values by querying the graph to find all Subject IRIs that do not have one and only 1 usubjidIRI. File: /SPARQL/SD0083-TC1-Verify.rq

  SELECT ?animalSubjectIRI (COUNT(?usubjidIRI) AS ?total)
  WHERE{
   ?animalSubjectIRI a                        study:AnimalSubject ;
                     study:hasUniqueSubjectID ?usubjidIRI .
   ?usubjidIRI       skos:prefLabel           ?usubjidLabel .
  } GROUP BY ?animalSubjectIRI
    HAVING (?total != 1)
  animalSubjectIRI          total
  cj16050:Animal_2a836191    2


Putting it all together:


3-D Visualization



SD0083-Test Case 2 : Animal Subject has no USUBJID value

In the test data, animalSubject IRI Animal_69fa85ac has no USUBJID value, violating SD0083-RC2.

  cj16050:Animal_69fa85ac
    a study:AnimalSubject ;
    study:hasReferenceInterval cj16050:Interval_Animal_69fa85ac ;
    study:memberOf cjprot:Set_00, code:Species_Rat ;
    study:participatesIn cj16050:AgeDataCollection_Animal_69fa85ac, cj16050:SexDataCollection_Animal_69fa85ac .

The <a href=’#sd0083rc1shacl>SHACL is identical to SD0083-Test Case 1</a>.

The Report correctly identifies AnimalSubject IRI Animal_69fa85ac as violating the constraint that USUBJID must be non-missing.

  a sh:ValidationResult ;                                                     
    sh:resultSeverity sh:Violation ;                                        
    sh:sourceShape study:hasMin1Max1Shape-USubjID ;
    sh:focusNode cj16050:Animal_69fa85ac ;
    sh:resultMessage "Subject --> USUBJID violation [SD0083]" ;
    sh:resultPath study:hasUniqueSubjectID ;       
    sh:sourceConstraintComponent sh:MinCountConstraintComponent            

The AnimalSubject IRI in the Report can be use to identify the value of Predicates and Objects attached to the AnimalSubject IRI in facilitate identification of the problematic record, since a missing USUBJID means no skos:prefLabel is available. File: /SPARQL/SD0083-TC2-Info.rq

  SELECT ?animalIRI ?p ?o
  WHERE{
    cj16050:Animal_69fa85ac ?p ?o .
    BIND(IRI(cj16050:Animal_Animal_69fa85ac) AS ?animalIRI )
  }
animalIRI                 p                            o
cj16050:Animal_69fa85ac   rdf:type                     study:AnimalSubject
cj16050:Animal_69fa85ac   study:hasReferenceInterval   cj16050:Interval_Animal_69fa85ac
cj16050:Animal_69fa85ac   study:memberOf               cjprot:#Set_00
cj16050:Animal_69fa85ac   study:memberOf               code:Species_Rat
cj16050:Animal_69fa85ac   study:participatesIn         cj16050:AgeDataCollection_Animal_69fa85ac
cj16050:Animal_69fa85ac   study:participatesIn         cj16050:SexDataCollection_Animal_69fa85ac
Verify

SPARQL independently confirms Animal_69fa85acc has no USUBJID. Because usubjid is used as the skos:prefLabel for AnimalSubject, there is not label to return when usubjid is missing. File: /SPARQL/SD0083-TC2-Verify.rq

  SELECT ?animalIRI
  WHERE{
    ?animalIRI a study:AnimalSubject .
    OPTIONAL{ ?animalIRI study:hasUniqueSubjectID ?usubjid . }
    FILTER(NOT EXISTS { ?animalIRI study:hasUniqueSubjectID ?usubjid. })
}
animalIRI
cj16050:Animal_69fa85ac



SD0083-RC3: A USUBJID cannot be assigned to more than one Animal Subject

Test Data

In the test data, Animal Subjects Animal_5dba5b4b and Animal_1a2751f1 have the same USUBJID value.

studyid domain usubjid subjid SubjectIRI Rule Violated
CJ16050 DM CJ16050_00M01 00M01 Animal_a6d09184 None
CJ16050 DM CJ16050_99T4 99T4 Animal_5dba5b4b SD0083-RC3
CJ16050 DM CJ16050_99T4 99T4 Animal_1a2751f1 SD0083-RC3


In RDF:

cj16050:Animal_5dba5b4b
    a study:AnimalSubject ;
    skos:prefLabel "Animal 99T4"^^xsd:string ;
    study:hasUniqueSubjectID cj16050:UniqueSubjectIdentifier_CJ16050_99T4 ;

cj16050:Animal_1a2751f1
    a study:AnimalSubject ;
    skos:prefLabel "Animal 99T4"^^xsd:string ;
    study:hasUniqueSubjectID cj16050:UniqueSubjectIdentifier_CJ16050_99T4 ;


There are multiple ways to assess the unique assignment of a USUBJID value to an AnimalSubject. Both SHACL-Core and SHACL-SPARQL are viable alternatives. Two SHACL-Core alternatives are discussed here.

Method 1: Identify USUBJIDs assigned to multiple AnimalSubjects

  • Identify duplicated USUBJID values.
Rule Statement
The target Object of the sh:inversePath for the predicate study:hasUniqueSubjectID must have a sh:maxCount of 1 .
Description
USUBJID is the Object associated with the study:hasUniqueSubjectID predicate. This method starts at the SUBJID Object and travels in the reverse direction through study:hasUniqueSubjectID using sh:inversePath to determine if a USUBJID Object is attached to more than one AnimalSubject Subject. This test is the most informative when trying to quickly identify duplicate USUBJID values without immediately identifying the AnimalSubject IRIs associated with those USUBJID values.

SHACL Shape for Method 1: Identify duplicate USUBJID values. This shape is applied to all uses of the predicate study:hasUniqueSubjectID, allowing its use for both SEND and SDTM when that predicate is present.

study:isUniqueShape-USubjID a sh:PropertyShape ;
  sh:targetObjectsOf study:hasUniqueSubjectID  ;
  sh:name            "uniqueUSubjid" ;
  sh:description     "A USUBJID must only be assigned to one Subject." ;
  sh:message         "USUBJID assigned to more than one Subject. [SD0083]" ;
  sh:property [
    sh:path [sh:inversePath study:hasUniqueSubjectID]  ;
    sh:maxCount 1
  ] .


The Validation Report from Method 1 is not shown because Method 2 was chosen for the project.


Method 2: Identify the AnimalSubjects that have the same USUBJID

  • Identify AnimalSubject IRIs assigned to the same USUBJID value.
Rule Statement
The target Class study:AnimalSubject of the sh:inversePath of the predicate study:hasUniqueSubjectID must have a sh:maxCount of 1 .
Description
The subtle difference in Method 2 is that it identifies the AnimalSubject IRIs that have the same USUBJID and does list the offending SUBUJID values. As in Method 1, the predicate study:hasUniqueSubjectID is evaluated in the reverse direction, from the USUBJID value to the AnimalSubject IRI using sh:inversePath .
  # Animal Subject Shape
  study:AnimalSubjectShape
    a              sh:NodeShape ;
    sh:targetClass study:AnimalSubject
    sh:property    study:hasMin1Max1Shape-USubjID  ;       
    sh:property    study:isUniqueShape-USubjID
    ...

study:isUniqueShape-USubjID 
  a  sh:PropertyShape ;
    sh:name            "uniqueUSubjid" ;
    sh:description     "A USUBJID must only be assigned to one Subject." ;
    sh:message         "USUBJID assigned to more than one Subject. [SD0083]" ;
    sh:path (study:hasUniqueSubjectID [sh:inversePath study:hasUniqueSubjectID]) ;
    sh:maxCount 1 ;
   .

Method 2 was chosen for consistency with the other checks in this section that focus on the identification of AnimalSubjects that fail constraints.

The report from Method 2 correctly identifies the Animal Subjects Animal_5dba5b4b and Animal_1a2751f1 as sharing the same USUBJID.

a sh:ValidationResult ;
  sh:sourceConstraintComponent sh:MaxCountConstraintComponent ;
  sh:focusNode cj16050:Animal_5dba5b4b ;
  sh:sourceShape _:bnode_dc2d5e41_a650_456a_87ee_944f84cffae6_826 ;
  sh:resultPath ( study:hasUniqueSubjectID [
    sh:inversePath study:hasUniqueSubjectID
  ] ) ;
  sh:resultMessage "USUBJID assigned to more than one Subject. [SD0083]" ;
  sh:resultSeverity sh:Violation

  ...

a sh:ValidationResult ;
  sh:sourceConstraintComponent sh:MaxCountConstraintComponent ;
  sh:focusNode cj16050:Animal_1a2751f1 ;
  sh:sourceShape _:bnode_dc2d5e41_a650_456a_87ee_944f84cffae6_826 ;
  sh:resultPath ( study:hasUniqueSubjectID [
    sh:inversePath study:hasUniqueSubjectID
  ] ) ;
  sh:resultMessage "USUBJID assigned to more than one Subject. [SD0083]" ;
  sh:resultSeverity sh:Violation ;

  ...


Use the AnimalSubject IRI values to identify the offendingusubjid value. File: /SPARQL/SD0083-TC3-Info.rq

  SELECT ?animalIRI ?animalLabel ?usubjid
  WHERE{
    {
      cj16050:Animal_5dba5b4b study:hasUniqueSubjectID ?usubjidIRI ;
                            skos:prefLabel           ?animalLabel .
      ?usubjidIRI             skos:prefLabel           ?usubjid .
      BIND(IRI(cj16050:Animal_5dba5b4b) AS ?animalIRI )
    }
    UNION
    {
      cj16050:Animal_1a2751f1 study:hasUniqueSubjectID ?usubjidIRI ;
                              skos:prefLabel           ?animalLabel .
      ?usubjidIRI             skos:prefLabel           ?usubjid .
      BIND(IRI(cj16050:Animal_1a2751f1) AS ?animalIRI )
    }
  }
  animalIRI                   animalLabel       usubjid
  cj16050:Animal_5dba5b4b	  "Animal 99T4"	  "CJ16050_99T4"
  cj16050:Animal_1a2751f1	  "Animal 99T4"	  "CJ16050_99T4"
Verify

Independently verify Animal_5dba5b4b and Animal_1a2751f1 share the same USUBJID (and consequently the same label for the AnimalSubject and USUBJID). File: /SPARQL/SD0083-TC3-Verify.rq

  SELECT ?animalIRI ?usubjid
  WHERE{

    ?animalIRI  study:hasUniqueSubjectID ?usubjidIRI ;
                skos:prefLabel           ?usubjid .
    ?animalIRI2  study:hasUniqueSubjectID ?usubjidIRI .
    FILTER(?animalIRI != ?animalIRI2)
  }
  animalIRI                 usubjid
  cj16050:Animal_1a2751f1   "Animal 99T4"
  cj16050:Animal_5dba5b4b   "Animal 99T4"



The following information is outdated as of 2020-02-13.

Rule SD1001 : SUBJID

The spreadsheet FDA-Validator-Rules.xlsx defines the rule for SUBJID in the DM Domain as:

FDA Validator Rule ID FDA Validator Message Business or Conformance Rule Validated FDA Validator Rule
SD1001 Duplicate SUBJID ‘Subject identifier, which must be unique within the study. The value of Subject Identifier for the Study (SUBJID) variable must be unique for each subject within the study.

The Rule Components and corresponding SHACL shapes for SD1001 are similar to those defined for USUBJID in SD0083 with exception of the predicate changing to study:hasSubjectID and result messages specific to SUBJID instead of USUBJID. Details for SD1001 are therefore not provided here. The SHACL is available in the file SHACL-AnimalSubject.TTL

Rule SD1002 : RFSTDTC, RFENDTC (Reference Interval)

The spreadsheet FDA-Validator-Rules.xlsx defines Rule SD10002 for Reference Start Date (RFSTDTC) and Reference End Date (RFENDTC) as:

FDA Validator Rule ID FDA Validator Message Business or Conformance Rule Validated FDA Validator Rule
SD1002 RFSTDTC is after RFENDTC Study Start and End Dates must be submitted and complete. Subject Reference Start Date/Time (RFSTDTC) must be less than or equal to Subject Reference End Date/Time (RFENDTC)

This project models RFSTDTC and RFENDTC as part of a Reference Interval (see next figure) connected to the Animal Subject IRI. This leads to the deconstruction of the FDA rule into the following Rule Components:

RC1. Reference Start Date and End Date must be in xsd:date format.

RC2. An Animal Subject has one Reference Interval.

RC3. A Reference Interval has one Start Date and one End Date.

RC4. Start Date must be on or before End Date.

Data Structure

Familiarity with the data structure is necessary to explain the constraints and test cases. The figure below illustrates a partial set of data for an AnimalSubject where the Reference Interval end date precedes the start date, thus violating Rule Component 4 of SD1002.

AnimalSubject with incorrect Reference Interval dates

SD1002-RC1. Reference Start Date and End Date in xsd:date format
Rule Statement
rfstdtc and rfendtc in xsd:date format.
Description
Reference Start Date (RFSTDTC) and End Date (RFENDTC) must be in date format. The study in this example requires xsd:date. Other studies may use xsd:dateTime or a combination of xsd:date and xsd:dateTime.

Test Data

In the test data, two Animal Subjects have string values for dates instead of the required xsd:date.

usubjid SubjectIRI rfstdtc rfendtc Rule Violated
00M01 Animal_a6d09184 2016-12-07 2016-12-07 None
99T6 Animal_aa573a5d 5-DEC-16 2016-12-07 SD1002-RC1
99T7 Animal_cdd31fb6 2016-12-07 6-DEC-16 SD1002-RC1


OUTDATED TEXT BEGINS Here

Refer back to previous sections to compare the data to the SHACL, below. The shape :DateFmtShape uses sh:targetObjectsOf to begin evaluation at the object of the predicates time:hasBeginning and time:hasEnd. These objects must be of type study:ReferenceBegin or study:ReferenceEnd and have the predicate time:inXSDDate that leads to the date value that must be in xsd:date format.

Interval IRI - - - time:hasBeginning  - - > Date IRI - - > time:inXSDDate - - > Date value

Interval IRI - - - time:hasEnd  - - > Date IRI - - > time:inXSDDate - - > Date value

Test data for Animal Subject 99T4 contains a string value for rfendtc. Not shown: Subject 9T10 with string value for rfstdtc.


cj16050:Animal_68bab561
  a                          study:AnimalSubject ;
  skos:prefLabel             "Animal 99T4"^^xsd:string ;
  study:hasReferenceInterval cj16050:Interval_Animal_68bab561 ;
  ...

cj16050:Interval_Animal_68bab561
  a                 study:ReferenceInterval ;
  time:hasBeginning cj16050:Date_2016-12-08 ;
  time:hasEnd       cj16050:Date_7-DEC-16  .

cj16050:Date_7-DEC-16
      a study:ReferenceEnd ;
      time:inXSDDate "7-DEC-16"^^xsd:string .


The shape tests the following conditions:

  • A Reference Start Date must be in xsd:date format.
  • A Reference End Date must be in xsd:date format.

Additional dates can be assessed by adding additional predicates as sh:targetObjectsOf if the date follows through the path time:inXSDDate.

study:hasTypeXsdDateShape-Date a sh:NodeShape ;
  sh:targetObjectsOf time:hasBeginning ;
  sh:targetObjectsOf time:hasEnd ;
  sh:or (
    [ sh:class study:ReferenceBegin ]
    [ sh:class study:ReferenceEnd ]
  ) ;  
  ] ;
  sh:property [
    sh:path        time:inXSDDate ;  
    sh:datatype    xsd:date ;
    sh:name        "xsd:date format";
    sh:description "Date format as xsd:date.";
    sh:message     "Date not in xsd:date format. [SD1002]"
  ] .  


The report correctly identifies the value ‘7-DEC-16’ as a string, violating the xsd:date requirement.

  a sh:ValidationReport ;
    sh:conforms false ;
    sh:result [
      a sh:ValidationResult ;
        sh:resultPath time:inXSDDate ;
        sh:resultSeverity sh:Violation ;
        sh:resultMessage "Date not in xsd:date format. [SD1002]" ;
        sh:value "7-DEC-16" ;
        sh:sourceShape _:bnode_3c9cf811_13d4_43cb_b212_b7097d00b1ed_221 ;
        sh:sourceConstraintComponent sh:DatatypeConstraintComponent ;
        sh:focusNode cj16050:Date_7-DEC-16 ;
    ]


The Report identifies the dates “7-DEC-16” and “6-DEC-16” (not shown above). Execute the following SPARQL to find corresponding Animal SUBJECT IRIs and values (Animal 99T4 for date “7-Dec-16” and Animal 99T10 for date “6-Dec-16”). Source file: /SPARQL/Animal-RefInterval.rq

  # RC1 : Find Subject with incorrect date format
  SELECT ?animalSubjectIRI ?animalLabel ?date
  WHERE{
    ?animalSubjectIRI a                          study:AnimalSubject ;
                      skos:prefLabel             ?animalLabel ;
                      study:hasReferenceInterval ?intervalIRI .

    ?intervalIRI ?beginOrEnd     ?dateIRI .
    ?dateIRI     time:inXSDDate  ?date .
    FILTER (?dateIRI IN (cj16050:Date_6-DEC-16, cj16050:Date_7-DEC-16))
  }
Verify

SPARQL independently verifies the test case by finding the two dates that are incorrectly typed as strings. Source file: /SPARQL/Animal-RefInterval.rq

  # RC1 : Independently verify with SPARQL
  SELECT ?refIntervalIRI ?dateIRI ?date ?dateDType
  WHERE{
    ?refIntervalIRI a              study:ReferenceInterval ;
                    ?beginOrEnd    ?dateIRI .
    ?dateIRI        time:inXSDDate ?date .                
    FILTER (datatype(?date)  != xsd:date)
}



SD1002-RC2: Subject has one Reference Interval
Rule Statement
:AnimalSubject :hasReferenceInterval with sh:minCount and sh:maxCount of 1
Description
Animal Subjects should have one and only one Reference Interval IRI.

Test Data

In the test data, subject 99T8 has no Reference Interval. While is possible to have a reference interval without start and end dates (see Data Conversion, the 99T8 has no interval created during the data conversion process (see RDF, below) specifically to test this constraint. Subject 99T9 has two reference intervals.

usubjid SubjectIRI rfstdtc rfendtc Rule Violated
00M01 Animal_a6d09184 2016-12-07 2016-12-07 None
99T8 Animal_d9209e97 NA NA SD1002-RC2-TC1
99T9 Animal_cdd31fb6 2016-11-11 2016-11-25</font> SD1002-RC2-TC2
99T9 Animal_cdd31fb6 2016-12-20 2016-12-28</font> SD1002-RC2-TC2


OUTDATED TEXT BEGINS Here

This check determines if the Animal Subject has one and only one Reference Interval IRI. While it is possible to have an Interval IRI with no start date and no end date (see Data Conversion), this rule component only evaluates the case of missing Reference Interval IRIs. Multiple start/end dates for a single subject are evaluated in Rule Component 3.

Test data for Animal Subject 99T11 has no study:hasReferenceInterval .

Not tested:

AnimalSubject with more than one Reference Interval.

cj16050:Animal_2a836191
    a                        study:AnimalSubject ;
    skos:prefLabel           "Animal 99T11"^^xsd:string ;
    study:hasSubjectID       cj16050:SubjectIdentifier_6204e90c ;
    study:hasUniqueSubjectID cj16050:UniqueSubjectIdentifier_6204e90c ;
    study:memberOf           cjprot:Set_00, code:Species_Rat ;
    study:participatesIn     cj16050:AgeDataCollection_Animal_6204e90c, cj16050:SexDataCollection_Animal_6204e90c .


The study ontology definesstudy:AnimalSubject as a sub class of both study:Subject and study:Animal. Study subjects, be they animal or person, have a Reference Interval documenting their participation in a trial. Therefore, when the ontology is loaded into the database, the same constraint can be used for both pre-clinical (SEND) and clinical (SDTM) studies. This same ontological approach is taken for USUBJID and SUBJID.

study:Subject
  rdf:type owl:Class ;
  rdfs:subClassOf study:Party ;
  skos:prefLabel "Subject" ;
.
study:Animal
  rdf:type owl:Class ;
  rdfs:subClassOf study:BiologicEntity ;
  skos:prefLabel "Animal" ;
.
study:AnimalSubject
  rdf:type owl:Class ;
  rdfs:subClassOf study:Animal ;
  rdfs:subClassOf study:Subject ;
  skos:prefLabel "Animal subject" ;
.


The SHACL shape evaluates the path study:hasReferenceInterval from the targetClass to determine if one and only one Reference Interval IRI is present. When the ontology is loaded, the more general study:Subject can be leveraged as the targetClass, assuming other study:Subjects like study:HumanStudySubject use the same predicate. The commented-out alternative is also provided for when the ontology is not loaded, or for cases where the constraint should only apply to study:AnimalSubject and not other classes like study:HumanSubject.

study:hasMin1Max1Shape-Interval a sh:NodeShape ;
  sh:targetClass study:Subject ;  # Ontology
  # sh:targetClass study:AnimalSubject ; # No Ontology 
  sh:path        study:hasReferenceInterval ;
  sh:name        "reference interval present";
  sh:description "Animal Subject must have one and only one reference interval IRI.";
  sh:message     "Animal Subject does not have one Reference Interval IRI. [SD1002]" ;
  sh:minCount    1 ;
  sh:maxCount    1 .


The report identifies the IRI cj16050:Animal_2a836191 , corresponding to Animal Subject 99T11.

a sh:ValidationReport ;                                                                  
  sh:conforms false ;                                                                  
  sh:result [                                                                          
    a sh:ValidationResult ;                                                          
      sh:sourceShape :SubjectOneRefIntervalShape ;                                 
      sh:resultPath study:hasReferenceInterval ;                                   
      sh:resultSeverity sh:Violation ;                                             
      sh:focusNode cj16050:Animal_2a836191 ;                                       
      sh:resultMessage "Animal Subject does not have one Reference Interval IRI. [SD1002]" ;
      sh:sourceConstraintComponent sh:MinCountConstraintComponent                  
  ]                                                                                    

SPARQL identifies the reported IRI as belonging to AnimalSubject 99T11, also confirming there is no study:hasReferenceInterval predicate.

  # RC 2 : Information : predicates and objects for the IRI in the report
  SELECT ?s ?p ?o
  WHERE {
    cj16050:Animal_2a836191 ?p ?o ;
    BIND( IRI(cj16050:Animal_2a836191) as ?s)  
  }  ORDER BY ?p
Verify

Verification identifies Animal Subject 99T11 with no Reference Interval.

# RC 2 : Verify: Number of reference intervals per subject
SELECT ?animalSubjectIRI ?animalLabel (COUNT(?intervalIRI) AS ?numIntervals )
  WHERE{
    ?animalSubjectIRI a study:AnimalSubject ;
                      skos:prefLabel             ?animalLabel ;
    OPTIONAL{
        ?animalSubjectIRI study:hasReferenceInterval ?intervalIRI .
    }    
} # ORDER BY ?animalLabel
 GROUP BY ?animalSubjectIRI ?animalLabel
 HAVING (?numIntervals != 1 )



SD1002-RC3 : Reference Interval has one Start Date and one End Date
Rule Statement
study:ReferenceInterval time:hasBeginning with sh:minCount and sh:maxCount of 1, sh:and time:hasEnd with sh:minCount and sh:maxCount of 1
Description
Each Reference interval should have one and only one start date and end date.

Test Data

In the test data, subject 99T8 has no Reference Interval. While is possible to have a reference interval without start and end dates (see Data Conversion, the 99T8 has no interval created during the data conversion process (see RDF, below) specifically to test this constraint. Subject 99T9 has two reference intervals.

usubjid SubjectIRI rfstdtc rfendtc Rule Violated
00M01 Animal_a6d09184 2016-12-07 2016-12-07 None
99T11 Animal_c5e105c3 NA 2016-12-07 SD1002-RC3-TC1
99T12 Animal_664e018b 2016-12-07 NA SD1002-RC3-TC2
99T8 Animal_d9209e97 NA NA SD1002-RC3-TC3
99T13 Animal_8196e3ec 2016-12-28 2016-12-25</font> SD1002-RC3-TC4


OLD TEXT AFTER HERE. Inaccurate information follows. Turn back now...

Reference interval IRIs are connected to their date values through the paths time:hasBeginning and time:hasEnd. A correctly formed interval has both start and end dates.

Test data provides the following violations:

  • 99T11 missing rfstdtc TC 1
  • 99T12 missing rfendtc TC 2
  • 99T8 missing both rfendtc, rfstdtc TC3
  • 99T13 >1 rfstdtc, >1 rfendtc TC4

Only the data and report for 99T5 is shown here, where start date is present and end date is missing for the Reference Interval.

cj16050:Animal_db3c6403
  a                          study:AnimalSubject ;
  skos:prefLabel             "Animal 99T5"^^xsd:string ;
  study:hasReferenceInterval cj16050:Interval_Animal_db3c6403  ;
  study:hasSubjectID         cj16050:SubjectIdentifier_db3c6403 ;
  study:hasUniqueSubjectID   cj16050:UniqueSubjectIdentifier_db3c6403 ;
  study:memberOf             cjprot:Set_00, code:Species_Rat ;
  study:participatesIn       cj16050:AgeDataCollection_Animal_db3c6403, cj16050:SexDataCollection_Animal_db3c6403 .

cj16050:Interval_Animal_db3c6403
  a                 study:ReferenceInterval ;
  skos:prefLabel    "Interval 2016-12-07 NA"^^xsd:string ;
  time:hasBeginning cj16050:Date_2016-12-07 .


The study ontology definesstudy:ReferenceInterval as a sub class of study:EntityInterval.

study:EntityInterval
  rdf:type owl:Class ;
   rdfs:subClassOf time:Interval ;
  skos:prefLabel "Entity interval" ;
.
study:ReferenceInterval
  rdf:type owl:Class ;
  rdfs:subClassOf study:EntityInterval ;
  skos:prefLabel "Reference interval" ;
.
study:Lifespan
  rdf:type owl:Class ;
  rdfs:subClassOf study:EntityInterval ;
  skos:prefLabel "Lifespan" ;
.
study:MedicalConditionInterval
  rdf:type owl:Class ;
  rdfs:subClassOf study:EntityInterval ;
  skos:prefLabel "Medical event interval" ;
.
study:StudyParticipationInterval
  rdf:type owl:Class ;
  rdfs:subClassOf study:EntityInterval ;
  skos:prefLabel "Study participation interval" ;
.


All sub classes of study:EntityInterval must have a time:hasBeginning and time:hasEnd, allowing the use of a single shape to evaluate following types of intervals when the ontology is loaded into the database:

  • study:ReferenceInterval
  • study:LifeSpan
  • study:MedicalConditionalInterval
  • study:StudyParticipationInterval

The ontology facilitates the use of the shape in both pre-clinical (SEND) and clinical (SDTM) studies.

The shape tests the following conditions:

  • Interval Start date for an Animal Subject has one and only one value.
  • Interval End date for an Animal Subject has one and only one value.
study:hasMin1Max1Shape-StartEndDates a sh:NodeShape ;
  sh:targetClass study:EntityInterval ; # Ontology
  # sh:targetClass study:ReferenceInterval ; # No Ontology
  sh:name        "intervalDateCount" ;
  sh:description "Interval has one and only one start and end date." ;
  sh:message     "Problem with Interval date. [SD1002]" ;
  sh:and (
    [ sh:path time:hasBeginning ;
      sh:minCount 1;
      sh:maxCount 1
    ]
    [
      sh:path time:hasEnd ;
      sh:minCount 1;
      sh:maxCount 1
    ]
 )
.


The report identifies the interval for Animal Subject 99T5 (cj16050:Interval_Animal_db3c6403) as violating the constraint.

  a sh:ValidationResult ;
    sh:sourceConstraintComponent sh:AndConstraintComponent ;
    sh:focusNode cj16050:Interval_Animal_db3c6403 ;
    sh:resultMessage "Problem with Interval date. [SD1002]" ;
    sh:value cj16050:Interval_Animal_db3c6403  ;
    sh:sourceShape :RefIntervalDateShape ;
    sh:resultSeverity sh:Violation ;

SPARQL can trace the reference interval from the report back to AnimalSubject 99T5, showing this individual is missing rfendtc.

SELECT ?animalLabel  ?beginDate ?endDate
WHERE{
  ?animalSubjectIRI study:hasReferenceInterval cj16050:Interval_Animal_db3c6403 ;
                    skos:prefLabel    ?animalLabel .

   OPTIONAL{
     cj16050:Interval_Animal_db3c6403 time:hasBeginning ?beginIRI .
     ?beginIRI time:inXSDDate  ?beginDate .
   }
   OPTIONAL{
     cj16050:Interval_Animal_db3c6403 time:hasEnd ?endIRI .
     ?beginIRI time:inXSDDate  ?beginDate .
  }
}
Verify

The query below correctly lists the AnimalSubjects with start and end date data issues as 99T2, 99T5, 99T8, 99T9.

#--- RC 3: Verify : Pull all subject IDs that do not have one start and one End date
SELECT ?animalLabel ?beginDate ?endDate (COUNT(?beginDate) AS ?numBeginDate)
       (COUNT(?endDate) AS ?numEndDate)
WHERE{
  ?animalSubjectIRI study:hasReferenceInterval ?intervalIRI ;
                    skos:prefLabel             ?animalLabel .
   OPTIONAL{
     ?intervalIRI time:hasBeginning ?beginIRI .
     ?beginIRI    time:inXSDDate    ?beginDate .
   }

   OPTIONAL{
     ?intervalIRI time:hasEnd     ?endIRI .
     ?beginIRI    time:inXSDDate  ?endDate .
  }
} GROUP BY ?animalSubjectIRI ?animalLabel ?beginDate ?endDate
  HAVING ((?numBeginDate != 1) || (?numEndDate != 1) )

Rule Component 4. Start Date on or before End Date
Rule Statement
For interval, ! (?endDate >= ?beginDate )
Description
Interval start date must be on or before end date. When the constraint is violated the report must display the FDA Validator Message "RFSTDTC is after RFENDTC"

Referring back to previous sections, the reference start and end dates are not directly attached to either an Animal Subject or that Subject’s Reference Interval IRI. This indirect connection makes the comparison of the two date values more complex, so SHACL-SPARQL is used in place of SHACL-Core. The SPARQL query is written to find cases where the end date is NOT greater than or equal to the start date. Test data provides the following violations:

  • 99T1 start date is after end date
  • 99T2 multiple start/end date values, one start date is before one end date value
  • 99T10 String value for rfstdtc results in a violation when comparing to rfendtc

Only the data and report for 99T1 is shown below.

cj16050:Animal_184f16eb
    a study:AnimalSubject ;
    skos:prefLabel "Animal 99T1"^^xsd:string ;
    study:hasReferenceInterval cj16050:Interval_Animal_184f16eb ;
    study:hasSubjectID cj16050:SubjectIdentifier_184f16eb ;
    study:hasUniqueSubjectID cj16050:UniqueSubjectIdentifier_184f16eb ;
    study:memberOf cjprot:Set_00, code:Species_Rat ;
    study:participatesIn cj16050:AgeDataCollection_Animal_184f16eb, cj16050:SexDataCollection_Animal_184f16eb .

cj16050:Interval_Animal_184f16eb
    a study:ReferenceInterval ;
    skos:prefLabel "Interval 2016-12-07 2016-12-06"^^xsd:string ;
    time:hasBeginning cj16050:Date_2016-12-07 ;
    time:hasEnd cj16050:Date_2016-12-06 .


cj16050:Date_2016-12-07
    a study:ReferenceBegin ;
    skos:prefLabel "Date 2016-12-07"^^xsd:string ;
    time:inXSDDate "2016-12-07"^^xsd:date ;
    study:dateTimeInXSDString "2016-12-07"^^xsd:string .

cj16050:Date_2016-12-06
    a study:ReferenceEnd ;
    skos:prefLabel "Date 2016-12-06"^^xsd:string ;
    time:inXSDDate "2016-12-06"^^xsd:date ;
    study:dateTimeInXSDString "2016-12-06"^^xsd:string .


The shape tests the following condition:

  • Reference Interval Start Date must be on or before End Date
  • The shape will also pick up cases where a date is in xsd:string format.

As described for Rule Component 3, this shape is applied at sh:targetClass study:EntityInterval for intervals that contain time:hasBeginning and time:hasEnd predicates. The alternative application to study:ReferenceInterval is shown for the alternative case when the ontology is not present.

study:hasStartLEEndShape-Interval a sh:NodeShape ;
  sh:targetClass study:EntityInterval ; # Ontology
  # sh:targetClass study:ReferenceInterval ; # No Ontology
 sh:sparql [
  a              sh:SPARQLConstraint ;
  sh:name        "sd1002" ;
  sh:description "Interval start date on or before end date." ;
  sh:message     "Interval Start Date on or before End Date";
  sh:prefixes [
    sh:declare [ sh:prefix "time" ;
      sh:namespace "http://www.w3.org/2006/time#"^^xsd:anyURI ;
    ],
    [ sh:prefix "study" ;
      sh:namespace "https://w3id.org/phuse/study#"^^xsd:anyURI ;
    ]  
  ] ;
 sh:select
  """SELECT $this (?beginDate AS ?intervalStart) (?endDate AS ?intervalEnd)
    WHERE {
      $this     time:hasBeginning  ?beginIRI ;
                time:hasEnd        ?endIRI .
      ?beginIRI time:inXSDDate     ?beginDate .
      ?endIRI   time:inXSDDate     ?endDate .
      FILTER  (! (?endDate >= ?beginDate ))
    }""" ;
] .


The report identifies the interval for Animal Subject 99T1 where End Date precedes Start Date.

  a sh:ValidationResult ;
  sh:sourceConstraint _:bnode_cacffc33_62e3_4c8b_bdba_e71e398a23dc_29 ;
  sh:sourceShape :SD1002RuleShape ;
  sh:resultMessage "Interval Start Date on or before End Date. [SD1002]" ;
  sh:value cj16050:Interval_Animal_184f16eb        ]
  sh:sourceConstraintComponent sh:SPARQLConstraintComponent ;
  sh:resultSeverity sh:Violation ;
  sh:focusNode cj16050:Interval_Animal_184f16eb

SPARQL traces the interval back to the AnimalSubject and date values.

SELECT ?animalLabel (?beginDate AS ?intervalStart) (?endDate AS ?intervalEnd)
WHERE {
  ?animalSubjectIRI study:hasReferenceInterval ?intervalIRI ;
                    skos:prefLabel             ?animalLabel .

  ?intervalIRI time:hasBeginning  ?beginIRI .
  ?beginIRI    time:inXSDDate     ?beginDate .

  ?intervalIRI time:hasEnd        ?endIRI .
  ?endIRI    time:inXSDDate       ?endDate .
  FILTER  (! (?endDate >= ?beginDate ))
}
Verify

Verification confirms Animal Subject 99T1 and 99T2 with End Data preceding Start Date. Note how when the start date is a string it also flags AnimalSubject 99T10 as a violator. The SPARQL statement is very similar to the query used in the SHACL-SPARQL constraint.

  # RC4 : Verify :
  SELECT ?animalLabel (?beginDate AS ?intervalStart) (?endDate AS ?intervalEnd)
  WHERE {
    ?animalSubjectIRI study:hasReferenceInterval ?intervalIRI ;
                      skos:prefLabel             ?animalLabel .

    ?intervalIRI time:hasBeginning  ?beginIRI .
    ?beginIRI    time:inXSDDate     ?beginDate .

    ?intervalIRI time:hasEnd        ?endIRI .
    ?endIRI    time:inXSDDate       ?endDate .
    FILTER  (! (?endDate >= ?beginDate ))
  }
  animalLabel    intervaIRI                  intervalStart   intervalEnd
  "Animal 99T1"  cj16050:Interval_Animal_184f16eb   2016-12-07      2016-12-06
  "Animal 99T10" cj16050:Interval_Animal_56cbc8c2   "6-DEC-16"      2016-12-07
  "Animal 99T2"  cj16050:Interval_Animal_21316392   2016-12-08      2016-12-07

Animal Subject Shape - Demographics Domain

Age

The following figure shows the connection from the Animal Subject IRI to its Age value.

Animal Subject Data Structure for Age

The spreadsheet FDA-Validator-Rules.xlsx defines numerous rules associated with Age in the DM domain. This project defines only a subset of these rules as SHACL Shapes. For example, the rule SD2019 “Invalid value for AGETXT” is not applicable because the example study collects AGE (numeric) and not AGETXT (age range as a string).

FDA Rule SD0084

FDA Validator Rule ID FDA Validator Message Business or Conformance Rule Validated FDA Validator Rule
SD0084 Negative value for age Values for age variables cannot be negative, The value of Age (AGE) cannot be less than 0.


Rule Component

1. AGE must be greater than or equal to 0.

Data Structure

Refer back to previous sections to see how age is indirectly associated with an AnimalSubject via a study:participatesIn predicate that leads to an outcome IRI that in turn contains the age value and units. Most subjects in the study are the same age (8 Weeks), resulting in a small number of tests in outcome IRIs instead of traditional tests on each age value associated with an Animal Subject.

Rule Component 1. AGE must be greater than or equal to 0.
Rule Statement
age sh:minInclusive 0.
Description
The Age value must be greater than or equal to 0. While this study has not age=0, the check is constructed to satisfy the FDA rule.

The age for Animal Subject 99T1 was set to -10 for testing.

  cj16050:Animal_184f16eb
    a                    study:AnimalSubject ;
    study:participatesIn cj16050:AgeDataCollection_Animal_184f16eb,
  ...

  cj16050:AgeDataCollection_Animal_184f16eb
    a            code:AgeDataCollection ;
    code:outcome cj16050:Age_-10_WEEKS .

  cj16050:Age_-10_WEEKS
    a                     study:Age ;
    time:hasXSDDuration  "P56D"^^xsd:duration ;
    time:numericDuration -10 ;
    time:unitType        time:unitWeek .


The shape tests the following condition:

  • An Age value must be >= 0
  study:hasminInclusive0Shape-Age a sh:NodeShape ;
    sh:targetClass study:Age ;
    sh:name        "agGTE0" ;
    sh:description "Age must be greater than or equal to 0." ;
    sh:message     "Negative value for AGE.  [SD0084]" ;
    sh:property [
      sh:path time:numericDuration ;
              sh:minInclusive 0 ;
    ] .


The report correctly identifies the value ‘-10’.

  a sh:ValidationReport ;
    sh:result [
      a sh:ValidationResult ;
        sh:sourceConstraintComponent sh:MinInclusiveConstraintComponent ;
        sh:sourceShape [] ;
        sh:focusNode cj16050:Age_-10_WEEKS ;
        sh:resultPath time:#numericDuration ;
        sh:value -10 ;
        sh:resultSeverity sh:Violation
    ] ;
    sh:conforms false


The report lists the Age value and Age outcome IRI, but not the AnimalSubject associated with the offending value. SPARQL can be used to identify the Animal Subject using the Age Outcome IRI identified in the report. Source file: /SPARQL/Animal-Age-LT0.rq

  SELECT ?animalLabel
  WHERE{
    ?ageDataCollIRI   code:outcome cj16050:Age_-10_WEEKS .

    ?AnimalSubjectIRI study:participatesIn ?ageDataCollIRI ;
                      skos:prefLabel       ?animalLabel .
  }

SPARQL independently verifies the Animal Subject with age < 0. Source file: /SPARQL/Animal-Age-LT0.rq

  SELECT ?animalIRI ?animalLabel ?age
  WHERE{
    ?animalIRI       study:participatesIn ?ageDataCollIRI ;
                     skos:prefLabel       ?animalLabel .
    ?ageDataCollIRI  code:outcome         ?ageIRI .
    ?ageIRI          time:numericDuration ?age .
    FILTER (?age < 0)
  }


Rule SD1121 : Age

FDA Validator Rule ID FDA Validator Message Business or Conformance Rule Validated FDA Validator Rule
SD1121 Age or age range must be provided for all subjects, except for Screen Failures. Age or age range must be provided for all subjects, except for Screen Failures. Value for Age (AGE) or Age Range (AGETXT) variables should be populated for all subjects with only exception for Screen Failures (ARMCD=SCRNFAIL) and Not Assigned (ARMCD=NOTASSGN) subjects.


Rule Component

1.AGE value must be present for all subjects that are not screen failures and not assigned to a treatment arm.

Source Data Data Structure

The rule states the age can be missing in cases where the subject is either:

  1. A screen failure ( armcd=SCRNFL)
  2. Not assigned to a treatment arm ( armcd=NOTASSGN)

Here we encounter another problem in the SDTM data model. SEND data identifies screen failures as armcd='SCRNFL' and subjects not assigned to a treatment arm as armcd='NOTASSGN'. armcd implies it contains values indicating the treatment arm, yet values SCRNFL and NOTASSGN identify subjects who were never assigned to a treatment arm! The data model must change to model these cases without ambiguity.

SCRNFL

Screen Failures had an eligibility assessment with an outcome that deemed the subject ineligible for the study (eligibility = FALSE).

Triples would appear similar to:

cj16050:EligibilityDetermination_XXXXXX
  rdf:type     study:EligibilityDetermination ;
  code:outcome code:RuleOutcome_FALSE .

cj16050:Randomization_Animal_2
  rdf:type study:Randomization ;
  skos:prefLabel "Randomization 2" ;
  code:outcome  

Missing Age values are represented in the data has having no object associated with the time:numericduration predicate.

Non missing value of ‘8’:

cj16050:Age_8_WEEKS a study:Age ; skos:prefLabel “Age 8 WEEKS”^^xsd:string ; time:hasXSDDuration “P56D”^^xsd:duration ; time:numericDuration 8 ; time:unitType time:unitWeek .

Missing age value: cj16050:Age_ skos:prefLabel “Age WEEKS”^^xsd:string ; time:hasXSDDuration “P56D”^^xsd:duration ; time:unitType time:unitWeek .

TEMPLATE FOR NEW SECTIONS
Rule Statement
age ADD HERE 0.
Description
ADD

The age for Animal Subjects 99TXX, 99TXX, 99TXX was set to missing.


   ADD data for one test case here.


The shape tests the following condition:

  • ADD CONDITION
   ADD SHACL


The report correctly identifies XXXX.

  ADD REPORT EXCERPT


The report ADD DETAILS ABOUT REPORT VALIDATION USING SPARQL. SPARQL can be used to identify Add Source file: Add/SPARQL/foo</font>

  ADD


NOTASSGN

armcd has the value NOTASSGN when there is no randomization outcome. The data conversion process attempts to create the randomization triple as:

Randomization_Animal_######## code:outcome NO OBJECT

The triple is not created due to the missing data for the Object.

The triples for a NOTASSGN subject appear as:

ADD TRIPLES



Add: Additional Rules...