How to Validate an OBO ontology using Obo Metadata Ontology schema


Step 1: Obtain the sqlite version of the ontology

Currently the sqlite version of ontologies are not distributed alongside them in OBO

You can either

The second option is likely easiest.

For example:


Step 2: Install oaklib

pip install oaklib

Check your install works:

runoak --help

Step 3: Validate an individual ontology

runoak -i sqlite:uberon.db validate

This will stream yaml output. The output is linkml objects using the SHACL Validation vocabulary

severity: ERROR
subject: CARO:0000003
predicate: rdfs:label
info: Missing slot (label) for CARO:0000003

severity: ERROR
subject: CARO:0000006
predicate: rdfs:label
info: Missing slot (label) for CARO:0000006

Step 3 (alternative): Validate multiple ontologies

runoak validate-multiple db/*.db -o obo-validation.tsv


Currently only the following are implemented:

  • MinCountConstraintComponent checks (required or recommended)

  • MaxCountConstraintComponent checks

  • DeprecatedPropertyComponent

  • DatatypeConstraintComponent: basic type checks (literal vs object) DOES NOT YET CHECK SPECIFIC LITERAL TYPE

  • ClosedConstraintComponent

Using your own schema

TODO: add an option to pass in your own yaml file

How this works

The Python API is described here:

Currently there is only one implementation, the SqlDatabase implementation

The validation is driven entirely by a LinkML schema

Currently this schema lives within this repo, but the goal is to have it live outside and be imported

Different implementations are free to use this in different ways

The SqlDatabase implementation attempts to do this in a performant way doing whole-database predicate-based queries

Validation results use the Validator Datamodel, which reuses many URIs from SHACL


See notebooks folder in