Identifying entities: CURIEs and URIs

Prefix maps

Every Entity in OAK has a unique Identifier. OAK is consistent with semantic web formalisms where everything is identified by an IRI, but in OAK these are typically compressed into a CURIE, using a Prefix Map.

For example, to reference the concept of “heart” in the Uberon ontology, we use the curie UBERON:0000948. This is a compressed form of the URL, using the OBO Foundry prefix map.

On the command line, most OAK commands take CURIEs or lists of CURIEs as inputs (although primary labels and queries can also be supplied). For example:

$ runoak -i ubergraph info UBERON:0000948
UBERON:0000948 ! heart

Under the hood, on the backend (in this case, Ubergraph Adapter), the concept is stored as a full URI.

Similarly, in Python:

>>> from oaklib import get_adapter
>>> ont = get_adapter('ubergraph:')
>>> id = "UBERON:0000948"
>>> print(ont.label(id))
>>> print(ont.curie_to_uri(id))

OAK uses the prefixmaps package to manage CURIEs and URIs, and by default will use a certain set of standard prefixmaps, including the OBO one, as well as a linked data prefix map, which provides a set of standard prefixes for non-OBO resources such as

Querying prefixmaps

You can get a list of all prefixes known to OAK using the prefixes command:

$ runoak prefixes

You can also query the prefixmap for a particular prefix or set of prefixes:

$ runoak prefixes UBERON CL oio skos schema

This will return a table:

Example prefixes








See the prefixes command for more details.

Non-default prefixmaps

For most purposes, the default prefixmap should suffice.

You can also choose to override the default prefixmap with your own, using the --prefix or --named-prefix-map options.

In python this can be done by accessing the prefixmap directly:

>>> from oaklib import get_adapter
>>> soil_oi = get_adapter("tests/input/soil-profile.skos.nt")
>>> soil_oi.prefix_map()["soilprofile"] = ""
>>> # trivial example: show all CURIEs and labels
>>> for entity, label in soil_oi.labels(soil_oi.entities()):
...        print(f"{entity} ! {label}")

Structure of identifiers

OAK doesn’t impose any expectations on the structure of identifiers.

For OBO ontologies, all identifiers should conform to the OBO identifier pattern, which is the prefix (typically all uppercase) followed by a local identifier which is all numeric (typically zero-padded with 7 digits). However, this is not a requirement for OAK.

Many semantic web ontologies such as use “semantic” URIs that a human can understand. These can be used in the same way:

$ wget -O tests/output/schema.rdf
$ runoak --prefix schema= -i tests/output/schema.rdf relationships schema:Person
subject     subject_label   predicate       predicate_label object  object_label
schema:Person       Person  rdfs:subClassOf None    schema:Thing    Thing

Futher reading