Part 3: Triplestore Backends
Previously we have been working with local ontology files in sqlite format, following the Semantic SQL schema. We will now show how OAK can be used to access information in a remote Triplestore. The same approach can be used to query local files in any RDF format such as Turtle.
Triplestore Concepts
There are a number of triplestores that include ontology content queryable using SPARQL.
OAK includes built in support for a number of triplestores, including:
Ubergraph, an enriched triplestore containing a core set of biological ontologies
Ontobee, a triplestore including all OBO Foundry ontologies
Wikidata, a triplestore containing a broad set of knowledge encodes as triples
LOV, (Linked Open Vocabularies), a triplestore containing generic semantic web ontologies and ontology-like schemas
Additionally, OAK is capable of treating any local RDF file on disk as if it were a triplestore
All triplestores are fairly standardized in that they all conform to the SPARQL standard. However, triplestores differ in how they store ontologies, and different ontologies conform to different metadata standards. This means it can be challenging writing code with uniform behavior across different triplestores. OAK attempts to bridge these differences as far as possible. OAK interfaces specify the logical operation, and behind the scenes, OAK will emit the most appropriate SPARQL query.
Ubergraph
For full documentation, see Ubergraph Adapter
Connecting
In this example we will use the ontologies()
method for Basic Ontology Interface to list
all ontologies the adapter knows about.
>>> from oaklib import get_adapter
>>> adapter = get_adapter("ubergraph:")
>>> for ont in adapter.ontologies():
... print(ont)
...
bspo.owl
chebi.owl
...
Basic Operations
>>> term_id = "UBERON:0002544"
>>> print(adapter.label(term_id))
digit
>>> print(adapter.definition(term_id))
A subdivision of the autopod that has as part a...
Relationships
We can query for Asserted Relationships:
>>> for rel in adapter.relationships([term_id]):
... print(rel)
...
('UBERON:0002544', 'RO:0002160', 'NCBITaxon:32523')
...
('UBERON:0002544', 'rdfs:subClassOf', 'UBERON:0005881')
And also for Entailed Relationships – this time specifying the predicate IS_A.
>>> from oaklib.datamodels.vocabulary import IS_A
>>> for rel in adapter.relationships([term_id], predicates=[IS_A], include_entailed=True):
... print(rel)
...
('UBERON:0002544', 'rdfs:subClassOf', 'UBERON:0001062')
...