Basic Ontology Interface Demo
this demonstrates the use of the BasicOntologyInterface which provides simplified access to local or remote ontologies.
We demonstrate the use of different backends, but in practice you will likely only use one depending on your use case.
pronto or sqlite for working with ontologies which you have a local copy of, and can trade startup time for generally faster operations
ubergraph or ontobee or ols or bioportal for an ontology that has been loaded into a remote server
Loading from obo files using Pronto
[1]:
from oaklib.implementations.pronto.pronto_implementation import ProntoImplementation
from oaklib.resource import OntologyResource
Local files
First we demonstrate loading from a file on the filesystem
[2]:
resource = OntologyResource(slug='go-nucleus.obo', directory='input', local=True)
oi = ProntoImplementation(resource)
[4]:
rels = oi.outgoing_relationship_map('GO:0005773')
for rel, parents in rels.items():
print(f' {rel} ! {oi.label(rel)}')
for parent in parents:
print(f' {parent} ! {oi.label(parent)}')
rdfs:subClassOf ! subClassOf
GO:0043231 ! intracellular membrane-bounded organelle
BFO:0000050 ! part_of
GO:0005737 ! cytoplasm
Remote (downloading from OBO)
Next we use pronto’s load from obo library feature
[5]:
oi = ProntoImplementation(OntologyResource(local=False, slug='go.obo'))
note this slight lag in executing the command above - while this method relieves the need to manage and synchronize files locally there is an initial network startup penalty
[6]:
rels = oi.outgoing_relationship_map('GO:0005773')
for rel, parents in rels.items():
print(f' {rel} ! {oi.label(rel)}')
for parent in parents:
print(f' {parent} ! {oi.label(parent)}')
rdfs:subClassOf ! subClassOf
GO:0043231 ! intracellular membrane-bounded organelle
BFO:0000050 ! part of
GO:0005737 ! cytoplasm
SQL Database access
We can load from a SQL Database following Semantic SQL patterns (see docs for how to download ready-made sqlite dbs for all of OBO)
[7]:
from oaklib.implementations.sqldb.sql_implementation import SqlImplementation
[7]:
oi = SqlImplementation(OntologyResource(slug=f'sqlite:///input/go-nucleus.db'))
[9]:
rels = oi.outgoing_relationship_map('GO:0005773')
for rel, parents in rels.items():
print(f' {rel} ! {oi.label(rel)}')
for parent in parents:
print(f' {parent} ! {oi.label(parent)}')
rdfs:subClassOf ! subClassOf
GO:0043231 ! intracellular membrane-bounded organelle
BFO:0000050 ! part of
GO:0005737 ! cytoplasm
[10]:
for curie in oi.basic_search('intracellular'):
print(f' MATCH: {curie} ! {oi.label(curie)} ')
Determining which interfaces an implementation implements
[11]:
oi.interfaces_implemented()
[11]:
[oaklib.interfaces.validator_interface.ValidatorInterface,
oaklib.interfaces.rdf_interface.RdfInterface,
oaklib.interfaces.obograph_interface.OboGraphInterface,
oaklib.interfaces.search_interface.SearchInterface,
oaklib.interfaces.mapping_provider_interface.MappingProviderInterface,
oaklib.interfaces.patcher_interface.PatcherInterface]
Loading from OWL ontologies using owlfun
TODO
Wrapping remote ontology portals
OLS
TODO
BioPortal
TODO
Wrapping SPARQL Endpoints
Ubergraph
[12]:
from oaklib.implementations.ubergraph.ubergraph_implementation import UbergraphImplementation
oi = UbergraphImplementation()
[13]:
rels = oi.outgoing_relationship_map('GO:0005773')
for rel, parents in rels.items():
print(f' {rel} ! {oi.label(rel)}')
for parent in parents:
print(f' {parent} ! {oi.label(parent)}')
WARNING:root:Multiple labels for BFO:0000050 = ['part of', 'part_of']
BFO:0000050 ! part of
GO:0005737 ! cytoplasm
COB:0000072 ! part of
GO:0005737 ! cytoplasm
rdfs:subClassOf ! None
GO:0043231 ! intracellular membrane-bounded organelle
notes
Notice some of the differences with some of the other mechanisms:
ubergraph includes multiple ontologies, one of which ‘injects’ an legacy caro#part_of relationship
similarly there are different injected labels for the part-of relation
note also that the ubergraph implementation uses the actual predicate CURIE, currently pronto uses the shortname
This also involves multiple iterative calls to the API which is inefficient.
In future there will be an interface for ‘bigger’ operations that can be implemented more efficiently
Ontobee
Currently the ontobee implementation doesn’t allow the selection of a specific ontology within the triplestore – instead the whole store is treated as if it were one giant ontology with everything merged together.
This can be confusing when one ontology contains a stale part of another - e.g. if an ontology used to have a parent term but it has since been obsoleted.
[14]:
from oaklib.implementations.ontobee.ontobee_implementation import OntobeeImplementation
oi = OntobeeImplementation()
[26]:
rels = oi.outgoing_relationship_map('GO:0005773')
for rel, parents in rels.items():
print(f' {rel} ! {oi.label(rel)}')
for parent in parents:
print(f' {parent} ! {oi.label(parent)}')
rdfs:subClassOf ! subClassOf
GO:0043231 ! intracellular membrane-bounded organelle
BFO:0000050 ! part of
GO:0005737 ! cytoplasm
Graph Operations
[16]:
oi = ProntoImplementation(OntologyResource(local=False, slug='go.obo'))
[17]:
ancs = oi.ancestors('GO:0005773')
for anc in list(ancs):
print(f'{anc} ! {oi.label(anc)}')
GO:0016020 ! membrane
GO:0005773 ! vacuole
GO:0005622 ! intracellular anatomical structure
GO:0043227 ! membrane-bounded organelle
GO:0043229 ! intracellular organelle
GO:0043226 ! organelle
GO:0043231 ! intracellular membrane-bounded organelle
GO:0005737 ! cytoplasm
GO:0110165 ! cellular anatomical entity
GO:0005575 ! cellular_component
[19]:
from oaklib.datamodels.vocabulary import IS_A, PART_OF
ancs = oi.ancestors('GO:0005773', predicates=[IS_A, PART_OF])
for anc in list(ancs):
print(f'{anc} ! {oi.label(anc)}')
GO:0005773 ! vacuole
GO:0005622 ! intracellular anatomical structure
GO:0043227 ! membrane-bounded organelle
GO:0043229 ! intracellular organelle
GO:0043226 ! organelle
GO:0043231 ! intracellular membrane-bounded organelle
GO:0005737 ! cytoplasm
GO:0110165 ! cellular anatomical entity
GO:0005575 ! cellular_component
[20]:
def render(curie):
return f'{curie} "{oi.label(curie)}"'
for s,p,o in oi.walk_up_relationship_graph('GO:0005773', predicates=[IS_A, PART_OF]):
print(f'{render(s)} -{render(p)}-> {render(o)}')
GO:0005773 "vacuole" -rdfs:subClassOf "subClassOf"-> GO:0043231 "intracellular membrane-bounded organelle"
GO:0005773 "vacuole" -BFO:0000050 "part of"-> GO:0005737 "cytoplasm"
GO:0005737 "cytoplasm" -rdfs:subClassOf "subClassOf"-> GO:0110165 "cellular anatomical entity"
GO:0005737 "cytoplasm" -BFO:0000050 "part of"-> GO:0005622 "intracellular anatomical structure"
GO:0005622 "intracellular anatomical structure" -rdfs:subClassOf "subClassOf"-> GO:0110165 "cellular anatomical entity"
GO:0110165 "cellular anatomical entity" -rdfs:subClassOf "subClassOf"-> GO:0005575 "cellular_component"
GO:0043231 "intracellular membrane-bounded organelle" -rdfs:subClassOf "subClassOf"-> GO:0043227 "membrane-bounded organelle"
GO:0043231 "intracellular membrane-bounded organelle" -rdfs:subClassOf "subClassOf"-> GO:0043229 "intracellular organelle"
GO:0043229 "intracellular organelle" -rdfs:subClassOf "subClassOf"-> GO:0043226 "organelle"
GO:0043229 "intracellular organelle" -BFO:0000050 "part of"-> GO:0005622 "intracellular anatomical structure"
GO:0043226 "organelle" -rdfs:subClassOf "subClassOf"-> GO:0110165 "cellular anatomical entity"
GO:0043227 "membrane-bounded organelle" -rdfs:subClassOf "subClassOf"-> GO:0043226 "organelle"
Rendering using GraphViz
[21]:
from oaklib.utilities.obograph_utils import graph_to_image, default_stylemap_path
stylemap = default_stylemap_path()
[22]:
graph = oi.ancestor_graph('GO:0005773', predicates=[IS_A, PART_OF])
[23]:
graph_to_image(graph, ['GO:0005773'], stylemap=stylemap, imgfile='output/vacuole.png')
ERROR:root:No og2dot
You need to install a node package to be able to visualize results
npm install -g obographviz
Then set your path to include og2dot
---------------------------------------------------------------------------
Exception Traceback (most recent call last)
Input In [23], in <cell line: 1>()
----> 1 graph_to_image(graph, ['GO:0005773'], stylemap=stylemap, imgfile='output/vacuole.png')
File ~/Documents/src/ontology-access-kit/src/oaklib/utilities/obograph_utils.py:102, in graph_to_image(graph, seeds, configure, stylemap, imgfile)
100 print("npm install -g obographviz")
101 print("Then set your path to include og2dot")
--> 102 raise Exception(
103 f"Cannot find {exec} on path. Install from https://github.com/INCATools/obographviz"
104 )
105 with tempfile.NamedTemporaryFile(dir="/tmp", mode="w") as tmpfile:
106 style = {}
Exception: Cannot find og2dot on path. Install from https://github.com/INCATools/obographviz
Output
[ ]: