Basic Ontology Interface Demo

this demonstrates the use of the BasicOntologyInterface which provides simplified access to local or remote ontologies.

We demonstrate the use of different backends, but in practice you will likely only use one depending on your use case.

  • pronto or sqlite for working with ontologies which you have a local copy of, and can trade startup time for generally faster operations

  • ubergraph or ontobee or ols or bioportal for an ontology that has been loaded into a remote server

Loading from obo files using Pronto

[1]:
from oaklib.implementations.pronto.pronto_implementation import ProntoImplementation
from oaklib.resource import OntologyResource

Local files

First we demonstrate loading from a file on the filesystem

[2]:
resource = OntologyResource(slug='go-nucleus.obo', directory='input', local=True)
oi = ProntoImplementation(resource)
[4]:
rels = oi.outgoing_relationship_map('GO:0005773')
for rel, parents in rels.items():
    print(f'  {rel} ! {oi.label(rel)}')
    for parent in parents:
        print(f'    {parent} ! {oi.label(parent)}')
  rdfs:subClassOf ! subClassOf
    GO:0043231 ! intracellular membrane-bounded organelle
  BFO:0000050 ! part_of
    GO:0005737 ! cytoplasm

Remote (downloading from OBO)

Next we use pronto’s load from obo library feature

[5]:
oi = ProntoImplementation(OntologyResource(local=False, slug='go.obo'))

note this slight lag in executing the command above - while this method relieves the need to manage and synchronize files locally there is an initial network startup penalty

[6]:
rels = oi.outgoing_relationship_map('GO:0005773')
for rel, parents in rels.items():
    print(f'  {rel} ! {oi.label(rel)}')
    for parent in parents:
        print(f'    {parent} ! {oi.label(parent)}')
  rdfs:subClassOf ! subClassOf
    GO:0043231 ! intracellular membrane-bounded organelle
  BFO:0000050 ! part of
    GO:0005737 ! cytoplasm

SQL Database access

We can load from a SQL Database following Semantic SQL patterns (see docs for how to download ready-made sqlite dbs for all of OBO)

[7]:
from oaklib.implementations.sqldb.sql_implementation import SqlImplementation

[7]:
oi = SqlImplementation(OntologyResource(slug=f'sqlite:///input/go-nucleus.db'))
[9]:
rels = oi.outgoing_relationship_map('GO:0005773')
for rel, parents in rels.items():
    print(f'  {rel} ! {oi.label(rel)}')
    for parent in parents:
        print(f'    {parent} ! {oi.label(parent)}')
  rdfs:subClassOf ! subClassOf
    GO:0043231 ! intracellular membrane-bounded organelle
  BFO:0000050 ! part of
    GO:0005737 ! cytoplasm
[10]:
for curie in oi.basic_search('intracellular'):
    print(f' MATCH: {curie} ! {oi.label(curie)} ')

Determining which interfaces an implementation implements

[11]:
oi.interfaces_implemented()
[11]:
[oaklib.interfaces.validator_interface.ValidatorInterface,
 oaklib.interfaces.rdf_interface.RdfInterface,
 oaklib.interfaces.obograph_interface.OboGraphInterface,
 oaklib.interfaces.search_interface.SearchInterface,
 oaklib.interfaces.mapping_provider_interface.MappingProviderInterface,
 oaklib.interfaces.patcher_interface.PatcherInterface]

Loading from OWL ontologies using owlfun

TODO

Wrapping remote ontology portals

OLS

TODO

BioPortal

TODO

Wrapping SPARQL Endpoints

Ubergraph

[12]:
from oaklib.implementations.ubergraph.ubergraph_implementation import UbergraphImplementation
oi = UbergraphImplementation()
[13]:
rels = oi.outgoing_relationship_map('GO:0005773')
for rel, parents in rels.items():
    print(f'  {rel} ! {oi.label(rel)}')
    for parent in parents:
        print(f'    {parent} ! {oi.label(parent)}')
WARNING:root:Multiple labels for BFO:0000050 = ['part of', 'part_of']
  BFO:0000050 ! part of
    GO:0005737 ! cytoplasm
  COB:0000072 ! part of
    GO:0005737 ! cytoplasm
  rdfs:subClassOf ! None
    GO:0043231 ! intracellular membrane-bounded organelle

notes

Notice some of the differences with some of the other mechanisms:

  • ubergraph includes multiple ontologies, one of which ‘injects’ an legacy caro#part_of relationship

  • similarly there are different injected labels for the part-of relation

  • note also that the ubergraph implementation uses the actual predicate CURIE, currently pronto uses the shortname

This also involves multiple iterative calls to the API which is inefficient.

In future there will be an interface for ‘bigger’ operations that can be implemented more efficiently

Ontobee

Currently the ontobee implementation doesn’t allow the selection of a specific ontology within the triplestore – instead the whole store is treated as if it were one giant ontology with everything merged together.

This can be confusing when one ontology contains a stale part of another - e.g. if an ontology used to have a parent term but it has since been obsoleted.

[14]:
from oaklib.implementations.ontobee.ontobee_implementation import OntobeeImplementation
oi = OntobeeImplementation()
[26]:
rels = oi.outgoing_relationship_map('GO:0005773')
for rel, parents in rels.items():
    print(f'  {rel} ! {oi.label(rel)}')
    for parent in parents:
        print(f'    {parent} ! {oi.label(parent)}')
  rdfs:subClassOf ! subClassOf
    GO:0043231 ! intracellular membrane-bounded organelle
  BFO:0000050 ! part of
    GO:0005737 ! cytoplasm

Graph Operations

[16]:
oi = ProntoImplementation(OntologyResource(local=False, slug='go.obo'))
[17]:
ancs = oi.ancestors('GO:0005773')
for anc in list(ancs):
    print(f'{anc} ! {oi.label(anc)}')
GO:0016020 ! membrane
GO:0005773 ! vacuole
GO:0005622 ! intracellular anatomical structure
GO:0043227 ! membrane-bounded organelle
GO:0043229 ! intracellular organelle
GO:0043226 ! organelle
GO:0043231 ! intracellular membrane-bounded organelle
GO:0005737 ! cytoplasm
GO:0110165 ! cellular anatomical entity
GO:0005575 ! cellular_component
[19]:
from oaklib.datamodels.vocabulary import IS_A, PART_OF

ancs = oi.ancestors('GO:0005773', predicates=[IS_A, PART_OF])
for anc in list(ancs):
    print(f'{anc} ! {oi.label(anc)}')
GO:0005773 ! vacuole
GO:0005622 ! intracellular anatomical structure
GO:0043227 ! membrane-bounded organelle
GO:0043229 ! intracellular organelle
GO:0043226 ! organelle
GO:0043231 ! intracellular membrane-bounded organelle
GO:0005737 ! cytoplasm
GO:0110165 ! cellular anatomical entity
GO:0005575 ! cellular_component
[20]:
def render(curie):
    return f'{curie} "{oi.label(curie)}"'
for s,p,o in oi.walk_up_relationship_graph('GO:0005773', predicates=[IS_A, PART_OF]):
    print(f'{render(s)} -{render(p)}-> {render(o)}')
GO:0005773 "vacuole" -rdfs:subClassOf "subClassOf"-> GO:0043231 "intracellular membrane-bounded organelle"
GO:0005773 "vacuole" -BFO:0000050 "part of"-> GO:0005737 "cytoplasm"
GO:0005737 "cytoplasm" -rdfs:subClassOf "subClassOf"-> GO:0110165 "cellular anatomical entity"
GO:0005737 "cytoplasm" -BFO:0000050 "part of"-> GO:0005622 "intracellular anatomical structure"
GO:0005622 "intracellular anatomical structure" -rdfs:subClassOf "subClassOf"-> GO:0110165 "cellular anatomical entity"
GO:0110165 "cellular anatomical entity" -rdfs:subClassOf "subClassOf"-> GO:0005575 "cellular_component"
GO:0043231 "intracellular membrane-bounded organelle" -rdfs:subClassOf "subClassOf"-> GO:0043227 "membrane-bounded organelle"
GO:0043231 "intracellular membrane-bounded organelle" -rdfs:subClassOf "subClassOf"-> GO:0043229 "intracellular organelle"
GO:0043229 "intracellular organelle" -rdfs:subClassOf "subClassOf"-> GO:0043226 "organelle"
GO:0043229 "intracellular organelle" -BFO:0000050 "part of"-> GO:0005622 "intracellular anatomical structure"
GO:0043226 "organelle" -rdfs:subClassOf "subClassOf"-> GO:0110165 "cellular anatomical entity"
GO:0043227 "membrane-bounded organelle" -rdfs:subClassOf "subClassOf"-> GO:0043226 "organelle"

Rendering using GraphViz

[21]:
from oaklib.utilities.obograph_utils import graph_to_image, default_stylemap_path
stylemap = default_stylemap_path()
[22]:
graph = oi.ancestor_graph('GO:0005773', predicates=[IS_A, PART_OF])
[23]:
graph_to_image(graph, ['GO:0005773'], stylemap=stylemap, imgfile='output/vacuole.png')
ERROR:root:No og2dot
You need to install a node package to be able to visualize results

npm install -g obographviz
Then set your path to include og2dot
---------------------------------------------------------------------------
Exception                                 Traceback (most recent call last)
Input In [23], in <cell line: 1>()
----> 1 graph_to_image(graph, ['GO:0005773'], stylemap=stylemap, imgfile='output/vacuole.png')

File ~/Documents/src/ontology-access-kit/src/oaklib/utilities/obograph_utils.py:102, in graph_to_image(graph, seeds, configure, stylemap, imgfile)
    100     print("npm install -g obographviz")
    101     print("Then set your path to include og2dot")
--> 102     raise Exception(
    103         f"Cannot find {exec} on path. Install from https://github.com/INCATools/obographviz"
    104     )
    105 with tempfile.NamedTemporaryFile(dir="/tmp", mode="w") as tmpfile:
    106     style = {}

Exception: Cannot find og2dot on path. Install from https://github.com/INCATools/obographviz

Output

img

[ ]: