Ubergraph Tutorial

Ubergraph is a SPARQL endpoint serving multiple OBO ontologies, pre-processed with:

  • relation-graph

  • information content scores

  • biolink categories

This notebook demonstrates the use of OAK with the ubergraph adapter. Many of the same operations can be applied with other adapters. Advantages of ubergraph include:

  • multiple ontologies all joined together in one graph

  • complex sparql querying of multiple pieces of information, including entailed relationships

First we set up a Jupyter alias for convenience:

[1]:
alias ubergraph runoak -i ubergraph:

First let’s see the list of ontologies in Ubergraph. Note this list is ever growing and if you try this you may see more than is shown here:

[2]:
ubergraph ontologies
aism/aism-base.owl
bspo.owl
chebi.owl
cl/cl-base.owl
cob/cob-base.owl
cob/components/cob-to-external.owl
colao/colao-base.owl
dpo/dpo-base.owl
eco/eco-base.owl
iao.owl
ecto/ecto-base.owl
emapa.owl
envo/envo-base.owl
fbbt/fbbt-base.owl
fbcv/fbcv-base.owl
fbdv/fbdv-base.owl
fypo/fypo-base.owl
go/extensions/go-bfo-bridge.owl
go/go-base.owl
hp/hp-base.owl
lepao/lepao-base.owl
ma.owl
maxo/maxo-base.owl
mi.owl
mmo.owl
mondo/mondo-base.owl
ncbitaxon.owl
ncit.owl
mp/mp-base.owl
nbo/nbo-base.owl
oba/oba-base.owl
obi/obi-base.owl
so.owl
pato/pato-base.owl
pcl/pcl-base.owl
pco/pco-base.owl
ro/ro-base.owl
uberon/bridge/cl-bridge-to-caro.owl
uberon/bridge/cl-bridge-to-fbbt.owl
uberon/bridge/uberon-bridge-to-caro.owl
uberon/bridge/uberon-bridge-to-fbbt.owl
uberon/uberon-base.owl
wbbt/wbbt-base.owl
wbls/wbls-base.owl
wbphenotype/wbphenotype-base.owl
zfa/zfa-base.owl
https://w3id.org/orcidio/orcidio.owl
pr/pr-asserted.owl
po.owl
https://raw.githubusercontent.com/PlantPhenoOntology/ppo/master/ppo.owl
apo.owl
mmusdv.owl
foodon.owl
to.owl
peco.owl
mro.owl
hao.owl
clao.owl
oarcs.owl
http://translator.renci.org/ubergraph-axioms.ofn

Basic lookup

We can use the OAK command to lookup various IDs

[24]:
ubergraph info CL:0000540 CL:0000679 RO:0002100 UBERON:0002771
CL:0000540 ! neuron
CL:0000679 ! glutamatergic neuron
WARNING:root:Multiple labels for RO:0002100 = has_soma_location != has soma location
WARNING:root:Multiple labels for RO:0002100 = has_soma_location != has soma location
RO:0002100 ! has_soma_location
UBERON:0002771 ! middle temporal gyrus

Note the annoying warning messages. OAK is telling us it couldn’t find the “definitive” label for RO:0002100. One challenge for a merged graph like Ubergraph is that some sources may include stale imports with older labels and metadata - worse, some may “inject” triples onto objects that don’t belong to them!

In future, OAK may include more advanced ways of retrieving labels such that trusted graphs are relied on over secondary ones; and it’s likely that some of these will be resolved in ubergraph.

But for now you can just use --quiet to silence the warnings.

Next let’s try querying for relationships:

[22]:
ubergraph --quiet relationships CL:0000540
subject subject_label   predicate       predicate_label object  object_label
CL:0000540      neuron  RO:0002215      capable of      GO:0019226      transmission of nerve impulse
CL:0000540      neuron  RO:0002216      capable of part of      GO:0007154      cell communication
CL:0000540      neuron  rdfs:subClassOf None    CL:0000393      electrically responsive cell
CL:0000540      neuron  rdfs:subClassOf None    CL:0000404      electrically signaling cell
CL:0000540      neuron  rdfs:subClassOf None    CL:0002319      neural cell

By default, only asserted relationships are shown. Use --include-entailed to show also entailed relationships (calculated ahead of time using relation-graph):

[25]:
ubergraph --quiet relationships CL:0000540 -p p --include-entailed | head -30






























Note that there a lot of trivial (but true) relationships there.

We can also use viz to visualize a subgraph, in this around “cell”, following is-a and part-of relationships:

[4]:
ubergraph viz -p i,p CL:0000540 -o output/ubergraph-neuron.png

img

If you have used the viz command on individual ontologies before, you may notice a few differences with ubergraph.

Note that some pairs of terms have cyclic is-as pointing at one another: this is not an error! this reflects the fact that these two concepts are equivalent. In future we may add something at the visual layer that compacts these down into a bidirectional equivalence edge.

Making SPARQL queries

You can make SPARQL queries using the query command:

Here we will do a complex query that involves two relationship constraints:

  • all glutaminergic neurons (CL:0000679) that have

  • a soma location (RO:0002100) in the middle temporal gyrus (UBERON:0002771)

[16]:
ubergraph query -q "?s rdfs:subClassOf CL:0000679 ; RO:0002100 UBERON:0002771" -P CL,RO,UBERON
WARNING:root:Auto-adding limit
Query has no LIMIT clause: SELECT * WHERE {?s rdfs:subClassOf CL:0000679 ; RO:0002100 UBERON:0002771} LIMIT 100
s       s_label
PCL:0023046     Exc L2 LAMP5 LTK middle temporal gyrus glutamatergic neuron (Hsap)
PCL:0023047     Exc L2-4 LINC00507 GLP2R middle temporal gyrus glutamatergic neuron (Hsap)
PCL:0023048     Exc L2-3 LINC00507 FREM3 middle temporal gyrus glutamatergic neuron (Hsap)
PCL:0023049     Exc L5-6 THEMIS C1QL3 middle temporal gyrus glutamatergic neuron (Hsap)
PCL:0023050     Exc L3-4 RORB CARM1P1 middle temporal gyrus glutamatergic neuron (Hsap)
PCL:0023051     Exc L3-5 RORB ESR1 middle temporal gyrus glutamatergic neuron (Hsap)
PCL:0023052     Exc L3-5 RORB COL22A1 middle temporal gyrus glutamatergic neuron (Hsap)
PCL:0023053     Exc L3-5 RORB FILIP1L middle temporal gyrus glutamatergic neuron (Hsap)
PCL:0023054     Exc L3-5 RORB TWIST2 middle temporal gyrus glutamatergic neuron (Hsap)
PCL:0023055     Exc L4-5 RORB FOLH1B middle temporal gyrus glutamatergic neuron (Hsap)
PCL:0023056     Exc L4-6 RORB SEMA3E middle temporal gyrus glutamatergic neuron (Hsap)
PCL:0023057     Exc L4-5 RORB DAPK2 middle temporal gyrus glutamatergic neuron (Hsap)
PCL:0023058     Exc L5-6 RORB TTC12 middle temporal gyrus glutamatergic neuron (Hsap)
PCL:0023059     Exc L4-6 RORB C1R middle temporal gyrus glutamatergic neuron (Hsap)
PCL:0023060     Exc L4-5 FEZF2 SCN4B middle temporal gyrus glutamatergic neuron (Hsap)
PCL:0023061     Exc L5-6 THEMIS DCSTAMP middle temporal gyrus glutamatergic neuron (Hsap)
PCL:0023062     Exc L5-6 THEMIS CRABP1 middle temporal gyrus glutamatergic neuron (Hsap)
PCL:0023063     Exc L5-6 THEMIS FGF10 middle temporal gyrus glutamatergic neuron (Hsap)
PCL:0023064     Exc L4-6 FEZF2 IL26 middle temporal gyrus glutamatergic neuron (Hsap)
PCL:0023065     Exc L5-6 FEZF2 ABO middle temporal gyrus glutamatergic neuron (Hsap)
PCL:0023066     Exc L6 FEZF2 SCUBE1 middle temporal gyrus glutamatergic neuron (Hsap)
PCL:0023067     Exc L5-6 FEZF2 IL15 middle temporal gyrus glutamatergic neuron (Hsap)
PCL:0023068     Exc L6 FEZF2 OR2T8 middle temporal gyrus glutamatergic neuron (Hsap)
PCL:0023069     Exc L5-6 FEZF2 EFTUD1P1 middle temporal gyrus glutamatergic neuron (Hsap)
PCL:0023127     L2/3 IT middle temporal gyrus glutamatergic neuron (Hsap)
PCL:0023130     L4 IT middle temporal gyrus glutamatergic neuron (Hsap)
PCL:0023134     L5 IT middle temporal gyrus glutamatergic neuron (Hsap)
PCL:0023138     L6 IT middle temporal gyrus glutamatergic neuron (Hsap)
PCL:0023142     L6b middle temporal gyrus glutamatergic neuron (Hsap)

Note that for your convenience, a surround SELECT {...} is added. The -P option will add prefix declarations for all specified prefixes using standard sources.

By default, labels will also be queried for all results (note that OAK will try to do this in an efficient a way as possible, avoiding iterative queries over the network). To suppress this, switch off the --autolabel option:

[17]:
ubergraph query -q "?s rdfs:subClassOf CL:0000679 ; RO:0002100 UBERON:0002771" -P CL,RO,UBERON --no-autolabel
WARNING:root:Auto-adding limit
Query has no LIMIT clause: SELECT * WHERE {?s rdfs:subClassOf CL:0000679 ; RO:0002100 UBERON:0002771} LIMIT 100
s
PCL:0023046
PCL:0023047
PCL:0023048
PCL:0023049
PCL:0023050
PCL:0023051
PCL:0023052
PCL:0023053
PCL:0023054
PCL:0023055
PCL:0023056
PCL:0023057
PCL:0023058
PCL:0023059
PCL:0023060
PCL:0023061
PCL:0023062
PCL:0023063
PCL:0023064
PCL:0023065
PCL:0023066
PCL:0023067
PCL:0023068
PCL:0023069
PCL:0023127
PCL:0023130
PCL:0023134
PCL:0023138
PCL:0023142

Advanced: feeding query results to other commands

So far we have seen an example of using the query command. You can actually use the results of a query in many other commands.

In OAK, most commands accept a query term list. These are typically lists of IDs or labels - but they can also be query expressions that are evaluated on the fly. One such expression is .query.

Here we will do the same query as above, and feed the results into the relationships command, allowing us the see the direct superclasses of all glutamine neurons in the MTG:

[30]:
ubergraph --quiet relationships -p i .query//prefixes=CL,RO,UBERON "?s rdfs:subClassOf CL:0000679 ; RO:0002100 UBERON:0002771"
subject subject_label   predicate       predicate_label object  object_label
PCL:0023046     Exc L2 LAMP5 LTK middle temporal gyrus glutamatergic neuron (Hsap)      rdfs:subClassOf None    PCL:0023127     L2/3 IT middle temporal gyrus glutamatergic neuron (Hsap)
PCL:0023047     Exc L2-4 LINC00507 GLP2R middle temporal gyrus glutamatergic neuron (Hsap)      rdfs:subClassOf None    PCL:0023127     L2/3 IT middle temporal gyrus glutamatergic neuron (Hsap)
PCL:0023048     Exc L2-3 LINC00507 FREM3 middle temporal gyrus glutamatergic neuron (Hsap)      rdfs:subClassOf None    PCL:0023127     L2/3 IT middle temporal gyrus glutamatergic neuron (Hsap)
PCL:0023049     Exc L5-6 THEMIS C1QL3 middle temporal gyrus glutamatergic neuron (Hsap) rdfs:subClassOf None    CL:0000679      glutamatergic neuron
PCL:0023049     Exc L5-6 THEMIS C1QL3 middle temporal gyrus glutamatergic neuron (Hsap) rdfs:subClassOf None    CL:0010012      cerebral cortex neuron
PCL:0023050     Exc L3-4 RORB CARM1P1 middle temporal gyrus glutamatergic neuron (Hsap) rdfs:subClassOf None    PCL:0023130     L4 IT middle temporal gyrus glutamatergic neuron (Hsap)
PCL:0023051     Exc L3-5 RORB ESR1 middle temporal gyrus glutamatergic neuron (Hsap)    rdfs:subClassOf None    PCL:0023130     L4 IT middle temporal gyrus glutamatergic neuron (Hsap)
PCL:0023052     Exc L3-5 RORB COL22A1 middle temporal gyrus glutamatergic neuron (Hsap) rdfs:subClassOf None    PCL:0023130     L4 IT middle temporal gyrus glutamatergic neuron (Hsap)
PCL:0023053     Exc L3-5 RORB FILIP1L middle temporal gyrus glutamatergic neuron (Hsap) rdfs:subClassOf None    PCL:0023130     L4 IT middle temporal gyrus glutamatergic neuron (Hsap)
PCL:0023054     Exc L3-5 RORB TWIST2 middle temporal gyrus glutamatergic neuron (Hsap)  rdfs:subClassOf None    PCL:0023130     L4 IT middle temporal gyrus glutamatergic neuron (Hsap)
PCL:0023055     Exc L4-5 RORB FOLH1B middle temporal gyrus glutamatergic neuron (Hsap)  rdfs:subClassOf None    PCL:0023134     L5 IT middle temporal gyrus glutamatergic neuron (Hsap)
PCL:0023056     Exc L4-6 RORB SEMA3E middle temporal gyrus glutamatergic neuron (Hsap)  rdfs:subClassOf None    PCL:0023134     L5 IT middle temporal gyrus glutamatergic neuron (Hsap)
PCL:0023057     Exc L4-5 RORB DAPK2 middle temporal gyrus glutamatergic neuron (Hsap)   rdfs:subClassOf None    PCL:0023134     L5 IT middle temporal gyrus glutamatergic neuron (Hsap)
PCL:0023058     Exc L5-6 RORB TTC12 middle temporal gyrus glutamatergic neuron (Hsap)   rdfs:subClassOf None    PCL:0023134     L5 IT middle temporal gyrus glutamatergic neuron (Hsap)
PCL:0023059     Exc L4-6 RORB C1R middle temporal gyrus glutamatergic neuron (Hsap)     rdfs:subClassOf None    PCL:0023134     L5 IT middle temporal gyrus glutamatergic neuron (Hsap)
PCL:0023060     Exc L4-5 FEZF2 SCN4B middle temporal gyrus glutamatergic neuron (Hsap)  rdfs:subClassOf None    CL:4023041      L5 extratelencephalic projecting glutamatergic cortical neuron
PCL:0023061     Exc L5-6 THEMIS DCSTAMP middle temporal gyrus glutamatergic neuron (Hsap)       rdfs:subClassOf None    PCL:0023138     L6 IT middle temporal gyrus glutamatergic neuron (Hsap)
PCL:0023062     Exc L5-6 THEMIS CRABP1 middle temporal gyrus glutamatergic neuron (Hsap)        rdfs:subClassOf None    PCL:0023138     L6 IT middle temporal gyrus glutamatergic neuron (Hsap)
PCL:0023063     Exc L5-6 THEMIS FGF10 middle temporal gyrus glutamatergic neuron (Hsap) rdfs:subClassOf None    PCL:0023138     L6 IT middle temporal gyrus glutamatergic neuron (Hsap)
PCL:0023064     Exc L4-6 FEZF2 IL26 middle temporal gyrus glutamatergic neuron (Hsap)   rdfs:subClassOf None    CL:4023012      near-projecting glutamatergic cortical neuron
PCL:0023065     Exc L5-6 FEZF2 ABO middle temporal gyrus glutamatergic neuron (Hsap)    rdfs:subClassOf None    CL:4023042      L6 corticothalamic-projecting glutamatergic cortical neuron
PCL:0023066     Exc L6 FEZF2 SCUBE1 middle temporal gyrus glutamatergic neuron (Hsap)   rdfs:subClassOf None    PCL:0023142     L6b middle temporal gyrus glutamatergic neuron (Hsap)
PCL:0023067     Exc L5-6 FEZF2 IL15 middle temporal gyrus glutamatergic neuron (Hsap)   rdfs:subClassOf None    PCL:0023142     L6b middle temporal gyrus glutamatergic neuron (Hsap)
PCL:0023068     Exc L6 FEZF2 OR2T8 middle temporal gyrus glutamatergic neuron (Hsap)    rdfs:subClassOf None    PCL:0023142     L6b middle temporal gyrus glutamatergic neuron (Hsap)
PCL:0023069     Exc L5-6 FEZF2 EFTUD1P1 middle temporal gyrus glutamatergic neuron (Hsap)       rdfs:subClassOf None    PCL:0023142     L6b middle temporal gyrus glutamatergic neuron (Hsap)
PCL:0023127     L2/3 IT middle temporal gyrus glutamatergic neuron (Hsap)       rdfs:subClassOf None    CL:4023008      intratelencephalic-projecting glutamatergic cortical neuron
PCL:0023130     L4 IT middle temporal gyrus glutamatergic neuron (Hsap) rdfs:subClassOf None    CL:4023008      intratelencephalic-projecting glutamatergic cortical neuron
PCL:0023134     L5 IT middle temporal gyrus glutamatergic neuron (Hsap) rdfs:subClassOf None    CL:4023008      intratelencephalic-projecting glutamatergic cortical neuron
PCL:0023138     L6 IT middle temporal gyrus glutamatergic neuron (Hsap) rdfs:subClassOf None    CL:4023008      intratelencephalic-projecting glutamatergic cortical neuron
PCL:0023142     L6b middle temporal gyrus glutamatergic neuron (Hsap)   rdfs:subClassOf None    CL:4023038      L6b glutamatergic cortical neuron
[ ]: