OAK disjoints command
This notebook is intended as a supplement to the main OAK CLI docs.
This notebook provides examples for the disjoints
command, which can be used to lookup and summarize disjointness axioms
For more on disjointness see The OBook
Help Option
You can get help on any OAK command using --help
[2]:
!runoak disjoints --help
Usage: runoak disjoints [OPTIONS] [TERMS]...
Show all disjoints for a set of terms, or whole ontology.
Leave off all arguments for defaults - all terms, YAML OboGraph model
serialization:
Example:
runoak -i sqlite:obo:uberon disjoints
Note that this will include pairwise disjoints, setwise disjoints, disjoint
unions, and disjoints involving simple class expressions.
A tabular format can be easier to browse, and includes labels by default:
Example:
runoak -i sqlite:obo:uberon disjoints --autolabel -O csv
To perform this on a subset:
Example:
runoak -i sqlite:obo:cl disjoints --autolabel -O csv .desc//p=i "immune
cell"
Data model:
https://w3id.org/oak/obograph
Options:
-p, --predicates TEXT A comma-separated list of predicates. This
may be a shorthand (i, p) or CURIE
--autolabel / --no-autolabel If set, results will automatically have
labels assigned [default: autolabel]
-O, --output-type TEXT Desired output type
--named-classes-only / --no-namde-classes-only
Only show disjointness axioms between two
named classes. [default: no-namde-classes-
only]
-o, --output FILENAME Output file, e.g. obo file
--help Show this message and exit.
Set up an alias
For convenience we will set up an alias for use in this notebook
[3]:
alias cl runoak -i sqlite:obo:cl
All simple disjointness axioms
Let’s first look at all simple disjointness axioms in the ontology - i.e. those between named classes
[6]:
cl disjoints --named-classes-only > output/cl-disjoints.yaml
[7]:
!head -40 output/cl-disjoints.yaml
classIds:
- BFO:0000002
- BFO:0000003
---
classIds:
- BFO:0000004
- BFO:0000031
---
classIds:
- BFO:0000004
- BFO:0000020
---
classIds:
- BFO:0000016
- BFO:0000023
---
classIds:
- BFO:0000017
- BFO:0000019
---
classIds:
- BFO:0000020
- BFO:0000031
---
classIds:
- BFO:0000040
- BFO:0000141
---
classIds:
- CARO:0000006
- CARO:0000007
---
The YAML here is conformant with OboGraphs. However, it’s not very convenient for viewing, so let’s get a flattened via as both obo format and a TSV
[19]:
cl disjoints --named-classes-only -O obo > output/cl-disjoints.obo
WARNING:root:Skipping DisjointClassExpressionsAxiom with only one class: DisjointClassExpressionsAxiom(meta=None, classIds=['_:riog00151338'], classExpressions=[], unionEquivalentTo=None, unionEquivalentToExpression=None)
[21]:
!head -20 output/cl-disjoints.obo
[Term]
id: BFO:0000002 ! continuant
disjoint_from: BFO:0000003 ! occurrent
[Term]
id: BFO:0000004 ! independent continuant
disjoint_from: BFO:0000031 ! generically dependent continuant
[Term]
id: BFO:0000004 ! independent continuant
disjoint_from: BFO:0000020 ! specifically dependent continuant
[Term]
id: BFO:0000016 ! disposition
disjoint_from: BFO:0000023 ! role
[8]:
cl disjoints --named-classes-only -O csv > output/cl-disjoints.tsv
[9]:
import pandas as pd
df = pd.read_csv("output/cl-disjoints.tsv", sep="\t")
df
[9]:
classIds | classIds_label | unionEquivalentTo | unionEquivalentToExpression | classExpressionPropertyIds | classExpressionFillerIds | |
---|---|---|---|---|---|---|
0 | BFO:0000002|BFO:0000003 | continuant|occurrent | NaN | NaN | NaN | NaN |
1 | BFO:0000004|BFO:0000031 | independent continuant|generically dependent c... | NaN | NaN | NaN | NaN |
2 | BFO:0000004|BFO:0000020 | independent continuant|specifically dependent ... | NaN | NaN | NaN | NaN |
3 | BFO:0000016|BFO:0000023 | disposition|role | NaN | NaN | NaN | NaN |
4 | BFO:0000017|BFO:0000019 | realizable entity|quality | NaN | NaN | NaN | NaN |
... | ... | ... | ... | ... | ... | ... |
309 | UBERON:0035165|UBERON:0035523 | posterior surface of prostate|anterior surface... | NaN | NaN | NaN | NaN |
310 | UBERON:2001156|UBERON:2001316 | posterior lateral line placode|anterior latera... | NaN | NaN | NaN | NaN |
311 | UBERON:2001314|UBERON:2001391 | posterior lateral line ganglion|anterior later... | NaN | NaN | NaN | NaN |
312 | UBERON:2001468|UBERON:2001471 | anterior lateral line system|posterior lateral... | NaN | NaN | NaN | NaN |
313 | _:riog00151338 | NaN | NaN | NaN | NaN | NaN |
314 rows × 6 columns
Note that many of the columns will never be filled so long as we are querying simple (NC only) disjoints.
This includes lots of ontologies that are merged in.
We can filter this by ID prefix using an i^
(identifier starts with) query
[14]:
cl disjoints --named-classes-only -O csv i^CL: > output/cl-disjoints-cell-types.tsv
[15]:
df = pd.read_csv("output/cl-disjoints-cell-types.tsv", sep="\t")
df
[15]:
classIds | classIds_label | unionEquivalentTo | unionEquivalentToExpression | classExpressionPropertyIds | classExpressionFillerIds | |
---|---|---|---|---|---|---|
0 | CL:0000000|GO:0043226 | cell|organelle | NaN | NaN | NaN | NaN |
1 | CL:0000000|GO:0032991 | cell|protein-containing complex | NaN | NaN | NaN | NaN |
2 | CL:0000000|GO:0031012 | cell|extracellular matrix | NaN | NaN | NaN | NaN |
3 | CL:0000039|CL:0002371 | germ line cell|somatic cell | NaN | NaN | NaN | NaN |
4 | CL:0000049|CL:0000557 | common myeloid progenitor|granulocyte monocyte... | NaN | NaN | NaN | NaN |
5 | CL:0000049|CL:0000051 | common myeloid progenitor|common lymphoid prog... | NaN | NaN | NaN | NaN |
6 | CL:0000049|CL:0000050 | common myeloid progenitor|megakaryocyte-erythr... | NaN | NaN | NaN | NaN |
7 | CL:0000050|CL:0002009 | megakaryocyte-erythroid progenitor cell|macrop... | NaN | NaN | NaN | NaN |
8 | CL:0000050|CL:0000557 | megakaryocyte-erythroid progenitor cell|granul... | NaN | NaN | NaN | NaN |
9 | CL:0000066|CL:0000738 | epithelial cell|leukocyte | NaN | NaN | NaN | NaN |
10 | CL:0000084|CL:0000945 | T cell|lymphocyte of B lineage | NaN | NaN | NaN | NaN |
11 | CL:0000225|CL:0002242 | anucleate cell|nucleate cell | NaN | NaN | NaN | NaN |
12 | CL:0000255|CL:0000520 | eukaryotic cell|prokaryotic cell | NaN | NaN | NaN | NaN |
13 | CL:0000451|CL:0000542 | dendritic cell|lymphocyte | NaN | NaN | NaN | NaN |
14 | CL:0000521|CL:0000548 | fungal cell|animal cell | NaN | NaN | NaN | NaN |
15 | CL:0000542|CL:0000766 | lymphocyte|myeloid leukocyte | NaN | NaN | NaN | NaN |
16 | CL:0000556|CL:0000764 | megakaryocyte|erythroid lineage cell | NaN | NaN | NaN | NaN |
17 | CL:0000624|CL:0000625 | CD4-positive, alpha-beta T cell|CD8-positive, ... | NaN | NaN | NaN | NaN |
18 | CL:0000737|CL:0008000 | striated muscle cell|non-striated muscle cell | NaN | NaN | NaN | NaN |
19 | CL:0000785|CL:0000818 | mature B cell|transitional stage B cell | NaN | NaN | NaN | NaN |
20 | CL:0000785|CL:0000817 | mature B cell|precursor B cell | NaN | NaN | NaN | NaN |
21 | CL:0000785|CL:0000816 | mature B cell|immature B cell | NaN | NaN | NaN | NaN |
22 | CL:0000789|CL:0000798 | alpha-beta T cell|gamma-delta T cell | NaN | NaN | NaN | NaN |
23 | CL:0000813|CL:0000898 | memory T cell|naive T cell | NaN | NaN | NaN | NaN |
24 | CL:0000817|CL:0000826 | precursor B cell|pro-B cell | NaN | NaN | NaN | NaN |
25 | CL:0000823|CL:0000937 | immature natural killer cell|pre-natural kille... | NaN | NaN | NaN | NaN |
26 | CL:0000823|CL:0000824 | immature natural killer cell|mature natural ki... | NaN | NaN | NaN | NaN |
27 | CL:0000837|CL:0002032 | hematopoietic multipotent progenitor cell|hema... | NaN | NaN | NaN | NaN |
28 | CL:0000838|CL:0000839 | lymphoid lineage restricted progenitor cell|my... | NaN | NaN | NaN | NaN |
29 | CL:0000851|CL:0000855 | neuromast mantle cell|sensory hair cell | NaN | NaN | NaN | NaN |
30 | CL:0000852|CL:0000855 | neuromast supporting cell|sensory hair cell | NaN | NaN | NaN | NaN |
31 | CL:0000955|CL:0000956 | pre-B-II cell|pre-B-I cell | NaN | NaN | NaN | NaN |
32 | CL:0001008|CL:0001024 | Kit and Sca1-positive hematopoietic stem cell|... | NaN | NaN | NaN | NaN |
33 | CL:0001021|CL:0001025 | CD34-positive, CD38-positive common lymphoid p... | NaN | NaN | NaN | NaN |
34 | CL:0001023|CL:0001026 | Kit-positive, CD34-positive common myeloid pro... | NaN | NaN | NaN | NaN |
35 | CL:0002031|CL:0002032 | hematopoietic lineage restricted progenitor ce... | NaN | NaN | NaN | NaN |
36 | CL:0002036|CL:0002043 | Slamf1-positive multipotent progenitor cell|CD... | NaN | NaN | NaN | NaN |
37 | CL:0008011|CL:0008020 | skeletal muscle satellite stem cell|skeletal m... | NaN | NaN | NaN | NaN |
38 | CL:0008046|CL:0008047 | extrafusal muscle fiber|intrafusal muscle fiber | NaN | NaN | NaN | NaN |
39 | _:riog00151338 | NaN | NaN | NaN | NaN | NaN |
Disjoint Class Expressions
Some ontologies like Uberon make use of more advanced disjointness concepts in order to express things like spatial disjointness. See Uberon wiki.
In OWL terms these are formally known as “General Class Inclusion Axioms”. However, OAK shields you from this and provides these using a simple data model.
To include part-of in lookups, use the --predicates
(-p
) option (this is a standard OAK option for any command involving relationship types).
Here we will find all spatial disjointness axioms between major organism subdivisions in Uberon:
[16]:
alias uberon runoak -i sqlite:obo:uberon
[18]:
uberon disjoints -p i,p .desc//p=i "subdivision of organism along main body axis"
classExpressions:
- fillerId: UBERON:0000026
propertyId: BFO:0000050
- fillerId: UBERON:0000915
propertyId: BFO:0000050
---
classExpressions:
- fillerId: UBERON:0000026
propertyId: BFO:0000050
- fillerId: UBERON:0002100
propertyId: BFO:0000050
---
classExpressions:
- fillerId: UBERON:0000033
propertyId: BFO:0000050
- fillerId: UBERON:0000915
propertyId: BFO:0000050
---
classExpressions:
- fillerId: UBERON:0000033
propertyId: BFO:0000050
- fillerId: UBERON:0000948
propertyId: BFO:0000050
---
classExpressions:
- fillerId: UBERON:0000033
propertyId: BFO:0000050
- fillerId: UBERON:0002100
propertyId: BFO:0000050
---
classExpressions:
- fillerId: UBERON:0000033
propertyId: BFO:0000050
- fillerId: UBERON:0005886
propertyId: BFO:0000050
---
classExpressions:
- fillerId: UBERON:0000915
propertyId: BFO:0000050
- fillerId: UBERON:0002417
propertyId: BFO:0000050
---
classIds:
- _:riog00226101
---
classIds:
- _:riog00226236
---
classIds:
- _:riog00226251
---
classIds:
- _:riog00226988
The OAK OboGraphs data model here allows each axiom to include a list of class expressions, these are tuples of a predicate (property) and a filler.
We can look at the flattened view:
[22]:
uberon disjoints -p i,p .desc//p=i "subdivision of organism along main body axis" -O csv -o output/uberon-part-disjoint-subdivisions.tsv
[23]:
df = pd.read_csv("output/uberon-part-disjoint-subdivisions.tsv", sep="\t")
df
[23]:
classIds | unionEquivalentTo | unionEquivalentToExpression | classExpressionPropertyIds | classExpressionPropertyIds_label | classExpressionFillerIds | classExpressionFillerIds_label | |
---|---|---|---|---|---|---|---|
0 | NaN | NaN | NaN | BFO:0000050|BFO:0000050 | part of|part of | UBERON:0000026|UBERON:0000915 | appendage|thoracic segment of trunk |
1 | NaN | NaN | NaN | BFO:0000050|BFO:0000050 | part of|part of | UBERON:0000026|UBERON:0002100 | appendage|trunk |
2 | NaN | NaN | NaN | BFO:0000050|BFO:0000050 | part of|part of | UBERON:0000033|UBERON:0000915 | head|thoracic segment of trunk |
3 | NaN | NaN | NaN | BFO:0000050|BFO:0000050 | part of|part of | UBERON:0000033|UBERON:0000948 | head|heart |
4 | NaN | NaN | NaN | BFO:0000050|BFO:0000050 | part of|part of | UBERON:0000033|UBERON:0002100 | head|trunk |
5 | NaN | NaN | NaN | BFO:0000050|BFO:0000050 | part of|part of | UBERON:0000033|UBERON:0005886 | head|post-hyoid pharyngeal arch skeleton |
6 | NaN | NaN | NaN | BFO:0000050|BFO:0000050 | part of|part of | UBERON:0000915|UBERON:0002417 | thoracic segment of trunk|abdominal segment of... |
7 | _:riog00226101 | NaN | NaN | NaN | NaN | NaN | NaN |
8 | _:riog00226236 | NaN | NaN | NaN | NaN | NaN | NaN |
9 | _:riog00226251 | NaN | NaN | NaN | NaN | NaN | NaN |
10 | _:riog00226988 | NaN | NaN | NaN | NaN | NaN | NaN |
Here the disjointness axiom states that all classIds and all predicate-filler expressions are mutually disjoint.
This is telling us that nothing is part of both an “appendage” and “thoracic segment of trunk”, i.e. there is no spatial overlap.
Generating disjointness axioms
Many ontologies are under-axiomatized. Editors sometimes struggle to add the appropriate disjointness axioms.
OAK provides a heuristic approach to suggesting disjointness axioms.
First we will explore this using the Zebrafish anatomy ontolog as an example. We will find candidate pairwise disjoints under “bone element”:
[34]:
alias zfa runoak -i sqlite:obo:zfa
[37]:
zfa generate-disjoints "bone element" -O csv -o output/zfa-bone-element-gen-disjoint.tsv
[38]:
df = pd.read_csv("output/zfa-bone-element-gen-disjoint.tsv", sep="\t")
df
[38]:
classIds | classIds_label | unionEquivalentTo | unionEquivalentToExpression | classExpressionPropertyIds | classExpressionFillerIds | |
---|---|---|---|---|---|---|
0 | ZFA:0000170|ZFA:0000658 | basibranchial|epibranchial bone | NaN | NaN | NaN | NaN |
1 | ZFA:0000442|ZFA:0000658 | supraneural|epibranchial bone | NaN | NaN | NaN | NaN |
2 | ZFA:0000442|ZFA:0000170 | supraneural|basibranchial | NaN | NaN | NaN | NaN |
3 | ZFA:0001066|ZFA:0000658 | neural arch|epibranchial bone | NaN | NaN | NaN | NaN |
4 | ZFA:0001066|ZFA:0000170 | neural arch|basibranchial | NaN | NaN | NaN | NaN |
... | ... | ... | ... | ... | ... | ... |
60 | ZFA:0001418|ZFA:0001551 | dorsal fin lepidotrichium|pectoral fin lepidot... | NaN | NaN | NaN | NaN |
61 | ZFA:0001421|ZFA:0001552 | anal fin lepidotrichium|pelvic fin lepidotrichium | NaN | NaN | NaN | NaN |
62 | ZFA:0001421|ZFA:0001550 | anal fin lepidotrichium|caudal fin lepidotrichium | NaN | NaN | NaN | NaN |
63 | ZFA:0001421|ZFA:0001551 | anal fin lepidotrichium|pectoral fin lepidotri... | NaN | NaN | NaN | NaN |
64 | ZFA:0001421|ZFA:0001418 | anal fin lepidotrichium|dorsal fin lepidotrichium | NaN | NaN | NaN | NaN |
65 rows × 6 columns
Generating spatial disjointness axioms
Pass in predicates to also generate candidate OWL axioms of the form
(part-of some X) DisjointWith (part-of some Y)
[40]:
zfa generate-disjoints "paired fin skeleton" -p i,p -O csv -o output/zfa-skel-gen-part-disjoint.tsv
[41]:
df = pd.read_csv("output/zfa-skel-gen-part-disjoint.tsv", sep="\t")
df
[41]:
classIds | unionEquivalentTo | unionEquivalentToExpression | classExpressionPropertyIds | classExpressionPropertyIds_label | classExpressionFillerIds | classExpressionFillerIds_label | |
---|---|---|---|---|---|---|---|
0 | NaN | NaN | NaN | BFO:0000050|BFO:0000050 | part of|part of | ZFA:0000943|ZFA:0001387 | pectoral fin skeleton|pelvic fin skeleton |
1 | NaN | NaN | NaN | BFO:0000050|BFO:0000050 | part of|part of | ZFA:0000257|ZFA:0001586 | pectoral fin cartilage|pectoral fin radial |
2 | NaN | NaN | NaN | BFO:0000050|BFO:0000050 | part of|part of | ZFA:0001551|ZFA:0001586 | pectoral fin lepidotrichium|pectoral fin radial |
3 | NaN | NaN | NaN | BFO:0000050|BFO:0000050 | part of|part of | ZFA:0001551|ZFA:0000257 | pectoral fin lepidotrichium|pectoral fin carti... |
4 | NaN | NaN | NaN | BFO:0000050|BFO:0000050 | part of|part of | ZFA:0000407|ZFA:0001586 | pectoral girdle|pectoral fin radial |
5 | NaN | NaN | NaN | BFO:0000050|BFO:0000050 | part of|part of | ZFA:0000407|ZFA:0001551 | pectoral girdle|pectoral fin lepidotrichium |
6 | NaN | NaN | NaN | BFO:0000050|BFO:0000050 | part of|part of | ZFA:0001552|ZFA:0000508 | pelvic fin lepidotrichium|pelvic radial |
7 | NaN | NaN | NaN | BFO:0000050|BFO:0000050 | part of|part of | ZFA:0001459|ZFA:0000508 | pelvic fin cartilage|pelvic radial |
8 | NaN | NaN | NaN | BFO:0000050|BFO:0000050 | part of|part of | ZFA:0001459|ZFA:0001552 | pelvic fin cartilage|pelvic fin lepidotrichium |
The first row here tells us that the pectoral and pelvic fin skeletons have no parts in common.
Note this is a stronger axiom than simply saying the two structures are class-disjoint.
[ ]: