OAK taxon-constraints command
This notebook is intended as a supplement to the main OAK CLI docs.
This notebook provides examples for the taxon-constraints
command, which can be used to lookup direct and inferred taxon constraints for terms
Background Material
We strongly recommend you first read the Taxon Constraints Explainer on the OBook.
Help Option
You can get help on any OAK command using --help
[5]:
!runoak taxon-constraints --help
Usage: runoak taxon-constraints [OPTIONS] [TERMS]...
Compute all taxon constraints for a term or terms.
This will apply rules using the inferred ancestors of subject terms, as well
as inferred ancestors/descendants of taxon terms.
The input ontology MUST include both the taxon constraint relationships AND
the relevant portion of NCBI Taxonomy
Example:
runoak -i db/go.db taxon-constraints GO:0034357 --include-redundant -p
i,p
Example:
runoak -i sqlite:obo:uberon taxon-constraints UBERON:0003884
UBERON:0003941 -p i,p
This command is a wrapper onto taxon_constraints_utils:
- https://incatools.github.io/ontology-access-
kit/src/oaklib.utilities.taxon.taxon_constraints_utils
Options:
-o, --output FILENAME Output file, e.g. obo file
-O, --output-type TEXT Desired output type
-p, --predicates TEXT A comma-separated list of predicates
-M, --graph-traversal-method [HOP|ENTAILMENT]
Desired output type
-A, --all / --no-A, --no-all if specified then perform for all terms
[default: no-A]
--include-redundant / --no-include-redundant
if specified then include redundant taxon
constraints from ancestral subjects
[default: no-include-redundant]
--direct / --no-direct only include directly asserted taxon
constraints [default: no-direct]
--help Show this message and exit.
Set up an alias
For convenience we will set up an alias for use in this notebook
[1]:
alias go runoak -i sqlite:obo:go
Taxon Constraints for nucleus membrane
[3]:
go taxon-constraints --no-include-redundant "nuclear membrane"
id: GO:0031965
label: nuclear membrane
description: 'Term GO:0031965 "nuclear membrane" is ONLY found in NCBITaxon:2759 "Eukaryota"
(NOT asserted: original term = GO:0005634 "nucleus"); no additional constraints'
only_in:
- subject: GO:0005634
predicate: RO:0002160
asserted: false
redundant: false
taxon:
id: NCBITaxon:2759
label: Eukaryota
via_terms:
- id: GO:0005634
label: nucleus
The YAML here conforms to the Taxon Constraints data model defined in OAK.
Here we can see that “nuclear membrane” is only applicable for eukaryotes.
Note the via_terms
- this means that the constraint was inferred via the nucleus
term (the nucleus is a eukaryotic specific feature)
Direct Taxon Constraints
to show ONLY direct taxon constraints, use --direct
:
[18]:
go taxon-constraints --direct GO:0005634
id: GO:0005634
label: nucleus
description: Term GO:0005634 "nucleus" is ONLY found in NCBITaxon:2759 "Eukaryota"
(IS asserted); no additional constraints
only_in:
- subject: GO:0005634
predicate: RO:0002160
asserted: true
redundant: false
taxon:
id: NCBITaxon:2759
label: Eukaryota
via_terms:
- id: GO:0005634
label: nucleus
Taxon Constraints from Other Ontologies
[7]:
go taxon-constraints --no-include-redundant ossification
id: GO:0001503
label: ossification
description: 'Term GO:0001503 "ossification" is ONLY found in NCBITaxon:7742 "Vertebrata
<vertebrates>" (NOT asserted: original term = UBERON:0001474 "bone element"); is
NEVER found in NCBITaxon:4896 "Schizosaccharomyces pombe" OR NCBITaxon:4932 "Saccharomyces
cerevisiae" OR NCBITaxon:2157 "Archaea" OR NCBITaxon:2 "Bacteria" IS asserted'
only_in:
- subject: UBERON:0001474
predicate: RO:0002160
asserted: false
redundant: false
taxon:
id: NCBITaxon:7742
label: Vertebrata <vertebrates>
via_terms:
- id: UBERON:0001474
label: bone element
never_in:
- subject: GO:0032501
predicate: RO:0002161
asserted: false
redundant: false
redundant_with_only_in: true
taxon:
id: NCBITaxon:4932
label: Saccharomyces cerevisiae
via_terms:
- id: GO:0032501
label: multicellular organismal process
- subject: GO:0032501
predicate: RO:0002161
asserted: false
redundant: false
redundant_with_only_in: true
taxon:
id: NCBITaxon:4896
label: Schizosaccharomyces pombe
via_terms:
- id: GO:0032501
label: multicellular organismal process
- subject: GO:0032501
predicate: RO:0002161
asserted: false
redundant: false
redundant_with_only_in: true
taxon:
id: NCBITaxon:2157
label: Archaea
via_terms:
- id: GO:0032501
label: multicellular organismal process
- subject: GO:0032501
predicate: RO:0002161
asserted: false
redundant: false
redundant_with_only_in: true
taxon:
id: NCBITaxon:2
label: Bacteria
via_terms:
- id: GO:0032501
label: multicellular organismal process
In this case we can see that the primary only_in
constraint comes not from GO but from Uberon.
We can use the paths
command to explore specific paths further:
go paths –directed –target NCBITaxon:7742 ossification
Evaluating Candidate taxon constraints
The related apply-taxon-constraints
command can be used to test taxon constraints
[13]:
go apply-taxon-constraints GO:0031965 only NCBITaxon:2759
id: GO:0031965
label: nuclear membrane
only_in:
- subject: GO:0031965
predicate: RO:0002160
redundant: true
taxon:
id: NCBITaxon:2759
label: Eukaryota
redundant_with:
- subject: GO:0005634
predicate: RO:0002160
asserted: false
redundant: false
taxon:
id: NCBITaxon:2759
label: Eukaryota
via_terms:
- id: GO:0005634
label: nucleus
comments:
- Redundant with pre-existing constraint GO:0005634
// Taxon(id='NCBITaxon:2759', label='Eukaryota')
This tells us that the addition is valid, but redundant
[16]:
go apply-taxon-constraints GO:0031965 only NCBITaxon:2
id: GO:0031965
label: nuclear membrane
description: 'Unsatisfiable taxon constraints: NCBITaxon:2759 and NCBITaxon:2 are
disjoint'
unsatisfiable: true
only_in:
- subject: GO:0031965
predicate: RO:0002160
taxon:
id: NCBITaxon:2
candidate: true
[ ]: