OAK taxon-constraints command

This notebook is intended as a supplement to the main OAK CLI docs.

This notebook provides examples for the taxon-constraints command, which can be used to lookup direct and inferred taxon constraints for terms

Background Material

We strongly recommend you first read the Taxon Constraints Explainer on the OBook.

Help Option

You can get help on any OAK command using --help

[5]:
!runoak taxon-constraints --help
Usage: runoak taxon-constraints [OPTIONS] [TERMS]...

  Compute all taxon constraints for a term or terms.

  This will apply rules using the inferred ancestors of subject terms, as well
  as inferred ancestors/descendants of taxon terms.

  The input ontology MUST include both the taxon constraint relationships AND
  the relevant portion of NCBI Taxonomy

  Example:

      runoak -i db/go.db taxon-constraints GO:0034357 --include-redundant -p
      i,p

  Example:

      runoak -i sqlite:obo:uberon taxon-constraints UBERON:0003884
      UBERON:0003941 -p i,p

  This command is a wrapper onto taxon_constraints_utils:

  - https://incatools.github.io/ontology-access-
  kit/src/oaklib.utilities.taxon.taxon_constraints_utils

Options:
  -o, --output FILENAME           Output file, e.g. obo file
  -O, --output-type TEXT          Desired output type
  -p, --predicates TEXT           A comma-separated list of predicates
  -M, --graph-traversal-method [HOP|ENTAILMENT]
                                  Desired output type
  -A, --all / --no-A, --no-all    if specified then perform for all terms
                                  [default: no-A]
  --include-redundant / --no-include-redundant
                                  if specified then include redundant taxon
                                  constraints from ancestral subjects
                                  [default: no-include-redundant]
  --direct / --no-direct          only include directly asserted taxon
                                  constraints  [default: no-direct]
  --help                          Show this message and exit.

Set up an alias

For convenience we will set up an alias for use in this notebook

[1]:
alias go runoak -i sqlite:obo:go

Taxon Constraints for nucleus membrane

[3]:
go taxon-constraints --no-include-redundant "nuclear membrane"
id: GO:0031965
label: nuclear membrane
description: 'Term GO:0031965 "nuclear membrane" is ONLY found in NCBITaxon:2759 "Eukaryota"
  (NOT asserted: original term = GO:0005634 "nucleus"); no additional constraints'
only_in:
- subject: GO:0005634
  predicate: RO:0002160
  asserted: false
  redundant: false
  taxon:
    id: NCBITaxon:2759
    label: Eukaryota
  via_terms:
  - id: GO:0005634
    label: nucleus

The YAML here conforms to the Taxon Constraints data model defined in OAK.

Here we can see that “nuclear membrane” is only applicable for eukaryotes.

Note the via_terms - this means that the constraint was inferred via the nucleus term (the nucleus is a eukaryotic specific feature)

Direct Taxon Constraints

to show ONLY direct taxon constraints, use --direct:

[18]:
go taxon-constraints --direct GO:0005634
id: GO:0005634
label: nucleus
description: Term GO:0005634 "nucleus" is ONLY found in NCBITaxon:2759 "Eukaryota"
  (IS asserted); no additional constraints
only_in:
- subject: GO:0005634
  predicate: RO:0002160
  asserted: true
  redundant: false
  taxon:
    id: NCBITaxon:2759
    label: Eukaryota
  via_terms:
  - id: GO:0005634
    label: nucleus

Taxon Constraints from Other Ontologies

[7]:
go taxon-constraints --no-include-redundant ossification
id: GO:0001503
label: ossification
description: 'Term GO:0001503 "ossification" is ONLY found in NCBITaxon:7742 "Vertebrata
  <vertebrates>" (NOT asserted: original term = UBERON:0001474 "bone element"); is
  NEVER found in NCBITaxon:4896 "Schizosaccharomyces pombe" OR NCBITaxon:4932 "Saccharomyces
  cerevisiae" OR NCBITaxon:2157 "Archaea" OR NCBITaxon:2 "Bacteria" IS asserted'
only_in:
- subject: UBERON:0001474
  predicate: RO:0002160
  asserted: false
  redundant: false
  taxon:
    id: NCBITaxon:7742
    label: Vertebrata <vertebrates>
  via_terms:
  - id: UBERON:0001474
    label: bone element
never_in:
- subject: GO:0032501
  predicate: RO:0002161
  asserted: false
  redundant: false
  redundant_with_only_in: true
  taxon:
    id: NCBITaxon:4932
    label: Saccharomyces cerevisiae
  via_terms:
  - id: GO:0032501
    label: multicellular organismal process
- subject: GO:0032501
  predicate: RO:0002161
  asserted: false
  redundant: false
  redundant_with_only_in: true
  taxon:
    id: NCBITaxon:4896
    label: Schizosaccharomyces pombe
  via_terms:
  - id: GO:0032501
    label: multicellular organismal process
- subject: GO:0032501
  predicate: RO:0002161
  asserted: false
  redundant: false
  redundant_with_only_in: true
  taxon:
    id: NCBITaxon:2157
    label: Archaea
  via_terms:
  - id: GO:0032501
    label: multicellular organismal process
- subject: GO:0032501
  predicate: RO:0002161
  asserted: false
  redundant: false
  redundant_with_only_in: true
  taxon:
    id: NCBITaxon:2
    label: Bacteria
  via_terms:
  - id: GO:0032501
    label: multicellular organismal process

In this case we can see that the primary only_in constraint comes not from GO but from Uberon.

We can use the paths command to explore specific paths further:

go paths –directed –target NCBITaxon:7742 ossification

Evaluating Candidate taxon constraints

The related apply-taxon-constraints command can be used to test taxon constraints

[13]:
go apply-taxon-constraints GO:0031965 only NCBITaxon:2759
id: GO:0031965
label: nuclear membrane
only_in:
- subject: GO:0031965
  predicate: RO:0002160
  redundant: true
  taxon:
    id: NCBITaxon:2759
    label: Eukaryota
  redundant_with:
  - subject: GO:0005634
    predicate: RO:0002160
    asserted: false
    redundant: false
    taxon:
      id: NCBITaxon:2759
      label: Eukaryota
    via_terms:
    - id: GO:0005634
      label: nucleus
  comments:
  - Redundant with pre-existing constraint                                 GO:0005634
    // Taxon(id='NCBITaxon:2759', label='Eukaryota')

This tells us that the addition is valid, but redundant

[16]:
go apply-taxon-constraints GO:0031965 only NCBITaxon:2
id: GO:0031965
label: nuclear membrane
description: 'Unsatisfiable taxon constraints: NCBITaxon:2759 and NCBITaxon:2 are
  disjoint'
unsatisfiable: true
only_in:
- subject: GO:0031965
  predicate: RO:0002160
  taxon:
    id: NCBITaxon:2
  candidate: true
[ ]: