Command Line

Preamble —

Warning

currently things are set up such that there is a single main command runoak, which has a growing number of subcommands. In the future we may decide to split the subcommands into other commands.

Note

we follow the CLIG guidelines as far as possible

General Guidelines

Note

if you are running this as an internal OAK developer you need to precede the command with poetry shell

The general structure is:

The value for --input (which can be shorted to -i) is specified in the Ontology Implementation Selectors documentation

You can specify further implementations with -a which will create an aggregator implementation that wraps multiple implementations. For example, you can multiplex queries over different endpoints.

Common Patterns

Term Lists

Many commands take a term or a list of terms as their primary argument. These are typically one of:

  • a CURIE such as UBERON:0000955

  • a Search Syntax term, which is either:

    • an exact match to a label; for example “limb” or “plasma membrane”

    • a compound search term such as t~limb which finds terms with partial matches to limb

Search terms are expanded to matching CURIEs, and then fed into the command

Be warned that use of search terms can make some commands “explode”

Predicates

Many commands take a --predicates option (shortened to -p). This specifies a list of predicates (aka relationship types, see Predicates) to be used in filtering. The list is specified as a comma-delimited list (no spaces) of CURIEs.

For many biological ontologies, it can be useful to filter on is_a (rdfs:subClassOf) and part_of (BFO:0000050) so the command line interface understands shortcuts for these:

  • i: is-a (i.e rdfs:subClassOf between two named classes)

  • p: part-of

However, this library is not restricted to biological ontologies, and in future we may allow customizable shortcuts.

Command Line Docs

runoak

Run the oaklib Command Line.

A subcommand must be passed - for example: ancestors, terms, …

Most commands require an input ontology to be specified:

runoak -i <INPUT SPECIFICATION> SUBCOMMAND <SUBCOMMAND OPTIONS AND ARGUMENTS>

Get help on any command, e.g:

runoak viz -h

runoak [OPTIONS] COMMAND [ARGS]...

Options

-v, --verbose
-q, --quiet <quiet>
--save-as <save_as>

For commands that mutate the ontology, this specifies where changes are saved to

--autosave, --no-autosave

For commands that mutate the ontology, this determines if these are automatically saved in place

--named-prefix-map <named_prefix_map>

the name of a prefix map, e.g. obo, prefixcc

--prefix <prefix>

prefix=expansion pair

--import-depth <import_depth>

Maximum depth in the import tree to traverse. Currently this is only used by the pronto adapter

-i, --input <input>

input implementation specification. This is either a path to a file, or an ontology selector

-I, --input-type <input_type>

Input format. Permissible values vary depending on the context

-a, --add <add>

additional implementation specification.

aliases

List aliases for a term or set of terms

Example:

runoak -i ubergraph:uberon aliases UBERON:0001988

TERMS should be either an explicit list of terms or queries, or can be a selector query, such as ‘.all’ to fetch all terms in the ontology

Show all aliases:

runoak -i db/envo.db aliases .all

Currently the core behavior of this command assumes a simple datamodel for aliases, where an aliases is a simple property-value tuples, with the property being from some standard vocabulary (e.g. skos:altLabel, oboInOwl, etc)

If you know the synonyms follow the OBO/oboInOwl datamodel you can pass –obo-model, this will give back richer data if present in the ontology, including synonym categories/types, synonym provenance

In future, this may become the default

runoak aliases [OPTIONS] [TERMS]...

Options

--obo-model, --no-obo-model

If true, assume the OBO synonym datamodel

-o, --output <output>

Output file, e.g. obo file

Arguments

TERMS

Optional argument(s)

ancestors

List all ancestors of a given term or terms.

Here ancestor means the transitive closure of the parent relationship, where a parent includes all relationship types, not just is-a.

Example:

runoak -i cl.owl ancestors CL:4023094

This will show ancestry over the full relationship graph. Like any relational OAK command, this can be filtered by relationship type (predicate), using –predicate (-p). For exampple, constrained to is-a and part-of:

runoak -i cl.owl ancestors CL:4023094 -p i,BFO:0000050

Multiple backends can be used, including ubergraph:

runoak -i ubergraph: ancestors CL:4023094 -p i,BFO:0000050

Search terms can also be used:

runoak -i cl.owl ancestors ‘goblet cell’

Multiple terms can be passed:

runoak -i sqlite:go.db ancestors GO:0005773 GO:0005737 -p i,p

More background:

runoak ancestors [OPTIONS] [TERMS]...

Options

-p, --predicates <predicates>

A comma-separated list of predicates

-O, --output-type <output_type>

Desired output type

--statistics, --no-statistics

For each ancestor, show statistics.

Default

False

-o, --output <output>

Output file, e.g. obo file

Arguments

TERMS

Optional argument(s)

annotate

Annotate a piece of text using a Named Entity Recognition annotation

Example:

runoak -i bioportal: annotate “enlarged nucleus in T-cells from peripheral blood”

Currently BioPortal is the only implementation. Volunteers sought to implement for OLS.

See the ontorunner framework for plugins for SciSpacy and OGER

For more on text annotation, see:

runoak annotate [OPTIONS] [WORDS]...

Options

-W, --matches-whole-text, --no-W, --no-matches-whole-text

if true, then only show matches that span the entire input text

Default

False

--text-file <text_file>

Text file to annotate. Each newline separated entry is a distinct text.

-o, --output <output>

Output file, e.g. obo file

-O, --output-type <output_type>

Desired output type

Arguments

WORDS

Optional argument(s)

apply

Applies a patch to an ontology. The patch should be specified using KGCL syntax, see https://github.com/INCATools/kgcl

Example:

runoak -i cl.owl.ttl apply “rename CL:0000561 to ‘amacrine neuron’” -o cl.owl.ttl -O ttl

On an obo format file:

runoak -i simpleobo:go-edit.obo apply “rename GO:0005634 from ‘nucleus’ to ‘foo’” -o go-edit-new.obo

With URIs:

runoak -i cl.owl.ttl apply “rename <http://purl.obolibrary.org/obo/CL_0000561> from ‘amacrine cell’ to ‘amacrine neuron’” -o cl.owl.ttl -O ttl

WARNING:

This command is still experimental. Some things to bear in mind:

  • for some ontologies, CURIEs may not work, instead specify a full URI surrounded by <>s

  • only a subset of KGCL commands are supported by each backend

runoak apply [OPTIONS] [COMMANDS]...

Options

-o, --output <output>
--changes-input <changes_input>

Path to an input changes file

--changes-format <changes_format>

Format of the changes file (json or kgcl)

--parse-only, --no-parse-only

if true, only perform the parse of KCGL and do not apply

Default

False

-O, --output-type <output_type>

Desired output type

--overwrite, --no-overwrite

If set, any changes applied will be saved back to the input file/source

Arguments

COMMANDS

Optional argument(s)

apply-obsolete

Sets an ontology element to be obsolete

Example:

runoak -i my.obo apply-obsolete MY:0002200 -o my-modified.obo

Multiple terms can be passed, as labels, IDs, or using OAK queries:

runoak -i my.obo apply-obsolete MY:1 MY:2 MY:3 … -o my-modified.obo

This may be chained, for example to take all terms matching a search query and then obsolete them all:

runoak -i my.db search ‘l/^Foo/` | runoak -i my.db –autosave apply-obsolete -

This command is partially redundant with the more general “apply” command

runoak apply-obsolete [OPTIONS] [TERMS]...

Options

-o, --output <output>
-O, --output-type <output_type>

Desired output type

Arguments

TERMS

Optional argument(s)

axioms

Filters axioms

Example:

runoak -i cl.ofn axiom

The above will write all axioms.

You can filter by axiom type:

Example:

runoak -i cl.ofn axiom –axiom-type SubClassOf

Note this currently only works with the funowl adapter, on functional syntax files

runoak axioms [OPTIONS] [TERMS]...

Options

-o, --output <output>

Output file, e.g. obo file

-O, --output-type <output_type>

Desired output type

--axiom-type <axiom_type>

Type of axiom, e.g. SubClassOf

--about <about>

CURIE that the axiom is about

--references <references>

CURIEs that the axiom references

Arguments

TERMS

Optional argument(s)

definitions

Show textual definitions for term or set of terms

Example:

runoak -i sqlite:obo:envo definitions ‘tropical biome’ ‘temperate biome’

You can use the “.all” selector to show all definitions for all terms in the ontology:

Example:

runoak -i sqlite:obo:envo definitions .all

runoak definitions [OPTIONS] [TERMS]...

Options

-o, --output <output>

Output file, e.g. obo file

-D, --display <display>

A comma-separated list of display options. Use ‘all’ for all

-O, --output-type <output_type>

Desired output type

Options

obo | obojson | ofn | rdf | json | yaml | csv | nl

--if-absent <if_absent>

determines behavior when the value is not present or is empty.

Options

absent-only | present-only

-S, --set-value <set_value>

the value to set for all terms for the given property.

Arguments

TERMS

Optional argument(s)

descendants

List all descendants of a term

Example:

runoak -i sqlite:obo:obi descendants assay -p i

Example:

runoak -i sqlite:obo:uberon descendants heart -p i,p

This is the inverse of the ‘ancestors’ command; see the documentation for that command. But note that ‘descendants’ commands have the potential to be more “explosive” than ancestors commands, especially for high level terms, and for when predicates are not specified

More background:

runoak descendants [OPTIONS] [TERMS]...

Options

-p, --predicates <predicates>

A comma-separated list of predicates

-D, --display <display>

A comma-separated list of display options. Use ‘all’ for all

-O, --output-type <output_type>

Desired output type

-o, --output <output>

Output file, e.g. obo file

Arguments

TERMS

Optional argument(s)

diff

Diff between two ontologies

Produces a list of Changes that are required to go from the main input ontology to the other ontology

The –simple option will compare the lists of terms in each ontology. This is currently implemented for most endpoints.

If –simple is not set, then this will do a complete diff, and return the diff as KGCL change commands.

Current limitations

  • complete diffs can only be done using local RDF files

  • Parsing using rdflib can be slow

  • Currently the return format is ONLY the KGCL change DSL. In future YAML, JSON, RDF will be an option

runoak diff [OPTIONS]

Options

-X, --other-ontology <other_ontology>

other ontology

--simple, --no-simple

perform a quick difference showing only terms that differ

Default

False

-o, --output <output>

Output file, e.g. obo file

-O, --output-type <output_type>

Desired output type

diff-terms

Compares a pair of terms in two ontologies

EXPERIMENTAL

runoak diff-terms [OPTIONS] [TERMS]...

Options

--other-ontology <other_ontology>

other ontology

-o, --output <output>

Output file, e.g. obo file

Arguments

TERMS

Optional argument(s)

diff-via-mappings

Calculates cross-ontology diff using mappings

Given a pair of ontologies, and mappings that connect terms in both ontologies, this command will perform a structural comparison of all mapped pairs of terms

Example:

runoak -i sqlite:obo:uberon –other-input sqlite:obo:zfa –source UBERON –source ZFA -O csv

Note the above command does not have any mapping file specified; the mappings that are distributed within each ontology is used (in this case, Uberon contains mappings to ZFA)

If the mappings are provided externally:

runoak -i ont1.obo –other-input ont2.obo –mapping-input mappings.sssom.tsv

(in the above example, –source is not passed, so all mappings are tested)

If there are no existing mappings, you can use the lexmatch command to generate them:

runoak -i ont1.obo -a ont2.obo lexmatch -o mappings.sssom.stv runoak -i ont1.obo –other-input ont2.obo –mapping-input mappings.sssom.tsv

The output from this command follows the cross-ontology-diff data model (https://incatools.github.io/ontology-access-kit/datamodels/cross-ontology-diff/index.html)

This can be serialized in YAML or TSV form

runoak diff-via-mappings [OPTIONS] [TERMS]...

Options

-S, --source <source>

ontology prefixes e.g. HP, MP

--mapping-input <mapping_input>

File of mappings in SSSOM format. If not provided then mappings in ontology(ies) are used

-X, --other-input <other_input>

Additional input file

--other-input-type <other_input_type>

Type of additional input file

--intra, --no-intra

If true, then all sources are in the main input ontology

Default

False

--autolabel, --no-autolabel

If set, results will automatically have labels assigned

Default

True

--include-identity-mappings, --no-include-identity-mappings

Use identity relation as mapping; use this for two versions of the same ontology

Default

False

--filter-category-identical, --no-filter-category-identical

Do not report cases where a relationship has not changed

Default

False

-p, --predicates <predicates>

A comma-separated list of predicates

-o, --output <output>

Output file, e.g. obo file

-O, --output-type <output_type>

Desired output type

Arguments

TERMS

Optional argument(s)

dump

Exports (dumps) the entire contents of an ontology

Example:

runoak -i pato.obo dump -o pato.json -O json

Example:

runoak -i pato.owl dump -o pato.ttl -O turtle

Currently each implementation only supports a subset of formats.

The dump command is also blocked for remote endpoints such as Ubergraph, to avoid killer queries

runoak dump [OPTIONS] [TERMS]...

Options

-o, --output <output>
-O, --output-type <output_type>

Desired output type

Arguments

TERMS

Optional argument(s)

eval-taxon-constraints

Test candidate taxon constraints

Multiple candidate constraints can be passed as arguments. these are in the form of triples separated by periods.

Example:

runoak -i db/go.db eval-taxon-constraints -p i,p GO:0005743 only NCBITaxon:2759 never NCBITaxon:2 . GO:0005634 only NCBITaxon:2

The –evolution-file (-E) option can be used to pass in a file of candidates. This should follow the format used in https://arxiv.org/abs/1802.06004

E.g.

GO:0000229,Gain|NCBITaxon:1(root);>Loss|NCBITaxon:2759(Eukaryota);

Example:

runoak -i db/go.db eval-taxon-constraints -p i,p -E tests/input/go-evo-gains-losses.csv

runoak eval-taxon-constraints [OPTIONS] [CONSTRAINTS]...

Options

-E, --evolution-file <evolution_file>

path to file containing gains and losses

-o, --output <output>

Output file, e.g. obo file

-p, --predicates <predicates>

A comma-separated list of predicates

Arguments

CONSTRAINTS

Optional argument(s)

expand-subsets

For each subset provide a mapping of each term in the ontology to a subset

Example:

runoak -i db/pato.db expand-subsets attribute_slim value_slim

runoak expand-subsets [OPTIONS] [SUBSETS]...

Options

-o, --output <output>

Output file, e.g. obo file

-p, --predicates <predicates>

A comma-separated list of predicates

Arguments

SUBSETS

Optional argument(s)

extract-triples

Extracts a subontology as triples

Currently the only endpoint to implement this is ubergraph. Ontobee seems to have performance issues with the query

This will soon be supported in the SqlDatabase/Sqlite endpoint

Example:

runoak -v -i ubergraph: extract-triples GO:0005635 CL:0000099 -o test.ttl -O ttl

runoak extract-triples [OPTIONS] [TERMS]...

Options

-p, --predicates <predicates>

A comma-separated list of predicates

-o, --output <output>

Output file, e.g. obo file

-O, --output-type <output_type>

Desired output type

Arguments

TERMS

Optional argument(s)

fill-table

Fills missing values in a table of ontology elements

See https://incatools.github.io/ontology-access-kit/src/oaklib.utilities.table_filler

Given a TSV with a populated ID column, and unpopulated columns for definition, label, mappings, ancestors, this will iterate through each row filling in each missing value by performing ontology lookups.

In some cases, this can also perform reverse lookups; i.e given a table with labels populated and blank IDs, then fill in the IDs

In the most basic scenario, you have a table with two columns ‘id’ and ‘label’. These are the “conventional” column headers for a table of ontology elements (see later for configuration when you don’t follow conventions)

Example:

runoak -i cl.owl.ttl fill-table my-table.tsv

(any implementation can be used)

The same command will work for the reverse scenario - when you have labels populated, but IDs are not populated

By default this will throw an error if a lookup is not successful; this can be relaxed

Relaxed:

runoak -i cl.owl.ttl fill-table –allow-missing my-table.tsv

In this case missing values that cannot be populated will remain empty

To explicitly populate a value:

runoak -i cl.owl.ttl fill-table –missing-value-token NO_DATA my-table.tsv

Currently the following columns are recognized:

  • id – the unique identifier of the element

  • label – the rdfs:label of the element

  • definition – the definition of the element

  • mappings – mappings for the element

  • ancestors – ancestors for the element (this can be parameterized)

The metadata inference procedure will also work for when you have denormalized TSV files with columns such as “foo_id” and “foo_name”. This will be recognized as an implicit normalized label relation between id and name of a foo element.

You can be more explicit in one of two ways:

  1. Pass in a YAML structure (on command line or in a YAML file) listing relations

  2. Pass in a LinkML data definitions YAML file

For the first method, you can pass in multiple relations using the –relation arg. For example, given a TSV with columns cl_identifier and cl_display_label you can say:

Example:

runoak -i cl.owl.ttl fill-table –relation “{primary_key: cl_identifier, dependent_column: cl_display_label, relation: label}”

You can also specify this in a YAML file

For the 2nd method, you need to specify a LinkML schema with a class where (1) at least one field is annotated as being an identifier (2) one or more slots have slot_uri elements mapping them to standard metadata elements such as rdfs:label.

For example, my-schema.yaml:

classes:
Person:
attributes:
id:

identifier: true

name:

slot_uri: rdfs:label

This is a powerful command with many ways of configuring it - we will add separate docs for this soon, for now please file an issue on github with any questions

  • TODO: allow for an option that will perform fuzzy matches of labels

  • TODO: reverse lookup is not provided for all fields, such as definitions

  • TODO: add an option to detect inconsistencies

  • TODO: add logical for obsoletion/replaced by

  • TODO: use most optimized method for whichever backend

runoak fill-table [OPTIONS] TABLE_FILE

Options

--allow-missing, --no-allow-missing

Allow some dependent values to be blank, post-processing

Default

False

--missing-value-token <missing_value_token>

Populate all missing values with this token

--schema <schema>

Path to linkml schema

--delimiter <delimiter>

Delimiter between columns in input and output

Default

--relation <relation>

Serialized YAML string corresponding to a normalized relation between two columns

--relation-file <relation_file>

Path to YAML file corresponding to a list of normalized relation between two columns

-o, --output <output>

Output file, e.g. obo file

Arguments

TABLE_FILE

Required argument

info

Show information on term or set of terms

Example:

runoak -i sqlite:obo:cl info CL:4023094

The default output is minimal, showing only ID and label

The –output-type (-O) option can be used to specify other formats for the output.

Currently there are only a few output types are supported. More will be provided in future.

In OBO format:

runoak -i cl.owl info CL:4023094 -O obo

As CSV:

runoak -i cl.obo info CL:4023094 -O csv

The info output format can be parameterized with –display (-D)

With xrefs and definitions:

runoak -i cl.owl info CL:4023094 -D x,d

With all information:

runoak -i cl.owl info CL:4023094 -D all

Like all OAK commands, input term lists can be multivalued, a mixture of IDs and labels, as well as queries that can be combined using boolean logic

Info on two STATO terms:

runoak -i ontobee:stato info STATO:0000286 STATO:0000287 -O obo

All terms in ENVO with the string “forest” in them:

runoak -i sqlite:obo:envo info l~forest

Info on all subtypes of “statistical hypothesis test” in STATO:

runoak -i sqlite:obo:stato info .desc//p=i ‘statistical hypothesis test’

runoak info [OPTIONS] [TERMS]...

Options

-o, --output <output>

Output file, e.g. obo file

-D, --display <display>

A comma-separated list of display options. Use ‘all’ for all

-O, --output-type <output_type>

Desired output type

Arguments

TERMS

Optional argument(s)

labels

Show labels for term or list of terms

Example:

runoak -i cl.owl labels CL:4023093 CL:4023094

You can use the “.all” selector to show all labels:

Example:

runoak -i cl.owl labels .all

(this may be blocked for remote endpoints)

You can query for terms that have either no label, or to include only ones with labels:

Nodes with no labels:

runoak -i cl.owl labels .all –if-absent exclude

runoak labels [OPTIONS] [TERMS]...

Options

-o, --output <output>

Output file, e.g. obo file

-D, --display <display>

A comma-separated list of display options. Use ‘all’ for all

-O, --output-type <output_type>

Desired output type

Options

obo | obojson | ofn | rdf | json | yaml | csv | nl

--if-absent <if_absent>

determines behavior when the value is not present or is empty.

Options

absent-only | present-only

-S, --set-value <set_value>

the value to set for all terms for the given property.

Arguments

TERMS

Optional argument(s)

leafs

List all leaf nodes in the ontology

Like all OAK relational commands, this is parameterized by –predicates (-p). Note that the default is to return the roots of the relation graph over all predicates

Example:

runoak -i db/cob.db leafs

This command is a wrapper onto the “leafs” command in the BasicOntologyInterface.

runoak leafs [OPTIONS]

Options

-o, --output <output>

Output file, e.g. obo file

-p, --predicates <predicates>

A comma-separated list of predicates

--filter-obsoletes, --no-filter-obsoletes

If set, results will exclude obsoletes

Default

True

lexmatch

Performs lexical matching between pairs of terms in one more more ontologies

Examples:

runoak -i foo.obo lexmatch -o foo.sssom.tsv

In this example, the input ontology file is assumed to contain all pairs of terms to be mapped.

It is more common to map between all pairs of terms in two ontology files. To avoid a merge preprocessing step:

runoak -i foo.obo -a bar.obo lexmatch -o foo.sssom.tsv

lexmatch implements a simple algorithm:

  • create a lexical index, keyed by normalized strings of labels, synonyms

  • report all pairs of entities that have the same key

The lexical index can be exported (in native YAML) using -L:

runoak -i foo.obo lexmatch -L foo.index.yaml -o foo.sssom.tsv

Note: if you run the above command a second time it will be faster as the index will be reused.

Using custom rules:

runoak -i foo.obo lexmatch -R match_rules.yaml -L foo.index.yaml -o foo.sssom.tsv

Full documentation:

module-oaklib.utilities.lexical.lexical_indexer

runoak lexmatch [OPTIONS]

Options

-L, --lexical-index-file <lexical_index_file>

path to lexical index. This is recreated each time unless –no-recreate is passed

-R, --rules-file <rules_file>

path to rules file. Conforms to rules_datamodel. e.g. https://github.com/INCATools/ontology-access-kit/blob/main/tests/input/matcher_rules.yaml

--add-labels, --no-add-labels

Populate empty labels with URI fragments or CURIE local IDs, for ontologies that use semantic IDs

Default

False

--recreate, --no-recreate

if true and lexical index is specified, always recreate, otherwise load from index

Default

True

-o, --output <output>

Output file, e.g. obo file

lint

Lints an ontology, applying changes in place

The current implementation is highly incomplete, and only handles linting of syntactic patterns (chains of whitespace, trailing whitespace) in labels and definitions

Implementations

  • owl, in functional syntax

  • obo

runoak lint [OPTIONS]

Options

-o, --output <output>
--report-format <report_format>

Output format for reporting proposed/applied changes

--dry-run, --no-dry-run

If true, nothing will be modified by executing command

-O, --output-type <output_type>

Desired output type

logical-definitions

Show all logical definitions for a term or terms

runoak logical-definitions [OPTIONS] [TERMS]...

Options

-p, --predicates <predicates>

A comma-separated list of predicates

--direction <direction>

direction of traversal over edges, which up is subject to object, down is object to subject.

Options

up | down | both

--autolabel, --no-autolabel

If set, results will automatically have labels assigned

Default

True

-O, --output-type <output_type>

Desired output type

-o, --output <output>

Output file, e.g. obo file

--if-absent <if_absent>

determines behavior when the value is not present or is empty.

Options

absent-only | present-only

-S, --set-value <set_value>

the value to set for all terms for the given property.

Arguments

TERMS

Optional argument(s)

mappings

List all mappings encoded in the ontology

Example:

runoak -i sqlite:obo:envo mappings

The default output is SSSOM YAML. To use the (canonical) csv format:

runoak -i sqlite:obo:envo mappings -O sssom

By default, labels are not included. Use –autolabel to include labels (but note that if the label is not in the source ontology, then no label will be retrieved)

runoak -i sqlite:obo:envo mappings -O sssom

To constrain the mapped object source:

runoak -i sqlite:obo:foodon mappings -O sssom –maps-to-source SUBSET_SIREN

runoak mappings [OPTIONS] [TERMS]...

Options

-o, --output <output>

Output file, e.g. obo file

-O, --output-type <output_type>

Desired output type

--autolabel, --no-autolabel

If set, results will automatically have labels assigned

Default

True

--maps-to-source <maps_to_source>

Return only mappings with subject or object source equal to this

Arguments

TERMS

Optional argument(s)

migrate-curies

Rewires an ontology replacing all instances of an ID or IDs

Note: the specified ontology is modified in place

The input for this command is a list equals-separated pairs, specifying the source and the target

Example:

runoak -i db/uberon.db migrate-curies –replace SRC1=TGT1 SRC2=TGT2

This command is a wrapper onto the “migrate_curies” command in the PatcherInterface

oaklib.interfaces.patcher_interface.PatcherInterface.migrate_curies

runoak migrate-curies [OPTIONS] [CURIE_PAIRS]...

Options

--replace, --no-replace

If true, will update in place

Default

False

-O, --output-type <output_type>

Desired output type

-o, --output <output>

Output file, e.g. obo file

Arguments

CURIE_PAIRS

Optional argument(s)

obsoletes

Shows all obsolete nodes

Example:

runoak -i obolibrary:go.obo obsoletes

TODO: this command should be parameterizable

runoak obsoletes [OPTIONS]

Options

-O, --output-type <output_type>

Desired output type

Options

obo | obojson | ofn | rdf | json | yaml | csv | nl

-o, --output <output>

Output file, e.g. obo file

ontologies

Shows all ontologies

If the input is a pre-merged ontology, then the output of this command is trivially a single line, with the name of the input ontology

This command is more meaningful when the input is a multi-ontology endpoint, e.g

runoak -i ubergraph ontologies

In future this command will be expanded to allow showing more metadata about each ontology

runoak ontologies [OPTIONS]

Options

-o, --output <output>

Output file, e.g. obo file

ontology-metadata

Shows ontology metadata

Example:

runoak -i bioportal: ontology-metadata obi uberon foodon

Use the --all option to show all ontologies

Example:

runoak -i bioportal: ontology-metadata –all

By default the output is YAML. You can get the results as TSV:

Example:

runoak -i bioportal: ontology-metadata –all -O csv

Warning

The output data model is not yet standardized – this may change in future

runoak ontology-metadata [OPTIONS] [ONTOLOGIES]...

Options

-o, --output <output>

Output file, e.g. obo file

-O, --output-type <output_type>

Desired output type

--all, --no-all

If true, show all ontologies. Use in place of passing an explicit list

Default

False

Arguments

ONTOLOGIES

Optional argument(s)

ontology-versions

Shows ontology versions

Currently only implemented for BioPortal

Example:

runoak -i bioportal: ontology-versions mp

All ontologies:

runoak -i bioportal ontology-versions –all

runoak ontology-versions [OPTIONS] [ONTOLOGIES]...

Options

-o, --output <output>

Output file, e.g. obo file

--all, --no-all

If true, show all ontologies. Use in place of passing an explicit list

Default

False

Arguments

ONTOLOGIES

Optional argument(s)

paths

List all paths between one or more start curies

Example:

runoak -i sqlite:obo:go paths -p i,p ‘nuclear membrane’

This shows all shortest paths from nuclear membrane to all ancestors

Example:

runoak -i sqlite:obo:go paths -p i,p ‘nuclear membrane’ –target cytoplasm

This shows shortest paths between two nodes

Example:

runoak -i sqlite:obo:go paths -p i,p ‘nuclear membrane’ ‘thylakoid’ –target cytoplasm ‘thylakoid membrane’

This shows all shortest paths between 4 combinations of starts and ends

Example:

runoak -i sqlite:obo:go paths -p i,p ‘nuclear membrane’ –target cytoplasm –predicate-weights “{i: 0.0001, p: 999}”

This shows all shortest paths after weighting relations

Example:

alias go=”runoak -i sqlite:obo:go” go paths -p i,p ‘nuclear membrane’ –target cytoplasm –flat | go viz –fill-gaps -

This visualizes the path by first exporting the path as a flat list, then passing the results to viz, using the fill-gaps option

runoak paths [OPTIONS] [TERMS]...

Options

--target <target>

end point of path

--flat, --no-flat

If true then output path is written a list of terms

Default

False

--autolabel, --no-autolabel

If set, results will automatically have labels assigned

Default

True

-p, --predicates <predicates>

A comma-separated list of predicates

-O, --output-type <output_type>

Desired output type

--predicate-weights <predicate_weights>

key-value pairs specified in YAML where keys are predicates or shorthands and values are weights

-o, --output <output>

Output file, e.g. obo file

Arguments

TERMS

Optional argument(s)

relationships

Show all relationships for a term or terms

By default, this shows all relationships where the input term(s) are the subjects

Example:

runoak -i cl.db relationships CL:4023094

Like all OAK commands, a label can be passed instead of a CURIE

Example:

runoak -i cl.db relationships neuron

To reverse the direction, and query where the search term(s) are objects, use the –direction flag:

Example:

runoak -i cl.db relationships –direction down neuron

Multiple terms can be passed

Example:

runoak -i uberon.db relationships heart liver lung

And like all OAK commands, a query can be passed rather than an explicit term list

The following query lists all arteries in the limb together which what structures they supply

Query:

runoak -i uberon.db relationships -p RO:0002178 .desc//p=i “artery” .and .desc//p=i,p “limb”

runoak relationships [OPTIONS] [TERMS]...

Options

-p, --predicates <predicates>

A comma-separated list of predicates

--direction <direction>

direction of traversal over edges, which up is subject to object, down is object to subject.

Options

up | down | both

--autolabel, --no-autolabel

If set, results will automatically have labels assigned

Default

True

-O, --output-type <output_type>

Desired output type

-o, --output <output>

Output file, e.g. obo file

--if-absent <if_absent>

determines behavior when the value is not present or is empty.

Options

absent-only | present-only

-S, --set-value <set_value>

the value to set for all terms for the given property.

--include-entailed, --no-include-entailed

Include entailed indirect relationships

Default

False

--include-tbox, --no-include-tbox

Include class-class relationships (subclass and existentials)

Default

True

--include-abox, --no-include-abox

Include instance relationships (class and object property assertions)

Default

True

Arguments

TERMS

Optional argument(s)

roots

List all root nodes in the ontology

Like all OAK relational commands, this is parameterized by –predicates (-p). Note that the default is to return the roots of the relation graph over all predicates. This can sometimes give unintuitive results, so we recommend always being explicit and parameterizing

Example:

runoak -i db/cob.db roots

This command is a wrapper onto the “roots” command in the BasicOntologyInterface.

runoak roots [OPTIONS]

Options

-o, --output <output>

Output file, e.g. obo file

-p, --predicates <predicates>

A comma-separated list of predicates

set-apikey

Sets an API key

Example:

oak set-apikey -e bioportal MY-KEY-VALUE

This is stored in an OS-dependent path

runoak set-apikey [OPTIONS] KEYVAL

Options

-e, --endpoint <endpoint>

Required Name of endpoint, e.g. bioportal

Arguments

KEYVAL

Required argument

siblings

List all siblings of a specified term or terms

Example:

runoak -i cl.owl siblings CL:4023094

Note that siblings is by default over ALL relationship types, so we recommend always being explicit and passing a predicate using -p (–predicates)

runoak siblings [OPTIONS] [TERMS]...

Options

-p, --predicates <predicates>

A comma-separated list of predicates

-o, --output <output>

Output file, e.g. obo file

-O, --output-type <output_type>

Desired output type

Options

obo | obojson | ofn | rdf | json | yaml | csv | nl

Arguments

TERMS

Optional argument(s)

similarity

All by all similarity

This calculates a similarity matrix for two sets of terms.

Input sets of a terms can be specified in different ways:

  • via a file

  • via explicit lists of terms or queries

Example:

runoak -i hp.db all-similarity -p i –set1-file HPO-TERMS1 –set2-file HPO-TERMS2 -O csv

This will compare every term in TERMS1 vs TERMS2

Alternatively standard OAK term queries can be used, with “@” separating the two lists

Example:

runoak -i hp.db all-similarity -p i TERM_1 TERM_2 … TERM_N @ TERM_N+1 … TERM_M

The .all term syntax can be used to select all terms in an ontology

Example:

runoak -i ma.db all-similarity -p i,p .all @ .all

This can be mixed with other term selectors; for example to calculate the similarity of “neuron” vs all terms in CL:

runoak -i cl.db all-similarity -p i,p .all @ neuron

An example pipeline to do all by all over all phenotypes in HPO:

Explicit:

runoak -i hp.db descendants -p i HP:0000118 > HPO runoak -i hp.db all-similarity -p i –set1-file HPO –set2-file HPO -O csv -o RESULTS.tsv

The same thing can be done more compactly with term queries:

runoak -i hp.db all-similarity -p i .desc//p=i HP:0000118 @ .desc//p=i HP:0000118

runoak similarity [OPTIONS] [TERMS]...

Options

-p, --predicates <predicates>

A comma-separated list of predicates

--set1-file <set1_file>

ID file for set1

--set2-file <set2_file>

ID file for set2

--jaccard-minimum <jaccard_minimum>

Minimum value for jaccard score

--ic-minimum <ic_minimum>

Minimum value for information content

-o, --output <output>

path to output

--main-score-field <main_score_field>

Score used for summarization

Default

phenodigm_score

--autolabel, --no-autolabel

If set, results will automatically have labels assigned

Default

True

-O, --output-type <output_type>

Desired output type

Arguments

TERMS

Optional argument(s)

similarity-pair

Determine pairwise similarity between two terms using a variety of metrics

NOTE: this command may be deprecated, consider using similarity

Note: We recommend always specifying explicit predicate lists

Example:

runoak -i ubergraph: similarity -p i,p CL:0000540 CL:0000000

You can omit predicates if you like but be warned this may yield hard to interpret results.

E.g.

runoak -i ubergraph: similarity CL:0000540 GO:0001750

yields “fully formed stage” (i.e these are both found in the adult) as the MRCA

For phenotype ontologies, UPHENO relationship types connect phenotype terms to anatomy, etc:

runoak -i ubergraph: similarity MP:0010922 HP:0010616 -p i,p,UPHENO:0000001

Background: https://incatools.github.io/ontology-access-kit/interfaces/semantic-similarity.html

runoak similarity-pair [OPTIONS] [TERMS]...

Options

-p, --predicates <predicates>

A comma-separated list of predicates

-o, --output <output>

Output file, e.g. obo file

Arguments

TERMS

Optional argument(s)

singletons

List all singleton nodes in the ontology

Like all OAK relational commands, this is parameterized by –predicates (-p). Note that the default is to return the singletons of the relation graph over all predicates

Obsoletes are filtered by default

Example:

runoak -i db/cob.db singletons

This command is a wrapper onto the “singletons” command in the BasicOntologyInterface.

runoak singletons [OPTIONS]

Options

-o, --output <output>

Output file, e.g. obo file

-p, --predicates <predicates>

A comma-separated list of predicates

--filter-obsoletes, --no-filter-obsoletes

If set, results will exclude obsoletes

Default

True

subsets

Shows information on subsets

Example:

runoak -i obolibrary:go.obo subsets

Example:

runoak -i cl.owl subsets

For background on subsets, see https://incatools.github.io/ontology-access-kit/concepts.html#subsets

Note you can use subsets in selector queries for other commands; e.g. to fetch all terms (directly) in goslim_generic in GO:

Example:

runoak -i sqlite:obo:go info .in goslim_generic

See also:

term-subsets command, which shows relationships of terms to subsets

runoak subsets [OPTIONS]

Options

-o, --output <output>

Output file, e.g. obo file

taxon-constraints

Compute all taxon constraints for a term or terms

Note that this computes the taxon constraints rather than doing a lookup

Example:

runoak -i db/go.db taxon-constraints GO:0034357 –include-redundant -p i,p

Example:

runoak -i sqlite:obo:uberon taxon-constraints UBERON:0003884 UBERON:0003941 -p i,p

This command is a wrapper onto taxon_constraints_utils:

runoak taxon-constraints [OPTIONS] [TERMS]...

Options

-o, --output <output>

Output file, e.g. obo file

-p, --predicates <predicates>

A comma-separated list of predicates

-A, --all, --no-A, --no-all

if specified then perform for all terms

Default

False

--include-redundant, --no-include-redundant

if specified then include redundant taxon constraints from ancestral subjects

Default

False

Arguments

TERMS

Optional argument(s)

term-categories

List categories for a term or set of terms

TODO

runoak term-categories [OPTIONS] [TERMS]...

Options

-o, --output <output>

Output file, e.g. obo file

-O, --output-type <output_type>

Desired output type

--category-system <category_system>

Example: biolink, cob, bfo, dbpedia, …

Arguments

TERMS

Optional argument(s)

term-metadata

Shows term metadata

runoak term-metadata [OPTIONS] [TERMS]...

Options

-o, --output <output>

Output file, e.g. obo file

-O, --output-type <output_type>

Desired output type

--reification, --no-reification

if true then fetch axiom triples with annotations

Default

False

Arguments

TERMS

Optional argument(s)

term-subsets

List subsets for a term or set of terms

runoak term-subsets [OPTIONS] [TERMS]...

Options

-o, --output <output>

Output file, e.g. obo file

-O, --output-type <output_type>

Desired output type

Arguments

TERMS

Optional argument(s)

terms

List all terms in the ontology

Example:

runoak -i db/cob.db terms

All terms without obsoletes:

runoak -i prontolib:cl.obo terms –filter-obsoletes

By default “terms” is considered to be any entity type in the ontology. Use –owl-type to constrain this:

Classes:

runoak -i sqlite:obo:ro terms –owl-type owl:Class

Relationship types (Object properties):

runoak -i sqlite:obo:ro terms –owl-type owl:ObjectProperty

Annotation properties:

runoak -i sqlite:obo:omo terms –owl-type owl:AnnotationProperty

runoak terms [OPTIONS]

Options

--filter-obsoletes, --no-filter-obsoletes

If set, results will exclude obsoletes

Default

True

-o, --output <output>

Output file, e.g. obo file

--owl-type <owl_type>

only include entities of this type, e.g. owl:Class, rdf:Property

termset-similarity

Termset similarity

This calculates a similarity matrix for two sets of terms.

Example:

runoak -i go.db termset-similarity -p i,p nucleus membrane @ “nuclear membrane” vacuole -p i,p

runoak termset-similarity [OPTIONS] [TERMS]...

Options

-p, --predicates <predicates>

A comma-separated list of predicates

-o, --output <output>

Output file, e.g. obo file

-O, --output-type <output_type>

Desired output type

--autolabel, --no-autolabel

If set, results will automatically have labels assigned

Default

True

Arguments

TERMS

Optional argument(s)

tree

Display an ancestor graph as an ascii/markdown tree

For general instructions, see the viz command, which this is analogous too

Example:

runoak -i envo.db tree ENVO:00000372 -p i,p

This produces output like:

.code:

* [i] ENVO:00000094 ! volcanic feature
    * [i] ENVO:00000247 ! volcano
        * [i] ENVO:00000403 ! shield volcano
            * [i] **ENVO:00000372 ! pyroclastic shield volcano**

Note: for many ontologies the tree view will explode, especially if no predicates are specified. You may wish to start with the is-a tree (-p i).

You can use the –gap-fill option to create a minimal tree:

Example:

runoak -i envo.db tree –gap-fill ‘pyroclastic shield volcano’ ‘subglacial volcano’ volcano -p i

This will show the tree containing only these terms, and the most direct inferred relationships between them.

You can also give a list of leaf terms and specify –add-mrcas alongside –gap-fill to fill in the most informative intermediate classes:

Example:

runoak -i envo.db tree –add-mrcas –gap-fill ‘pyroclastic shield volcano’ ‘subglacial volcano’ ‘mud volcano’ -p i

This will fill in the term “volcano”, as it is the most recent common ancestor of the specified terms

The –max-hops option can control the distance

runoak -i envo.db tree ‘pyroclastic shield volcano’ ‘subglacial volcano’ –max-hops 1 -p i

This will generate:

  • [] ENVO:00000247 ! volcano
    • [i] ENVO:00000403 ! shield volcano
      • [i] ENVO:00000372 ! pyroclastic shield volcano

    • [i] ENVO:00000407 ! subglacial volcano

Note that ‘volcano’ is the root, even though it is 2 hops from one of the terms, it can be connected to at least one of the seeds (highlighted with asterisks) by a path of length 1.

runoak tree [OPTIONS] [TERMS]...

Options

--down, --no-down

traverse down

Default

False

--gap-fill, --no-gap-fill

If set then find the minimal graph that spans all input curies

Default

False

--add-mrcas, --no-add-mrcas

If set then extend input seed list to include all pairwise MRCAs

Default

False

-S, --stylemap <stylemap>

a json file to configure visualization. See https://berkeleybop.github.io/kgviz-model/

-C, --configure <configure>

overrides for stylemap, specified as yaml. E.g. `-C “styles: [filled, rounded]” `

--max-hops <max_hops>

Trim nodes that are equal to or greater than this distance from terms

--skip <skip>

Exclude paths that contain this node

--root <root>

Use this node or nodes as roots

-p, --predicates <predicates>

A comma-separated list of predicates

-O, --output-type <output_type>

Desired output type

-o, --output <output>

Output file, e.g. obo file

Arguments

TERMS

Optional argument(s)

validate

Validate an ontology against ontology metadata

Implementation notes: Currently only works on SQLite

Example:

runoak -i db/ecto.db validate -o results.tsv

For more information, see the OAK how-to guide:

runoak validate [OPTIONS]

Options

--cutoff <cutoff>

maximum results to report for any (type, predicate) pair

Default

50

-o, --output <output>

Output file, e.g. obo file

validate-definitions

Check definitions

REDUNDANT WITH VALIDATE - may be obsoleted

runoak validate-definitions [OPTIONS]

Options

-o, --output <output>

Output file, e.g. obo file

validate-multiple

Validate multiple ontologies against ontology metadata

See the validate command - this is the same except you can pass a list of databases

For more information, see the OAK how-to guide:

runoak validate-multiple [OPTIONS] [DBS]...

Options

--cutoff <cutoff>

maximum results to report for any (type, predicate) pair

Default

50

-s, --schema <schema>

Path to schema (if you want to override the bundled OMO schema)

-o, --output <output>

Output file, e.g. obo file

Arguments

DBS

Optional argument(s)

viz

Visualizing an ancestor graph using obographviz

For general background on what is meant by a graph in OAK, see https://incatools.github.io/ontology-access-kit/interfaces/obograph.html

Note

This requires that obographviz is installed.

Example:

runoak -i sqlite:cl.db viz CL:4023094

Same query on ubergraph:

runoak -i ubergraph: viz CL:4023094

Example, showing only is-a:

runoak -i sqlite:cl.db viz CL:4023094 -p i

Example, showing only is-a and part-of, to include Uberon:

runoak -i sqlite:cl.db viz CL:4023094 -p i,p

As above, including develops-from:

runoak -i sqlite:cl.db viz CL:4023094 -p i,p,RO:0002202

With abbreviation:

runoak -i sqlite:cl.db viz CL:4023094 -p i,p,d

We can also limit the number of “hops” from the seed terms; for example, all is-a and develops-from ancestors of T-cell, limiting to a distance of 2:

runoak -i sqlite:cl.db viz ‘T cell’ -p i,d –max-hops 2

runoak viz [OPTIONS] [TERMS]...

Options

--view, --no-view

if view is set then open the image after rendering

Default

True

--down, --no-down

traverse down

Default

False

--gap-fill, --no-gap-fill

If set then find the minimal graph that spans all input curies

Default

False

--add-mrcas, --no-add-mrcas

If set then extend input seed list to include all pairwise MRCAs

Default

False

-S, --stylemap <stylemap>

a json file to configure visualization. See https://berkeleybop.github.io/kgviz-model/

-C, --configure <configure>

overrides for stylemap, specified as yaml. E.g. `-C “styles: [filled, rounded]” `

--max-hops <max_hops>

Trim nodes that are equal to or greater than this distance from terms

--meta, --no-meta

Add metadata object to graph nodes, including xrefs, definitions

Default

False

-p, --predicates <predicates>

A comma-separated list of predicates

-O, --output-type <output_type>

Desired output type

-o, --output <output>

Path to output file

Arguments

TERMS

Optional argument(s)