Glossary

This section contains a glossary of the terms used in the OAK documentation and API, as well as terms used more broadly by different ontology communities.

For a deeper dive into some of these concepts, see the The OAK Guide.

Ontology

A flexible concept loosely encompassing any collection of Ontology Elements and statements or relationships connecting them.

  • See also Basics in the Guide.

Ontology Element

A discrete part of an Ontology, with a unique persistent identifier. The most important elements are Terms, but other elements can include various metadata artefacts like Annotation Properties or Subsets

Term

A core element in an ontology, typically a Class, but sometimes used to include Instances or Relationship Types, depending on context. Note that in some contexts, the term “term” means something like a Label or Synonym, but here we follow standard usage and use “term” to refer to the main elements in an ontology.

Concept

See Term

Class

An Ontology Element that formally represents something that can be instantiated. For example, the class “heart” represented in the Uberon ontology by the CURIE UBERON:0000948. In most bio-ontologies, term “Class” is often synonymous with Terms.

Identifier

An identifier is a string that serves to uniquely identify some kind of entity such as an Ontology Element. In Semantic Web and Linked Data technologies, identifiers are always IRIs, although they may be shortened to CURIEs within individual documents.

CURIE

A CURIE is a compact URI. For example, CL:0000001 is the CURIE for the root Class in the Cell Ontology (which has the Prefix CL).

URI

A Uniform Resource Indicator, a generalization of URL. Most people think of URLs as being solely for addresses for web pages (or APIs) but in semantic web technologies, URLs can serve as actual identifiers for entities like ontology terms. Data models like OWL and RDF use URIs as identifiers. In OAK, URIs are mapped to CURIEs.

Label

Usually refers to a human-readable label corresponding to the rdfs:label predicate. Labels are typically unique per ontology. In OBO Format and in the bio-ontology literature, labels are sometimes called Names. Sometimes in the machine learning literature, and in databases such as Neo4J, “label” actually refers to a Category. In the context of OAK, Label is used to refer to the rdfs:label Predicate, or sometimes skos:prefLabel.

Name

Usually synonymous with Label, but in the formal logic and OWL community, “Name” sometimes denotes an Identifier

Category

The term Category is frequently ambiguous. In the context of OAK it refers to a high-level grouping Class that may come from an upper ontology like COB or a schema language like Biolink or schema.org.

Annotation
The term Annotation is frequently ambiguous. It can refer to:
Association

In the context of OAK, an Association is a statement that connects some kind of named entity (such as a gene, a person, a sample, or a disease) to an Ontology Element. In the context of many bio-ontologies like the Gene Ontology or Human Phenotype Ontology, Associations are usually called “annotations”. Associations can be seen as special cases of Edges, but it is often convenient to treat them differently (for example, associations frequently have additional metadata and evidence, and often have nuanced semantics that different from standard ontology edges). Despite the differences, we still use the same terminology for associations as for Edges. The Subject of an association is the named entity, which the association is about; it could be a gene, a person, a sample, a document, a disease, or any number of things. It could potentially be represented by a node in an ontology, but it is more typically a databased entity. The Object is the ontology term that is used as a descriptor for the subject. (Confusingly, in some formats, the “database object” actually refers to the subject of the association).

Text Annotation

The process of annotating spans of texts within a text document with references to ontology terms, or the result of this process. This is frequently done automatically. The Bioportal implementation provides text annotation services. More advanced annotation services will be available through AI plugins in OAK in the future.

Mapping

The term Mapping is often used differently by different communities. In the context of OAK it means a pairwise association between two Ontology Elements, where those elements are conceptually similar or close in meaning. OAK adheres closely to the SSSOM data model. Note that OAK treats mappings as distinct from ontology Associations or Edges, due to different use cases for each of these structures. However, there are commonalities, and we use the terms Subject, Object, and Predicate in the same way for each of these structures.

SSSOM

Simple Standard for Sharing Ontological Mappings. SSSOM is the primary Datamodel in OAK for passing around Mappings.

Graph

Formally a graph is a data structure consisting of Nodes and Edges. There are different forms of graphs, but for the purposes of OAK, an ontology graph has all Terms as nodes, and relationships connecting terms (is-a, part-of) as edges. Note the concept of an ontology graph and an RDF graph do not necessarily fully align - RDF graphs of OWL ontologies employ numerous blank nodes that obscure the ontology structure. See Ontology Graph Projection.

Edge

See Relationship

Relationship

A Relationship is a type connection between two ontology elements. The first element is called the Subject, and the second one the Object, with the type of connection being the Predicate. Sometimes Relationships are equated with Triples in RDF but this can be confusing, because some relationships map to multiple triples when following the OWL RDF serialization. An example is the relationship “finger part-of hand”, which in OWL is represented using a Existential Restriction that maps to 4 triples.

Triple

The term “triple” is generally only used in the context of the RDF data model. A triple is a simple statement consisting of a Subject, Predicate, and Object. The concept of triple is closely related to, but not identical to, the concept of Relationship.

Node

A Node (aka Vertex) is one of the two main elements that make up a Graph. The other element is an Edge. The nodes in a graph typically represent Classes but this depends on the Ontology Graph Projection. The nodes of a graph might also be Instances or Relationship Types, or metadata elements such as Subset definitions.

Subject

The subject of a Relationship or Association is the first element. The subject is always a Node. Note that the same node can be the Subject of one edge, and the Object of another edge. For example, the node for “Scoliosis” in the Human Phenotype Ontology is the subject of the SubClassOf edge whose object is “Abnormality of the vertebral column”; it may also be the object of a gene-phenotype association edge.

Object

The term “Object” is highly overloaded. In a general programming context, it refers to an instance of a (programmatic) class. But typically in the OAK context, it refers to the second element in a Relationship or Association. It is the counterpart to Subject.

Relationship Type

See Predicate

Predicate

An Ontology element that represents the type of a Relationship. Typically corresponds to an ObjectProperty in OWL, but this is not always true; in particular, the is-a relationship type is a builtin construct SubClassOf in OWL Examples:

  • IS_A (rdfs:subClassOf)

  • Part Of (BFO:0000050)

IS_A:

The is-a relationship type. This is a builtin construct in OWL and is not represented as an Ontology Element. In OAK, the IS_A relationship type is represented as a Predicate with the IRI owl:subClassOf.

Part Of

The Part Of relationship type. This is one of the most important relationship types in many ontologies such as GO, Uberon, and others. In OAK, the Part Of relationship type is represented as a Predicate with the CURIE BFO:0000050.

Ancestor

The Ancestor of an entity is the set of all entities that are reachable by following all Relationship`s, from :term:`subject or object. Ancestor traversal is frequently parameterized by a set of Predicates. The concept of Ancestor and graph traversal is closely related to the concept of Entailment in OWL.

Descendant

The converse of Ancestor.

Closure

In the context of ontologies and OAK “closure” refers to the closure of a predicate, i.e. the Ancestor of all entities that are reachable by following the predicate or predicates.

Subject Closure

The Subject Closure of an edge is the set of all entities that are reachable by following the Subject of the edge or association, over a specified set of predicates (called the Subject Closure Predicates). For example, in a disease phenotype association, if the disease is “Mucopolysaccharidosis type I”, then the subject closure would include “Mucopolysaccharidosis”, “Lysosomal Storage Disease”, “Disease”. In cases where the subject is a database entity rather than an ontology term, the subject closure may trivially be a singleton containing only the subject.

Object Closure

The Object Closure of an edge is the set of all entities that are reachable by following the Object of the edge or association, over a specified set of predicates (called the Object Closure Predicates). For example, in a disease to phenotype association, if the phenotype is “Abnormality of the vertebral column”, then the object closure would include “Abnormality of the vertebral column”, “Abnormality of the musculoskeletal system”, etc.

Ontology Graph Projection

The mapping between an ontology as represented in some formalism such as OWL ontology onto a Graph. This is a non-trivial process, because OWL ontologies are not natively represented as graphs, instead they are represented as collections of Axioms. The most common projection is the RDF mapping, but this results in a structure that is not well suited to graph operations due to the use of Blank Nodes to represent OWL expressions. OAK makes use of a simple projection where OWL existential axioms are mapped to Edges, similar to Relation Graph.

Relation Graph

Relation Graph is both a tool and a Ontology Graph Projection. Relation Graph is used behind the scenes in both Ubergraph and in Semantic SQL. For the tool, see INCATools/relation-graph.

Ontology Format

A syntax for serializing an Ontology as text. Examples include OWL Functional Syntax, various RDF formats such as Turtle, or OBO Format. In OAK we take a broad view of the term “Ontology”, and also include things such as RDF serializations of SKOS.

OWL

An ontology language that uses constructs from Description Logic. OWL is not itself an ontology format, it can be serialized through different Ontology Formats such as Functional Syntax, and it can be mapped to RDF and serialized via an RDF format.

RDF

A Datamodel consisting of simple Subject Predicate Object Triples organized into an RDF Graph

FunOWL

FunOWL is a Python Ontology Library that provides a simple API for working with OWL ontologies conceptualized using the native OWL OWL Functional Syntax representation.

Functional Syntax

A syntax / Ontology Format that directly expresses the OWL data model.

OBO Format

An Ontology Format designed for easy viewing, direct editing, and readable diffs. It is popular in bioinformatics, but not widely used or known outside the genomics sphere. OBO is mapped to OWL, but only expresses a subset, and provides some OWL abstractions in a more easy to understand fashion.

Pronto

An Ontology Library for parsing OBO Format with some support for OWL files. OAK provides a wrapper around Pronto via the Pronto / OBO Files Adapter.

OBO Graphs

A JSON-based serialization Ontology Format and also a Datamodel for representing Ontology Graphs. OBO Graphs are designed to be an abstraction that is more suited to data science tasks than OWL or RDF, and utilizes a different Ontology Graph Projection than RDF.

Input Selector

A syntax that provides a shorthand for selecting an Adapter to communicate with an ontology. These may be command line based or for a remote endpoint. The syntax is typically <selector>:<path> but if a path is specified, a default adapter will be used.

OWL Annotation

In the context of OWL, the term Annotation means a piece of metadata that does not have a strict logical interpretation. Annotations can be on entities, for example, Label annotations, or annotations can be on Axioms.

Named Individual

An Ontology Element that represents an instance of a class. For example, the instance “John” or “John’s heart”. Note that instances are not commonly directly represented in bio-ontologies, but may be more common in other domains.

Property

An Ontology Element that represents an attribute or a characteristic of an element. In OWL, properties are divided into disjoint categories:

ObjectProperty

In OWL, an ObjectProperty is a Property that connects two Named Individuals. Object Properties are also used in Class Axioms, to express generalizations about how instances of those classes are necessarily related.

AnnotationProperty

In OWL, an AnnotationProperty is a Property that connects an Ontology Element to another element for the purposes of assigning metadata. Annotation Properties are “logically silent”. In OAK interfaces, we typically use the term Metadata property when referring to annotation properties.

DatatypeProperty

In OWL, a DatatypeProperty is a Property that connects an Ontology Element to a Literal. Datatype properties are not widely used in most bio-ontologies, and currently OAK has limited support for working with them.

Logical Definition

A Logical Definition is a particular kind of Axiom that is used to provide a definition of a term that is computable.

Subset

An Ontology Element that represents a named collection of elements, typically grouped for some purpose. Subsets are commonly used in ontologies like the Gene Ontology.

Reasoner

An ontology tool that will perform inference over an ontology to yield new axioms (e.g. new Edges) or to determine if an ontology is logically Coherent.

Reasoning

See Reasoner and Entailed

Bioportal

An Ontology Repository that is a comprehensive collection of multiple biologically relevant ontologies. Bioportal exposes an API endpoint, that is utilized by the OAK Bioportal Adapter.

OntoPortal

A framework for Ontology Repositories that is used by Bioportal, as well as AgroPortal, EcoPortal, etc. - See Bioportal Adapter.

Asserted

An Axiom or Edge that is directly asserted in an ontology, as opposed to being Entailed. Note that asserted edges or axioms usually correspond to Direct (one-hop) edges, but this isn’t always the case.

Entailed

An Axiom or Edge that is is inferred by a Reasoner. Note that all asserted edges or axioms are also entailed. Note also that sometimes entailed axioms can include trivial Tautologies.

Graph Traversal

A strategy for walking graphs, such as from a start node to all ancestors or descendants. In some cases, graph traversal can be used in place of Reasoning. See the section on Relationships and Graphs in the OAK guide.

Reflexive

A Edge or Axiom that connects an Ontology Element to itself. These are trivially true (Tautology), but in general these are included by default in operations involving Reasoning and Graph Traversal. See also the `RO guide to reflexivity<https://oborel.github.io/obo-relations/reflexivity/>`_.

Tautology

A Axiom or Edge that is trivially true.

OLS

Ontology Lookup Service. An Ontology Repository that is a curated collection of multiple biologically relevant ontologies, many from OBO. OLS exposes an API endpoint, that is utilized by the OAK OLS Implementation

Triplestore

A Graph database that stores Triples in a RDF Graph. Triplestores are used to store Ontology data, and to provide SPARQL querying over the data.

SPARQL

A Query Language for querying RDF Graphs. SPARQL is the standard query language for Triplestores. SPARQL queries are typically executed against a remote SPARQL Endpoint but they can also be executed against a local RDF file. OAK typically abstracts away from languages like SPARQL, but it is possible to pass-through SPARQL.

SQL

A Query Language for querying relational databases. While the use of SPARQL is more common in for ontologies, one of the most performant OAK Implementations is a Semantic SQL database.

Ubergraph

A:term:Triplestore and a Ontology Repository that allows for SPARQL querying of integrated OBO ontologies. Accessible via Ubergraph Adapter. Ubergraph includes inferred Relation Graph edges as triples.

Ontobee

A Triplestore and a Ontology Repository that allows for SPARQL querying of integrated OBO ontologies. Accessible via Ontobee Adapter.

Semantic SQL

Semantic SQL is a proposed standardized schema for representing any RDF/OWL ontology, plus a set of tools for building a database conforming to this schema from RDF/OWL files.

Diff

A representation of an individual difference between two Ontologies.

Patch

A representation of a set of Diffs that are intended to be applied.

KGCL

Knowledge Graph Change Language (KGCL) is a Datamodel for communicating desired changes (aka Patch) to an ontology. It can also be used to communicate Diffs between two ontologies. See KGCL docs.

Semantic Similarity

A means of measuring similarity between either pairs of ontology concepts, or between entities annotated using ontology concepts. There is a wide variety of different methods for calculating semantic similarity, for example Jaccard Similarity and Information Content based measures.

Information Content

A measure of how informative an ontology concept is; broader concepts are less informative as they encompass many things, whereas more specific concepts are more unique. This is usually measured as -log2(Pr(term)). The method of calculating the probability varies, depending on which predicates are taken into account (for many ontologies, it makes sense to use part-of as well as is-a), and whether the probability is the probability of observing a descendant term, or of an entity annotated using that term.

Iterator

A programming language construct used frequently in OAK - it allows for passing of results from API calls without fetching everything in advance. See https://realpython.com/python-iterators-iterables/.

Interface

A programmatic abstraction that allows us to focus on what something should do rather than how it is done. Contrast with Interface. The how is managed by an Implementation.

Implementation

Also known as Adapter. Typically the details of implementation should not be exposed, and developers of applications that use OAK should always Code to the Interface. For example, the method to query for all Relationships from a term should have the same meaning regardless of whether the adapter implementing the interface is a remote triplestore like Ubergraph, a Semantic SQL adapter, or a local OBO Graphs file. See the list of all implementations

Datamodel

Aka schema. OAK follows a pluralistic worldview, and includes many different datamodels for different purposes. Examples include:

OntoGPT

A framework built on OAK that combines ontologies and Large Language Models.