.. _associations: Associations and Curated Annotations ==================================== Background ---------- The main purpose of OAK is to provide uniform access onto an :term:`Ontology`. Ontologies are frequently used in combination with some form of tagging data entities. In the bio-ontologies and other realms, this kind of tagging is usually called :term:`Annotation` (but this term can be ambiguous). There are many different formats and data models for associations. The :term:`Gene Ontology` uses the :term:`GAF` format, which associates genes or gene products with terms in the ontology, alongside additional contextual information and provenance. There is a similar format for :term:`Human Phenotype Ontology` associations, which associate disease identifiers with phenotypic feature terms, alongside information about severity, age of onset, as well as provenance. Outside the bio-ontology world, the :term:`Open Annotation` standard provides a way of associating a wide range of entities of different types. The difference in use cases make supporting a single data model challenging. However, there are a number of core elements that are typically shared. The association: - is typically about something, i.e the :term:`Subject` of the association - relates the subject to another thing (the :term:`Object`), typically a class from an ontology - may have an (explicit or implicit) :term:`Predicate` indicating the nature of the relationship between subject and object - should have provenance, typically indicated via CURIEs to publications like DOIs or PMIDs - may have some kind of semantic modifier, including a negation flag - may have any number of pieces of additional evidence, providence, or administrative metadata - may include additional *denormalized* fields for convenience. The first three of these constitute the OAK :term:`Edge` data model. You may well ask, why treat associations differently from other kinds of edges in the ontology? There are a variety of answers to this question. Some are pragmatically oriented: - associations have historically been separated from ontology relationships in many domains - the operations we may want to do on one may differ from those on the other - associations typically emphasize the importance of provenance and additional metadata whereas ontology relationships are taken "as given" - associations are typically curated by different groups than those that curate ontologies Others answers are more formally oriented: - ontology relationships have strict OWL logical semantics (usually some combination of :term:`SubClassOf` and :term:`SomeValuesFrom`), whereas associations don't have defined semantics (or are weak Some-Some axioms) - ontology relationships represent *term* invariant relationships, whereas associations are *contingent* For a more detailed treatment of these formal aspects, see `On beyond Gruber: "Ontologies" in today's biomedical information systems and the limits of OWL `_. Association support in OAK --------------------------- .. warning:: The current way associations are loaded and modeled in OAK is subject to change Data Model ~~~~~~~~~~ See the `Association data model `_ for details of the data model. The data model is intentionally minimalist, and intends to capture the core features of multiple association data models. A generic ``PropertyValue`` object captures domain-specific extensions. Selecting association sources ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ There are a number of ways to select an association source. On the command line you can supplement the main ontology input (passed with ``--input`` or ``-i``) with an ``--associations`` option (shorthand ``-g``). You will also need to specify the association format (``--associations-type`` or ``-G``). The following will query HPO associations for any diseases associated with "Abnormal lacrimal gland morphology" or any is-a :term:`Descendant`: .. code-block:: bash wget http://purl.obolibrary.org/obo/hp/hpoa/phenotype.hpoa runoak -i sqlite:obo:hp -G hpoa -g phenotype.hpoa associations -p i HP:0011482 Further reading --------------- - `Gene Ontology: tool for the unification of biology `_ - `On beyond Gruber: "Ontologies" in today's biomedical information systems and the limits of OWL `_.