{ "cells": [ { "cell_type": "markdown", "id": "97be5edf-91b5-4364-9ae0-f91c1cce9567", "metadata": {}, "source": [ "# OAK disjoints command\n", "\n", "This notebook is intended as a supplement to the [main OAK CLI docs](https://incatools.github.io/ontology-access-kit/cli.html).\n", "\n", "This notebook provides examples for the `disjoints` command, which can be used to **lookup and summarize disjointness axioms**\n", "\n", "For more on disjointness see [The OBook](https://oboacademy.github.io/obook/tutorial/disjointness/)\n", "\n", "## Help Option\n", "\n", "You can get help on any OAK command using `--help`" ] }, { "cell_type": "code", "execution_count": 2, "id": "6f4f7723-21a4-4f10-85c5-facde3dde4a3", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Usage: runoak disjoints [OPTIONS] [TERMS]...\n", "\n", " Show all disjoints for a set of terms, or whole ontology.\n", "\n", " Leave off all arguments for defaults - all terms, YAML OboGraph model\n", " serialization:\n", "\n", " Example:\n", "\n", " runoak -i sqlite:obo:uberon disjoints\n", "\n", " Note that this will include pairwise disjoints, setwise disjoints, disjoint\n", " unions, and disjoints involving simple class expressions.\n", "\n", " A tabular format can be easier to browse, and includes labels by default:\n", "\n", " Example:\n", "\n", " runoak -i sqlite:obo:uberon disjoints --autolabel -O csv\n", "\n", " To perform this on a subset:\n", "\n", " Example:\n", "\n", " runoak -i sqlite:obo:cl disjoints --autolabel -O csv .desc//p=i \"immune\n", " cell\"\n", "\n", " Data model:\n", "\n", " https://w3id.org/oak/obograph\n", "\n", "Options:\n", " -p, --predicates TEXT A comma-separated list of predicates. This\n", " may be a shorthand (i, p) or CURIE\n", " --autolabel / --no-autolabel If set, results will automatically have\n", " labels assigned [default: autolabel]\n", " -O, --output-type TEXT Desired output type\n", " --named-classes-only / --no-namde-classes-only\n", " Only show disjointness axioms between two\n", " named classes. [default: no-namde-classes-\n", " only]\n", " -o, --output FILENAME Output file, e.g. obo file\n", " --help Show this message and exit.\n" ] } ], "source": [ "!runoak disjoints --help" ] }, { "cell_type": "markdown", "id": "1d977eab-a0bc-48b3-ba85-bf6deec4d615", "metadata": {}, "source": [ "## Set up an alias\n", "\n", "For convenience we will set up an alias for use in this notebook" ] }, { "cell_type": "code", "execution_count": 3, "id": "da3b3732-9282-499c-b0ae-fd8af97c6110", "metadata": {}, "outputs": [], "source": [ "alias cl runoak -i sqlite:obo:cl" ] }, { "cell_type": "markdown", "id": "26d68485-5702-4daf-aaf6-f9eb2512f02c", "metadata": {}, "source": [ "## All simple disjointness axioms\n", "\n", "Let's first look at all simple disjointness axioms in the ontology - i.e. those between named classes" ] }, { "cell_type": "code", "execution_count": 6, "id": "190fe0cb-db67-4d88-8e2a-d308b93d46d7", "metadata": {}, "outputs": [], "source": [ "cl disjoints --named-classes-only > output/cl-disjoints.yaml" ] }, { "cell_type": "code", "execution_count": 7, "id": "36c5c3e5-4714-4ccf-86a6-3e1f045b2ccf", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "classIds:\n", "- BFO:0000002\n", "- BFO:0000003\n", "\n", "---\n", "classIds:\n", "- BFO:0000004\n", "- BFO:0000031\n", "\n", "---\n", "classIds:\n", "- BFO:0000004\n", "- BFO:0000020\n", "\n", "---\n", "classIds:\n", "- BFO:0000016\n", "- BFO:0000023\n", "\n", "---\n", "classIds:\n", "- BFO:0000017\n", "- BFO:0000019\n", "\n", "---\n", "classIds:\n", "- BFO:0000020\n", "- BFO:0000031\n", "\n", "---\n", "classIds:\n", "- BFO:0000040\n", "- BFO:0000141\n", "\n", "---\n", "classIds:\n", "- CARO:0000006\n", "- CARO:0000007\n", "\n", "---\n" ] } ], "source": [ "!head -40 output/cl-disjoints.yaml" ] }, { "cell_type": "markdown", "id": "089a6f44-2204-45d7-ba80-2268662f41fe", "metadata": {}, "source": [ "The YAML here is conformant with OboGraphs. However, it's not very convenient for viewing,\n", "so let's get a flattened via as both obo format and a TSV" ] }, { "cell_type": "code", "execution_count": 19, "id": "8c7c4a49-d643-41d4-a83a-2204fa2304f8", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "WARNING:root:Skipping DisjointClassExpressionsAxiom with only one class: DisjointClassExpressionsAxiom(meta=None, classIds=['_:riog00151338'], classExpressions=[], unionEquivalentTo=None, unionEquivalentToExpression=None)\n" ] } ], "source": [ "cl disjoints --named-classes-only -O obo > output/cl-disjoints.obo" ] }, { "cell_type": "code", "execution_count": 21, "id": "c9a2b5ce-3a9f-44c5-8a73-8e5196bee4d8", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[Term]\n", "id: BFO:0000002 ! continuant\n", "disjoint_from: BFO:0000003 ! occurrent\n", "\n", "\n", "[Term]\n", "id: BFO:0000004 ! independent continuant\n", "disjoint_from: BFO:0000031 ! generically dependent continuant\n", "\n", "\n", "[Term]\n", "id: BFO:0000004 ! independent continuant\n", "disjoint_from: BFO:0000020 ! specifically dependent continuant\n", "\n", "\n", "[Term]\n", "id: BFO:0000016 ! disposition\n", "disjoint_from: BFO:0000023 ! role\n", "\n", "\n" ] } ], "source": [ "!head -20 output/cl-disjoints.obo" ] }, { "cell_type": "code", "execution_count": 8, "id": "5d8f2791-9810-4870-807c-eecbc45396c9", "metadata": {}, "outputs": [], "source": [ "cl disjoints --named-classes-only -O csv > output/cl-disjoints.tsv" ] }, { "cell_type": "code", "execution_count": 9, "id": "d91f88b1-b012-4541-b52a-50180e35c349", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
classIdsclassIds_labelunionEquivalentTounionEquivalentToExpressionclassExpressionPropertyIdsclassExpressionFillerIds
0BFO:0000002|BFO:0000003continuant|occurrentNaNNaNNaNNaN
1BFO:0000004|BFO:0000031independent continuant|generically dependent c...NaNNaNNaNNaN
2BFO:0000004|BFO:0000020independent continuant|specifically dependent ...NaNNaNNaNNaN
3BFO:0000016|BFO:0000023disposition|roleNaNNaNNaNNaN
4BFO:0000017|BFO:0000019realizable entity|qualityNaNNaNNaNNaN
.....................
309UBERON:0035165|UBERON:0035523posterior surface of prostate|anterior surface...NaNNaNNaNNaN
310UBERON:2001156|UBERON:2001316posterior lateral line placode|anterior latera...NaNNaNNaNNaN
311UBERON:2001314|UBERON:2001391posterior lateral line ganglion|anterior later...NaNNaNNaNNaN
312UBERON:2001468|UBERON:2001471anterior lateral line system|posterior lateral...NaNNaNNaNNaN
313_:riog00151338NaNNaNNaNNaNNaN
\n", "

314 rows × 6 columns

\n", "
" ], "text/plain": [ " classIds \\\n", "0 BFO:0000002|BFO:0000003 \n", "1 BFO:0000004|BFO:0000031 \n", "2 BFO:0000004|BFO:0000020 \n", "3 BFO:0000016|BFO:0000023 \n", "4 BFO:0000017|BFO:0000019 \n", ".. ... \n", "309 UBERON:0035165|UBERON:0035523 \n", "310 UBERON:2001156|UBERON:2001316 \n", "311 UBERON:2001314|UBERON:2001391 \n", "312 UBERON:2001468|UBERON:2001471 \n", "313 _:riog00151338 \n", "\n", " classIds_label unionEquivalentTo \\\n", "0 continuant|occurrent NaN \n", "1 independent continuant|generically dependent c... NaN \n", "2 independent continuant|specifically dependent ... NaN \n", "3 disposition|role NaN \n", "4 realizable entity|quality NaN \n", ".. ... ... \n", "309 posterior surface of prostate|anterior surface... NaN \n", "310 posterior lateral line placode|anterior latera... NaN \n", "311 posterior lateral line ganglion|anterior later... NaN \n", "312 anterior lateral line system|posterior lateral... NaN \n", "313 NaN NaN \n", "\n", " unionEquivalentToExpression classExpressionPropertyIds \\\n", "0 NaN NaN \n", "1 NaN NaN \n", "2 NaN NaN \n", "3 NaN NaN \n", "4 NaN NaN \n", ".. ... ... \n", "309 NaN NaN \n", "310 NaN NaN \n", "311 NaN NaN \n", "312 NaN NaN \n", "313 NaN NaN \n", "\n", " classExpressionFillerIds \n", "0 NaN \n", "1 NaN \n", "2 NaN \n", "3 NaN \n", "4 NaN \n", ".. ... \n", "309 NaN \n", "310 NaN \n", "311 NaN \n", "312 NaN \n", "313 NaN \n", "\n", "[314 rows x 6 columns]" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import pandas as pd\n", "df = pd.read_csv(\"output/cl-disjoints.tsv\", sep=\"\\t\")\n", "df" ] }, { "cell_type": "markdown", "id": "99626f2d-665a-441e-bb2f-d5d927033cdb", "metadata": {}, "source": [ "Note that many of the columns will never be filled so long as we are querying simple (NC only) disjoints.\n", "\n", "This includes lots of ontologies that are merged in.\n", "\n", "We can filter this by ID prefix using an `i^` (identifier starts with) query" ] }, { "cell_type": "code", "execution_count": 14, "id": "a7289029-0a53-4fb6-a5c7-7c6598ddbc38", "metadata": {}, "outputs": [], "source": [ "cl disjoints --named-classes-only -O csv i^CL: > output/cl-disjoints-cell-types.tsv" ] }, { "cell_type": "code", "execution_count": 15, "id": "e923e12e-fb68-4500-a1ab-ac658b13b157", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
classIdsclassIds_labelunionEquivalentTounionEquivalentToExpressionclassExpressionPropertyIdsclassExpressionFillerIds
0CL:0000000|GO:0043226cell|organelleNaNNaNNaNNaN
1CL:0000000|GO:0032991cell|protein-containing complexNaNNaNNaNNaN
2CL:0000000|GO:0031012cell|extracellular matrixNaNNaNNaNNaN
3CL:0000039|CL:0002371germ line cell|somatic cellNaNNaNNaNNaN
4CL:0000049|CL:0000557common myeloid progenitor|granulocyte monocyte...NaNNaNNaNNaN
5CL:0000049|CL:0000051common myeloid progenitor|common lymphoid prog...NaNNaNNaNNaN
6CL:0000049|CL:0000050common myeloid progenitor|megakaryocyte-erythr...NaNNaNNaNNaN
7CL:0000050|CL:0002009megakaryocyte-erythroid progenitor cell|macrop...NaNNaNNaNNaN
8CL:0000050|CL:0000557megakaryocyte-erythroid progenitor cell|granul...NaNNaNNaNNaN
9CL:0000066|CL:0000738epithelial cell|leukocyteNaNNaNNaNNaN
10CL:0000084|CL:0000945T cell|lymphocyte of B lineageNaNNaNNaNNaN
11CL:0000225|CL:0002242anucleate cell|nucleate cellNaNNaNNaNNaN
12CL:0000255|CL:0000520eukaryotic cell|prokaryotic cellNaNNaNNaNNaN
13CL:0000451|CL:0000542dendritic cell|lymphocyteNaNNaNNaNNaN
14CL:0000521|CL:0000548fungal cell|animal cellNaNNaNNaNNaN
15CL:0000542|CL:0000766lymphocyte|myeloid leukocyteNaNNaNNaNNaN
16CL:0000556|CL:0000764megakaryocyte|erythroid lineage cellNaNNaNNaNNaN
17CL:0000624|CL:0000625CD4-positive, alpha-beta T cell|CD8-positive, ...NaNNaNNaNNaN
18CL:0000737|CL:0008000striated muscle cell|non-striated muscle cellNaNNaNNaNNaN
19CL:0000785|CL:0000818mature B cell|transitional stage B cellNaNNaNNaNNaN
20CL:0000785|CL:0000817mature B cell|precursor B cellNaNNaNNaNNaN
21CL:0000785|CL:0000816mature B cell|immature B cellNaNNaNNaNNaN
22CL:0000789|CL:0000798alpha-beta T cell|gamma-delta T cellNaNNaNNaNNaN
23CL:0000813|CL:0000898memory T cell|naive T cellNaNNaNNaNNaN
24CL:0000817|CL:0000826precursor B cell|pro-B cellNaNNaNNaNNaN
25CL:0000823|CL:0000937immature natural killer cell|pre-natural kille...NaNNaNNaNNaN
26CL:0000823|CL:0000824immature natural killer cell|mature natural ki...NaNNaNNaNNaN
27CL:0000837|CL:0002032hematopoietic multipotent progenitor cell|hema...NaNNaNNaNNaN
28CL:0000838|CL:0000839lymphoid lineage restricted progenitor cell|my...NaNNaNNaNNaN
29CL:0000851|CL:0000855neuromast mantle cell|sensory hair cellNaNNaNNaNNaN
30CL:0000852|CL:0000855neuromast supporting cell|sensory hair cellNaNNaNNaNNaN
31CL:0000955|CL:0000956pre-B-II cell|pre-B-I cellNaNNaNNaNNaN
32CL:0001008|CL:0001024Kit and Sca1-positive hematopoietic stem cell|...NaNNaNNaNNaN
33CL:0001021|CL:0001025CD34-positive, CD38-positive common lymphoid p...NaNNaNNaNNaN
34CL:0001023|CL:0001026Kit-positive, CD34-positive common myeloid pro...NaNNaNNaNNaN
35CL:0002031|CL:0002032hematopoietic lineage restricted progenitor ce...NaNNaNNaNNaN
36CL:0002036|CL:0002043Slamf1-positive multipotent progenitor cell|CD...NaNNaNNaNNaN
37CL:0008011|CL:0008020skeletal muscle satellite stem cell|skeletal m...NaNNaNNaNNaN
38CL:0008046|CL:0008047extrafusal muscle fiber|intrafusal muscle fiberNaNNaNNaNNaN
39_:riog00151338NaNNaNNaNNaNNaN
\n", "
" ], "text/plain": [ " classIds classIds_label \\\n", "0 CL:0000000|GO:0043226 cell|organelle \n", "1 CL:0000000|GO:0032991 cell|protein-containing complex \n", "2 CL:0000000|GO:0031012 cell|extracellular matrix \n", "3 CL:0000039|CL:0002371 germ line cell|somatic cell \n", "4 CL:0000049|CL:0000557 common myeloid progenitor|granulocyte monocyte... \n", "5 CL:0000049|CL:0000051 common myeloid progenitor|common lymphoid prog... \n", "6 CL:0000049|CL:0000050 common myeloid progenitor|megakaryocyte-erythr... \n", "7 CL:0000050|CL:0002009 megakaryocyte-erythroid progenitor cell|macrop... \n", "8 CL:0000050|CL:0000557 megakaryocyte-erythroid progenitor cell|granul... \n", "9 CL:0000066|CL:0000738 epithelial cell|leukocyte \n", "10 CL:0000084|CL:0000945 T cell|lymphocyte of B lineage \n", "11 CL:0000225|CL:0002242 anucleate cell|nucleate cell \n", "12 CL:0000255|CL:0000520 eukaryotic cell|prokaryotic cell \n", "13 CL:0000451|CL:0000542 dendritic cell|lymphocyte \n", "14 CL:0000521|CL:0000548 fungal cell|animal cell \n", "15 CL:0000542|CL:0000766 lymphocyte|myeloid leukocyte \n", "16 CL:0000556|CL:0000764 megakaryocyte|erythroid lineage cell \n", "17 CL:0000624|CL:0000625 CD4-positive, alpha-beta T cell|CD8-positive, ... \n", "18 CL:0000737|CL:0008000 striated muscle cell|non-striated muscle cell \n", "19 CL:0000785|CL:0000818 mature B cell|transitional stage B cell \n", "20 CL:0000785|CL:0000817 mature B cell|precursor B cell \n", "21 CL:0000785|CL:0000816 mature B cell|immature B cell \n", "22 CL:0000789|CL:0000798 alpha-beta T cell|gamma-delta T cell \n", "23 CL:0000813|CL:0000898 memory T cell|naive T cell \n", "24 CL:0000817|CL:0000826 precursor B cell|pro-B cell \n", "25 CL:0000823|CL:0000937 immature natural killer cell|pre-natural kille... \n", "26 CL:0000823|CL:0000824 immature natural killer cell|mature natural ki... \n", "27 CL:0000837|CL:0002032 hematopoietic multipotent progenitor cell|hema... \n", "28 CL:0000838|CL:0000839 lymphoid lineage restricted progenitor cell|my... \n", "29 CL:0000851|CL:0000855 neuromast mantle cell|sensory hair cell \n", "30 CL:0000852|CL:0000855 neuromast supporting cell|sensory hair cell \n", "31 CL:0000955|CL:0000956 pre-B-II cell|pre-B-I cell \n", "32 CL:0001008|CL:0001024 Kit and Sca1-positive hematopoietic stem cell|... \n", "33 CL:0001021|CL:0001025 CD34-positive, CD38-positive common lymphoid p... \n", "34 CL:0001023|CL:0001026 Kit-positive, CD34-positive common myeloid pro... \n", "35 CL:0002031|CL:0002032 hematopoietic lineage restricted progenitor ce... \n", "36 CL:0002036|CL:0002043 Slamf1-positive multipotent progenitor cell|CD... \n", "37 CL:0008011|CL:0008020 skeletal muscle satellite stem cell|skeletal m... \n", "38 CL:0008046|CL:0008047 extrafusal muscle fiber|intrafusal muscle fiber \n", "39 _:riog00151338 NaN \n", "\n", " unionEquivalentTo unionEquivalentToExpression \\\n", "0 NaN NaN \n", "1 NaN NaN \n", "2 NaN NaN \n", "3 NaN NaN \n", "4 NaN NaN \n", "5 NaN NaN \n", "6 NaN NaN \n", "7 NaN NaN \n", "8 NaN NaN \n", "9 NaN NaN \n", "10 NaN NaN \n", "11 NaN NaN \n", "12 NaN NaN \n", "13 NaN NaN \n", "14 NaN NaN \n", "15 NaN NaN \n", "16 NaN NaN \n", "17 NaN NaN \n", "18 NaN NaN \n", "19 NaN NaN \n", "20 NaN NaN \n", "21 NaN NaN \n", "22 NaN NaN \n", "23 NaN NaN \n", "24 NaN NaN \n", "25 NaN NaN \n", "26 NaN NaN \n", "27 NaN NaN \n", "28 NaN NaN \n", "29 NaN NaN \n", "30 NaN NaN \n", "31 NaN NaN \n", "32 NaN NaN \n", "33 NaN NaN \n", "34 NaN NaN \n", "35 NaN NaN \n", "36 NaN NaN \n", "37 NaN NaN \n", "38 NaN NaN \n", "39 NaN NaN \n", "\n", " classExpressionPropertyIds classExpressionFillerIds \n", "0 NaN NaN \n", "1 NaN NaN \n", "2 NaN NaN \n", "3 NaN NaN \n", "4 NaN NaN \n", "5 NaN NaN \n", "6 NaN NaN \n", "7 NaN NaN \n", "8 NaN NaN \n", "9 NaN NaN \n", "10 NaN NaN \n", "11 NaN NaN \n", "12 NaN NaN \n", "13 NaN NaN \n", "14 NaN NaN \n", "15 NaN NaN \n", "16 NaN NaN \n", "17 NaN NaN \n", "18 NaN NaN \n", "19 NaN NaN \n", "20 NaN NaN \n", "21 NaN NaN \n", "22 NaN NaN \n", "23 NaN NaN \n", "24 NaN NaN \n", "25 NaN NaN \n", "26 NaN NaN \n", "27 NaN NaN \n", "28 NaN NaN \n", "29 NaN NaN \n", "30 NaN NaN \n", "31 NaN NaN \n", "32 NaN NaN \n", "33 NaN NaN \n", "34 NaN NaN \n", "35 NaN NaN \n", "36 NaN NaN \n", "37 NaN NaN \n", "38 NaN NaN \n", "39 NaN NaN " ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df = pd.read_csv(\"output/cl-disjoints-cell-types.tsv\", sep=\"\\t\")\n", "df" ] }, { "cell_type": "markdown", "id": "17d95808-674d-457d-8682-5ffda72bf518", "metadata": {}, "source": [ "## Disjoint Class Expressions\n", "\n", "Some ontologies like Uberon make use of more advanced disjointness concepts in order to\n", "express things like spatial disjointness. See [Uberon wiki](https://github.com/obophenotype/uberon/wiki/Part-disjointness-Design-Pattern).\n", "\n", "In OWL terms these are formally known as \"General Class Inclusion Axioms\". However, OAK shields you from\n", "this and provides these using a simple data model.\n", "\n", "To include part-of in lookups, use the `--predicates` (`-p`) option (this is a standard OAK option for\n", "any command involving relationship types).\n", "\n", "Here we will find all spatial disjointness axioms between major organism subdivisions in Uberon:" ] }, { "cell_type": "code", "execution_count": 16, "id": "d2fb7525-1fe1-4da9-9c3f-43be03520f7e", "metadata": {}, "outputs": [], "source": [ "alias uberon runoak -i sqlite:obo:uberon" ] }, { "cell_type": "code", "execution_count": 18, "id": "df0f1d23-2485-484d-bc8c-2597e76cc687", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "classExpressions:\n", "- fillerId: UBERON:0000026\n", " propertyId: BFO:0000050\n", "- fillerId: UBERON:0000915\n", " propertyId: BFO:0000050\n", "\n", "---\n", "classExpressions:\n", "- fillerId: UBERON:0000026\n", " propertyId: BFO:0000050\n", "- fillerId: UBERON:0002100\n", " propertyId: BFO:0000050\n", "\n", "---\n", "classExpressions:\n", "- fillerId: UBERON:0000033\n", " propertyId: BFO:0000050\n", "- fillerId: UBERON:0000915\n", " propertyId: BFO:0000050\n", "\n", "---\n", "classExpressions:\n", "- fillerId: UBERON:0000033\n", " propertyId: BFO:0000050\n", "- fillerId: UBERON:0000948\n", " propertyId: BFO:0000050\n", "\n", "---\n", "classExpressions:\n", "- fillerId: UBERON:0000033\n", " propertyId: BFO:0000050\n", "- fillerId: UBERON:0002100\n", " propertyId: BFO:0000050\n", "\n", "---\n", "classExpressions:\n", "- fillerId: UBERON:0000033\n", " propertyId: BFO:0000050\n", "- fillerId: UBERON:0005886\n", " propertyId: BFO:0000050\n", "\n", "---\n", "classExpressions:\n", "- fillerId: UBERON:0000915\n", " propertyId: BFO:0000050\n", "- fillerId: UBERON:0002417\n", " propertyId: BFO:0000050\n", "\n", "---\n", "classIds:\n", "- _:riog00226101\n", "\n", "---\n", "classIds:\n", "- _:riog00226236\n", "\n", "---\n", "classIds:\n", "- _:riog00226251\n", "\n", "---\n", "classIds:\n", "- _:riog00226988\n" ] } ], "source": [ "uberon disjoints -p i,p .desc//p=i \"subdivision of organism along main body axis\"" ] }, { "cell_type": "markdown", "id": "5b4d0673-4cd9-4886-83b1-550296464d86", "metadata": {}, "source": [ "The OAK OboGraphs data model here allows each axiom to include a list of *class expressions*, these are\n", "tuples of a predicate (property) and a filler.\n", "\n", "We can look at the flattened view:" ] }, { "cell_type": "code", "execution_count": 22, "id": "e8f67c9e-c02f-4ab8-bb32-d22af9204872", "metadata": {}, "outputs": [], "source": [ "uberon disjoints -p i,p .desc//p=i \"subdivision of organism along main body axis\" -O csv -o output/uberon-part-disjoint-subdivisions.tsv" ] }, { "cell_type": "code", "execution_count": 23, "id": "84ee02ab-4bac-4656-8ae5-48293e4316c4", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
classIdsunionEquivalentTounionEquivalentToExpressionclassExpressionPropertyIdsclassExpressionPropertyIds_labelclassExpressionFillerIdsclassExpressionFillerIds_label
0NaNNaNNaNBFO:0000050|BFO:0000050part of|part ofUBERON:0000026|UBERON:0000915appendage|thoracic segment of trunk
1NaNNaNNaNBFO:0000050|BFO:0000050part of|part ofUBERON:0000026|UBERON:0002100appendage|trunk
2NaNNaNNaNBFO:0000050|BFO:0000050part of|part ofUBERON:0000033|UBERON:0000915head|thoracic segment of trunk
3NaNNaNNaNBFO:0000050|BFO:0000050part of|part ofUBERON:0000033|UBERON:0000948head|heart
4NaNNaNNaNBFO:0000050|BFO:0000050part of|part ofUBERON:0000033|UBERON:0002100head|trunk
5NaNNaNNaNBFO:0000050|BFO:0000050part of|part ofUBERON:0000033|UBERON:0005886head|post-hyoid pharyngeal arch skeleton
6NaNNaNNaNBFO:0000050|BFO:0000050part of|part ofUBERON:0000915|UBERON:0002417thoracic segment of trunk|abdominal segment of...
7_:riog00226101NaNNaNNaNNaNNaNNaN
8_:riog00226236NaNNaNNaNNaNNaNNaN
9_:riog00226251NaNNaNNaNNaNNaNNaN
10_:riog00226988NaNNaNNaNNaNNaNNaN
\n", "
" ], "text/plain": [ " classIds unionEquivalentTo unionEquivalentToExpression \\\n", "0 NaN NaN NaN \n", "1 NaN NaN NaN \n", "2 NaN NaN NaN \n", "3 NaN NaN NaN \n", "4 NaN NaN NaN \n", "5 NaN NaN NaN \n", "6 NaN NaN NaN \n", "7 _:riog00226101 NaN NaN \n", "8 _:riog00226236 NaN NaN \n", "9 _:riog00226251 NaN NaN \n", "10 _:riog00226988 NaN NaN \n", "\n", " classExpressionPropertyIds classExpressionPropertyIds_label \\\n", "0 BFO:0000050|BFO:0000050 part of|part of \n", "1 BFO:0000050|BFO:0000050 part of|part of \n", "2 BFO:0000050|BFO:0000050 part of|part of \n", "3 BFO:0000050|BFO:0000050 part of|part of \n", "4 BFO:0000050|BFO:0000050 part of|part of \n", "5 BFO:0000050|BFO:0000050 part of|part of \n", "6 BFO:0000050|BFO:0000050 part of|part of \n", "7 NaN NaN \n", "8 NaN NaN \n", "9 NaN NaN \n", "10 NaN NaN \n", "\n", " classExpressionFillerIds \\\n", "0 UBERON:0000026|UBERON:0000915 \n", "1 UBERON:0000026|UBERON:0002100 \n", "2 UBERON:0000033|UBERON:0000915 \n", "3 UBERON:0000033|UBERON:0000948 \n", "4 UBERON:0000033|UBERON:0002100 \n", "5 UBERON:0000033|UBERON:0005886 \n", "6 UBERON:0000915|UBERON:0002417 \n", "7 NaN \n", "8 NaN \n", "9 NaN \n", "10 NaN \n", "\n", " classExpressionFillerIds_label \n", "0 appendage|thoracic segment of trunk \n", "1 appendage|trunk \n", "2 head|thoracic segment of trunk \n", "3 head|heart \n", "4 head|trunk \n", "5 head|post-hyoid pharyngeal arch skeleton \n", "6 thoracic segment of trunk|abdominal segment of... \n", "7 NaN \n", "8 NaN \n", "9 NaN \n", "10 NaN " ] }, "execution_count": 23, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df = pd.read_csv(\"output/uberon-part-disjoint-subdivisions.tsv\", sep=\"\\t\")\n", "df" ] }, { "cell_type": "markdown", "id": "894a55b7-d984-4fd0-99d1-04145a87a552", "metadata": {}, "source": [ "Here the disjointness axiom states that all classIds and all predicate-filler expressions are mutually disjoint.\n", "\n", "This is telling us that nothing is part of both an \"appendage\" and \"thoracic segment of trunk\", i.e. there is no\n", "spatial overlap." ] }, { "cell_type": "markdown", "id": "d78865e0-de82-4656-9db0-610662288222", "metadata": {}, "source": [ "## Generating disjointness axioms\n", "\n", "Many ontologies are under-axiomatized. Editors sometimes struggle to add the appropriate disjointness axioms.\n", "\n", "OAK provides a heuristic approach to suggesting disjointness axioms.\n", "\n", "First we will explore this using the Zebrafish anatomy ontolog as an example. We will find candidate pairwise\n", "disjoints under \"bone element\":" ] }, { "cell_type": "code", "execution_count": 34, "id": "bbc734fc-81da-4f5a-88be-b74136ae6ee6", "metadata": {}, "outputs": [], "source": [ "alias zfa runoak -i sqlite:obo:zfa" ] }, { "cell_type": "code", "execution_count": 37, "id": "9d4f94e3-54de-4ea0-819c-4dd92f637813", "metadata": {}, "outputs": [], "source": [ "zfa generate-disjoints \"bone element\" -O csv -o output/zfa-bone-element-gen-disjoint.tsv" ] }, { "cell_type": "code", "execution_count": 38, "id": "da52528e-f85a-4e70-8a72-ba12c23e21ff", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
classIdsclassIds_labelunionEquivalentTounionEquivalentToExpressionclassExpressionPropertyIdsclassExpressionFillerIds
0ZFA:0000170|ZFA:0000658basibranchial|epibranchial boneNaNNaNNaNNaN
1ZFA:0000442|ZFA:0000658supraneural|epibranchial boneNaNNaNNaNNaN
2ZFA:0000442|ZFA:0000170supraneural|basibranchialNaNNaNNaNNaN
3ZFA:0001066|ZFA:0000658neural arch|epibranchial boneNaNNaNNaNNaN
4ZFA:0001066|ZFA:0000170neural arch|basibranchialNaNNaNNaNNaN
.....................
60ZFA:0001418|ZFA:0001551dorsal fin lepidotrichium|pectoral fin lepidot...NaNNaNNaNNaN
61ZFA:0001421|ZFA:0001552anal fin lepidotrichium|pelvic fin lepidotrichiumNaNNaNNaNNaN
62ZFA:0001421|ZFA:0001550anal fin lepidotrichium|caudal fin lepidotrichiumNaNNaNNaNNaN
63ZFA:0001421|ZFA:0001551anal fin lepidotrichium|pectoral fin lepidotri...NaNNaNNaNNaN
64ZFA:0001421|ZFA:0001418anal fin lepidotrichium|dorsal fin lepidotrichiumNaNNaNNaNNaN
\n", "

65 rows × 6 columns

\n", "
" ], "text/plain": [ " classIds \\\n", "0 ZFA:0000170|ZFA:0000658 \n", "1 ZFA:0000442|ZFA:0000658 \n", "2 ZFA:0000442|ZFA:0000170 \n", "3 ZFA:0001066|ZFA:0000658 \n", "4 ZFA:0001066|ZFA:0000170 \n", ".. ... \n", "60 ZFA:0001418|ZFA:0001551 \n", "61 ZFA:0001421|ZFA:0001552 \n", "62 ZFA:0001421|ZFA:0001550 \n", "63 ZFA:0001421|ZFA:0001551 \n", "64 ZFA:0001421|ZFA:0001418 \n", "\n", " classIds_label unionEquivalentTo \\\n", "0 basibranchial|epibranchial bone NaN \n", "1 supraneural|epibranchial bone NaN \n", "2 supraneural|basibranchial NaN \n", "3 neural arch|epibranchial bone NaN \n", "4 neural arch|basibranchial NaN \n", ".. ... ... \n", "60 dorsal fin lepidotrichium|pectoral fin lepidot... NaN \n", "61 anal fin lepidotrichium|pelvic fin lepidotrichium NaN \n", "62 anal fin lepidotrichium|caudal fin lepidotrichium NaN \n", "63 anal fin lepidotrichium|pectoral fin lepidotri... NaN \n", "64 anal fin lepidotrichium|dorsal fin lepidotrichium NaN \n", "\n", " unionEquivalentToExpression classExpressionPropertyIds \\\n", "0 NaN NaN \n", "1 NaN NaN \n", "2 NaN NaN \n", "3 NaN NaN \n", "4 NaN NaN \n", ".. ... ... \n", "60 NaN NaN \n", "61 NaN NaN \n", "62 NaN NaN \n", "63 NaN NaN \n", "64 NaN NaN \n", "\n", " classExpressionFillerIds \n", "0 NaN \n", "1 NaN \n", "2 NaN \n", "3 NaN \n", "4 NaN \n", ".. ... \n", "60 NaN \n", "61 NaN \n", "62 NaN \n", "63 NaN \n", "64 NaN \n", "\n", "[65 rows x 6 columns]" ] }, "execution_count": 38, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df = pd.read_csv(\"output/zfa-bone-element-gen-disjoint.tsv\", sep=\"\\t\")\n", "df" ] }, { "cell_type": "markdown", "id": "2fddad6f-88e8-4a1e-82c0-bce4d36358a0", "metadata": {}, "source": [ "### Generating spatial disjointness axioms\n", "\n", "Pass in predicates to also generate candidate OWL axioms of the form\n", "\n", "`(part-of some X) DisjointWith (part-of some Y)`" ] }, { "cell_type": "code", "execution_count": 40, "id": "46ae0ca6-9a46-421e-a07a-f57087b73c45", "metadata": {}, "outputs": [], "source": [ "zfa generate-disjoints \"paired fin skeleton\" -p i,p -O csv -o output/zfa-skel-gen-part-disjoint.tsv" ] }, { "cell_type": "code", "execution_count": 41, "id": "55f70ab7-5d6d-4946-ac49-53a70bb7fc78", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
classIdsunionEquivalentTounionEquivalentToExpressionclassExpressionPropertyIdsclassExpressionPropertyIds_labelclassExpressionFillerIdsclassExpressionFillerIds_label
0NaNNaNNaNBFO:0000050|BFO:0000050part of|part ofZFA:0000943|ZFA:0001387pectoral fin skeleton|pelvic fin skeleton
1NaNNaNNaNBFO:0000050|BFO:0000050part of|part ofZFA:0000257|ZFA:0001586pectoral fin cartilage|pectoral fin radial
2NaNNaNNaNBFO:0000050|BFO:0000050part of|part ofZFA:0001551|ZFA:0001586pectoral fin lepidotrichium|pectoral fin radial
3NaNNaNNaNBFO:0000050|BFO:0000050part of|part ofZFA:0001551|ZFA:0000257pectoral fin lepidotrichium|pectoral fin carti...
4NaNNaNNaNBFO:0000050|BFO:0000050part of|part ofZFA:0000407|ZFA:0001586pectoral girdle|pectoral fin radial
5NaNNaNNaNBFO:0000050|BFO:0000050part of|part ofZFA:0000407|ZFA:0001551pectoral girdle|pectoral fin lepidotrichium
6NaNNaNNaNBFO:0000050|BFO:0000050part of|part ofZFA:0001552|ZFA:0000508pelvic fin lepidotrichium|pelvic radial
7NaNNaNNaNBFO:0000050|BFO:0000050part of|part ofZFA:0001459|ZFA:0000508pelvic fin cartilage|pelvic radial
8NaNNaNNaNBFO:0000050|BFO:0000050part of|part ofZFA:0001459|ZFA:0001552pelvic fin cartilage|pelvic fin lepidotrichium
\n", "
" ], "text/plain": [ " classIds unionEquivalentTo unionEquivalentToExpression \\\n", "0 NaN NaN NaN \n", "1 NaN NaN NaN \n", "2 NaN NaN NaN \n", "3 NaN NaN NaN \n", "4 NaN NaN NaN \n", "5 NaN NaN NaN \n", "6 NaN NaN NaN \n", "7 NaN NaN NaN \n", "8 NaN NaN NaN \n", "\n", " classExpressionPropertyIds classExpressionPropertyIds_label \\\n", "0 BFO:0000050|BFO:0000050 part of|part of \n", "1 BFO:0000050|BFO:0000050 part of|part of \n", "2 BFO:0000050|BFO:0000050 part of|part of \n", "3 BFO:0000050|BFO:0000050 part of|part of \n", "4 BFO:0000050|BFO:0000050 part of|part of \n", "5 BFO:0000050|BFO:0000050 part of|part of \n", "6 BFO:0000050|BFO:0000050 part of|part of \n", "7 BFO:0000050|BFO:0000050 part of|part of \n", "8 BFO:0000050|BFO:0000050 part of|part of \n", "\n", " classExpressionFillerIds classExpressionFillerIds_label \n", "0 ZFA:0000943|ZFA:0001387 pectoral fin skeleton|pelvic fin skeleton \n", "1 ZFA:0000257|ZFA:0001586 pectoral fin cartilage|pectoral fin radial \n", "2 ZFA:0001551|ZFA:0001586 pectoral fin lepidotrichium|pectoral fin radial \n", "3 ZFA:0001551|ZFA:0000257 pectoral fin lepidotrichium|pectoral fin carti... \n", "4 ZFA:0000407|ZFA:0001586 pectoral girdle|pectoral fin radial \n", "5 ZFA:0000407|ZFA:0001551 pectoral girdle|pectoral fin lepidotrichium \n", "6 ZFA:0001552|ZFA:0000508 pelvic fin lepidotrichium|pelvic radial \n", "7 ZFA:0001459|ZFA:0000508 pelvic fin cartilage|pelvic radial \n", "8 ZFA:0001459|ZFA:0001552 pelvic fin cartilage|pelvic fin lepidotrichium " ] }, "execution_count": 41, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df = pd.read_csv(\"output/zfa-skel-gen-part-disjoint.tsv\", sep=\"\\t\")\n", "df" ] }, { "cell_type": "markdown", "id": "12fcee23-bfda-4715-b895-923100961808", "metadata": {}, "source": [ "The first row here tells us that the pectoral and pelvic fin skeletons have no parts in common.\n", "\n", "Note this is a stronger axiom than simply saying the two structures are class-disjoint." ] }, { "cell_type": "code", "execution_count": null, "id": "f76a4e68-90df-4b6a-990a-75547a6bd3d0", "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.5" } }, "nbformat": 4, "nbformat_minor": 5 }