{ "cells": [ { "cell_type": "markdown", "id": "0f6c4513", "metadata": {}, "source": [ "# OAK rollup command\n", "\n", "This notebook is intended as a supplement to the [main OAK CLI docs](https://incatools.github.io/ontology-access-kit/cli.html).\n", "\n", "This notebook provides examples for the `rollup` command which produces a summarization of ontology associations\n", "between entities and \"rolled-up\" ancestor terms.\n", "\n", "## Help Option\n", "\n", "You can get help on any OAK command using `--help`" ] }, { "cell_type": "code", "execution_count": 1, "id": "65db4b53", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Usage: runoak rollup [OPTIONS] [TERMS]...\r\n", "\r\n", " Produce an association rollup report.\r\n", "\r\n", " The report will list associations where the subject is one of the terms\r\n", " provided. The associations will be grouped by any provided --object-group\r\n", " options. This option can be provided multiple times. If the value is a comma\r\n", " separated list of object IDs, the first will be used as a primary grouping\r\n", " dimension and the remainder will be used to create sub-groups.\r\n", "\r\n", " Example:\r\n", "\r\n", " runoak -i sqlite:go.db -g wb.gaf -G gaf rollup --object-\r\n", " group GO:0032502,GO:0007568,GO:0048869,GO:0098727 --object-\r\n", " group GO:0008152,GO:0009056,GO:0044238,GO:1901275 --object-\r\n", " group GO:0050896,GO:0051716,GO:0051606,GO:0051606,GO:0014823\r\n", " --object-group=GO:0023052 --output rollup.html\r\n", " WB:WBGene00000417 WB:WBGene00000912 WB:WBGene00000898 WB:WBGene00006752\r\n", "\r\n", " By default, is-a relationships between association objects are used to\r\n", " perform the rollup. Use the -p/--predicates option to change this behavior.\r\n", "\r\n", "Options:\r\n", " -o, --output FILENAME Output file, e.g. obo file\r\n", " -p, --predicates TEXT A comma-separated list of predicates\r\n", " --autolabel / --no-autolabel If set, results will automatically have labels\r\n", " assigned [default: autolabel]\r\n", " -O, --output-type TEXT Desired output type\r\n", " --object-group TEXT An object ID to group by. If a comma separated\r\n", " list of IDs is provided, the first one is\r\n", " interpreted as a top-level grouping and the\r\n", " remaining IDs are interpreted as sub-groups\r\n", " within.\r\n", " --help Show this message and exit.\r\n" ] } ], "source": [ "!runoak rollup --help" ] }, { "cell_type": "markdown", "id": "c8878ac5", "metadata": {}, "source": [ "## Download example file and setup\n", "\n", "We will use the HPO Association file" ] }, { "cell_type": "code", "execution_count": 1, "id": "12a41f0d", "metadata": {}, "outputs": [], "source": [ "!curl -L -s http://purl.obolibrary.org/obo/hp/hpoa/genes_to_phenotype.txt > input/g2p.tsv" ] }, { "cell_type": "markdown", "id": "d57ac006", "metadata": {}, "source": [ "next we will set up an hpo alias" ] }, { "cell_type": "code", "execution_count": 1, "id": "dc71c543", "metadata": {}, "outputs": [], "source": [ "alias hp runoak -i sqlite:obo:hp" ] }, { "cell_type": "markdown", "id": "6033aa66", "metadata": {}, "source": [ "Test this out by querying for associations for a particular orpha disease.\n", "\n", "We need to pass in the association file we downloaded, as well as specify the file type (with `-G`):" ] }, { "cell_type": "code", "execution_count": 3, "id": "2cfa1be8", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "subject\tsubject_label\tpredicate\tobject\tobject_label\tproperty_values\r", "\r\n", "NCBIGene:8195\tNone\tNone\tHP:0000218\tHigh palate\t[]\r", "\r\n", "NCBIGene:8195\tNone\tNone\tHP:0001156\tBrachydactyly\t[]\r", "\r\n", "NCBIGene:8195\tNone\tNone\tHP:0001830\tPostaxial foot polydactyly\t[]\r", "\r\n", "NCBIGene:8195\tNone\tNone\tHP:0001508\tFailure to thrive\t[]\r", "\r\n", "NCBIGene:8195\tNone\tNone\tHP:0008678\tRenal hypoplasia/aplasia\t[]\r", "\r\n", "NCBIGene:8195\tNone\tNone\tHP:0006101\tFinger syndactyly\t[]\r", "\r\n", "NCBIGene:8195\tNone\tNone\tHP:0001643\tPatent ductus arteriosus\t[]\r", "\r\n", "NCBIGene:8195\tNone\tNone\tHP:0004383\tHypoplastic left heart\t[]\r", "\r\n", "NCBIGene:8195\tNone\tNone\tHP:0000807\tGlandular hypospadias\t[]\r", "\r\n" ] } ], "source": [ "hp -G hpoa_g2p -g input/g2p.tsv associations -Q subject NCBIGene:8195 -O csv | head" ] }, { "cell_type": "markdown", "id": "c9047f55", "metadata": {}, "source": [ "## Enrichment\n", "\n", "Next we will enrich based on a gene set" ] }, { "cell_type": "code", "execution_count": 4, "id": "ab30433e", "metadata": {}, "outputs": [], "source": [ "alias mondo runoak -i sqlite:obo:mondo" ] }, { "cell_type": "code", "execution_count": null, "id": "f70ff2bc", "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.5" } }, "nbformat": 4, "nbformat_minor": 5 }