{ "cells": [ { "cell_type": "markdown", "id": "b3b0dbee", "metadata": {}, "source": [ "# PhenIO tutorial\n", "\n", "* author: Chris Mungall\n", "* created: 2022-08-16\n", "\n", "This tutorial walks through phenio, the goals are:\n", "\n", "- to help understand the structure of phenio\n", "- to show how to do advanced OAK queries (CLI and programmatic) on PhenIO\n", "- to explore semsim techniques\n" ] }, { "cell_type": "markdown", "id": "9aced0d9", "metadata": {}, "source": [ "## Create an alias\n", "\n", "For convenience we will set a bash alias.\n", "\n", "This assumes you have built the sqlite version of phenio in the `../../db` folder\n", "\n", "__NOTE__ an alternative is:\n", "\n", "`%alias phenio runoak -i sqlite:obo:phenio`" ] }, { "cell_type": "code", "execution_count": 2, "id": "44bd10cd", "metadata": {}, "outputs": [], "source": [ "%alias phenio runoak -i ../../db/phenio.db" ] }, { "cell_type": "markdown", "id": "09595dfc", "metadata": {}, "source": [ "### Basic lookup queries\n", "\n", "Let's check it's working:" ] }, { "cell_type": "code", "execution_count": 3, "id": "81399308", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "HP:0008609 ! Morphological abnormality of the middle ear (HPO)\r\n" ] } ], "source": [ "phenio info HP:0008609" ] }, { "cell_type": "code", "execution_count": 4, "id": "fb640127", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "HP:0000370 ! Abnormality of the middle ear (HPO)\r\n", "HP:0004452 ! Abnormality of the middle ear ossicles (HPO)\r\n", "HP:0008609 ! Morphological abnormality of the middle ear (HPO)\r\n", "HP:0011452 ! Functional abnormality of the middle ear (HPO)\r\n" ] } ], "source": [ "phenio info \"l~abnormality of the middle ear\"" ] }, { "cell_type": "markdown", "id": "f8b16045", "metadata": {}, "source": [ "## Exploring the structure of PhenIO" ] }, { "cell_type": "code", "execution_count": 5, "id": "1e122ab2", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "subject\tpredicate\tobject\tsubject_label\tpredicate_label\tobject_label\n", "HP:0008609\trdfs:subClassOf\tHP:0000370\tMorphological abnormality of the middle ear (HPO)\tNone\tAbnormality of the middle ear (HPO)\n", "HP:0008609\trdfs:subClassOf\tUPHENO:0021692\tMorphological abnormality of the middle ear (HPO)\tNone\tabnormal middle ear morphology\n" ] } ], "source": [ "phenio relationships HP:0008609" ] }, { "cell_type": "markdown", "id": "66be6c73", "metadata": {}, "source": [ "__NOTE__ I am surprised there is no linkages to Uberon here. Something has gone wrong.\n", "\n", "Let's try a different example" ] }, { "cell_type": "code", "execution_count": 6, "id": "e14dacae", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "subject\tpredicate\tobject\tsubject_label\tpredicate_label\tobject_label\r", "\r\n", "HP:0008609\trdfs:subClassOf\tHP:0000370\tMorphological abnormality of the middle ear (HPO)\tNone\tAbnormality of the middle ear (HPO)\r", "\r\n", "HP:0008609\trdfs:subClassOf\tUPHENO:0021692\tMorphological abnormality of the middle ear (HPO)\tNone\tabnormal middle ear morphology\r", "\r\n", "HP:0011453\tUPHENO:0000001\tUBERON:0001688\tAbnormality of the incus (HPO)\thas phenotype affecting\tincus bone\r", "\r\n", "HP:0011453\tUPHENO:0000003\tUBERON:0001688\tAbnormality of the incus (HPO)\tphenotype has associated entity\tincus bone\r", "\r\n", "HP:0011453\trdfs:subClassOf\tHP:0004452\tAbnormality of the incus (HPO)\tNone\tAbnormality of the middle ear ossicles (HPO)\r", "\r\n", "HP:0011453\trdfs:subClassOf\tUPHENO:0075925\tAbnormality of the incus (HPO)\tNone\tabnormal incus bone\r", "\r\n" ] } ], "source": [ "phenio relationships HP:0008609 HP:0011453" ] }, { "cell_type": "markdown", "id": "7d3dace1", "metadata": {}, "source": [ "Here we can see linkages to external ontologies using two relations\n", "\n", "We can find out more about these:" ] }, { "cell_type": "code", "execution_count": 7, "id": "bb2399c3", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "UPHENO:0000001 ! has phenotype affecting\r\n", "UPHENO:0000003 ! phenotype has associated entity\r\n" ] } ], "source": [ "phenio info UPHENO:0000001 UPHENO:0000003" ] }, { "cell_type": "markdown", "id": "b79f675d", "metadata": {}, "source": [ "Here we can see linkages to external ontologies using two relations\n", "\n", "### Querying ancestors\n", "\n", "We will try finding all ancestors of HP:0008609 (Morphological abnormality of the middle ear)\n", "\n", "__IMPORTANT__ in OAK, all graph commands are parameterized by predicate lists. Consult the OAK docs if you\n", "don't understand what this means!\n", "\n", "To find all is-a ancestors (i.e. ancestors following SubClassOf between named classes) we use `-p i`:" ] }, { "cell_type": "code", "execution_count": 9, "id": "90d3806c", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "BFO:0000001 ! entity\r\n", "BFO:0000002 ! continuant\r\n", "BFO:0000020 ! specifically dependent continuant\r\n", "HP:0000001 ! All (HPO)\r\n", "HP:0000118 ! Phenotypic abnormality (HPO)\r\n", "HP:0000152 ! Abnormality of head or neck (HPO)\r\n", "HP:0000234 ! Abnormality of the head (HPO)\r\n", "HP:0000370 ! Abnormality of the middle ear (HPO)\r\n", "HP:0000598 ! Abnormality of the ear (HPO)\r\n", "HP:0008609 ! Morphological abnormality of the middle ear (HPO)\r\n", "HP:0031703 ! Abnormal ear morphology (HPO)\r\n", "PATO:0000001 ! quality\r\n", "UPHENO:0001001 ! Phenotype\r\n", "UPHENO:0001001 ! phenotype\r\n", "UPHENO:0001002 ! Phenotypic abnormality\r\n", "UPHENO:0001003 ! phenotype by ontology source\r\n", "UPHENO:0001005 ! abnormal phenotype by ontology source\r\n", "UPHENO:0002536 ! abnormal anatomical entity\r\n", "UPHENO:0002764 ! abnormal craniocervical region\r\n", "UPHENO:0002844 ! abnormal head\r\n", "UPHENO:0002903 ! abnormal ear\r\n", "UPHENO:0003017 ! abnormal middle ear\r\n", "UPHENO:0015280 ! abnormal anatomical entity morphology in the independent continuant\r\n", "UPHENO:0020584 ! abnormal anatomical entity morphology\r\n", "UPHENO:0021692 ! abnormal middle ear morphology\r\n", "UPHENO:0075696 ! abnormal anatomical entity\r\n", "UPHENO:0076692 ! abnormal anatomical entity morphology\r\n", "UPHENO:0076730 ! abnormal ear morphology\r\n", "UPHENO:0081581 ! abnormal multicellular organism morphology\r\n" ] } ], "source": [ "phenio ancestors -p i HP:0008609" ] }, { "cell_type": "markdown", "id": "58f23581", "metadata": {}, "source": [ "Next we will generate a visualization from this using the `viz` command:" ] }, { "cell_type": "code", "execution_count": 10, "id": "c10dc41f", "metadata": {}, "outputs": [], "source": [ "phenio viz -p i HP:0008609 -o output/HP_0008609.png" ] }, { "cell_type": "markdown", "id": "a354b6da", "metadata": {}, "source": [ "![img](output/HP_0008609.png)" ] }, { "cell_type": "markdown", "id": "216f2c87", "metadata": {}, "source": [ "Note we are using the default OAK stylesheet which colors Upheno in black, HPO in blue-green, etc. For more info\n", "on visualization and stylesheets see [OboGraphViz](https://github.com/INCATools/obographviz/)\n", "\n", "#### siblings\n", "\n", "Next we will explore siblings. Again, we parameterize the predicate list to avoid flooding results:" ] }, { "cell_type": "code", "execution_count": 10, "id": "c2a52d6e", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "HP:0000388 ! Otitis media (HPO)\n", "HP:0008609 ! Morphological abnormality of the middle ear (HPO)\n", "HP:0011452 ! Functional abnormality of the middle ear (HPO)\n", "HP:0040090 ! Abnormality of the tympanic membrane (HPO)\n", "HP:0040099 ! Abnormality of the round window (HPO)\n", "HP:0040100 ! Abnormality of the vestibular window (HPO)\n", "HP:0040115 ! Abnormality of the Eustachian tube (HPO)\n", "HP:0040262 ! Glue ear (HPO)\n", "HP:0040268 ! Recurrent infections of the middle ear (HPO)\n", "HP:0100799 ! Neoplasm of the middle ear (HPO)\n", "MP:0000049 ! abnormal middle ear morphology (MPO)\n", "XPO:0102637 ! abnormal middle ear morphology (XPO)\n" ] } ], "source": [ "phenio siblings -p i HP:0008609 " ] }, { "cell_type": "markdown", "id": "be928d3a", "metadata": {}, "source": [ "let's take a closer look at some of these siblings.\n", "\n", "the `viz` command, like most commands in OAK, takes as argument an open ended list of queries:" ] }, { "cell_type": "code", "execution_count": 11, "id": "9247b790", "metadata": {}, "outputs": [], "source": [ "phenio viz -p i HP:0008609 MP:0000049 -o output/HP_0008609_MP_0000049.png" ] }, { "cell_type": "markdown", "id": "4dc87998", "metadata": {}, "source": [ "![img](output/HP_0008609_MP_0000049.png)" ] }, { "cell_type": "code", "execution_count": 12, "id": "6d39a58a", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "MP:0000029 ! abnormal malleus morphology (MPO)\n", "MP:0000030 ! abnormal tympanic ring morphology (MPO)\n", "MP:0000040 ! absent middle ear ossicles (MPO)\n", "MP:0000049 ! abnormal middle ear morphology (MPO)\n", "MP:0003110 ! absent malleus processus brevis (MPO)\n", "MP:0003138 ! absent tympanic ring (MPO)\n", "MP:0003330 ! abnormal auditory tube morphology (MPO)\n", "MP:0003739 ! dense middle ear ossicles (MPO)\n", "MP:0003740 ! fusion of middle ear ossicles (MPO)\n", "MP:0004204 ! absent stapes (MPO)\n", "MP:0004290 ! abnormal stapes footplate morphology (MPO)\n", "MP:0004318 ! absent incus (MPO)\n", "MP:0004319 ! absent malleus (MPO)\n", "MP:0004479 ! abnormal oval window morphology (MPO)\n", "MP:0004480 ! abnormal round window morphology (MPO)\n", "MP:0004541 ! absent auditory tube (MPO)\n", "MP:0004665 ! abnormal stapedial artery morphology (MPO)\n", "MP:0004666 ! absent stapedial artery (MPO)\n", "MP:0005105 ! abnormal middle ear ossicle morphology (MPO)\n", "MP:0005106 ! abnormal incus morphology (MPO)\n", "MP:0005107 ! abnormal stapes morphology (MPO)\n", "MP:0006018 ! abnormal tympanic membrane morphology (MPO)\n", "MP:0006019 ! absent tympanic membrane (MPO)\n", "MP:0006020 ! decreased tympanic ring size (MPO)\n", "MP:0008372 ! small malleus (MPO)\n", "MP:0008373 ! short malleus (MPO)\n", "MP:0008374 ! abnormal malleus manubrium morphology (MPO)\n", "MP:0008375 ! short malleus manubrium (MPO)\n", "MP:0008376 ! small malleus manubrium (MPO)\n", "MP:0008377 ! absent malleus manubrium (MPO)\n", "MP:0008378 ! small malleus processus brevis (MPO)\n", "MP:0008379 ! absent malleus head (MPO)\n", "MP:0008380 ! abnormal gonial bone morphology (MPO)\n", "MP:0008381 ! absent gonial bone (MPO)\n", "MP:0008382 ! gonial bone hypoplasia (MPO)\n", "MP:0008383 ! enlarged gonial bone (MPO)\n", "MP:0010324 ! abnormal malleus processus brevis morphology (MPO)\n", "MP:0010325 ! abnormal malleus head morphology (MPO)\n", "MP:0010326 ! malleus hypoplasia (MPO)\n", "MP:0010327 ! abnormal malleus neck morphology (MPO)\n", "MP:0010328 ! thin malleus neck (MPO)\n", "MP:0013871 ! abnormal stapedial artery topology (MPO)\n", "MP:0020900 ! abnormal middle ear epithelium morphology (MPO)\n", "MP:0020901 ! abnormal middle ear goblet cell morphology (MPO)\n", "MP:0020902 ! abnormal middle ear goblet cell number (MPO)\n", "MP:0020903 ! increased middle ear goblet cell number (MPO)\n", "MP:0020904 ! decreased middle ear goblet cell number (MPO)\n", "MP:0030084 ! tympanic ring hypoplasia (MPO)\n", "MP:0030106 ! small incus (MPO)\n", "MP:0030107 ! incus hypoplasia (MPO)\n", "MP:0030110 ! incudomalleolar fusion (MPO)\n", "MP:0030123 ! small middle ear ossicles (MPO)\n", "MP:0030124 ! middle ear ossicle hypoplasia (MPO)\n", "MP:0030125 ! small gonial bone (MPO)\n", "MP:0030127 ! small stapes (MPO)\n", "MP:0030128 ! stapes hypoplasia (MPO)\n", "MP:0030154 ! abnormal tympanic cavity morphology (MPO)\n", "MP:0030155 ! absent tympanic cavity (MPO)\n", "MP:0030156 ! abnormal tympanic cavity muscle morphology (MPO)\n", "MP:0030157 ! abnormal stapedius muscle morphology (MPO)\n", "MP:0030158 ! absent stapedius muscle (MPO)\n", "MP:0030159 ! abnormal tensor tympani muscle morphology (MPO)\n", "MP:0030187 ! abnormal epitympanic recess morphology (MPO)\n", "MP:0030213 ! gonial bone hyperplasia (MPO)\n", "MP:0030226 ! middle ear polyps (MPO)\n", "MP:0030227 ! abnormal tubotympanic recess morphology (MPO)\n", "MP:0030228 ! absent tubotympanic recess (MPO)\n", "MP:0030321 ! abnormal tegmen tympani morphology (MPO)\n", "MP:0030394 ! abnormal incus short process morphology (MPO)\n", "MP:0030395 ! absent incus short process (MPO)\n", "MP:0030396 ! abnormal incus long process morphology (MPO)\n", "MP:0030397 ! abnormal incus lenticular process morphology (MPO)\n", "MP:0030398 ! absent incus lenticular process (MPO)\n", "MP:0030399 ! abnormal incus body morphology (MPO)\n", "MP:0030400 ! abnormal stapes annular ligament morphology (MPO)\n", "MP:0030401 ! absent stapes annular ligament (MPO)\n", "MP:0030402 ! abnormal stapes head morphology (MPO)\n", "MP:0030403 ! absent stapes head (MPO)\n", "MP:0030404 ! abnormal stapes obturator foramen morphology (MPO)\n", "MP:0030405 ! small stapes obturator foramen (MPO)\n", "MP:0030406 ! absent stapes obturator foramen (MPO)\n", "MP:0030407 ! abnormal stapes crus morpholgy (MPO)\n", "MP:0030408 ! abnormal stapes posterior crus morphology (MPO)\n", "MP:0030409 ! abnormal stapes anterior crus morphology (MPO)\n", "MP:0030410 ! middle ear effusion (MPO)\n", "MP:0030411 ! small round window (MPO)\n", "MP:0030412 ! absent round window (MPO)\n", "MP:0030413 ! tympanic membrane retraction (MPO)\n", "MP:0030414 ! tympanic membrane perforation (MPO)\n", "MP:0030465 ! absent oval window (MPO)\n" ] } ], "source": [ "phenio descendants -p i \"abnormal middle ear morphology (MPO)\" " ] }, { "cell_type": "markdown", "id": "6264a795", "metadata": {}, "source": [ "### Pairwise term similarity\n", "\n", "Next we will explore the nascent semantic similarity functions in OAK\n", "\n", "Note that the data model and signatures may change slightly here in the future.\n", "\n", "Once again, it is important to understand how OAK handles graphs - all similarity methods are parameterized\n", "by predicate lists. Let's start with the simple case of is-a hierarchies:" ] }, { "cell_type": "code", "execution_count": 13, "id": "c082b7b6", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "subject_id: HP:0000388\r\n", "object_id: MP:0030395\r\n", "ancestor_id: UPHENO:0003017\r\n", "ancestor_information_content: 12.860723034493532\r\n", "jaccard_similarity: 0.22077922077922077\r\n", "phenodigm_score: 1.6850461151591245\r\n" ] } ], "source": [ "phenio similarity -p i HP:0000388 MP:0030395" ] }, { "cell_type": "markdown", "id": "85aa90d3", "metadata": {}, "source": [ "TODOs:\n", "\n", "- allow calculation of IC from background annotations\n", "- add an `--autolabel` option (other OAK commands have this)\n", "\n", "to see what the MRCA is:" ] }, { "cell_type": "code", "execution_count": 15, "id": "2fa9ad42", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "UPHENO:0003017 ! abnormal middle ear\r\n" ] } ], "source": [ "phenio info UPHENO:0003017" ] }, { "cell_type": "code", "execution_count": 16, "id": "b5e20c50", "metadata": {}, "outputs": [], "source": [ "# this is how we would do all-by-all\n", "#phenio all-similarity -p i,p --set1 i^HP: --set2 i^MP: --ic-minimum 3.0 --jaccard-minimum 0.65" ] }, { "cell_type": "code", "execution_count": 17, "id": "ec989314", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "* [] UPHENO:0003017 ! abnormal middle ear\n", " * [i] HP:0000370 ! Abnormality of the middle ear (HPO)\n", " * [i] **HP:0000388 ! Otitis media (HPO)**\n", "* [] HP:0031703 ! Abnormal ear morphology (HPO)\n", " * [i] HP:0000370 ! Abnormality of the middle ear (HPO)\n", " * [i] **HP:0000388 ! Otitis media (HPO)**\n", "* [] HP:0012647 ! Abnormal inflammatory response (HPO)\n", " * [i] HP:0012649 ! Increased inflammatory response (HPO)\n", " * [i] **HP:0000388 ! Otitis media (HPO)**\n", "* [] UPHENO:0074782 ! increased biological_process in middle ear\n", " * [i] UPHENO:0074629 ! increased inflammatory response in middle ear\n", " * [i] **HP:0000388 ! Otitis media (HPO)**\n", "* [] UPHENO:0074685 ! increased inflammatory response in independent continuant\n", " * [i] UPHENO:0074629 ! increased inflammatory response in middle ear\n", " * [i] **HP:0000388 ! Otitis media (HPO)**\n", "* [] MP:0005106 ! abnormal incus morphology (MPO)\n", " * [i] MP:0030394 ! abnormal incus short process morphology (MPO)\n", " * [i] **MP:0030395 ! absent incus short process (MPO)**\n" ] } ], "source": [ "phenio tree -p i HP:0000388 MP:0030395 --max-hops 2" ] }, { "cell_type": "markdown", "id": "c6380351", "metadata": {}, "source": [ "### Using other ontologies\n", "\n", "We can parameterize the similarity calculation to make use of is-a (i), part-of (p) (which typically traverse *within* an ontology), and UPHENO:0000003 (phenotype has associated entity) (which traverses \"sideways\" from a phenotype ontology to an \"entity\" ontology).\n", "\n", "Here we will ask for the similarity between the concept of the *stapes bone* and the concept of *Abnormality of the incus*:" ] }, { "cell_type": "code", "execution_count": 18, "id": "64484901", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "subject_id: UBERON:0001687\r\n", "object_id: HP:0011453\r\n", "ancestor_id: UBERON:0001686\r\n", "ancestor_information_content: 12.16586147141704\r\n", "jaccard_similarity: 0.4329896907216495\r\n", "phenodigm_score: 2.2951454411150713\r\n" ] } ], "source": [ "phenio similarity -p i,p,UPHENO:0000003 UBERON:0001687 HP:0011453" ] }, { "cell_type": "code", "execution_count": 20, "id": "b3695681", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "UBERON:0001686 ! auditory ossicle bone\r\n" ] } ], "source": [ "phenio info UBERON:0001686" ] }, { "cell_type": "code", "execution_count": 19, "id": "5276e2b2", "metadata": {}, "outputs": [], "source": [ "phenio viz HP:0011453 -p i,UPHENO:0000003 -o output/with-uberon.png" ] }, { "cell_type": "markdown", "id": "bd861a02", "metadata": {}, "source": [ "![img](output/with-uberon.png)" ] }, { "cell_type": "code", "execution_count": null, "id": "0cfdb2db", "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.5" } }, "nbformat": 4, "nbformat_minor": 5 }