{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "0f6c4513",
   "metadata": {},
   "source": [
    "# command: crawl\n",
    "\n",
    "This notebook is intended as a supplement to the [main OAK CLI docs](https://incatools.github.io/ontology-access-kit/cli.html).\n",
    "\n",
    "This notebook provides examples for the `crawl` command, which is used to walk over multiple ontologies and endpoints.\n",
    "\n",
    "## Help Option\n",
    "\n",
    "You can get help on any OAK command using `--help`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "65db4b53",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Usage: runoak crawl [OPTIONS] [TERMS]...\n",
      "\n",
      "  Crawl one or more ontologies, hopping over edges and mappings\n",
      "\n",
      "  Crawl is a powerful command that allows for multi-ontology traversal,\n",
      "  particularly on mapping paths. Multiple ontologies and ontology sources\n",
      "  (e.g. BioPortal, OLS) provide mappings between terms. No single ontology is\n",
      "  likely to have a complete source. Using crawl, you can walk across the union\n",
      "  of mappings in all ontologies, with custom rules for each ontology (e.g.\n",
      "  normalizing prefixes).\n",
      "\n",
      "  Documentation for this command will be provided in a separate notebook.\n",
      "\n",
      "Options:\n",
      "  -o, --output TEXT               Path to output file\n",
      "  -O, --output-type TEXT          Desired output type\n",
      "  --autolabel / --no-autolabel    If set, results will automatically have\n",
      "                                  labels assigned  [default: autolabel]\n",
      "  -M, --maps-to-source TEXT       Return only mappings with subject or object\n",
      "                                  source equal to this\n",
      "  --mapper TEXT                   A selector for an adapter that is to be used\n",
      "                                  for the main lookup operation\n",
      "  --unmelt / --no-unmelt          Use a wide table for display.  [default: no-\n",
      "                                  unmelt]\n",
      "  --adapters TEXT                 A comma-separated list of adapters\n",
      "  --allowed-prefixes TEXT         A comma-separated list of prefixes to\n",
      "                                  traverse over\n",
      "  --mapping-predicates TEXT       A comma-separated list of mapping predicates\n",
      "                                  to traverse over\n",
      "  --viz / --no-viz                If true then draw a graph  [default: no-viz]\n",
      "  -d, --directory TEXT            Directory to write output files\n",
      "  --whole-ontology / --no-whole-ontology\n",
      "                                  Run over whole ontology  [default: no-whole-\n",
      "                                  ontology]\n",
      "  -C, --config-yaml TEXT\n",
      "  --help                          Show this message and exit.\n"
     ]
    }
   ],
   "source": [
    "!runoak crawl --help"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "d0fdbd5f-8d04-4e85-86c8-5ef0f7873ac7",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "adapter_configs:\n",
      "  OMIM:\n",
      "  DOID:\n",
      "    prefix_normalization_map:\n",
      "      \"MIM:PS\": \"OMIMPS:\"\n",
      "      MIM: OMIM\n",
      "      UMLS_CUI: UMLS\n",
      "      SNOMEDCT_US_2023_03_01: SCTID\n",
      "      NCI: NCIT\n",
      "  ORDO:\n",
      "    prefix_normalization_map:\n",
      "      MeSH: MESH\n",
      "  NCIT:\n",
      "  GARD:\n",
      "adapter_specs:\n",
      "  OMIM: /Users/cjm/repos/semantic-sql/db/omim.db\n",
      "allowed_prefixes: [OMIM, DOID, ORDO, OMIMPS, GARD, NCIT, OMIMPS]\n",
      "mapping_predicates:\n",
      "  - oio:hasDbXref\n",
      "  - skos:exactMatch\n"
     ]
    }
   ],
   "source": [
    "!cat input/mapping-crawler-config.yaml"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "7b051587-1399-4b7b-ad76-a6ca64496d1e",
   "metadata": {},
   "outputs": [],
   "source": [
    "!runoak  crawl -C input/mapping-crawler-config.yaml -d output/refsum-analysis GARD:4648  --viz -o output/refsum.png"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "408e4bc2-a14f-44bb-ac06-bcee10862675",
   "metadata": {},
   "source": [
    "![img](output/refsum.png)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "id": "9091520d-4615-4943-ac57-df20cde245f8",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "/Users/cjm/Library/Caches/pypoetry/virtualenvs/oaklib-OeQZizwE-py3.9/lib/python3.9/site-packages/sssom/util.py:162: FutureWarning: Downcasting behavior in `replace` is deprecated and will be removed in a future version. To retain the old behavior, explicitly call `result.infer_objects(copy=False)`. To opt-in to the future behavior, set `pd.set_option('future.no_silent_downcasting', True)`\n",
      "  df.replace(\"\", np.nan, inplace=True)\n",
      "/Users/cjm/Library/Caches/pypoetry/virtualenvs/oaklib-OeQZizwE-py3.9/lib/python3.9/site-packages/sssom/util.py:162: FutureWarning: Downcasting behavior in `replace` is deprecated and will be removed in a future version. To retain the old behavior, explicitly call `result.infer_objects(copy=False)`. To opt-in to the future behavior, set `pd.set_option('future.no_silent_downcasting', True)`\n",
      "  df.replace(\"\", np.nan, inplace=True)\n"
     ]
    }
   ],
   "source": [
    "!runoak  crawl -C input/mapping-crawler-config.yaml -d output/refsum-analysis --whole-ontology GARD:4648 GARD:6322"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "id": "1d81be3c-d986-4a40-9fbc-ec6b904eaf06",
   "metadata": {
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "output/refsum-analysis\n",
      "output/refsum-analysis/clique_summary.csv\n",
      "output/refsum-analysis/clique_results.csv\n",
      "output/refsum-analysis/cliques\n",
      "output/refsum-analysis/cliques/GARD_4648.sssom.tsv\n",
      "output/refsum-analysis/cliques/GARD_6322.sssom.tsv\n"
     ]
    }
   ],
   "source": [
    "!find output/refsum-analysis"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "id": "ea17b5d8-f758-4f52-a17a-b8ad142572a9",
   "metadata": {},
   "outputs": [],
   "source": [
    "import pandas as pd"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "id": "16ac02c7-d026-4aec-9ce2-198bf0d05289",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Unnamed: 0</th>\n",
       "      <th>mapping_count</th>\n",
       "      <th>entity_count</th>\n",
       "      <th>average_incoherency</th>\n",
       "      <th>max_incoherency</th>\n",
       "      <th>incoherency_GARD</th>\n",
       "      <th>incoherency_ORDO</th>\n",
       "      <th>incoherency_OMIM</th>\n",
       "      <th>incoherency_DOID</th>\n",
       "      <th>incoherency_NCIT</th>\n",
       "      <th>incoherency_OMIMPS</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>count</td>\n",
       "      <td>2.000000</td>\n",
       "      <td>2.000000</td>\n",
       "      <td>2.000000</td>\n",
       "      <td>2.00000</td>\n",
       "      <td>2.00000</td>\n",
       "      <td>2.000000</td>\n",
       "      <td>1.0</td>\n",
       "      <td>2.00000</td>\n",
       "      <td>2.000000</td>\n",
       "      <td>1.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>mean</td>\n",
       "      <td>145.500000</td>\n",
       "      <td>44.500000</td>\n",
       "      <td>7.900000</td>\n",
       "      <td>12.50000</td>\n",
       "      <td>12.50000</td>\n",
       "      <td>2.000000</td>\n",
       "      <td>24.0</td>\n",
       "      <td>12.50000</td>\n",
       "      <td>0.500000</td>\n",
       "      <td>0.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>std</td>\n",
       "      <td>194.454365</td>\n",
       "      <td>55.861436</td>\n",
       "      <td>11.172287</td>\n",
       "      <td>17.67767</td>\n",
       "      <td>17.67767</td>\n",
       "      <td>2.828427</td>\n",
       "      <td>NaN</td>\n",
       "      <td>17.67767</td>\n",
       "      <td>0.707107</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>min</td>\n",
       "      <td>8.000000</td>\n",
       "      <td>5.000000</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>0.00000</td>\n",
       "      <td>0.00000</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>24.0</td>\n",
       "      <td>0.00000</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>0.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>25%</td>\n",
       "      <td>76.750000</td>\n",
       "      <td>24.750000</td>\n",
       "      <td>3.950000</td>\n",
       "      <td>6.25000</td>\n",
       "      <td>6.25000</td>\n",
       "      <td>1.000000</td>\n",
       "      <td>24.0</td>\n",
       "      <td>6.25000</td>\n",
       "      <td>0.250000</td>\n",
       "      <td>0.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>50%</td>\n",
       "      <td>145.500000</td>\n",
       "      <td>44.500000</td>\n",
       "      <td>7.900000</td>\n",
       "      <td>12.50000</td>\n",
       "      <td>12.50000</td>\n",
       "      <td>2.000000</td>\n",
       "      <td>24.0</td>\n",
       "      <td>12.50000</td>\n",
       "      <td>0.500000</td>\n",
       "      <td>0.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>75%</td>\n",
       "      <td>214.250000</td>\n",
       "      <td>64.250000</td>\n",
       "      <td>11.850000</td>\n",
       "      <td>18.75000</td>\n",
       "      <td>18.75000</td>\n",
       "      <td>3.000000</td>\n",
       "      <td>24.0</td>\n",
       "      <td>18.75000</td>\n",
       "      <td>0.750000</td>\n",
       "      <td>0.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>max</td>\n",
       "      <td>283.000000</td>\n",
       "      <td>84.000000</td>\n",
       "      <td>15.800000</td>\n",
       "      <td>25.00000</td>\n",
       "      <td>25.00000</td>\n",
       "      <td>4.000000</td>\n",
       "      <td>24.0</td>\n",
       "      <td>25.00000</td>\n",
       "      <td>1.000000</td>\n",
       "      <td>0.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "  Unnamed: 0  mapping_count  entity_count  average_incoherency  \\\n",
       "0      count       2.000000      2.000000             2.000000   \n",
       "1       mean     145.500000     44.500000             7.900000   \n",
       "2        std     194.454365     55.861436            11.172287   \n",
       "3        min       8.000000      5.000000             0.000000   \n",
       "4        25%      76.750000     24.750000             3.950000   \n",
       "5        50%     145.500000     44.500000             7.900000   \n",
       "6        75%     214.250000     64.250000            11.850000   \n",
       "7        max     283.000000     84.000000            15.800000   \n",
       "\n",
       "   max_incoherency  incoherency_GARD  incoherency_ORDO  incoherency_OMIM  \\\n",
       "0          2.00000           2.00000          2.000000               1.0   \n",
       "1         12.50000          12.50000          2.000000              24.0   \n",
       "2         17.67767          17.67767          2.828427               NaN   \n",
       "3          0.00000           0.00000          0.000000              24.0   \n",
       "4          6.25000           6.25000          1.000000              24.0   \n",
       "5         12.50000          12.50000          2.000000              24.0   \n",
       "6         18.75000          18.75000          3.000000              24.0   \n",
       "7         25.00000          25.00000          4.000000              24.0   \n",
       "\n",
       "   incoherency_DOID  incoherency_NCIT  incoherency_OMIMPS  \n",
       "0           2.00000          2.000000                 1.0  \n",
       "1          12.50000          0.500000                 0.0  \n",
       "2          17.67767          0.707107                 NaN  \n",
       "3           0.00000          0.000000                 0.0  \n",
       "4           6.25000          0.250000                 0.0  \n",
       "5          12.50000          0.500000                 0.0  \n",
       "6          18.75000          0.750000                 0.0  \n",
       "7          25.00000          1.000000                 0.0  "
      ]
     },
     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "pd.read_csv(\"output/refsum-analysis/clique_summary.csv\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "id": "f3a06b4b-00f0-4b61-a612-2647338a83ad",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>seed</th>\n",
       "      <th>name</th>\n",
       "      <th>mapping_count</th>\n",
       "      <th>entity_count</th>\n",
       "      <th>entities</th>\n",
       "      <th>entity_labels</th>\n",
       "      <th>predicates</th>\n",
       "      <th>mapping_sources</th>\n",
       "      <th>average_incoherency</th>\n",
       "      <th>max_incoherency</th>\n",
       "      <th>sources</th>\n",
       "      <th>incoherency_GARD</th>\n",
       "      <th>incoherency_ORDO</th>\n",
       "      <th>incoherency_OMIM</th>\n",
       "      <th>incoherency_DOID</th>\n",
       "      <th>incoherency_NCIT</th>\n",
       "      <th>incoherency_OMIMPS</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>GARD:4648</td>\n",
       "      <td>NaN</td>\n",
       "      <td>283</td>\n",
       "      <td>84</td>\n",
       "      <td>['OMIM:614867', 'ORDO:772', 'OMIM:614873', 'GA...</td>\n",
       "      <td>{'GARD:4648': 'Infantile Refsum disease', 'ORD...</td>\n",
       "      <td>['rdfs:subClassOf', 'oio:hasDbXref', 'skos:exa...</td>\n",
       "      <td>['obo:ORDO', 'obo:GARD', 'obo:DOID', 'obo:OMIM']</td>\n",
       "      <td>15.8</td>\n",
       "      <td>25</td>\n",
       "      <td>['GARD', 'ORDO', 'OMIM', 'DOID', 'NCIT']</td>\n",
       "      <td>25</td>\n",
       "      <td>4</td>\n",
       "      <td>24.0</td>\n",
       "      <td>25</td>\n",
       "      <td>1</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>GARD:6322</td>\n",
       "      <td>NaN</td>\n",
       "      <td>8</td>\n",
       "      <td>5</td>\n",
       "      <td>['ORDO:98249', 'DOID:13359', 'NCIT:C34568', 'O...</td>\n",
       "      <td>{'DOID:13359': 'Ehlers-Danlos syndrome', 'GARD...</td>\n",
       "      <td>['oio:hasDbXref', 'skos:exactMatch']</td>\n",
       "      <td>['obo:DOID', 'obo:GARD']</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0</td>\n",
       "      <td>['DOID', 'GARD', 'ORDO', 'OMIMPS', 'NCIT']</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "        seed  name  mapping_count  entity_count  \\\n",
       "0  GARD:4648   NaN            283            84   \n",
       "1  GARD:6322   NaN              8             5   \n",
       "\n",
       "                                            entities  \\\n",
       "0  ['OMIM:614867', 'ORDO:772', 'OMIM:614873', 'GA...   \n",
       "1  ['ORDO:98249', 'DOID:13359', 'NCIT:C34568', 'O...   \n",
       "\n",
       "                                       entity_labels  \\\n",
       "0  {'GARD:4648': 'Infantile Refsum disease', 'ORD...   \n",
       "1  {'DOID:13359': 'Ehlers-Danlos syndrome', 'GARD...   \n",
       "\n",
       "                                          predicates  \\\n",
       "0  ['rdfs:subClassOf', 'oio:hasDbXref', 'skos:exa...   \n",
       "1               ['oio:hasDbXref', 'skos:exactMatch']   \n",
       "\n",
       "                                    mapping_sources  average_incoherency  \\\n",
       "0  ['obo:ORDO', 'obo:GARD', 'obo:DOID', 'obo:OMIM']                 15.8   \n",
       "1                          ['obo:DOID', 'obo:GARD']                  0.0   \n",
       "\n",
       "   max_incoherency                                     sources  \\\n",
       "0               25    ['GARD', 'ORDO', 'OMIM', 'DOID', 'NCIT']   \n",
       "1                0  ['DOID', 'GARD', 'ORDO', 'OMIMPS', 'NCIT']   \n",
       "\n",
       "   incoherency_GARD  incoherency_ORDO  incoherency_OMIM  incoherency_DOID  \\\n",
       "0                25                 4              24.0                25   \n",
       "1                 0                 0               NaN                 0   \n",
       "\n",
       "   incoherency_NCIT  incoherency_OMIMPS  \n",
       "0                 1                 NaN  \n",
       "1                 0                 0.0  "
      ]
     },
     "execution_count": 13,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "pd.read_csv(\"output/refsum-analysis/clique_results.csv\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "id": "8c511b79-c03e-4fd4-ba77-080c8d06ee46",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>subject_id</th>\n",
       "      <th>subject_label</th>\n",
       "      <th>predicate_id</th>\n",
       "      <th>object_id</th>\n",
       "      <th>object_label</th>\n",
       "      <th>mapping_justification</th>\n",
       "      <th>mapping_source</th>\n",
       "      <th>other</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>DOID:0080476</td>\n",
       "      <td>peroxisome biogenesis disorder 1A</td>\n",
       "      <td>oio:hasDbXref</td>\n",
       "      <td>OMIM:214100</td>\n",
       "      <td>peroxisome biogenesis disorder 1a (zellweger)</td>\n",
       "      <td>semapv:UnspecifiedMatching</td>\n",
       "      <td>obo:DOID</td>\n",
       "      <td>distance: 6, direction: -1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>DOID:0080476</td>\n",
       "      <td>peroxisome biogenesis disorder 1A</td>\n",
       "      <td>oio:hasDbXref</td>\n",
       "      <td>OMIM:214100</td>\n",
       "      <td>peroxisome biogenesis disorder 1a (zellweger)</td>\n",
       "      <td>semapv:UnspecifiedMatching</td>\n",
       "      <td>obo:DOID</td>\n",
       "      <td>distance: 7, direction: 1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>DOID:0080476</td>\n",
       "      <td>peroxisome biogenesis disorder 1A</td>\n",
       "      <td>rdfs:subClassOf</td>\n",
       "      <td>DOID:905</td>\n",
       "      <td>Zellweger syndrome</td>\n",
       "      <td>semapv:ManualMappingCuration</td>\n",
       "      <td>obo:DOID</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>DOID:0080477</td>\n",
       "      <td>peroxisome biogenesis disorder 2A</td>\n",
       "      <td>oio:hasDbXref</td>\n",
       "      <td>OMIM:214110</td>\n",
       "      <td>peroxisome biogenesis disorder 2a (zellweger)</td>\n",
       "      <td>semapv:UnspecifiedMatching</td>\n",
       "      <td>obo:DOID</td>\n",
       "      <td>distance: 6, direction: -1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>DOID:0080477</td>\n",
       "      <td>peroxisome biogenesis disorder 2A</td>\n",
       "      <td>oio:hasDbXref</td>\n",
       "      <td>OMIM:214110</td>\n",
       "      <td>peroxisome biogenesis disorder 2a (zellweger)</td>\n",
       "      <td>semapv:UnspecifiedMatching</td>\n",
       "      <td>obo:DOID</td>\n",
       "      <td>distance: 7, direction: 1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>278</th>\n",
       "      <td>ORDO:912</td>\n",
       "      <td>Zellweger syndrome</td>\n",
       "      <td>oio:hasDbXref</td>\n",
       "      <td>OMIM:614887</td>\n",
       "      <td>peroxisome biogenesis disorder 13a (zellweger)</td>\n",
       "      <td>semapv:UnspecifiedMatching</td>\n",
       "      <td>obo:ORDO</td>\n",
       "      <td>distance: 3, direction: 1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>279</th>\n",
       "      <td>ORDO:912</td>\n",
       "      <td>Zellweger syndrome</td>\n",
       "      <td>oio:hasDbXref</td>\n",
       "      <td>OMIM:614887</td>\n",
       "      <td>peroxisome biogenesis disorder 13a (zellweger)</td>\n",
       "      <td>semapv:UnspecifiedMatching</td>\n",
       "      <td>obo:ORDO</td>\n",
       "      <td>distance: 4, direction: -1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>280</th>\n",
       "      <td>ORDO:912</td>\n",
       "      <td>Zellweger syndrome</td>\n",
       "      <td>oio:hasDbXref</td>\n",
       "      <td>OMIM:617370</td>\n",
       "      <td>peroxisome biogenesis disorder 10b</td>\n",
       "      <td>semapv:UnspecifiedMatching</td>\n",
       "      <td>obo:ORDO</td>\n",
       "      <td>distance: 2, direction: -1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>281</th>\n",
       "      <td>ORDO:912</td>\n",
       "      <td>Zellweger syndrome</td>\n",
       "      <td>oio:hasDbXref</td>\n",
       "      <td>OMIM:617370</td>\n",
       "      <td>peroxisome biogenesis disorder 10b</td>\n",
       "      <td>semapv:UnspecifiedMatching</td>\n",
       "      <td>obo:ORDO</td>\n",
       "      <td>distance: 3, direction: 1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>282</th>\n",
       "      <td>ORDO:912</td>\n",
       "      <td>Zellweger syndrome</td>\n",
       "      <td>rdfs:subClassOf</td>\n",
       "      <td>ORDO:79189</td>\n",
       "      <td>Peroxisome biogenesis disorder</td>\n",
       "      <td>semapv:ManualMappingCuration</td>\n",
       "      <td>obo:ORDO</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>283 rows × 8 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "       subject_id                      subject_label     predicate_id  \\\n",
       "0    DOID:0080476  peroxisome biogenesis disorder 1A    oio:hasDbXref   \n",
       "1    DOID:0080476  peroxisome biogenesis disorder 1A    oio:hasDbXref   \n",
       "2    DOID:0080476  peroxisome biogenesis disorder 1A  rdfs:subClassOf   \n",
       "3    DOID:0080477  peroxisome biogenesis disorder 2A    oio:hasDbXref   \n",
       "4    DOID:0080477  peroxisome biogenesis disorder 2A    oio:hasDbXref   \n",
       "..            ...                                ...              ...   \n",
       "278      ORDO:912                 Zellweger syndrome    oio:hasDbXref   \n",
       "279      ORDO:912                 Zellweger syndrome    oio:hasDbXref   \n",
       "280      ORDO:912                 Zellweger syndrome    oio:hasDbXref   \n",
       "281      ORDO:912                 Zellweger syndrome    oio:hasDbXref   \n",
       "282      ORDO:912                 Zellweger syndrome  rdfs:subClassOf   \n",
       "\n",
       "       object_id                                    object_label  \\\n",
       "0    OMIM:214100   peroxisome biogenesis disorder 1a (zellweger)   \n",
       "1    OMIM:214100   peroxisome biogenesis disorder 1a (zellweger)   \n",
       "2       DOID:905                              Zellweger syndrome   \n",
       "3    OMIM:214110   peroxisome biogenesis disorder 2a (zellweger)   \n",
       "4    OMIM:214110   peroxisome biogenesis disorder 2a (zellweger)   \n",
       "..           ...                                             ...   \n",
       "278  OMIM:614887  peroxisome biogenesis disorder 13a (zellweger)   \n",
       "279  OMIM:614887  peroxisome biogenesis disorder 13a (zellweger)   \n",
       "280  OMIM:617370              peroxisome biogenesis disorder 10b   \n",
       "281  OMIM:617370              peroxisome biogenesis disorder 10b   \n",
       "282   ORDO:79189                  Peroxisome biogenesis disorder   \n",
       "\n",
       "            mapping_justification mapping_source                       other  \n",
       "0      semapv:UnspecifiedMatching       obo:DOID  distance: 6, direction: -1  \n",
       "1      semapv:UnspecifiedMatching       obo:DOID   distance: 7, direction: 1  \n",
       "2    semapv:ManualMappingCuration       obo:DOID                         NaN  \n",
       "3      semapv:UnspecifiedMatching       obo:DOID  distance: 6, direction: -1  \n",
       "4      semapv:UnspecifiedMatching       obo:DOID   distance: 7, direction: 1  \n",
       "..                            ...            ...                         ...  \n",
       "278    semapv:UnspecifiedMatching       obo:ORDO   distance: 3, direction: 1  \n",
       "279    semapv:UnspecifiedMatching       obo:ORDO  distance: 4, direction: -1  \n",
       "280    semapv:UnspecifiedMatching       obo:ORDO  distance: 2, direction: -1  \n",
       "281    semapv:UnspecifiedMatching       obo:ORDO   distance: 3, direction: 1  \n",
       "282  semapv:ManualMappingCuration       obo:ORDO                         NaN  \n",
       "\n",
       "[283 rows x 8 columns]"
      ]
     },
     "execution_count": 15,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "pd.read_csv(\"output/refsum-analysis/cliques/GARD_4648.sssom.tsv\", sep=\"\\t\", comment=\"#\")\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "e1ec0bd3-ae44-4194-be2e-2261e26b021f",
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.5"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}