Summarizing with LLMs

This notebook demonstrates how to summarize the output of LLMs using the datasette LLM command line tool.

See also:

Install the LLM command line tool

pip install llm

You may also want to install plugins for your models of choice:

pip install llm-deepseek

Summarize outputs

You can redirect any output you like to llm. For example, consider this OAK query to get definition of all kinds of hearts in Uberon:

[1]:
!runoak -i sqlite:obo:uberon definitions .sub "circulatory organ"
id      label   definition
UBERON:0000948  heart   A myogenic muscular circulatory organ found in the vertebrate cardiovascular system composed of chambers of cardiac muscle. It is the primary circulatory organ.
UBERON:0007100  primary circulatory organ       A hollow, muscular organ, which, by contracting rhythmically, keeps up the circulation of the blood or analogs[GO,modified].
UBERON:0015202  lymph heart     A circulatory organ that is reponsible for pumping lymph throughout the body.
UBERON:0015227  peristaltic circulatory vessel  A vessel down which passes a wave of muscular contraction, that forces the flow of haemolymphatic fluid.
UBERON:0015228  circulatory organ       A hollow, muscular organ, which, by contracting rhythmically, contributes to the circulation of lymph, blood or analogs. Examples: a chambered vertebrate heart; the tubular peristaltic heart of ascidians; the dorsal vessel of an insect; the lymoh heart of a reptile.
UBERON:0015229  accessory circulatory organ     A circulatory organ that is not responsible for primary circulation.
UBERON:0015230  dorsal vessel heart     The caudal, pulsatile region of the dorsal vessel of the arthropod circulatory system.
UBERON:0034961  embryonic lymph heart   A lymph heart that is part of an embryo.
UBERON:0034962  copulatory lymph heart  A lymph heart that assists in the return of lymph from the penis to the venous system.
UBERON:0036259  cardial lymph propulsor A lymphatic propulsor that lies tightly against the truncus arteriosus, the major outflow tract of the amphibian heart.
UBERON:0034959  right lymph heart       None
UBERON:0034960  left lymph heart        None
[2]:
!runoak -i sqlite:obo:uberon definitions .sub "circulatory organ" | llm -m 4o -s "give a summary of these terms and critical comments on definitions"
The terms provided mainly describe various structures within the circulatory system, both in vertebrates and invertebrates, with a focus on definitions from anatomical and biological perspectives. Here is a summary of the terms along with critical comments on their definitions:

1. **Heart (UBERON:0000948):** Defined as a myogenic muscular organ in vertebrates, responsible for circulating blood through cardiac muscle chambers. The term appropriately highlights the heart’s primary function and structural characteristics.

2. **Primary Circulatory Organ (UBERON:0007100):** Described as a hollow, muscular organ that maintains blood circulation through rhythmic contractions. The definition is clear, albeit a bit redundant with the general concept of a "heart."

3. **Lymph Heart (UBERON:0015202):** Defined as an organ pumping lymph throughout the body. The definition clearly states its function, but it might benefit from specifying its presence in particular animal groups.

4. **Peristaltic Circulatory Vessel (UBERON:0015227):** A vessel where muscular contractions propel fluid, specifically haemolymph. The definition is clear, but could use more context or examples of organisms in which such structures are found.

5. **Circulatory Organ (UBERON:0015228):** Broadly covers any hollow, muscular organ aiding blood, lymph, or fluid circulation. The examples provided enhance understanding, though the broad nature of the definition might necessitate more clarity or distinction between types.

6. **Accessory Circulatory Organ (UBERON:0015229):** An organ not primarily responsible for circulation, suggesting a supportive role. This generally makes sense, but would benefit from specific examples for clarity.

7. **Dorsal Vessel Heart (UBERON:0015230):** Describes the pulsatile region of the arthropod dorsal vessel, indicating its role in the circulatory system. The definition is precise for those familiar with arthropod anatomy but could confuse those without such background.

8. **Embryonic Lymph Heart (UBERON:0034961):** Specifies a lymph heart within an embryo. This is straightforward, however, further description of its role during development could enhance understanding.

9. **Copulatory Lymph Heart (UBERON:0034962):** Involves lymph return from the penis to the venous system. The definition is clear, yet specialized, relevant mostly to anatomy involving reproductive systems.

10. **Cardial Lymph Propulsor (UBERON:0036259):** A lymph structure associated with the amphibian heart’s major outflow tract. The definition is precise for specific studies of amphibian physiology.

11. **Right Lymph Heart (UBERON:0034959) & Left Lymph Heart (UBERON:0034960):** These terms lack definitions, which raises questions about their specific roles and anatomical details; adding definitions would improve completeness and understanding.

Critical observations indicate that many definitions would benefit from contextual enlargement or specification of their presence and role in different species, enhancing educational value. Furthermore, terms without definitions need elaboration for comprehensive utility.

Templates

The llm tool allows you to define templates.

llm templates edit summarize-definitions

Then in your editor:

system: give a summary of these terms and critical comments on definitions
[3]:
!runoak -i sqlite:obo:uberon definitions .sub "circulatory organ" | llm -m 4o -t summarize-definitions

The dataset provides definitions for various types of circulatory organs and structures within the UBERON ontology, which is a comprehensive multi-species anatomy ontology encompassing multiple biological domains. Here is a summary of each term included:

1. **Heart (UBERON:0000948)**: Described as a myogenic muscular organ within the vertebrate cardiovascular system. It is the primary circulatory organ that functions by moving blood throughout the body via chambers of cardiac muscle.

2. **Primary Circulatory Organ (UBERON:0007100)**: A hollow, muscular organ that maintains blood circulation through rhythmic contractions. The definition is adapted from the Gene Ontology (GO).

3. **Lymph Heart (UBERON:0015202)**: A circulatory organ tasked with pumping lymphatic fluid throughout the body. It serves a function analogous to the heart but specific to lymph circulation.

4. **Peristaltic Circulatory Vessel (UBERON:0015227)**: Described as a vessel in which muscular contractions send waves that propel haemolymphatic fluid forward, common in certain invertebrates.

5. **Circulatory Organ (UBERON:0015228)**: This broader category includes any hollow, muscular organ that contributes to the circulation of blood, lymph, or their analogs. Examples include the vertebrate heart, ascidian peristaltic heart, the insect dorsal vessel, and a reptile's lymph heart.

6. **Accessory Circulatory Organ (UBERON:0015229)**: Any circulatory organ that does not play the primary role in circulation, implying a supportive or secondary role.

7. **Dorsal Vessel Heart (UBERON:0015230)**: A section of the dorsal vessel in arthropods, it is characterized as pulsatile, assisting in their circulatory system.

8. **Embryonic Lymph Heart (UBERON:0034961)**: A lymph heart that is active during the embryonic stage of development.

9. **Copulatory Lymph Heart (UBERON:0034962)**: A specialized lymph heart aiding in returning lymph from the penis to the venous system.

10. **Cardial Lymph Propulsor (UBERON:0036259)**: Positioned closely against the truncus arteriosus in amphibians, it helps propel lymphatic fluid, acting near the heart's outflow tract.

11. **Right Lymph Heart (UBERON:0034959)** and **Left Lymph Heart (UBERON:0034960)**: These entries lack defined descriptions in the dataset, suggesting they might represent specific anatomical structures known in certain species but not yet fully characterized in this context.

**Critical Comments on Definitions:**

- **Incompleteness and Specificity:** Some entries, like the right and left lymph heart, are missing definitions, which indicate a need for further research or input in these areas. Definitions should be expanded to clarify their unique roles or confirm their existence in various species.

- **Overlap and Clarity:** There is potential overlap in the definitions of terms like "circulatory organ" and "primary circulatory organ," which might cause confusion regarding their distinct functionalities and hierarchies. It would be beneficial to delineate these more clearly within the ontology.

- **Use of Examples and Context:** The use of examples, such as the mention of ascidians and insects in the circulatory organ definition, enriches understanding but might require caveats for context-specific interpretations.

Overall, while the given definitions are fairly descriptive of the various components of circulatory systems across different organisms, further refinement and detail, particularly where missing or vague, could improve comprehensiveness and usability in biological research and applications.

Gene summaries

Create a template for summarizing gene annotations:

llm templates edit summarize-gaf-for-gene

system: I will provide you with GAF for a gene. Summarize the function of the gene.
  Give a one short description a biologist would understand.
  You may weave together multiple terms where there is redundancy.
  You should aim to be faithful to the GAF, but be aware that mistakes and over-annotation happens.
  If you see things that are unlikely, you can omit these.
  You may also produce some commentary at the end
  (e.g. 'the GAF showed annotation to X but this contradicts what is known about the gene')
  Do not focus on the evidence, or names, or IDs, or metadata about the annotation,
  just write the biological narrative.
  The exception is if this is really relevant (e.g. you may call into question a very old annotation if it
  does not make sense).
  Be aware that historically there has been over-annotation with experimental codes, for example, phenotypes from downstream effects.
  These are less relevant, and you should focus on the core activity, cellular process, and localization.
  You may however choose to briefly summarize phenotypic annotations (e.g. the role of G in process P has downstream effects E1, ...).
  Use your judgment to explain the story biologically rather than simply regurgitating terms.
  Note that the IBA code (inferred from biological ancestor) reflects high quality annotations in many species because these terms
  have been reviewed in a phylogenetic context and checked for over-annotation.
  But note that IBAs may sometimes be less complete, especially for organism-specific knowledge.
  Use your own biological knowledge.
  If aspects of the model are not clear, or you think there are errors, then at the end of your summary report on problems or anything that was not clear.
[8]:
!runoak -i amigo:NCBITaxon:9606 associations -p i,p -H  --expand GO:0009229
# Query IDs: GO:0009229
# Ontology closure predicates: rdfs:subClassOf, BFO:0000050
#
# The results include a round of expansion
#
subject predicate       object  property_values subject_label   predicate_label object_label    negated publications    evidence_type   supporting_objects      primary_knowledge_source        aggregator_knowledge_source     subject_closure subject_closure_label   object_closure  object_closure_label    comments
UniProtKB:Q9BZV2        biolink:related_to      GO:0009229              SLC19A3 None    thiamine diphosphate biosynthetic process       False   GO_REF:0000107  IEA             Ensembl infores:go
UniProtKB:Q9H3S4        biolink:related_to      GO:0009229              TPK1    None    thiamine diphosphate biosynthetic process       False   GO_REF:0000041  IEA             UniProt infores:go
UniProtKB:Q9H3S4        biolink:related_to      GO:0009229              TPK1    None    thiamine diphosphate biosynthetic process       False   PMID:11342111   IDA             UniProt infores:go
UniProtKB:Q9H3S4        biolink:related_to      GO:0009229              TPK1    None    thiamine diphosphate biosynthetic process       False   PMID:38547260   IDA             UniProt infores:go
UniProtKB:Q9H3S4        biolink:related_to      GO:0009229              TPK1    None    thiamine diphosphate biosynthetic process       False   GO_REF:0000033  IBA             GO_Central      infores:go
UniProtKB:Q9HC21        biolink:related_to      GO:0009229              SLC25A19        None    thiamine diphosphate biosynthetic process       False   GO_REF:0000107  IEA             Ensembl infores:go
UniProtKB:O60779        biolink:related_to      GO:0009229              SLC19A2 None    thiamine diphosphate biosynthetic process       False   GO_REF:0000107  IEA             Ensembl infores:go
UniProtKB:Q9BU02        biolink:related_to      GO:0009229              THTPA   None    thiamine diphosphate biosynthetic process       False   GO_REF:0000107  IEA             Ensembl infores:go
UniProtKB:Q9BZV2        biolink:related_to      GO:0005515              SLC19A3 None    protein binding False   PMID:32296183   IPI             IntAct  infores:go
UniProtKB:Q9BZV2        biolink:related_to      GO:0005515              SLC19A3 None    protein binding False   PMID:32296183   IPI             IntAct  infores:go
UniProtKB:Q9BZV2        biolink:related_to      GO:0005515              SLC19A3 None    protein binding False   PMID:32296183   IPI             IntAct  infores:go
UniProtKB:Q9BZV2        biolink:related_to      GO:0005515              SLC19A3 None    protein binding False   PMID:32296183   IPI             IntAct  infores:go
UniProtKB:Q9BZV2        biolink:related_to      GO:0005515              SLC19A3 None    protein binding False   PMID:32296183   IPI             IntAct  infores:go
UniProtKB:Q9BZV2        biolink:related_to      GO:0005515              SLC19A3 None    protein binding False   PMID:32296183   IPI             IntAct  infores:go
UniProtKB:Q9BZV2        biolink:related_to      GO:0005515              SLC19A3 None    protein binding False   PMID:32296183   IPI             IntAct  infores:go
UniProtKB:Q9BZV2        biolink:related_to      GO:0005515              SLC19A3 None    protein binding False   PMID:32296183   IPI             IntAct  infores:go
UniProtKB:Q9BZV2        biolink:related_to      GO:0005515              SLC19A3 None    protein binding False   PMID:32296183   IPI             IntAct  infores:go
UniProtKB:Q9BZV2        biolink:related_to      GO:0005515              SLC19A3 None    protein binding False   PMID:32296183   IPI             IntAct  infores:go
UniProtKB:Q9BZV2        biolink:related_to      GO:0005515              SLC19A3 None    protein binding False   PMID:32296183   IPI             IntAct  infores:go
UniProtKB:Q9BZV2        biolink:related_to      GO:0015234              SLC19A3 None    thiamine transmembrane transporter activity     False   GO_REF:0000024  ISS             BHF-UCL infores:go
UniProtKB:Q9BZV2        biolink:related_to      GO:0015234              SLC19A3 None    thiamine transmembrane transporter activity     False   Reactome:R-HSA-199626   TAS             Reactome        infores:go
UniProtKB:Q9BZV2        biolink:related_to      GO:0009229              SLC19A3 None    thiamine diphosphate biosynthetic process       False   GO_REF:0000107  IEA             Ensembl infores:go
UniProtKB:Q9BZV2        biolink:related_to      GO:0015888              SLC19A3 None    thiamine transport      False   PMID:11731220   IDA             UniProt infores:go
UniProtKB:Q9BZV2        biolink:related_to      GO:0015888              SLC19A3 None    thiamine transport      False   PMID:33008889   IDA             UniProt infores:go
UniProtKB:Q9BZV2        biolink:related_to      GO:0015888              SLC19A3 None    thiamine transport      False   PMID:35512554   IDA             UniProt infores:go
UniProtKB:Q9BZV2        biolink:related_to      GO:0015888              SLC19A3 None    thiamine transport      False   PMID:35724964   IMP             UniProt infores:go
UniProtKB:Q9BZV2        biolink:related_to      GO:0031923              SLC19A3 None    pyridoxine transport    False   PMID:33008889   IDA             UniProt infores:go
UniProtKB:Q9BZV2        biolink:related_to      GO:0031923              SLC19A3 None    pyridoxine transport    False   PMID:35512554   IDA             UniProt infores:go
UniProtKB:Q9BZV2        biolink:related_to      GO:0031923              SLC19A3 None    pyridoxine transport    False   PMID:35724964   IMP             UniProt infores:go
UniProtKB:Q9BZV2        biolink:related_to      GO:0031923              SLC19A3 None    pyridoxine transport    False   PMID:36456177   IDA             UniProt infores:go
UniProtKB:Q9BZV2        biolink:related_to      GO:0042723              SLC19A3 None    thiamine-containing compound metabolic process  False   Reactome:R-HSA-196819   TAS             Reactome        infores:go
UniProtKB:Q9BZV2        biolink:related_to      GO:0071934              SLC19A3 None    thiamine transmembrane transport        False   GO_REF:0000024  ISS             BHF-UCL infores:go
UniProtKB:Q9BZV2        biolink:related_to      GO:0005886              SLC19A3 None    plasma membrane False   Reactome:R-HSA-199626   TAS             Reactome        infores:go
UniProtKB:Q9BZV2        biolink:related_to      GO:0016020              SLC19A3 None    membrane        False   PMID:11136550   NAS             UniProt infores:go
UniProtKB:Q9BZV2        biolink:related_to      GO:0005886              SLC19A3 None    plasma membrane False   GO_REF:0000033  IBA             GO_Central      infores:go
UniProtKB:Q9BZV2        biolink:related_to      GO:0055085              SLC19A3 None    transmembrane transport False   GO_REF:0000033  IBA             GO_Central      infores:go
UniProtKB:Q9BZV2        biolink:related_to      GO:0015234              SLC19A3 None    thiamine transmembrane transporter activity     False   GO_REF:0000033  IBA             GO_Central      infores:go
UniProtKB:Q9H3S4        biolink:related_to      GO:0004788              TPK1    None    thiamine diphosphokinase activity       False   PMID:11342111   IDA             UniProt infores:go
UniProtKB:Q9H3S4        biolink:related_to      GO:0004788              TPK1    None    thiamine diphosphokinase activity       False   PMID:38547260   IDA             UniProt infores:go
UniProtKB:Q9H3S4        biolink:related_to      GO:0005515              TPK1    None    protein binding False   PMID:32296183   IPI             IntAct  infores:go
UniProtKB:Q9H3S4        biolink:related_to      GO:0005524              TPK1    None    ATP binding     False   GO_REF:0000043  IEA             UniProt infores:go
UniProtKB:Q9H3S4        biolink:related_to      GO:0016301              TPK1    None    kinase activity False   GO_REF:0000043  IEA             UniProt infores:go
UniProtKB:Q9H3S4        biolink:related_to      GO:0030975              TPK1    None    thiamine binding        False   GO_REF:0000002  IEA             InterPro        infores:go
UniProtKB:Q9H3S4        biolink:related_to      GO:0042802              TPK1    None    identical protein binding       False   PMID:25502805   IPI             IntAct  infores:go
UniProtKB:Q9H3S4        biolink:related_to      GO:0042802              TPK1    None    identical protein binding       False   PMID:29892012   IPI             IntAct  infores:go
UniProtKB:Q9H3S4        biolink:related_to      GO:0042802              TPK1    None    identical protein binding       False   PMID:31515488   IPI             IntAct  infores:go
UniProtKB:Q9H3S4        biolink:related_to      GO:0042802              TPK1    None    identical protein binding       False   PMID:32296183   IPI             IntAct  infores:go
UniProtKB:Q9H3S4        biolink:related_to      GO:0141200              TPK1    None    UTP thiamine diphosphokinase activity   False   PMID:38547260   IDA             UniProt infores:go
UniProtKB:Q9H3S4        biolink:related_to      GO:0006772              TPK1    None    thiamine metabolic process      False   GO_REF:0000107  IEA             Ensembl infores:go
UniProtKB:Q9H3S4        biolink:related_to      GO:0009229              TPK1    None    thiamine diphosphate biosynthetic process       False   GO_REF:0000041  IEA             UniProt infores:go
UniProtKB:Q9H3S4        biolink:related_to      GO:0009229              TPK1    None    thiamine diphosphate biosynthetic process       False   PMID:11342111   IDA             UniProt infores:go
UniProtKB:Q9H3S4        biolink:related_to      GO:0009229              TPK1    None    thiamine diphosphate biosynthetic process       False   PMID:38547260   IDA             UniProt infores:go
UniProtKB:Q9H3S4        biolink:related_to      GO:0010510              TPK1    None    regulation of acetyl-CoA biosynthetic process from pyruvate     False   PMID:38547260   IMP             UniProt infores:go
UniProtKB:Q9H3S4        biolink:related_to      GO:0005829              TPK1    None    cytosol False   Reactome:R-HSA-196761   TAS             Reactome        infores:go
UniProtKB:Q9H3S4        biolink:related_to      GO:0004788              TPK1    None    thiamine diphosphokinase activity       False   GO_REF:0000033  IBA             GO_Central      infores:go
UniProtKB:Q9H3S4        biolink:related_to      GO:0009229              TPK1    None    thiamine diphosphate biosynthetic process       False   GO_REF:0000033  IBA             GO_Central      infores:go
UniProtKB:Q9HC21        biolink:not     GO:0030233              SLC25A19        None    deoxynucleotide transmembrane transporter activity      True    PMID:15539640   IDA             UniProt infores:go
UniProtKB:Q9HC21        biolink:not     GO:0030233              SLC25A19        None    deoxynucleotide transmembrane transporter activity      True    PMID:17035501   IDA             UniProt infores:go
UniProtKB:Q9HC21        biolink:not     GO:0030302              SLC25A19        None    deoxynucleotide transport       True    PMID:15539640   IDA             UniProt infores:go
UniProtKB:Q9HC21        biolink:related_to      GO:0015297              SLC25A19        None    antiporter activity     False   GO_REF:0000043  IEA             UniProt infores:go
UniProtKB:Q9HC21        biolink:related_to      GO:0030233              SLC25A19        None    deoxynucleotide transmembrane transporter activity      False   PMID:11226231   TAS             UniProt infores:go
UniProtKB:Q9HC21        biolink:related_to      GO:0090422              SLC25A19        None    thiamine pyrophosphate transmembrane transporter activity       False   PMID:17035501   IDA             UniProt infores:go
UniProtKB:Q9HC21        biolink:related_to      GO:0090422              SLC25A19        None    thiamine pyrophosphate transmembrane transporter activity       False   Reactome:R-HSA-8875838  TAS             Reactome        infores:go
UniProtKB:Q9HC21        biolink:related_to      GO:0009229              SLC25A19        None    thiamine diphosphate biosynthetic process       False   GO_REF:0000107  IEA             Ensembl infores:go
UniProtKB:Q9HC21        biolink:related_to      GO:0030302              SLC25A19        None    deoxynucleotide transport       False   PMID:11226231   NAS             UniProt infores:go
UniProtKB:Q9HC21        biolink:related_to      GO:0030974              SLC25A19        None    thiamine pyrophosphate transmembrane transport  False   PMID:17035501   IDA             UniProt infores:go
UniProtKB:Q9HC21        biolink:related_to      GO:0042723              SLC25A19        None    thiamine-containing compound metabolic process  False   Reactome:R-HSA-196819   TAS             Reactome        infores:go
UniProtKB:Q9HC21        biolink:related_to      GO:0005634              SLC25A19        None    nucleus False   PMID:21630459   HDA             UniProt infores:go
UniProtKB:Q9HC21        biolink:related_to      GO:0005739              SLC25A19        None    mitochondrion   False   GO_REF:0000052  IDA             HPA     infores:go
UniProtKB:Q9HC21        biolink:related_to      GO:0005739              SLC25A19        None    mitochondrion   False   GO_REF:0000052  IDA             HPA     infores:go
UniProtKB:Q9HC21        biolink:related_to      GO:0005739              SLC25A19        None    mitochondrion   False   GO_REF:0000052  IDA             HPA     infores:go
UniProtKB:Q9HC21        biolink:related_to      GO:0005739              SLC25A19        None    mitochondrion   False   PMID:15539640   IDA             UniProt infores:go
UniProtKB:Q9HC21        biolink:related_to      GO:0005739              SLC25A19        None    mitochondrion   False   PMID:31506564   IDA             UniProt infores:go
UniProtKB:Q9HC21        biolink:related_to      GO:0005739              SLC25A19        None    mitochondrion   False   PMID:34800366   HTP             FlyBase infores:go
UniProtKB:Q9HC21        biolink:related_to      GO:0005743              SLC25A19        None    mitochondrial inner membrane    False   Reactome:R-HSA-8875838  TAS             Reactome        infores:go
UniProtKB:Q9HC21        biolink:related_to      GO:0030974              SLC25A19        None    thiamine pyrophosphate transmembrane transport  False   GO_REF:0000033  IBA             GO_Central      infores:go
UniProtKB:Q9HC21        biolink:related_to      GO:0015234              SLC25A19        None    thiamine transmembrane transporter activity     False   GO_REF:0000033  IBA             GO_Central      infores:go
UniProtKB:Q9HC21        biolink:related_to      GO:0005743              SLC25A19        None    mitochondrial inner membrane    False   GO_REF:0000033  IBA             GO_Central      infores:go
UniProtKB:O60779        biolink:related_to      GO:0005515              SLC19A2 None    protein binding False   PMID:21836059   IPI             UniProt infores:go
UniProtKB:O60779        biolink:related_to      GO:0005515              SLC19A2 None    protein binding False   PMID:21836059   IPI             IntAct  infores:go
UniProtKB:O60779        biolink:related_to      GO:0008517              SLC19A2 None    folic acid transmembrane transporter activity   False   PMID:10542220   NAS             UniProt infores:go
UniProtKB:O60779        biolink:related_to      GO:0015234              SLC19A2 None    thiamine transmembrane transporter activity     False   GO_REF:0000024  ISS             BHF-UCL infores:go
UniProtKB:O60779        biolink:related_to      GO:0015234              SLC19A2 None    thiamine transmembrane transporter activity     False   PMID:10542220   TAS             UniProt infores:go
UniProtKB:O60779        biolink:related_to      GO:0015234              SLC19A2 None    thiamine transmembrane transporter activity     False   PMID:21836059   IDA             UniProt infores:go
UniProtKB:O60779        biolink:related_to      GO:0015234              SLC19A2 None    thiamine transmembrane transporter activity     False   Reactome:R-HSA-199626   TAS             Reactome        infores:go
UniProtKB:O60779        biolink:related_to      GO:0007283              SLC19A2 None    spermatogenesis False   GO_REF:0000107  IEA             Ensembl infores:go
UniProtKB:O60779        biolink:related_to      GO:0009229              SLC19A2 None    thiamine diphosphate biosynthetic process       False   GO_REF:0000107  IEA             Ensembl infores:go
UniProtKB:O60779        biolink:related_to      GO:0015884              SLC19A2 None    folic acid transport    False   GO_REF:0000108  IEA             GOC     infores:go
UniProtKB:O60779        biolink:related_to      GO:0015888              SLC19A2 None    thiamine transport      False   PMID:10391222   IMP             UniProt infores:go
UniProtKB:O60779        biolink:related_to      GO:0015888              SLC19A2 None    thiamine transport      False   PMID:10542220   IDA             UniProt infores:go
UniProtKB:O60779        biolink:related_to      GO:0015888              SLC19A2 None    thiamine transport      False   PMID:10542220   NAS             UniProt infores:go
UniProtKB:O60779        biolink:related_to      GO:0015888              SLC19A2 None    thiamine transport      False   PMID:33008889   IDA             UniProt infores:go
UniProtKB:O60779        biolink:related_to      GO:0015888              SLC19A2 None    thiamine transport      False   PMID:35512554   IDA             UniProt infores:go
UniProtKB:O60779        biolink:related_to      GO:0015888              SLC19A2 None    thiamine transport      False   PMID:35724964   IMP             UniProt infores:go
UniProtKB:O60779        biolink:related_to      GO:0031923              SLC19A2 None    pyridoxine transport    False   PMID:33008889   IDA             UniProt infores:go
UniProtKB:O60779        biolink:related_to      GO:0031923              SLC19A2 None    pyridoxine transport    False   PMID:35512554   IDA             UniProt infores:go
UniProtKB:O60779        biolink:related_to      GO:0042723              SLC19A2 None    thiamine-containing compound metabolic process  False   Reactome:R-HSA-196819   TAS             Reactome        infores:go
UniProtKB:O60779        biolink:related_to      GO:0071934              SLC19A2 None    thiamine transmembrane transport        False   GO_REF:0000024  ISS             BHF-UCL infores:go
UniProtKB:O60779        biolink:related_to      GO:0005886              SLC19A2 None    plasma membrane False   PMID:21836059   IDA             UniProt infores:go
UniProtKB:O60779        biolink:related_to      GO:0005886              SLC19A2 None    plasma membrane False   Reactome:R-HSA-199626   TAS             Reactome        infores:go
UniProtKB:O60779        biolink:related_to      GO:0016020              SLC19A2 None    membrane        False   PMID:10542220   NAS             UniProt infores:go
UniProtKB:O60779        biolink:related_to      GO:0005886              SLC19A2 None    plasma membrane False   GO_REF:0000033  IBA             GO_Central      infores:go
UniProtKB:O60779        biolink:related_to      GO:0015888              SLC19A2 None    thiamine transport      False   GO_REF:0000033  IBA             GO_Central      infores:go
UniProtKB:O60779        biolink:related_to      GO:0015234              SLC19A2 None    thiamine transmembrane transporter activity     False   GO_REF:0000033  IBA             GO_Central      infores:go
UniProtKB:O60779        biolink:related_to      GO:0055085              SLC19A2 None    transmembrane transport False   GO_REF:0000033  IBA             GO_Central      infores:go
UniProtKB:Q9BU02        biolink:related_to      GO:0000287              THTPA   None    magnesium ion binding   False   GO_REF:0000024  ISS             UniProt infores:go
UniProtKB:Q9BU02        biolink:related_to      GO:0005515              THTPA   None    protein binding False   PMID:32296183   IPI             IntAct  infores:go
UniProtKB:Q9BU02        biolink:related_to      GO:0016787              THTPA   None    hydrolase activity      False   PMID:11827967   TAS             UniProt infores:go
UniProtKB:Q9BU02        biolink:related_to      GO:0050333              THTPA   None    thiamine triphosphate phosphatase activity      False   PMID:11827967   IDA             UniProt infores:go
UniProtKB:Q9BU02        biolink:related_to      GO:0006091              THTPA   None    generation of precursor metabolites and energy  False   PMID:11827967   NAS             UniProt infores:go
UniProtKB:Q9BU02        biolink:related_to      GO:0006772              THTPA   None    thiamine metabolic process      False   PMID:11827967   TAS             UniProt infores:go
UniProtKB:Q9BU02        biolink:related_to      GO:0009229              THTPA   None    thiamine diphosphate biosynthetic process       False   GO_REF:0000107  IEA             Ensembl infores:go
UniProtKB:Q9BU02        biolink:related_to      GO:0016311              THTPA   None    dephosphorylation       False   PMID:11827967   IDA             UniProt infores:go
UniProtKB:Q9BU02        biolink:related_to      GO:0042357              THTPA   None    thiamine diphosphate metabolic process  False   GO_REF:0000024  ISS             UniProt infores:go
UniProtKB:Q9BU02        biolink:related_to      GO:0005829              THTPA   None    cytosol False   Reactome:R-HSA-965067   TAS             Reactome        infores:go
UniProtKB:Q9BU02        biolink:related_to      GO:0000287              THTPA   None    magnesium ion binding   False   GO_REF:0000033  IBA             GO_Central      infores:go
UniProtKB:Q9BU02        biolink:related_to      GO:0050333              THTPA   None    thiamine triphosphate phosphatase activity      False   GO_REF:0000033  IBA             GO_Central      infores:go
UniProtKB:Q9BU02        biolink:related_to      GO:0042357              THTPA   None    thiamine diphosphate metabolic process  False   GO_REF:0000033  IBA             GO_Central      infores:go
[11]:
!runoak -i amigo:NCBITaxon:9606 associations -p i,p -H  --expand GO:0009229 | llm -m 4o -t summarize-gaf-for-gene
The gene products associated with the thiamine diphosphate biosynthetic process (GO:0009229) involve several proteins including SLC19A3, TPK1, SLC25A19, SLC19A2, and THTPA, each contributing to the synthesis and transport of thiamine-related compounds in different capacities.

SLC19A3 is primarily involved in the transport of thiamine and pyridoxine across the plasma membrane, displaying thiamine transmembrane transporter activity. It facilitates thiamine transmembrane transport and is located on the plasma membrane. This protein also plays a role in thiamine transport in general, contributing to the overall metabolic process of thiamine-containing compounds.

TPK1 (thiamine pyrophosphate kinase) catalyzes the ATP-dependent phosphorylation of thiamine to form thiamine diphosphate, which is the active form of thiamine used as a coenzyme in various enzymatic reactions. It has been detected in the cytosol and also possesses kinase and ATP binding activity.

SLC25A19 is responsible for the transmembrane transport of thiamine pyrophosphate into mitochondria. It shows activity as a thiamine pyrophosphate transmembrane transporter, indicating its critical role in mitochondrial functions related to thiamine metabolism. This protein is associated with the mitochondrial inner membrane, further evidencing its role in mitochondrial transport processes.

SLC19A2 serves as both a thiamine and folic acid transporter at the plasma membrane, contributing to the transport of these vitamins into cells, supporting both thiamine and folic acid metabolic processes.

THTPA functions in the conversion of thiamine triphosphate to thiamine diphosphate, acting as a thiamine triphosphate phosphatase. Involved in broad thiamine metabolic processes, it possesses hydrolase activity, with a focus on dephosphorylation reactions. It has been primarily located in the cytosol.

Anomalous annotations include SLC19A2 being associated with spermatogenesis, which may not directly relate to its primary functions in vitamin transport. Additionally, some annotations suggest contradictory activities, such as SLC25A19 annotations to deoxynucleotide transport, which have been negated in other studies.

Overall, these genes coordinate to manage the cellular and systemic availability of thiamine diphosphate, ensuring proper metabolic processes that utilize this crucial co-factor.
[ ]: