OAK associations command

This notebook is intended as a supplement to the main OAK CLI docs.

This notebook provides examples for the associations command which ways of querying associations.

For more on associations, see Associations and Curated Annotations in the OAK guide.

For more on command line usage in general, see the Command Line Tutorial

Help Option

You can get help on any OAK command using --help

[1]:
!runoak associations --help
Usage: runoak associations [OPTIONS] [TERMS]...

  Lookup associations from or to entities.

  Example:

      runoak -i sqlite:obo:hp -g test.hpoa -G hpoa associations

  The above will show all associations

  To query using an ontology term, including is-a closure, specify one or more
  terms or term queries, plus the closure predicate(s), e.g.

  Example:

      runoak -i sqlite:obo:hp -g test.hpoa -G hpoa associations -p i
      HP:0001392

  This shows all annotations either to "Abnormality of the liver"
  (HP:0001392), or to is-a descendants.

  Using input specifications:

  It can be awkward to specify both input ontology and association path and
  format. You can use input specifications to bundle common combinations of
  inputs together.

  For example, the go-dictybase-input-spec combines go plus dictybase
  associations.

  Example:

      runoak --i src/oaklib/conf/go-dictybase-input-spec.yaml associations -p
      i,p GO:0008104

  More examples:

     https://github.com/INCATools/ontology-access-
     kit/blob/main/notebooks/Commands/Associations.ipynb

Options:
  -o, --output FILENAME           Output file, e.g. obo file
  -p, --predicates TEXT           A comma-separated list of predicates. This
                                  may be a shorthand (i, p) or CURIE
  --autolabel / --no-autolabel    If set, results will automatically have
                                  labels assigned  [default: autolabel]
  -O, --output-type TEXT          Desired output type
  -o, --output FILENAME           Output file, e.g. obo file
  --if-absent [absent-only|present-only]
                                  determines behavior when the value is not
                                  present or is empty.
  -S, --set-value TEXT            the value to set for all terms for the given
                                  property.
  --association-predicates TEXT   A comma-separated list of predicates for the
                                  association relation
  -Q, --terms-role [subject|object|both]
                                  How to interpret query terms.  [default:
                                  object]
  --help                          Show this message and exit.

Set up an alias

We will set up an alias for running OAK bound to GO for the purposes of this notebook:

[2]:
alias go runoak -i sqlite:obo:go
[3]:
go ontology-metadata --all
id:
- obo:go/extensions/go-plus.owl
dce:description:
- The Gene Ontology (GO) provides a framework and set of concepts for describing the
  functions of gene products from all organisms.
dce:title:
- Gene Ontology
dcterms:license:
- <http://creativecommons.org/licenses/by/4.0/>
oio:default-namespace:
- gene_ontology
oio:hasOBOFormatVersion:
- '1.2'
owl:versionIRI:
- obo:go/releases/2023-04-01/extensions/go-plus.owl
owl:versionInfo:
- '2023-04-01'
rdf:type:
- owl:Ontology
sh:prefix:
- obo
schema:url:
- http://purl.obolibrary.org/obo/go/extensions/go-plus.owl
rdfs:isDefinedBy:
- http://purl.obolibrary.org/obo/obo.owl

Check that queries work

[4]:
go info "kinase activity"
GO:0016301 ! kinase activity

Query for associations to a gene

Here we will query from a previously downloaded GAF all associations to a gene

[8]:
go -g input/gene_association.sgd.gaf -G gaf associations -Q subject SGD:S000004294 -O csv | head -20
subject predicate       object  object_label    property_values subject_label   predicate_label negated publications    primary_knowledge_source        aggregator_knowledge_source
SGD:S000004294  None    GO:0003824      None            MET17   None    None    SGD_REF:S000124036      infores:InterPro        None
SGD:S000004294  None    GO:0003824      None            MET17   None    None    SGD_REF:S000124036      infores:InterPro        None
SGD:S000004294  None    GO:0003824      None            MET17   None    None    SGD_REF:S000148669      infores:UniProt None
SGD:S000004294  None    GO:0005737      None            MET17   None    None    SGD_REF:S000148669      infores:UniProt None
SGD:S000004294  None    GO:0005737      None            MET17   None    None    SGD_REF:S000148671      infores:UniProt None
SGD:S000004294  None    GO:0005737      None            MET17   None    None    SGD_REF:S000069459|PMID:11914276        infores:SGD     None
SGD:S000004294  None    GO:0005737      None            MET17   None    None    SGD_REF:S000069459|PMID:11914276        infores:SGD     None
SGD:S000004294  None    GO:0016765      None            MET17   None    None    SGD_REF:S000124036      infores:InterPro        None
SGD:S000004294  None    GO:0030170      None            MET17   None    None    SGD_REF:S000124036      infores:InterPro        None
SGD:S000004294  None    GO:0030170      None            MET17   None    None    SGD_REF:S000185201      infores:GO_Central      None
SGD:S000004294  None    GO:0006520      None            MET17   None    None    SGD_REF:S000124036      infores:InterPro        None
SGD:S000004294  None    GO:0008152      None            MET17   None    None    SGD_REF:S000148669      infores:UniProt None
SGD:S000004294  None    GO:0008652      None            MET17   None    None    SGD_REF:S000148669      infores:UniProt None
SGD:S000004294  None    GO:0009086      None            MET17   None    None    SGD_REF:S000148669      infores:UniProt None
SGD:S000004294  None    GO:0019344      None            MET17   None    None    SGD_REF:S000148669      infores:UniProt None
SGD:S000004294  None    GO:0019344      None            MET17   None    None    SGD_REF:S000075748|PMID:15042590        infores:SGD     None
SGD:S000004294  None    GO:0071266      None            MET17   None    None    SGD_REF:S000204515      infores:GOC     None
SGD:S000004294  None    GO:0003961      None            MET17   None    None    SGD_REF:S000124037      infores:UniProt None
SGD:S000004294  None    GO:0003961      None            MET17   None    None    SGD_REF:S000057063|PMID:3299001 infores:SGD     None

Query for associations to a term

In contrast to gene queries, we want to make use of ontology relationships - in particular we typically want to include all is-a and part-of descendants in our query

[9]:
go -g input/gene_association.sgd.gaf -G gaf associations -p i,p "kinase activity" -O csv | head -20
subject predicate       object  object_label    property_values subject_label   predicate_label negated publications    primary_knowledge_source        aggregator_knowledge_source
SGD:S000001369  None    GO:0016301      None            PFK26   None    None    SGD_REF:S000148669      infores:UniProt None
SGD:S000001369  None    GO:0003873      None            PFK26   None    None    SGD_REF:S000124037      infores:UniProt None
SGD:S000001369  None    GO:0003873      None            PFK26   None    None    SGD_REF:S000124036      infores:InterPro        None
SGD:S000001369  None    GO:0003873      None            PFK26   None    None    SGD_REF:S000051318|PMID:1322693 infores:SGD     None
SGD:S000001369  None    GO:0003873      None            PFK26   None    None    SGD_REF:S000048479|PMID:1657152 infores:SGD     None
SGD:S000002318  None    GO:0004708      None            STE7    None    None    SGD_REF:S000041791|PMID:8668180 infores:SGD     None
SGD:S000002318  None    GO:0004708      None            STE7    None    None    SGD_REF:S000045748|PMID:8384702 infores:SGD     None
SGD:S000003272  None    GO:0004707      None            KSS1    None    None    SGD_REF:S000041791|PMID:8668180 infores:SGD     None
SGD:S000003272  None    GO:0004707      None            KSS1    None    None    SGD_REF:S000045641|PMID:8918885 infores:SGD     None
SGD:S000003272  None    GO:0004707      None            KSS1    None    None    SGD_REF:S000124037      infores:UniProt None
SGD:S000003272  None    GO:0004707      None            KSS1    None    None    SGD_REF:S000124036      infores:InterPro        None
SGD:S000006124  None    GO:0004672      None            TPK2    None    None    SGD_REF:S000124036      infores:InterPro        None
SGD:S000006124  None    GO:0004672      None            TPK2    None    None    SGD_REF:S000124036      infores:InterPro        None
SGD:S000006124  None    GO:0004672      None            TPK2    None    None    SGD_REF:S000113918|PMID:16319894        infores:SGD     None
SGD:S000006124  None    GO:0004672      None            TPK2    None    None    SGD_REF:S000113918|PMID:16319894        infores:SGD     None
SGD:S000002318  None    GO:0016301      None            STE7    None    None    SGD_REF:S000148669      infores:UniProt None
SGD:S000000364  None    GO:0004674      None            CDC28   None    None    SGD_REF:S000086178|PMID:16096060        infores:SGD     None
SGD:S000000364  None    GO:0004674      None            CDC28   None    None    SGD_REF:S000146417|PMID:21841787        infores:SGD     None
SGD:S000000364  None    GO:0004674      None            CDC28   None    None    SGD_REF:S000149310|PMID:22521784        infores:SGD     None
SGD:S000003820  None    None    GO:0004674      protein serine/threonine kinase activity        []
SGD:S000002394  None    None    GO:0009927      histidine phosphotransfer kinase activity       []
SGD:S000001644  None    None    GO:0004693      cyclin-dependent protein serine/threonine kinase activity       []
SGD:S000004710  None    None    GO:0004674      protein serine/threonine kinase activity        []
SGD:S000001949  None    None    GO:0008865      fructokinase activity   []
SGD:S000003607  None    None    GO:0003991      acetylglutamate kinase activity []
SGD:S000001075  None    None    GO:0004349      glutamate 5-kinase activity     []
SGD:S000002924  None    None    GO:0019158      mannokinase activity    []
SGD:S000003509  None    None    GO:0004140      dephospho-CoA kinase activity   []
SGD:S000004438  None    None    GO:0008865      fructokinase activity   []
SGD:S000001651  None    None    GO:0004674      protein serine/threonine kinase activity        []
SGD:S000001681  None    None    GO:0004674      protein serine/threonine kinase activity        []
SGD:S000002427  None    None    GO:0004849      uridine kinase activity []
SGD:S000003866  None    None    GO:0019200      carbohydrate kinase activity    []
SGD:S000000687  None    None    GO:0004674      protein serine/threonine kinase activity        []
SGD:S000006071  None    None    GO:0004674      protein serine/threonine kinase activity        []
SGD:S000004438  None    None    GO:0004340      glucokinase activity    []
SGD:S000001654  None    None    GO:0009931      calcium-dependent protein serine/threonine kinase activity      []
SGD:S000001949  None    None    GO:0004340      glucokinase activity    []
SGD:S000005878  None    None    GO:0004683      calmodulin-dependent protein kinase activity    []
SGD:S000006074  None    None    GO:0004674      protein serine/threonine kinase activity        []
SGD:S000003126  None    None    GO:0004683      calmodulin-dependent protein kinase activity    []
SGD:S000001622  None    None    GO:0008353      RNA polymerase II CTD heptapeptide repeat kinase activity       []
SGD:S000005488  None    None    GO:0004693      cyclin-dependent protein serine/threonine kinase activity       []
SGD:S000002924  None    None    GO:0008865      fructokinase activity   []
SGD:S000003222  None    None    GO:0008865      fructokinase activity   []
SGD:S000002266  None    None    GO:0004693      cyclin-dependent protein serine/threonine kinase activity       []
SGD:S000006043  None    None    GO:0008353      RNA polymerase II CTD heptapeptide repeat kinase activity       []
SGD:S000004438  None    None    GO:0019158      mannokinase activity    []
SGD:S000000545  None    None    GO:0019158      mannokinase activity    []
SGD:S000002656  None    None    GO:0046316      gluconokinase activity  []
SGD:S000002924  None    None    GO:0004340      glucokinase activity    []
SGD:S000001654  None    None    GO:0004683      calmodulin-dependent protein kinase activity    []
SGD:S000003222  None    None    GO:0019158      mannokinase activity    []
SGD:S000000632  None    None    GO:0019200      carbohydrate kinase activity    []
SGD:S000001141  None    None    GO:0016301      kinase activity []
SGD:S000002915  None    None    GO:0004674      protein serine/threonine kinase activity        []
SGD:S000005776  None    None    GO:0051731      polynucleotide 5'-hydroxyl-kinase activity  []
SGD:S000000601  None    None    GO:0004674      protein serine/threonine kinase activity        []
SGD:S000003222  None    None    GO:0004340      glucokinase activity    []
SGD:S000000612  None    None    GO:0019200      carbohydrate kinase activity    []
SGD:S000006130  None    None    GO:0035174      histone serine kinase activity  []
SGD:S000003701  None    None    GO:0004674      protein serine/threonine kinase activity        []
SGD:S000003437  None    None    GO:0004849      uridine kinase activity []
SGD:S000003818  None    None    GO:0004550      nucleoside diphosphate kinase activity  []
SGD:S000002516  None    None    GO:0019150      D-ribulokinase activity []
SGD:S000002691  None    None    GO:0004672      protein kinase activity []
SGD:S000001124  None    None    GO:0004672      protein kinase activity []
SGD:S000004965  None    None    GO:0004672      protein kinase activity []
SGD:S000003631  None    None    GO:0004672      protein kinase activity []
SGD:S000001297  None    None    GO:0004672      protein kinase activity []
SGD:S000005242  None    None    GO:0004672      protein kinase activity []
SGD:S000001910  None    None    GO:0004672      protein kinase activity []
SGD:S000005376  None    None    GO:0004672      protein kinase activity []
SGD:S000003324  None    None    GO:0004672      protein kinase activity []
SGD:S000001531  None    None    GO:0004672      protein kinase activity []
SGD:S000000015  None    None    GO:0004672      protein kinase activity []
SGD:S000006125  None    None    GO:0004672      protein kinase activity []
SGD:S000003593  None    None    GO:0004672      protein kinase activity []
SGD:S000001121  None    None    GO:0004672      protein kinase activity []
SGD:S000004086  None    None    GO:0004672      protein kinase activity []
SGD:S000002266  None    None    GO:0004672      protein kinase activity []
SGD:S000003942  None    None    GO:0004672      protein kinase activity []
SGD:S000003272  None    None    GO:0004672      protein kinase activity []
SGD:S000005251  None    None    GO:0004672      protein kinase activity []
SGD:S000002186  None    None    GO:0004672      protein kinase activity []
SGD:S000000931  None    None    GO:0004672      protein kinase activity []
SGD:S000003664  None    None    GO:0004672      protein kinase activity []
SGD:S000005952  None    None    GO:0004672      protein kinase activity []
SGD:S000001357  None    None    GO:0004672      protein kinase activity []
SGD:S000001599  None    None    GO:0004672      protein kinase activity []
SGD:S000002373  None    None    GO:0004672      protein kinase activity []
SGD:S000006074  None    None    GO:0004672      protein kinase activity []
SGD:S000004747  None    None    GO:0004672      protein kinase activity []
SGD:S000005947  None    None    GO:0004672      protein kinase activity []
SGD:S000001072  None    None    GO:0004672      protein kinase activity []
SGD:S000002885  None    None    GO:0004672      protein kinase activity []
SGD:S000005963  None    None    GO:0004672      protein kinase activity []
SGD:S000004354  None    None    GO:0004672      protein kinase activity []
SGD:S000000999  None    None    GO:0004672      protein kinase activity []
SGD:S000003723  None    None    GO:0004672      protein kinase activity []
SGD:S000003700  None    None    GO:0004672      protein kinase activity []
SGD:S000001649  None    None    GO:0004672      protein kinase activity []
SGD:S000002655  None    None    GO:0004672      protein kinase activity []
SGD:S000001177  None    None    GO:0004672      protein kinase activity []
SGD:S000005098  None    None    GO:0004672      protein kinase activity []
SGD:S000000925  None    None    GO:0004672      protein kinase activity []
SGD:S000005488  None    None    GO:0004672      protein kinase activity []
SGD:S000000364  None    None    GO:0016301      kinase activity []
SGD:S000000036  None    None    GO:0016301      kinase activity []
SGD:S000003818  None    None    GO:0016301      kinase activity []
SGD:S000000605  None    None    GO:0016301      kinase activity []
SGD:S000000224  None    None    GO:0016301      kinase activity []
SGD:S000001949  None    None    GO:0016301      kinase activity []
SGD:S000003222  None    None    GO:0016301      kinase activity []
SGD:S000001649  None    None    GO:0004674      protein serine/threonine kinase activity        []
SGD:S000001649  None    None    GO:0016301      kinase activity []
SGD:S000002175  None    None    GO:0016301      kinase activity []
SGD:S000003700  None    None    GO:0004674      protein serine/threonine kinase activity        []
SGD:S000002266  None    None    GO:0016301      kinase activity []
SGD:S000002175  None    None    GO:0004672      protein kinase activity []
SGD:S000006124  None    None    GO:0004674      protein serine/threonine kinase activity        []
SGD:S000002266  None    None    GO:0004674      protein serine/threonine kinase activity        []
SGD:S000006124  None    None    GO:0016301      kinase activity []
SGD:S000002318  None    None    GO:0004672      protein kinase activity []
SGD:S000002318  None    None    GO:0004674      protein serine/threonine kinase activity        []
SGD:S000002634  None    None    GO:0016301      kinase activity []
SGD:S000002634  None    None    GO:0019205      nucleobase-containing compound kinase activity  []
SGD:S000002580  None    None    GO:0016301      kinase activity []
SGD:S000004821  None    None    GO:0016301      kinase activity []
SGD:S000003664  None    None    GO:0004674      protein serine/threonine kinase activity        []
SGD:S000003664  None    None    GO:0004713      protein tyrosine kinase activity        []
SGD:S000002931  None    None    GO:0004674      protein serine/threonine kinase activity        []
SGD:S000002931  None    None    GO:0016301      kinase activity []
SGD:S000003623  None    None    GO:0016301      kinase activity []
SGD:S000000854  None    None    GO:0016301      kinase activity []
SGD:S000001248  None    None    GO:0004672      protein kinase activity []
SGD:S000001248  None    None    GO:0016301      kinase activity []
SGD:S000001609  None    None    GO:0004672      protein kinase activity []
SGD:S000001609  None    None    GO:0016301      kinase activity []
SGD:S000004086  None    None    GO:0016301      kinase activity []
SGD:S000003272  None    None    GO:0004674      protein serine/threonine kinase activity        []
SGD:S000002691  None    None    GO:0016301      kinase activity []
SGD:S000001507  None    None    GO:0016301      kinase activity []
SGD:S000001507  None    None    GO:0019205      nucleobase-containing compound kinase activity  []
SGD:S000002862  None    None    GO:0016301      kinase activity []
SGD:S000001297  None    None    GO:0016301      kinase activity []
SGD:S000000112  None    None    GO:0004674      protein serine/threonine kinase activity        []
SGD:S000000112  None    None    GO:0016301      kinase activity []
SGD:S000005952  None    None    GO:0016301      kinase activity []
SGD:S000000545  None    None    GO:0004396      hexokinase activity     []
SGD:S000000545  None    None    GO:0016301      kinase activity []
SGD:S000005587  None    None    GO:0004672      protein kinase activity []
SGD:S000004615  None    None    GO:0016301      kinase activity []
SGD:S000004123  None    None    GO:0016301      kinase activity []
SGD:S000003810  None    None    GO:0016301      kinase activity []
SGD:S000005251  None    None    GO:0016301      kinase activity []
SGD:S000000071  None    None    GO:0004674      protein serine/threonine kinase activity        []
SGD:S000005127  None    None    GO:0004672      protein kinase activity []
SGD:S000005127  None    None    GO:0016301      kinase activity []
SGD:S000003324  None    None    GO:0016301      kinase activity []
SGD:S000006074  None    None    GO:0004713      protein tyrosine kinase activity        []
SGD:S000006074  None    None    GO:0016301      kinase activity []
SGD:S000000301  None    None    GO:0004672      protein kinase activity []
SGD:S000000301  None    None    GO:0016301      kinase activity []
SGD:S000005376  None    None    GO:0004674      protein serine/threonine kinase activity        []
SGD:S000005376  None    None    GO:0016301      kinase activity []
SGD:S000004230  None    None    GO:0016301      kinase activity []
SGD:S000001177  None    None    GO:0016301      kinase activity []
SGD:S000005098  None    None    GO:0016301      kinase activity []
SGD:S000004833  None    None    GO:0016301      kinase activity []
SGD:S000000601  None    None    GO:0016301      kinase activity []
SGD:S000000687  None    None    GO:0016301      kinase activity []
SGD:S000000529  None    None    GO:0016301      kinase activity []
SGD:S000000669  None    None    GO:0004674      protein serine/threonine kinase activity        []
SGD:S000000669  None    None    GO:0016301      kinase activity []
SGD:S000003087  None    None    GO:0016301      kinase activity []
SGD:S000001599  None    None    GO:0004674      protein serine/threonine kinase activity        []
SGD:S000001599  None    None    GO:0016301      kinase activity []
SGD:S000006125  None    None    GO:0016301      kinase activity []
SGD:S000000015  None    None    GO:0016301      kinase activity []
SGD:S000003942  None    None    GO:0004712      protein serine/threonine/tyrosine kinase activity       []
SGD:S000003942  None    None    GO:0016301      kinase activity []
SGD:S000004603  None    None    GO:0004674      protein serine/threonine kinase activity        []
SGD:S000004603  None    None    GO:0016301      kinase activity []
SGD:S000001686  None    None    GO:0004430      1-phosphatidylinositol 4-kinase activity        []
SGD:S000001686  None    None    GO:0016301      kinase activity []
SGD:S000000767  None    None    GO:0016301      kinase activity []
SGD:S000000105  None    None    GO:0016301      kinase activity []
SGD:S000001664  None    None    GO:0016301      kinase activity []
SGD:S000003723  None    None    GO:0004674      protein serine/threonine kinase activity        []
SGD:S000003723  None    None    GO:0016301      kinase activity []
SGD:S000001915  None    None    GO:0016301      kinase activity []
SGD:S000001915  None    None    GO:0016307      phosphatidylinositol phosphate kinase activity  []
SGD:S000003827  None    None    GO:0016301      kinase activity []
SGD:S000001681  None    None    GO:0016301      kinase activity []
SGD:S000001654  None    None    GO:0004674      protein serine/threonine kinase activity        []
SGD:S000001654  None    None    GO:0016301      kinase activity []
SGD:S000001651  None    None    GO:0004672      protein kinase activity []
SGD:S000001651  None    None    GO:0016301      kinase activity []
SGD:S000001644  None    None    GO:0004674      protein serine/threonine kinase activity        []
SGD:S000001644  None    None    GO:0004672      protein kinase activity []
SGD:S000001644  None    None    GO:0016301      kinase activity []
SGD:S000001550  None    None    GO:0016301      kinase activity []
SGD:S000001508  None    None    GO:0004672      protein kinase activity []
SGD:S000004296  None    None    GO:0016301      kinase activity []
SGD:S000000164  None    None    GO:0016301      kinase activity []
SGD:S000000184  None    None    GO:0016301      kinase activity []
SGD:S000000340  None    None    GO:0004674      protein serine/threonine kinase activity        []
SGD:S000000340  None    None    GO:0016301      kinase activity []
SGD:S000004747  None    None    GO:0016301      kinase activity []
SGD:S000001003  None    None    GO:0016301      kinase activity []
SGD:S000001075  None    None    GO:0016301      kinase activity []
SGD:S000001124  None    None    GO:0016301      kinase activity []
SGD:S000003701  None    None    GO:0016301      kinase activity []
SGD:S000000931  None    None    GO:0016301      kinase activity []
SGD:S000006130  None    None    GO:0016301      kinase activity []
SGD:S000002616  None    None    GO:0016301      kinase activity []
SGD:S000002616  None    None    GO:0016307      phosphatidylinositol phosphate kinase activity  []
SGD:S000006130  None    None    GO:0004674      protein serine/threonine kinase activity        []
SGD:S000002259  None    None    GO:0016301      kinase activity []
SGD:S000005963  None    None    GO:0004674      protein serine/threonine kinase activity        []
SGD:S000005211  None    None    GO:0016301      kinase activity []
SGD:S000001409  None    None    GO:0000155      phosphorelay sensor kinase activity     []
SGD:S000001409  None    None    GO:0016301      kinase activity []
SGD:S000000925  None    None    GO:0016301      kinase activity []
SGD:S000001357  None    None    GO:0016301      kinase activity []
SGD:S000001304  None    None    GO:0016301      kinase activity []
SGD:S000003420  None    None    GO:0004674      protein serine/threonine kinase activity        []
SGD:S000006258  None    None    GO:0004674      protein serine/threonine kinase activity        []
SGD:S000006258  None    None    GO:0004672      protein kinase activity []
SGD:S000006258  None    None    GO:0016301      kinase activity []
SGD:S000006135  None    None    GO:0016301      kinase activity []
SGD:S000003437  None    None    GO:0016301      kinase activity []
SGD:S000003636  None    None    GO:0016301      kinase activity []
SGD:S000001861  None    None    GO:0016301      kinase activity []
SGD:S000003593  None    None    GO:0004674      protein serine/threonine kinase activity        []
SGD:S000003593  None    None    GO:0016301      kinase activity []
SGD:S000003820  None    None    GO:0016301      kinase activity []
SGD:S000003866  None    None    GO:0016301      kinase activity []
SGD:S000002237  None    None    GO:0004672      protein kinase activity []
SGD:S000002237  None    None    GO:0016301      kinase activity []
SGD:S000003051  None    None    GO:0004672      protein kinase activity []
SGD:S000003027  None    None    GO:0016301      kinase activity []
SGD:S000003494  None    None    GO:0016301      kinase activity []
SGD:S000005310  None    None    GO:0016301      kinase activity []
SGD:S000005330  None    None    GO:0016301      kinase activity []
SGD:S000005200  None    None    GO:0016301      kinase activity []
SGD:S000005105  None    None    GO:0004672      protein kinase activity []
SGD:S000005105  None    None    GO:0016301      kinase activity []
SGD:S000004965  None    None    GO:0016301      kinase activity []
SGD:S000004535  None    None    GO:0016301      kinase activity []
SGD:S000004535  None    None    GO:0050354      triokinase activity     []
SGD:S000001072  None    None    GO:0016301      kinase activity []
SGD:S000000871  None    None    GO:0016301      kinase activity []
SGD:S000003631  None    None    GO:0004674      protein serine/threonine kinase activity        []
SGD:S000003631  None    None    GO:0004713      protein tyrosine kinase activity        []
SGD:S000002898  None    None    GO:0016301      kinase activity []
SGD:S000000999  None    None    GO:0004674      protein serine/threonine kinase activity        []
SGD:S000000999  None    None    GO:0016301      kinase activity []
SGD:S000002604  None    None    GO:0016301      kinase activity []
SGD:S000001622  None    None    GO:0016301      kinase activity []
SGD:S000002924  None    None    GO:0004396      hexokinase activity     []
SGD:S000002924  None    None    GO:0016301      kinase activity []
SGD:S000002516  None    None    GO:0016301      kinase activity []
SGD:S000004250  None    None    GO:0008481      sphinganine kinase activity     []
SGD:S000004250  None    None    GO:0003951      NAD+ kinase activity    []
SGD:S000004438  None    None    GO:0004396      hexokinase activity     []
SGD:S000004438  None    None    GO:0016301      kinase activity []
SGD:S000006109  None    None    GO:0003951      NAD+ kinase activity    []
SGD:S000003958  None    None    GO:0016301      kinase activity []
SGD:S000006179  None    None    GO:0016301      kinase activity []
SGD:S000006157  None    None    GO:0004672      protein kinase activity []
SGD:S000006157  None    None    GO:0016301      kinase activity []
SGD:S000002325  None    None    GO:0016301      kinase activity []
SGD:S000002427  None    None    GO:0016301      kinase activity []
SGD:S000002183  None    None    GO:0016301      kinase activity []
SGD:S000006071  None    None    GO:0016301      kinase activity []
SGD:S000005645  None    None    GO:0016301      kinase activity []
SGD:S000005645  None    None    GO:0004674      protein serine/threonine kinase activity        []
SGD:S000005488  None    None    GO:0016301      kinase activity []
SGD:S000005460  None    None    GO:0016301      kinase activity []
SGD:S000005697  None    None    GO:0016301      kinase activity []
SGD:S000005697  None    None    GO:0008481      sphinganine kinase activity     []
SGD:S000002915  None    None    GO:0016301      kinase activity []
SGD:S000005422  None    None    GO:0016301      kinase activity []
SGD:S000005473  None    None    GO:0016301      kinase activity []
SGD:S000005496  None    None    GO:0016301      kinase activity []
SGD:S000005947  None    None    GO:0016301      kinase activity []
SGD:S000002885  None    None    GO:0016301      kinase activity []





















Note that including part of (p) does not make a difference with the MF hierarchy in GO, but does make a big difference in the other two.

Important: closures make a big difference

Let’s compare the number of results with and without closures

[10]:
go -g input/gene_association.sgd.gaf -G gaf associations -p i,p "kinase activity" -O csv | wc
    3209   32091  315394
[11]:
go -g input/gene_association.sgd.gaf -G gaf associations "kinase activity" -O csv | wc
     285    2851   26750

Complex Queries

We can use the OAK graph query language to specify exhaustive lists of direct terms.

For example, not retrieve annotations to any kinase that is not a protein kinase:

[12]:
go -g input/gene_association.sgd.gaf -G gaf associations  .desc//p=i "kinase activity" .not .desc//p=i "protein kinase activity" -O csv | head -30
subject predicate       object  object_label    property_values subject_label   predicate_label negated publications    primary_knowledge_source        aggregator_knowledge_source
SGD:S000001369  None    GO:0016301      None            PFK26   None    None    SGD_REF:S000148669      infores:UniProt None
SGD:S000001369  None    GO:0003873      None            PFK26   None    None    SGD_REF:S000124037      infores:UniProt None
SGD:S000001369  None    GO:0003873      None            PFK26   None    None    SGD_REF:S000124036      infores:InterPro        None
SGD:S000001369  None    GO:0003873      None            PFK26   None    None    SGD_REF:S000051318|PMID:1322693 infores:SGD     None
SGD:S000001369  None    GO:0003873      None            PFK26   None    None    SGD_REF:S000048479|PMID:1657152 infores:SGD     None
SGD:S000002318  None    GO:0016301      None            STE7    None    None    SGD_REF:S000148669      infores:UniProt None
SGD:S000000605  None    GO:0004618      None            PGK1    None    None    SGD_REF:S000058483|PMID:6254992 infores:SGD     None
SGD:S000000605  None    GO:0004618      None            PGK1    None    None    SGD_REF:S000124036      infores:InterPro        None
SGD:S000000605  None    GO:0004618      None            PGK1    None    None    SGD_REF:S000124036      infores:InterPro        None
SGD:S000000605  None    GO:0004618      None            PGK1    None    None    SGD_REF:S000124036      infores:InterPro        None
SGD:S000000605  None    GO:0004618      None            PGK1    None    None    SGD_REF:S000124036      infores:InterPro        None
SGD:S000000605  None    GO:0004618      None            PGK1    None    None    SGD_REF:S000124037      infores:UniProt None
SGD:S000003818  None    GO:0004798      None            CDC8    None    None    SGD_REF:S000053290|PMID:6088527 infores:SGD     None
SGD:S000003818  None    GO:0004798      None            CDC8    None    None    SGD_REF:S000049877|PMID:6094555 infores:SGD     None
SGD:S000003818  None    GO:0004798      None            CDC8    None    None    SGD_REF:S000130762|PMID:19540237        infores:SGD     None
SGD:S000003818  None    GO:0004798      None            CDC8    None    None    SGD_REF:S000130762|PMID:19540237        infores:SGD     None
SGD:S000003818  None    GO:0004798      None            CDC8    None    None    SGD_REF:S000053290|PMID:6088527 infores:SGD     None
SGD:S000003818  None    GO:0004798      None            CDC8    None    None    SGD_REF:S000042433|PMID:6091111 infores:SGD     None
SGD:S000003818  None    GO:0004798      None            CDC8    None    None    SGD_REF:S000042433|PMID:6091111 infores:SGD     None
SGD:S000003818  None    GO:0004798      None            CDC8    None    None    SGD_REF:S000124037      infores:UniProt None
SGD:S000003818  None    GO:0004798      None            CDC8    None    None    SGD_REF:S000124036      infores:InterPro        None
SGD:S000003818  None    GO:0004798      None            CDC8    None    None    SGD_REF:S000124036      infores:InterPro        None
SGD:S000003818  None    GO:0009041      None            CDC8    None    None    SGD_REF:S000049877|PMID:6094555 infores:SGD     None
SGD:S000002939  None    GO:0004594      None            CAB1    None    None    SGD_REF:S000124037      infores:UniProt None
SGD:S000002939  None    GO:0004594      None            CAB1    None    None    SGD_REF:S000124036      infores:InterPro        None
SGD:S000002939  None    GO:0004594      None            CAB1    None    None    SGD_REF:S000129524|PMID:19266201        infores:SGD     None
SGD:S000002939  None    GO:0004594      None            CAB1    None    None    SGD_REF:S000065722|PMID:9890959 infores:SGD     None
SGD:S000002939  None    GO:0004594      None            CAB1    None    None    SGD_REF:S000124037      infores:UniProt None
SGD:S000002939  None    GO:0004594      None            CAB1    None    None    SGD_REF:S000124036      infores:InterPro        None
SGD:S000001357  None    None    GO:0016301      kinase activity []
SGD:S000001304  None    None    GO:0016301      kinase activity []
SGD:S000006258  None    None    GO:0016301      kinase activity []
SGD:S000006135  None    None    GO:0016301      kinase activity []
SGD:S000003437  None    None    GO:0016301      kinase activity []
SGD:S000003636  None    None    GO:0016301      kinase activity []
SGD:S000001861  None    None    GO:0016301      kinase activity []
SGD:S000003593  None    None    GO:0016301      kinase activity []
SGD:S000003820  None    None    GO:0016301      kinase activity []
SGD:S000003866  None    None    GO:0016301      kinase activity []
SGD:S000002237  None    None    GO:0016301      kinase activity []
SGD:S000003027  None    None    GO:0016301      kinase activity []
SGD:S000003494  None    None    GO:0016301      kinase activity []
SGD:S000005310  None    None    GO:0016301      kinase activity []
SGD:S000005330  None    None    GO:0016301      kinase activity []
SGD:S000005200  None    None    GO:0016301      kinase activity []
SGD:S000005105  None    None    GO:0016301      kinase activity []
SGD:S000004965  None    None    GO:0016301      kinase activity []
SGD:S000004535  None    None    GO:0016301      kinase activity []
SGD:S000004535  None    None    GO:0050354      triokinase activity     []
SGD:S000001072  None    None    GO:0016301      kinase activity []
SGD:S000000871  None    None    GO:0016301      kinase activity []
SGD:S000002898  None    None    GO:0016301      kinase activity []
SGD:S000000999  None    None    GO:0016301      kinase activity []
SGD:S000002604  None    None    GO:0016301      kinase activity []
SGD:S000001622  None    None    GO:0016301      kinase activity []
SGD:S000002924  None    None    GO:0004396      hexokinase activity     []
SGD:S000002924  None    None    GO:0016301      kinase activity []
SGD:S000002516  None    None    GO:0016301      kinase activity []
SGD:S000004250  None    None    GO:0008481      sphinganine kinase activity     []
SGD:S000004250  None    None    GO:0003951      NAD+ kinase activity    []
SGD:S000004438  None    None    GO:0004396      hexokinase activity     []
SGD:S000004438  None    None    GO:0016301      kinase activity []
SGD:S000006109  None    None    GO:0003951      NAD+ kinase activity    []
SGD:S000003958  None    None    GO:0016301      kinase activity []
SGD:S000006179  None    None    GO:0016301      kinase activity []
SGD:S000006157  None    None    GO:0016301      kinase activity []
SGD:S000002325  None    None    GO:0016301      kinase activity []
SGD:S000002427  None    None    GO:0016301      kinase activity []
SGD:S000002183  None    None    GO:0016301      kinase activity []
SGD:S000006071  None    None    GO:0016301      kinase activity []
SGD:S000005645  None    None    GO:0016301      kinase activity []
SGD:S000005488  None    None    GO:0016301      kinase activity []
SGD:S000005460  None    None    GO:0016301      kinase activity []
SGD:S000005697  None    None    GO:0016301      kinase activity []
SGD:S000005697  None    None    GO:0008481      sphinganine kinase activity     []
SGD:S000002915  None    None    GO:0016301      kinase activity []
SGD:S000005422  None    None    GO:0016301      kinase activity []
SGD:S000005473  None    None    GO:0016301      kinase activity []
SGD:S000005496  None    None    GO:0016301      kinase activity []
SGD:S000005947  None    None    GO:0016301      kinase activity []
SGD:S000002885  None    None    GO:0016301      kinase activity []
SGD:S000003664  None    None    GO:0016301      kinase activity []
SGD:S000002416  None    None    GO:0004335      galactokinase activity  []
SGD:S000003272  None    None    GO:0016301      kinase activity []
SGD:S000004818  None    None    GO:0016301      kinase activity []
SGD:S000005878  None    None    GO:0016301      kinase activity []
SGD:S000003426  None    None    GO:0016301      kinase activity []
SGD:S000005874  None    None    GO:0016301      kinase activity []
SGD:S000002874  None    None    GO:0016301      kinase activity []
SGD:S000002554  None    None    GO:0016301      kinase activity []
SGD:S000002939  None    None    GO:0016301      kinase activity []
SGD:S000005793  None    None    GO:0016301      kinase activity []
SGD:S000000972  None    None    GO:0004017      adenylate kinase activity       []
SGD:S000000972  None    None    GO:0016301      kinase activity []
SGD:S000000972  None    None    GO:0019205      nucleobase-containing compound kinase activity  []
SGD:S000002644  None    None    GO:0016301      kinase activity []
SGD:S000002655  None    None    GO:0016301      kinase activity []
SGD:S000002656  None    None    GO:0016301      kinase activity []

Querying via API

Some association sources provide an API, so rather than downloading an association file, you have OAK speak to the API.

Note that API endpoints may not support all OAK options; e.g. the amigo endpoint currently forces you to use IDs:

[13]:
!runoak -i amigo:NCBITaxon:9606 associations -p i,p GO:0016301 | head -30
subject predicate       object  property_values subject_label   predicate_label object_label    negated publications    primary_knowledge_source        aggregator_knowledge_source
UniProtKB:Q13976        None    GO:0004672              PRKG1   None    protein kinase activity None    PMID:25447536   BHF-UCL infores:go
UniProtKB:Q13976        None    GO:0004692              PRKG1   None    cGMP-dependent protein kinase activity  None    PMID:21402151   UniProt infores:go
UniProtKB:Q13976        None    GO:0004692              PRKG1   None    cGMP-dependent protein kinase activity  None    Reactome:R-HSA-418442   Reactome        infores:go
UniProtKB:Q13976        None    GO:0106310              PRKG1   None    protein serine kinase activity  None    GO_REF:0000116  RHEA    infores:go
UniProtKB:Q9HCP0        None    GO:0004674              CSNK1G1 None    protein serine/threonine kinase activity        None    PMID:25500533   ParkinsonsUK-UCL        infores:go
UniProtKB:Q9HCP0        None    GO:0106310              CSNK1G1 None    protein serine kinase activity  None    GO_REF:0000116  RHEA    infores:go
UniProtKB:Q9HCP0        None    GO:0004674              CSNK1G1 None    protein serine/threonine kinase activity        None    PMID:21873635   GO_Central      infores:go
UniProtKB:Q8IWQ3        None    GO:0004674              BRSK2   None    protein serine/threonine kinase activity        None    GO_REF:0000024  ARUK-UCL        infores:go
UniProtKB:Q8IWQ3        None    GO:0004674              BRSK2   None    protein serine/threonine kinase activity        None    PMID:14976552   UniProt infores:go
UniProtKB:Q8IWQ3        None    GO:0050321              BRSK2   None    tau-protein kinase activity     None    GO_REF:0000024  UniProt infores:go
UniProtKB:Q8IWQ3        None    GO:0050321              BRSK2   None    tau-protein kinase activity     None    PMID:21985311   UniProt infores:go
UniProtKB:Q8IWQ3        None    GO:0050321              BRSK2   None    tau-protein kinase activity     None    PMID:28386764   ARUK-UCL        infores:go
UniProtKB:Q8IWQ3        None    GO:0106310              BRSK2   None    protein serine kinase activity  None    GO_REF:0000116  RHEA    infores:go
UniProtKB:Q8IWQ3        None    GO:0050321              BRSK2   None    tau-protein kinase activity     None    PMID:21873635   GO_Central      infores:go
UniProtKB:Q8IWQ3        None    GO:0004674              BRSK2   None    protein serine/threonine kinase activity        None    PMID:21873635   GO_Central      infores:go
UniProtKB:Q96PF2        None    GO:0004674              TSSK2   None    protein serine/threonine kinase activity        None    GO_REF:0000024  UniProt infores:go
UniProtKB:Q96PF2        None    GO:0004674              TSSK2   None    protein serine/threonine kinase activity        None    PMID:18533145   UniProt infores:go
UniProtKB:Q96PF2        None    GO:0004674              TSSK2   None    protein serine/threonine kinase activity        None    PMID:20729278   UniProt infores:go
UniProtKB:Q96PF2        None    GO:0106310              TSSK2   None    protein serine kinase activity  None    GO_REF:0000116  RHEA    infores:go
UniProtKB:Q96PF2        None    GO:0004674              TSSK2   None    protein serine/threonine kinase activity        None    PMID:21873635   GO_Central      infores:go
UniProtKB:P19525        None    GO:0004672              EIF2AK2 None    protein kinase activity None    PMID:12882984   UniProt infores:go
UniProtKB:P19525        None    GO:0004672              EIF2AK2 None    protein kinase activity None    PMID:15229216   UniProt infores:go
UniProtKB:P19525        None    GO:0004672              EIF2AK2 None    protein kinase activity None    PMID:18835251   UniProt infores:go
UniProtKB:P19525        None    GO:0004672              EIF2AK2 None    protein kinase activity None    PMID:21123651   UniProt infores:go
UniProtKB:P19525        None    GO:0004672              EIF2AK2 None    protein kinase activity None    PMID:248628414  UniProt infores:go
UniProtKB:P19525        None    GO:0004674              EIF2AK2 None    protein serine/threonine kinase activity        None    PMID:1695551    PINC    infores:go
UniProtKB:P19525        None    GO:0004694              EIF2AK2 None    eukaryotic translation initiation factor 2alpha kinase activity None    PMID:25329545   UniProt infores:go
UniProtKB:P19525        None    GO:0004715              EIF2AK2 None    non-membrane spanning protein tyrosine kinase activity  None    GO_REF:0000003  UniProt infores:go
UniProtKB:P19525        None    GO:0016301              EIF2AK2 None    kinase activity None    PMID:21123651   UniProt infores:go
[ ]: