Class Enrichment Calculation Interface

class oaklib.interfaces.class_enrichment_calculation_interface.ClassEnrichmentCalculationInterface(resource: ~oaklib.resource.OntologyResource | None = None, strict: bool = False, _multilingual: bool | None = None, autosave: bool = <factory>, exclude_owl_top_and_bottom: bool = <factory>, ontology_metamodel_mapper: ~oaklib.mappers.ontology_metadata_mapper.OntologyMetadataMapper | None = None, _converter: ~curies.api.Converter | None = None, auto_relax_axioms: bool | None = None, cache_lookups: bool = False, property_cache: ~oaklib.utilities.keyval_cache.KeyValCache = <factory>, _edge_index: ~oaklib.indexes.edge_index.EdgeIndex | None = None, _entailed_edge_index: ~oaklib.indexes.edge_index.EdgeIndex | None = None, _prefix_map: ~typing.Mapping[str, str] | None = None, _association_index: ~oaklib.utilities.associations.association_index.AssociationIndex | None = None, normalizers: ~typing.List[~oaklib.interfaces.association_provider_interface.EntityNormalizer] = <factory>)[source]

An interface that provides services to test for over representation of class membership in a set of entities.

This interface is intended to be used to test for over representation of a class in a set of entities,

for example: to test for over representation of a disease in a set of genes.

enriched_classes(subjects: Iterable[str] | None = None, item_list: ItemList | None = None, predicates: Iterable[str] | None = None, object_closure_predicates: List[str] | None = None, background: Iterable[str] | None = None, hypotheses: Iterable[str] | None = None, cutoff=0.05, autolabel=False, filter_redundant=False, sort_by: str | None = None, direction='greater') Iterator[ClassEnrichmentResult][source]

Test for over-representation of classes in a set of entities.

>>> from oaklib import get_adapter
>>> adapter = get_adapter("src/oaklib/conf/go-pombase-input-spec.yaml")
>>> sample = ["PomBase:SPAC1142.02c", "PomBase:SPAC3H1.05", "PomBase:SPAC1142.06", "PomBase:SPAC4G8.02c"]
>>> for result in adapter.enriched_classes(sample, autolabel=True):
...    assert result.p_value < 0.05
...    print(f"{result.class_id} {result.class_label}")

GO:0006620 post-translational protein targeting to endoplasmic reticulum membrane
...

By default, results may include redundant terms. If we set filter_redundant=True, then redundant terms are removed, unless they are more significant than the descendant term

Parameters:
  • subjects – The set of entities to test for over-representation of classes

  • item_list – An item list objects as an alternate way to specify subjects

  • background – The set of entities to use as a background for the test (recommended)

  • hypotheses – The set of classes to test for over-representation (default is all)

  • cutoff – The threshold to use for the adjusted p-value

  • labels – Whether to include labels (names) for the classes

  • direction – The direction of the test. One of ‘greater’, ‘less’, ‘two-sided’

  • filter_redundant – Whether to filter out redundant hypotheses

  • sort_by – The field to sort by. One of ‘p_value’, ‘sample_count’, ‘background_count’, ‘odds_ratio’

  • direction – The direction of the test. One of ‘greater’, ‘less’, ‘two-sided’

Returns:

An iterator over ClassEnrichmentResult objects

create_self_associations()[source]

Create self associations for all terms in the ontology.

>>> from oaklib import get_adapter
>>> adapter = get_adapter("tests/input/go-nucleus.obo")
>>> adapter.create_self_associations()
>>> assocs = list(adapter.associations(["GO:0005773"]))
>>> assert len(assocs) == 1
>>> assoc = assocs[0]
>>> print(assoc.subject, assoc.predicate, assoc.object)
GO:0005773 owl:equivalentClass GO:0005773

This is useful for simple over-representation tests over term sets without any annotations.

>>> from oaklib import get_adapter
>>> adapter = get_adapter("tests/input/go-nucleus.obo")
>>> adapter.create_self_associations()
>>> terms = ["GO:0034357", "GO:0031965", "GO:0005773"]
>>> for r in adapter.enriched_classes(terms, autolabel=True, filter_redundant=True):
...     print(r.class_id, r.class_label, round(r.p_value_adjusted,3))
GO:0016020 membrane 0.004
...