Quick Start

This page walks through the most common workflow: validating a list of CURIEs, enriching them, and inspecting the results.

Installation

Pandasaurus requires Python 3.9–3.11. Install with pip or Poetry:

pip install pandasaurus

or

poetry add pandasaurus

Validate CURIEs

Use pandasaurus.curie_validator.CurieValidator to confirm that your seed terms exist and aren’t obsoleted:

from pandasaurus.curie_validator import CurieValidator

seeds = ["CL:0000084", "CL:0000787", "CL:0000636"]
terms = CurieValidator.construct_term_list(seeds)
CurieValidator.get_validation_report(terms)  # raises if invalid or obsoleted

Handle Invalid Terms

If pandasaurus.utils.pandasaurus_exceptions.InvalidTerm is raised, inspect the invalid IRIs from the exception message, update your seed list, and rerun.

Run an Enrichment

Instantiate pandasaurus.query.Query with your validated CURIEs and call an enrichment method, e.g. simple_enrichment():

from pandasaurus.query import Query

query = Query(seeds, force_fail=True)
df = query.simple_enrichment()
print(df.head())

force_fail=True ensures the constructor raises immediately on invalid or obsoleted terms.

Review Obsoleted Terms

If the seed list contains obsoleted CURIEs, use update_obsoleted_terms() to replace them with their suggested alternatives:

query.update_obsoleted_terms()

Generate a Graph

Every enrichment populates pandasaurus.query.Query.graph_df, which can be converted into a NetworkX-compatible graph:

graph = query.graph  # rdflib.Graph after transitive reduction
# or export query.graph_df for plotting

Next Steps

Explore advanced methods (minimal_slim_enrichment, contextual_slim_enrichment) in the Query API.
See User and Developer Guides for task-focused recipes.
Visit Contributing & Development to learn how to run tests, linting, and documentation builds locally.