Introduction

About

The goal of this project is to define a high level language and data model that can be used to describe changes in ontologies and more generally, "knowledge graphs".

The language should be a higher level of abstraction than a low-level owl or rdf diff. For example, conceptually, changing the parent of a class in ontology is a single event, which can be broken down into a delete and add operation (see NodeMove).

An example of a NodeMove instance, correcting an incorrect placement from being part-of leg to part-of arm:

[ a kgcl:NodeMove ;
  kgcl:about UBERON:0002398 ## manus
  kgcl:old_value UBERON:0002103 ## hindlimb
  kgcl:old_predicate BFO:0000050 ## part_of
  kgcl:new_value UBERON:0002102 ## forelimb
  kgcl:new_predicate BFO:0000050 ## part_of
]

Examples

See the examples/ folder

Apply and Diff operations

Change objects can be used in two directions:

  • Diff: given two ontologies O1 and O2, generate a list of changes C
  • Apply: Given a list of changes C, apply to ontology O1 to generate O2

The Diff operation is intended to be used to provide KG/Ontology authors a high level view of changes.

The Apply operation can take diffs as inputs. The diffs can have different serializations:

  • A direct instantiation of the classes design in the model, in one of:
  • JSON
  • YAML
  • RDF
  • A string serialization
  • A tsv serialization, for use in spreadsheets

For example an instance of a rename class on Uberon to change the primary label of an entity may be:

[ a kgcl:NodeRename ;
  kgcl:about UBERON:0002398 ;
  kgcl:old_value "manus" ;
  kgcl:new_value "hand" ]

This can be serialized as

  • rename UBERON:0002398 from 'manus' to 'hand'

TBD: we also want an even more compact form:

  • rename 'manus' to 'hand'

There are a few user stories for the Apply operation:

  • As an ontology contributor, I want to quickly describe and apply a change to an ontology, so that I do not have to clone a github repo, open protege, make a PR
  • As an ontology tool creator, I want to generate "suggestions" for changes to an ontology, so that a human user can spot-check them and apply all valid ones
    • Example: a tool that makes suggested lexical changes to text defs to conform to standards
    • An OWL logic tool may suggest redundant axioms that can safely be removed. The curator feels safest vetting these via the intermediate form

Intended use in GitHub

One intended killer app for this language is the ability for a human or agent to specify a set of changes in a GitHub ticket in a human-readable transparent way, and for a bot to create a PR from the computable description in the ticket.

This would be ideal for "drive-by" edits and Term Brokers.

The overall idea is laid out in: this google doc

Schema Source

The linkml source yaml can be found here:

The source is in LinkML. The best way to browse this is via the generated markdown

An example class is kgcl:NodeRename

classes:
  node rename:
    is_a: node change
    description: >-
      A node change where the name (aka rdfs:label) of the node changes
    slots:
      - old value
      - new value
      - has textual diff      
    slot_usage:
      old value:
        multivalued: false
      new value:
        multivalued: false
      change description:
        string_serialization: "renaming {about} from {old value} to {new value}"

Derived Artefacts

Python Classes

Example code:

from kgcl import NodeRename

change = NodeRename(about="UBERON:1234567", old_value="limb skin", new_value="skin of limb")

JSON Schema

Example snippet:

      "NodeRename": {
         "additionalProperties": false,
         "description": "A node change where the name (aka rdfs:label) of the node changes",
         "properties": {
            "about": {
               "type": "string"
            },
            "change_description": {
               "type": "string"
            },
            "has_textual_diff": {
               "$ref": "#/definitions/TextualDiff"
            },
            "new_value": {
               "type": "string"
            },
            "old_value": {
               "type": "string"
            },
            "was_generated_by": {
               "type": "string"
            }
         },
         "required": [],
         "title": "NodeRename",
         "type": "object"
      },

Implementation

Currently this is just a schema, no implementation.

It is likely we will bind this into owl-diff so compilation to scala traits likely in future.

Note that transactions can themselves be represented in RDF. This can be either JSON-LD following the schema above or native. The ShEx Schema constrains the shape of the RDF graph.