Derived Schemas
This section describes rules that can be applied to a schema to obtain a derived (aka induced) schema.
The derived schema can be materialized, or it may be present as a view onto a schema. A derived view may also be referred to as an inferred or induced view.
Derivations happen via rules that are specified below, using a set of convenience functions
Conventions
We use m
to denote the input or asserted schema (model), and m*
to denote the derived schema
Functions
Function: ClassIdentifier
The function ClassIdentifier(c
) takes a ClassDefinition or ClassDefinitionName as input and returns:
- the name of a derived attribute
s
inc
wheres.identifier
is True inm*
- None if there is no such slot
- An error if there are multiple such slots
Function: Closure
The function Closure(x
, s
) takes as input an element x
and a metaslot s
and returns the mathematical closure
of looking up x.<s>
The ReflexiveClosure includes x
Function: Ancestors
The function Ancestors(x
) returns the Closure of the Parents function applied to x
. Parents itself is the union of is_a
and mixins
.
The function ReflexiveAncestors uses the ReflexiveClosure.
Derivation Rules
Rule: Model Imports
Each model imports zero to many imports, indicated by the SchemaDefinition.imports metaslot.
m*
is set to be the union of all schema elements from the ReflexiveClosure of m.imports
When copying an element x
from an import into m*
, the name x.name
must be unique - if the same name has been used in another model, the derivation procedure fails, and an error is thrown.
Note: If two or more models import the same target (e.g. m1
imports m2
and m3
and m2
imports m3
), m3
will be only be resolved once.
Note: Two models are considered to be "identical" if they both
have the same id
. If m2
and m3
both have id: http://models-r.us/modelA
, it is assumed that, despite the different location, they represent the same thing. LinkML will check the model version
field and will raise an error if m2
has version: 1.0.0
and m3
has version: 1.0.1
Each imported module must be resolved - i.e the value of the import slot is mapped to a location on disk or on the web
Rule: fromschema elements
Each element in the schema as assigned a metaslot fromschema
value. This is the value of the id
of the schema in which that element is defined.
This is preserved over imports, such that if m
imports m2
, and m2
defines a class c
, then m*[c].fromschema
= m2
Rule: Applicable Slot Names
The set of applicable slot names for a class c
are determined by taking the union of:
- The names of all values of
x.attributes
- The names of all values of
x.slots
For all x
in ReflexiveAncestors(c)
Rule: Derived Attributes
Derived attributes can be calculated for every applicable slot name for any class c
c.attributes[s] = DerivedAttributes(c,s)
This is the combination of asserted attributes for c, as well as asserted attributes for ancestors of c,
as well as the SlotDefinition corresponding to that attribute, together with any overrides specified by slot_usage
in c
and ancestors of c
- attributes asserted directly in
c.attributes
in the base schema - attributes derived from each SlotDefinition
s
inc.slots
by- looking up
s
inm*.slots
and copying the slot-value assignments from these SlotDefinitions - overriding these slot-value assignments with any slot-value assignments provided by
c.slot_usage[s]
- looking up
- inheriting from parents of
c
using precedence rules - inheriting from parents of
s
The precedence rules for derived attributes are as follows:
If a metaslot s
is declared multivalued
then when copying s
from a parent to a child, the values are appended.
If a metaslot s
is declared multivalued
if a slot is multi valued then copying will append, unless the element already exists.
if the slot is single valued, and intersection rules can be applied to the slot, then these are performed on all values
if the slot is single valued, and intersection rules cannot be applied to the slot, then the following precedence rules are applied:
- metaslot values from slot_usage take the highest priority
- metaslot values from the slot definition take the next highest priority
- direct mixins take the next highest priority. where multiple direct mixins are provided as a list, the last element takes highest priority
- direct is_as take the next highest priority
- the above two rules are applied one level up, and then recursively applied
Intersection rules
metaslot | rule |
---|---|
maximum_value |
min(v1,v2) |
minimum_value |
max(v1,v2) |
pattern |
TBD |
range |
IF subsumes(v1,v2) then v2 ELSE IF subsumes(v2,v1) then v2 ELSE UNDEFINED |
If the result of applying any intersection rule is UNDEFINED then we fall back on precedence rules
Rule: Default Range
For all attributes in the derived model, if a range is not assigned using the above method, then the range is assigned a value corresponding
to the value of default_range
for the schema in which the slot is defined.
Rule: Derived Class and Slot URIs
For each class or slot, if a class_uri or slot_uri is not specified, then this is derived by concatenating m.default_prefix
with the CURIE separator :
followed by the SafeUpperCamelCase encoding of the name of that class or slot definition
Rule: Generation of patterns from structured patterns
For any slot s
, if s.structured_pattern = p
and p
is not None then s.pattern
is assigned a value based on the following
procedure:
If p.interpolated
is True, then the value of s.syntax
is interpolated, by replacing all occurrences of braced text {VAR}
with the value of VAR
. The value of VAR
is obtained using m.settings[VAR]
, where m
is the schema in which p
is introduced.
If p.interpolated
is not True, then the value of s.syntax
is used directly.
If p.partial_match
is not True, then s.pattern
has a '^' character inserted at the begining and a '$' character inserted as the end.
Rule: Generation of ClassDefinitionReferences
For every ClassDefinition c
, if there exists a slot s
such that s.identifer=True
,
then a corresponding ClassDefinitionReference r
is generated.
r.name
is assigned to be the concatenation of c.name
and s.name
r
functionally serves as a foreign key to instances of r
, and allows for non-inlined
representation of instance references in tree-based formats such as JSON.
Structural Conformance Rules
Rule: Each referenced entity must be present
Every ClassDefinition, ClassDefinitionReference, SlotDefinitionReference, EnumDefinitionReference, and TypeDefinitionReference must be resolvable within m*
However, not every element needs to be referenced. For example, it is valid to have a list of SlotDefinitions that are never used in m*
.
ClassDefinition Structural Conformance Rules
Each c
in m*.classes
must conform to the rules below:
c
must be an instance of a ClassDefinitionc
must have a unique namec.name
, and this name must not be shared by any other class or element inm*
c
lists permissible slots inc.slots
, the range of this is a reference to a SlotDefinition inm*.slots
c
defines how slots are used in the context ofc
via a collection of SlotDefinitions specified inc.slot_usage
c
may define local slots usingc.attributes
, the value of this is a. collection of SlotDefinitionsc
may have certain boolean properties defined such asc.mixin
andc.abstract
c
must have exactly one value forc.class_uri
inm*
, and the value must be an instance of the builtin type UriOrCuriec
may have parent ClassDefinitions defined viac.is_a
andc.mixins
- the value of
c.is_a
must be a ClassDefinitionReference - the value of
c.mixins
must be a collection of ClassDefinitonReferences
- the value of
- For any parent
p
ofc
, ifp.mixin
is True, thenc.mixin
SHOULD be True c
includes additional rules inc.rules
andc.classificiation_rules
c
may have any number of additional slot-value assignments consistent with the validation rules provided here with the metamodelMM
SlotDefinition Structural Conformance Rules
Each s
in m*.slots
must conform to the rules below:
s
must be an instance of a SlotDefinitions
must have a unique names.name
, and this name must not be shared by any other type or elements
must have a range specified vias.range
inm*
s
may have an assignments.identifier
which is True ifs
plays the role of a unique identifiers
may have certain boolean properties defined such ass.mixin
ands.abstract
s
must have exactly one value fors.slot_uri
inm*
, and the value must be an instance of the builtin type UriOrCuries
may have parent SlotDefinitions defined vias.is_a
ands.mixins
- the value of
s.is_a
must be a SlotDefinitionReference - the value of
s.mixins
must be a collection of SlotDefinitionReferences - For any parent
p
ofs
, ifp.mixin
is True, thens.mixin
SHOULD be True
- the value of
s
may have any number of additional slot-value assignments consistent with the validation rules provided here with the metamodelMM
TypeDefinition Structural Conformance Rules
Each s
in m*.types
must conform to the rules below:
t
must be an instance of a TypeDefinitiont
must have a unique namet.name
, and this name must not be shared by any other type or elementt
must have a mapping to an xsd type provided viat.uri
inm*
t
may have a parent type declared viat.typeof
t
may have any number of additional slot-value assignments consistent with the validation rules provided here with the metamodelMM
EnumDefinition Structural Conformance Rules
Each e
in m*.enums
must conform to the rules below:
e
must be an instance of a EnumDefinitione
must have a unique namee.name
, and this name must not be shared by any other enum or elemente
lists all static permissible values viae.permissible_values
, the value of which is a list of instances of the MM class PermissibleValuee
may have any number of additional slot-value assignments consistent with the validation rules provided here with the metamodelMM
ClassDefinitionReference Structural Conformance Rules
Each r
in m*.class_references
must conform to the rules below:
r
must be an instance of a ClassDefinitionReferencer
must have a unique namer.name
, and this name must not be shared by any other type or element
Metamodel Conformance Rules
Both the asserted and derived schema should be valid instances of the LinkML metamodel MM using the instance validation rules described in the next section