Overview¶
django_analyses provides a database-supported pipeline engine meant to facilitate research management.
A general schema for pipeline management is laid out as follows:

Analyses¶
Each Analysis
may be associated with a number of
AnalysisVersion
instances, and each of
those must be provided with an interface, i.e. a Python class exposing some run() method
and returning a dictionary of results.
Input and Output Specifications¶
InputSpecification
and
OutputSpecification
each
aggregate a number of
InputDefinition
and OutputDefinition
sub-classes (respectively).
Input and Output Definitions¶
Currently, there are seven different types of built-in input definitions:
and two different kinds of supported output definitions:
Each one of these InputDefinition
and OutputDefinition
sub-classes
provides unique validation rules (default, minimal/maximal value or length, choices, etc.), and you
can easily create more definitions to suit your own needs.
Pipelines¶
Pipeline
instances are used to reference
a particular collection of Node
and
Pipe
instances. Each
Node
defines a particular combination of analysis
version and configuration, and each Pipe
connects
between one node’s output definition and another’s input definition.
Runs¶
Run
instances are used to keep a record of every time
an analysis version is run with a distinct set of inputs and outputs. If we ever to execute
a run with identical parameters, the RunManager
will simply return the existing run.