You're reading an old version of this documentation. For up-to-date information, please have a look at v0.5.

aspecd.tasks module

Constituents of a recipe-driven data analysis.

One main aspect of tasks is to provide the constituents of a recipe-driven data analysis, i.e. aspecd.tasks.Recipe and aspecd.tasks.Chef. In its simplest form, a recipe gets cooked by a chef, resulting in a series of tasks being performed on a list of datasets.

The idea of recipes here is to provide all necessary information for data processing and analysis in a simple, human-readable and human-writable form. This allows users not familiar with programming to perform even complex tasks. In addition, recipes can even be “executed” using the command line, not needing to start a Python interpreter.

From a user’s perspective, a recipe is usually stored in a YAML file. This allows to easily create and modify recipes without knowing too much about the underlying processes. For an accessible overview of the YAML syntax, see the introduction provided by ansible .

Recipe-driven data analysis by example

Recipes always consist of two major parts: A list of datasets to operate on, and a list of tasks to be performed on the datasets. Of course, you can specify for each task on which datasets it should be performed, and if possible, whether it should be performed on each dataset separately or combined. The latter is particularly interesting for representations (e.g., plots) consisting of multiple datasets, or analysis steps spanning multiple datasets.

A first recipe

To give a first impression of how such a recipe may look like:

datasets:
  - loi:xxx
  - loi:yyy

tasks:
  - kind: processing
    type: SingleProcessingStep
    properties:
      parameters:
        param1: bar
        param2: foo
      prop2: blub
  - kind: singleanalysis
    type: SingleAnalysisStep
    properties:
      parameters:
        param1: bar
        param2: foo
      prop2: blub
    apply_to:
      - loi:xxx
    result: new_dataset

Here, tasks is a list of dictionary-style entries. The key kind determines which kind of task should be performed. For each kind, a class subclassing aspecd.tasks.Task needs to exist. For details, see below. The key type stores the name of the actual class, such as a concrete processing step derived from aspecd.processing.SingleProcessingStep. The dictionary properties contains keys corresponding to the attributes of the respective class. Depending on the type of task, additional keys can be used, such as apply_to to determine the datasets this task should be applied to, or result providing a label for a dataset created newly by an analysis task.

Note

The use of loi: markers in the example above points to a situation in which every dataset can be accessed by a unique identifier. For details, see the LabInform documentation.

Base directory for dataset import

There are different ways to refer to datasets, but the most common (for now) is to specify the (relative or absolute) path to the datasets within the local file system.

At the same time, the “paths” listed in the datasets list are used as internal references within the recipe. Therefore, short names are preferrable.

To make things a bit easier, there is a way to define the source directory for datasets:

directories:
  datasets_source: /path/to/my/datasets/

datasets:
  - dataset1
  - dataset2

tasks:
  - kind: processing
    type: SingleProcessingStep

In this case, all dataset names will be treated relative to the source directory. Note that if you provide the option datasets_source_directory, this can be both, an absolute path, as shown here for unixoid file systems, and a relative path, as shown in the second example below.

directories:
  datasets_source: relative/path/to/my/datasets/

datasets:
  - dataset1
  - dataset2

tasks:
  - kind: processing
    type: SingleProcessingStep

Here, paths have been given for unixoid file systems, using / as a separator. Adjust to your needs if necessary.

Output directory

Some tasks, namely plotting and report tasks, can save their results to files. This will usually be the directory you cook the recipe from. However, sometimes it is quite convenient to specify an output directory, either relative or absolute.

To do so, simply add the output_directory key to the top level of your recipe:

directories:
  output: /absolute/path/for/the/outputs

datasets:
  - dataset

tasks:
  - kind: singleplot
    type: SinglePlotter
    properties:
      filename:
        - fancyfigure.pdf

As said, this path can as well be a relative path with respect to the directory you cook your recipes from:

directories:
  output: relative/path/for/the/outputs

datasets:
    - dataset

tasks:
  - kind: singleplot
    type: SinglePlotter
    properties:
      filename:
        - fancyfigure.pdf

Here, paths have been given for unixoid file systems, using / as a separator. Adjust to your needs if necessary.

Tasks from other packages

Usually, you will use classes to perform the individual tasks that come from your own package. There is a a simple way of doing that, not having to prefix the kind property of every single task: define the default package name like so:

settings:
  default_package: my_package

datasets:
  - loi:xxx
  - loi:yyy

tasks:
  - kind: processing
    type: SingleProcessingStep

If you would like to use a class from a different package for only one task, feel free to prefix the “kind” attribute of the respective task, as shown:

tasks:
  - kind: some_other_package.processing
    type: SingleProcessingStep

Of course, in order to work, this package termed here “some_other_package” needs to follow the same basic rules and layout as the ASpecD framework and packages derived from it. In particular, if you use the “default_package” directive in your recipe, the given package needs to implement a child of the aspecd.dataset.DatasetFactory class.

To state the obvious: You can, of course, combine both strategies, defining a default package and overriding this for a particular task:

settings:
  default_package: my_package

datasets:
  - loi:xxx
  - loi:yyy

tasks:
  - kind: some_other_package.processing
    type: SingleProcessingStep

Setting own labels (and properties) for datasets

Usually, you specify the path (or any other unique and supported identifier) to your dataset(s) in the list of datasets at the beginning of a recipe, like this:

datasets:
  - /lengthly/path/to/dataset1
  - /lengthly/path/to/dataset2

In this case, you will have to refer to the datasets by their path (or whatever other identifier you used). Usually, these identifiers are quite lengthly, hence not necessarily convenient for use as labels within a recipe. However, you can set your own ids for datasets:

datasets:
  - source: /lengthly/path/to/dataset1
    id: dataset1
  - source: /lengthly/path/to/dataset2
    id: dataset2

Make sure to set the source value to the identifier of your dataset. For your id, you are free to choose, as long as it is a valid key for a dict. From now on, refer to the datasets by their respective ids throughout the recipe.

Note

If you use the source key but don’t specify a id key as well, the source will be used as id, as before.

However, you can even drive the whole thing one step further: Suppose you are bored from having always the dataset label (that is by default identical to the source it is imported from) appearing in a figure legend, as it simply does not fit to what you need. How about that:

datasets:
  - source: /lengthly/path/to/dataset1
    id: dataset1
    label: low concentration
  - source: /lengthly/path/to/dataset2
    id: dataset2
    label: high concentration

In this case, you assign the label field of your datasets upon loading them. The idea behind: When specifying which dataset to load, you usually know best about such things, and you don’t want to need to deal with this later on when plotting.

Important

Generally, each property of a dataset can be set this way. However, be careful not to override properties that are not scalar and cannot easily be represented in YAML in the recipe, as you will most certainly break things otherwise. A good example of how to definitely break things would be to override the data property of a dataset.

Import datasets from other packages

Sounds strange in the first place, but appears to be more common than you may imagine: Sometimes, you need to compare datasets recorded using different methods that are in turn handled by different ASpecD-derived packages.

So how to import a dataset using the importer of a different package than the current one? The syntax for the recipe is much the same as the one described above for setting other properties of a dataset:

datasets:
  - source: /lengthly/path/to/dataset1
  - source: /lengthly/path/to/dataset2
    package: other_package

In this example, the first dataset will be imported using the default package set for the recipe, but the second dataset will be loaded using the aspecd.dataset.DatasetFactory and aspecd.io.DatasetImporterFactory classes from other_package. Of course, you need to make sure that other_package exists and contains both, a dataset factory and dataset importer factory. Furthermore, these two classes need to reside in the same modules as in the ASpecD framework, i.e., the dataset factory needs to reside in the “dataset” module and the dataset importer factory in the “io” module.

Specify importer for datasets

Sometimes it may be necessary to explicitly provide the importer class that shall be used to import a dataset. In this case, you can explicitly say which importer to use:

datasets:
  - source: /lengthly/path/to/dataset1
  - source: /lengthly/path/to/dataset2
    importer: TxtImporter

However, be careful to match data format and importer, as you are overriding the automatic importer determination of the aspecd.io.DatasetImporterFactory this way. Furthermore, make sure the respective importer class exists. Of course, this works as well with providing an alternative package:

datasets:
  - source: /lengthly/path/to/dataset1
  - source: /lengthly/path/to/dataset2
    package: other_package
    importer: TxtImporter

In this particular example, the importer located in other_package.io.TxtImporter would be used to import your dataset. The parameters will be directly passed to the importer without further checking, and it is the sole responsibility of the importer class to make sense of the parameters provided. Have a look at the documentation of the actual importer class you intend to use for parameters you can set (if any). Note that many parameters will not recognise additional parameters.

Specify importer parameters for datasets

Furthermore, sometimes you may want to provide parameters for an importer, e.g. in case of importing text files with headers, and you can do this as well:

datasets:
  - source: /lengthly/path/to/dataset1
  - source: /lengthly/path/to/dataset2
    importer: TxtImporter
    importer_parameters:
      skiprows: 3

You can even provide importer_parameters without explicitly specifying an importer to use, although this may lead to hard to detect behaviour, as you rely on the automatism of choosing the importer class implemented in the aspecd.io.DatasetImporterFactory in this case.

Referring to other datasets and results

Some tasks yield results you usually would want to use later on in the recipe. Prime examples are analysis steps and plots. While analysis steps have a property result that can refer to either a dataset or something else, depending on the actual type of analysis step, plots have a label that can be used to refer to them.

While analysis steps always yield results, processing steps usually operate on a dataset that gets modified in turn. However, sometimes it is desired to return the modified dataset as a new dataset, independent of the original one. In this case, specify a result here, too. For details, see the aspecd.tasks.ProcessingTask documentation below.

Variable replacement

Additionally to the labels described above, variables will be parsed and replaced. Currently, the following types of variables are understood:

key1: {{ basename(id) }}
key2: {{ path(id) }}
key3: {{ id(id) }}

Here, id is the id used internally for referring to a dataset, {{ basename(id) }} will be replaced with the file basename of the respective dataset source, {{ path(id) }} will be replaced by the path of the respective dataset source, and {{ id(id) }} will be replaced by the id itself.

Note: The spaces within the double curly brackets are only for better readability, they can be omitted, although this is not recommended.

Why is this interesting? Suppose you would like to create a rather generic recipe always performing the same tasks, but for different datasets. A rather minimal example is given below:

datasets:
  - source: /path/to/my_dataset.txt
    id: first_measurement

tasks:
  - kind: processing
    type: SubtractBaseline
    properties:
      parameters:
        kind: polynomial
        order: 0
  - kind: singleplot
    type: SinglePlotter
    properties:
      filename:
        - {{ basename(first_measurement) }}.pdf

Here, you can see that all you would need to do is to replace the source with the actual path to your dataset. This will automatically perform the tasks of the recipe on the given dataset, storing the plot to a file named my_dataset.pdf.

Executing recipes: serving the cooked results

As stated above, a recipe gets cooked by a chef, resulting in a series of tasks being performed on a list of datasets. However, as an (end) user you usually don’t care about chefs and recipes besides the human-readable and writable representation of a recipe in YAML format. Therefore, there is a fairly simple way to get a recipe executed, or, in terms of the metaphor of recipe and cook, to get the meal served:

serve <my-recipe>.yaml

No need of running a Python terminal, no need of instantiating any class. Simply executing a command from a terminal, that’s all that is to it. In this particular example, <my-recipe> is a placeholder for your recipe file name.

Of course, you can do the same from within Python:

serve(recipe_filename='<my-recipe>.yaml')

And if you insist, of course there is an object-oriented way to do it:

chef_de_service = ChefDeService()
chef_de_service.serve(recipe_filename='<my-recipe>.yaml')

The good news with all this: It should work for every package derived from the ASpecD framework, as long as you specify the default_package directive within the recipe. And of course, calling the recipe from the command-line will only help you if it creates some kind of output.

History of a recipe

The aspecd.tasks.Chef class takes care of automatically creating a history of the recipe cooked, with a full list of parameters for each task. This history is a dict that follows the same structure as the original recipe. Therefore, you can save this history to a YAML file and use it as a recipe again, perhaps after some modifications.

If you use the aspecd.tasks.ChefDeService class, you need not care about actually writing the history to a YAML file. Therefore, using this class or even the command-line call to serve as described above, is highly recommended. In this case, you will have a full history of all your tasks contained in a human-readable YAML file, together with some additional information on the system and package versions used to cook the recipe, as well as the time for start and end of cooking.

To make it short: The history of the recipe allows you to perform a fully reproducible data analysis even of multiple datasets and arbitrarily complex tasks without having to care about the details. You get it all for free. That’s what the ASpecD framework is all about. Care about the results of your data analysis and what this means in terms of answering the scientific questions that originally triggered obtaining and analysing the data. Reproducibility is been taken care of for you.

Suppress automatically writing history

Warning

Not having a history of each individual step of your data analysis is considered bad practice and not consistent with reproducible research. Therefore, use the following only in debugging/development settings, never in real life applications.

In some particular cases, namely debugging and development, writing a history for each individual cooking of a recipe might be inconvenient. Therefore, you can tell ASpecD not to automatically write a history. However, use with extreme caution!

settings:
    write_history: false

To remind you of what you are doing, this will issue a warning on the command line if using serve.

Kinds of tasks

Tasks can be grouped similarly to the way classes of the ASpecD framework are grouped into different modules. Hence, there are different kinds of tasks. Each task is internally represented by an aspecd.tasks.Task object, more precisely an object instantiated from a subclass of aspecd.tasks.Task. This polymorphism of task classes makes it possible to easily extend the scope of recipe-driven data analysis. Therefore, to allow ASpecD to know how to handle your task (i.e., what task object to create), you need to specify the kind of your task within the recipe, besides the type that is the class name of the actual class performing the respective task.

Currently, the following subclasses are implemented:

As you can see from the above list, there are (currently) three special cases of kinds of tasks: processing, analysis and plot tasks. Usually, you will set the kind of a task in a recipe to the module the class eventually performing the task resides in. As both, analyses and plots can either span one or several datasets, here we have to discriminate. Therefore, it is essential that you take care to set the kind value in your recipe for these kinds of tasks to singleanalysis or multianalysis, respectively. the same is true for plots. To make this a bit easier to follow, see the example below.

tasks:
  - kind: processing
    type: SingleProcessingStep

  - kind: singleanalysis
    type: AnalysisStep

  - kind: multiplot
    type: MultiPlotter1D

Important

As long as there is no automatic syntax checking of recipes before they get executed, you are entirely responsible on your own to provide correct syntax. From own experience, there are a few problems frequently arising: Don’t use analysis, but either singleanalysis or multianalysis as kind in an analysis step. The same applies to plots. Don’t use plotting, but either singleplot or multiplot as kind.

Properties of tasks

For each task, you can set all attributes of the underlying class using the properties dictionary in the recipe. Therefore, to know which parameters can be set for what kind of task means simply to check the documentation for the respective classes. I.e., for a task represented by an aspecd.tasks.ProcessingTask object, check out the appropriate class from the aspecd.processing module. The same is true for packages derived from ASpecD.

A simple example is the normalisation processing step using the aspecd.processing.Normalisation class:

tasks:
  - kind: processing
    type: Normalisation
    properties:
      parameters:
        kind: amplitude

How to know what properties can be set? Have a look at the aspecd.processing.Normalisation documentation. Note that all properties that are documented there can be set using a recipe. As processing steps always have a property parameters that is a dict, you need to set the individual keys of this dictionary.

Additionally, for each task, you can explicitly state to which of the datasets it should be applied to. Note that not only the datasets initially loaded can be used here, but all labels referring to datasets that originate from other tasks.

Furthermore, depending on the kind of task, you may be able to set additional parameters controlling in more detail how the particular task is performed. For details, see the documentation of the respective task subclass in this module below.

Prerequisites for recipe-driven data analysis

Note

This section is mostly relevant for those developing packages based on the ASpecD framework. Users of recipe-driven data analysis usually need not bother about these details (as others did for them already).

To be able to use recipe-driven data analysis in packages derived from the ASpecD framework, a series of prerequisites needs to be met, i.e., classes implemented. Besides the usual suspects such as aspecd.dataset.Dataset and its constituents as well as the different processing and analysis steps based on aspecd.processing.SingleProcessingStep and aspecd.analysis.SingleAnalysisStep, two different factory classes need to be implemented in particular, subclassing

respectively. Actually, only aspecd.dataset.DatasetFactory is directly used by aspecd.tasks.Recipe, however, internally it relies on the existence of aspecd.io.DatasetImporterFactory to return a dataset based solely on a (unique) ID.

Besides implementing these classes, the facilities provided by the aspecd.tasks module should be fully sufficient for regular recipe-driven data analysis. In particular, normally there should be no need to subclass any of the classes within this module in a package derived from the ASpecD framework. One particular design goal of recipe-driven data analysis is to decouple the actual tasks being performed from the general handling of recipes. The former is implemented within each respective package built upon the ASpecD framework, the latter is taken care of fully by the ASpecD framework itself. You might want to implement a simple proxy within a derived package to prevent the user from having to call out to functionality provided directly by the ASpecD framework. The latter might be confusing for those unfamiliar with the underlying details, i.e., most common users. More explicit, you may want to create proxy classes in the processing and analysis modules of your package, subclassing all the concrete processing and analysis steps already provided with the ASpecD framework.

Notes for developers

Note

This section is only relevant for those further developing the ASpecD framework. Users of recipe-driven data analysis as well as developers of packages derived from the ASpecD framework usually need not bother about these details (as others did for them already).

Recipe-driven data analysis introduces another level of abstraction and indirection with its use of recipes in YAML format. Based on this analogy, we have a aspecd.tasks.Recipe consisting of a list of datasets and a list of aspecd.tasks.Task to be performed on the datasets. Such recipe gets “cooked” by a aspecd.tasks.Chef, and for the convenience of the user of recipe-driven data analysis, the result gets “served” by the aspecd.tasks.ChefDeService. An actual user will not see any of this, but simply call serve <recipe-name.yaml> from the command line.

Internally, recipes are represented by an instance of aspecd.tasks.Recipe, and this representation takes care already to import the datasets specified in the datasets block of a recipe. Therefore, all handling of data import needs to be done here. Similarly, upon populating a recipe (from dict or by importing), the tasks will already be created using a aspecd.tasks.TaskFactory.

The actual tasks are represented by instances of subclasses of aspecd.tasks.Task, and they in turn create an instance of the actual object internally, applying this to the dataset(s).

“Cooking” a recipe is done by aspecd.tasks.Chef, and this class takes care of writing a history in form of an executable recipe, thus ensuring reproducibility and good scientific practice.

“Serving” the results of a cooked recipe is eventually the responsibility of the aspecd.tasks.ChefDeService, and it is this class calling out to the aspecd.tasks.Chef and writing the history to an actual file that can be used as recipe again. For the convenience of the user, an entry point (console script) is included in the setup.py file calling aspecd.tasks.serve() that in turn takes care of loading the recipe and instantiating a aspecd.tasks.ChefDeService.

Module documentation

Todo

There is a number of things that are not yet implemented, but highly recommended for a working recipe-driven data analysis that follows good practice for reproducible research. This includes (but may not be limited to):

  • Parser for recipes performing a static analysis of their syntax. Useful particularly for larger datasets and/or longer lists of tasks.

class aspecd.tasks.Recipe

Bases: object

Recipes get cooked by chefs in recipe-driven data analysis.

A recipe contains a list of tasks to be performed on a list of datasets. To actually carry out all tasks in a recipe, it is handed over to a aspecd.tasks.Chef object for cooking using the respective aspecd.tasks.Chef.cook() method.

From a user’s perspective, recipes reside usually in YAML files from where they are imported into an aspecd.tasks.Recipe object using its respective import_into() method and an object of class aspecd.io.RecipeYamlImporter. Similarly, a given recipe can be exported back to a YAML file using the export_to() method and an object of class aspecd.io.RecipeYamlExporter.

In contrast to the persistent form of a recipe (e.g., as file on the file system), the object contains actual datasets and tasks that are objects of the respective classes. Therefore, the attributes of a recipe are normally set by the respective methods from either a file or a dictionary (that in turn will normally be created from contents of a file).

Retrieving datasets is delegated to an aspecd.dataset.DatasetFactory instance stored in dataset_factory. This provides a maximum of flexibility but makes it necessary to specify (and first implement) such factory in packages derived from the ASpecD framework.

Todo

Can recipes have LOIs themselves and therefore be retrieved from the extended data safe? Might be a sensible option, although generic (and at the same time unique) LOIs for recipes are much harder to create than LOIs for datasets and alike.

Generally, the concept of a LOI is nothing a recipe needs to know about. But it does know about an ID of any kind. Whether this ID is a (local) path or a LOI doesn’t matter. Somewhere in the ASpecD framework there may exist a resolver (factory) for handling IDs of any kind and eventually retrieving the respective information.

datasets

Ordered dictionary of datasets the tasks should be performed for

Each dataset is an object of class aspecd.dataset.Dataset.

The keys are the dataset ids.

Type

collections.OrderedDict

tasks

List of tasks to be performed on the datasets

Each task is an object of class aspecd.tasks.Task.

Type

list

results

Ordered dictionary of results originating from analysis tasks

Results can be of any type, but are mostly either instances of aspecd.dataset.Dataset or aspecd.metadata.PhysicalQuantity.

The keys are those defined by aspecd.tasks.SingleanalysisTask.result and aspecd.tasks.MultianalysisTask.result, respectively.

Type

collections.OrderedDict

figures

Ordered dictionary of figures originating from plotting tasks

Each entry is an object of class aspecd.tasks.FigureRecord.

Type

collections.OrderedDict

plotters

Ordered dictionary of plotters originating from plotting tasks

Each entry is an object of class aspecd.plotting.Plotter.

To end up in the list of plotters, the plot task needs to define a result. This is mainly used for tasks involving CompositePlotters, to define the plotters for each individual plot panel.

Type

collections.OrderedDict

dataset_factory

Factory for datasets used to retrieve datasets

If no factory is set, but a recipe imported from a file or set from a dictionary, an exception will be raised.

Type

aspecd.dataset.DatasetFactory

task_factory

Factory for tasks

Defaults to an object of class aspecd.tasks.TaskFactory.

If no factory is set, but a recipe imported from a file or set from a dictionary, an exception will be raised.

Type

aspecd.tasks.TaskFactory

format

Information on the format of the recipe

This information is used to automatically convert recipes to the current format.

Dictionary with the following fields:

typestr

Information on the type of file

Shall never be changed.

Default: ASpecD recipe

versionstr

Version of the recipe structure (in form X.Y)

This information is used to automatically convert recipes to the current format.

Defaults to the latest version.

New in version 0.4.

Type

dict

settings

General settings relevant for cooking the recipe.

Dictionary with the following fields:

default_package: str

Name of the package the task objects are obtained from

If no name for a default package is supplied, “aspecd” is used.

autosave_plots: bool

Whether to save plots automatically even if no filename is provided.

If true, each aspecd.tasks.SingleplotTask and aspecd.tasks.MultiplotTask will save the plots to default file names. For details, see the documentation of the respective classes.

Default: True

New in version 0.2.

write_history: bool

Whether to write a history when serving the recipe.

If true, for each serving of a recipe, a history will be written into a file with same base name and timestamp appended. In terms of reproducible research, it is highly recommended to always write a history. However, when debugging, you may set this to “False”.

Default: True

New in version 0.4.

Changed in version 0.4: Moved properties to keys in this dictionary

Type

dict

directories

Optional control of different directories.

Dictionary with the following fields:

datasets_source: str

Root directory for the datasets.

Interpreted as absolute path if starting with the system-specific file separator. Otherwise, interpreted as relative to the current directory. If provided, all output resulting from cooking a recipe will be saved to this path.

output: str

Directory to save output (plots, reports, …) to.

Interpreted as absolute path if starting with the system-specific file separator. Otherwise, interpreted as relative to the current directory. If provided, all output resulting from cooking a recipe will be saved to this path.

Make sure the path actually exists. Otherwise, you may run into trouble when tasks try to save their output.

Changed in version 0.4: Moved properties to keys in this dictionary

Type

dict

filename

Name of the (YAML) file the recipe was loaded from.

Empty string if recipe was loaded from a dictionary instead.

The filename can be used to persist the history of a cooked recipe in form of a YAML file for full reproducibility. This will be done when using the aspecd.tasks.ChefDeService class and its aspecd.tasks.ChefDeService.serve() method.

Type

str

Raises
  • aspecd.tasks.MissingDictError – Raised if no dict is provided.

  • aspecd.tasks.MissingImporterError – Raised if no importer is provided.

  • aspecd.tasks.MissingExporterError – Raised if no exporter is provided.

  • aspecd.tasks.MissingTaskFactoryError – Raised if task_factory is invalid.

Changed in version 0.4: Move properties “default_package” and “autosave_plots” to new property “settings”; move properties “output_directory” and “datasets_source” to new property “directories”.

from_dict(dict_=None)

Set attributes from dictionary.

Loads datasets and creates aspecd.tasks.Task objects that are stored as lists respectively.

Parameters

dict (dict) – Dictionary containing information of a recipe.

Raises
  • aspecd.tasks.MissingDictError – Raised if no dict is provided.

  • aspecd.tasks.MissingDatasetFactoryError – Raised if importer_factory is invalid.

to_dict()

Return dict from attributes.

Returns

dict_ – Dictionary with fields “datasets” and “tasks”

Return type

dict

to_yaml()

Create YAML representation of recipe.

As users will interact with recipes primarily in form of YAML files, this method conveniently returns the YAML representation of an empty recipe that can serve as a starting point for own recipes.

Returns

yaml – YAML representation of a recipe

Return type

str

Examples

To get the YAML representation of an empty recipe, create a recipe object and call this method:

recipe = aspecd.tasks.Recipe()
print(recipe.to_yaml())

The result of the last line (the print statement from above) will look as follows:

format:
  type: ASpecD recipe
  version: '0.2'
settings:
  default_package: ''
  autosave_plots: true
  write_history: true
directories:
  output: ''
  datasets_source: ''
datasets: []
tasks: []

As you can see, you will get the full set of settings currently allowed within a recipe. Not all of them are strictly necessary, and for details, you still need to look into the documentation. But this can make it much easier to start with writing recipes.

Similarly, you can get YAML representations of tasks that you can add to your recipe. For details, see the documentation of the Task.to_yaml() method.

New in version 0.5.

import_from(importer=None)

Import recipe using importer.

Importers can be created to read recipes from different sources. Thus the recipe as such is entirely independent of the persistence layer.

Parameters

importer (aspecd.io.RecipeImporter) – importer used to actually import recipe

Raises

aspecd.tasks.MissingImporterError – Raised if no importer is provided

export_to(exporter=None)

Export recipe using exporter.

Exporters can be created to write recipes to different targets. Thus the recipe as such is entirely independent of the persistence layer.

Parameters

exporter (aspecd.io.RecipeExporter) – exporter used to actually export recipe

Raises

aspecd.tasks.MissingExporterError – Raised if no exporter is provided

get_dataset(identifier='')

Return dataset corresponding to given identifier.

In case of having a list of identifiers, use the similar method aspecd.tasks.Recipe.get_datasets().

Parameters

identifier (str) – Identifier matching the aspecd.dataset.Dataset.id attribute.

Returns

dataset – Dataset corresponding to given identifier

If no dataset corresponding to the given identifier could be found, None is returned.

Return type

aspecd.dataset.Dataset

Raises

aspecd.tasks.MissingDatasetIdentifierError – Raised if no identifier is provided.

get_datasets(identifiers=None)

Return datasets corresponding to given list of identifiers.

In case of having a single identifier, use the similar method aspecd.tasks.Recipe.get_dataset().

Parameters

identifiers (list) – Identifiers matching the aspecd.dataset.Dataset.id attribute.

Returns

datasets – Datasets corresponding to given identifier

Each dataset is an instance of aspecd.dataset.Dataset.

If no datasets corresponding to the given identifiers could be found, an empty list is returned.

Return type

list

Raises

aspecd.tasks.MissingDatasetIdentifierError – Raised if no identifiers are provided.

class aspecd.tasks.Chef(recipe=None)

Bases: object

Chefs cook recipes in recipe-driven data analysis.

As a result, they create a full history of the tasks performed, including all parameters, implicit and explicit. In this respect, they make the history independent of a singe dataset and allow to trace processing and analysis of multiple datasets.

Note

One necessary prerequisite for full reproducibility is therefore some kind of persistent and unique identifier for each dataset. The “Lab Object Identifier” (LOI) as used within the LabInform framework, is one solution of such identifier.

For persisting the history of cooking a recipe, the contents of the history attribute should be saved as a YAML file. There are two ways how to do that: manually and fully automated. If you manually instantiate an object of the aspecd.tasks.Chef class, you would need to do that on your own, as follows:

chef = aspecd.tasks.Chef()
# ... obtaining recipe from file
chef.cook(recipe)

yaml = aspecd.utils.Yaml()
yaml.dict = chef.history
yaml.write_to(filename='<my-recipe-history>.yaml')

The other way is to use an instance of the aspecd.tasks.ChefDeService class and its aspecd.tasks.ChefDeService.serve() method:

chef_de_service = ChefDeService()
chef_de_service.serve(recipe_filename='my_recipe.yaml')

This will automatically save the recipe history for you as a YAML file with its filename derived from the original recipe name. For details, see the documentation of the aspecd.tasks.ChefDeService class.

The YAML files generated from saving the history should work as recipes themselves, therefore allowing a full turnover, as well as easy modification of a recipe.

recipe

Recipe to cook, i.e. to carry out

Type

aspecd.tasks.Recipe

history

History of cooking the recipe

Contains a complete record of each task performed, including all parameters, implicit and explicit. Additionally, contains system information as collected by the aspecd.system.SystemInfo class.

Can be exported to a YAML file that works as a recipe.

Type

collections.OrderedDict

Parameters

recipe (aspecd.tasks.Recipe) – Recipe to cook, i.e. to carry out

Raises

aspecd.tasks.MissingRecipeError – Raised if no recipe is available to be cooked

cook(recipe=None)

Cook recipe, i.e. carry out tasks contained therein.

A recipe is an object of class aspecd.tasks.Recipe and contains both, a list of datasets and a list of tasks to be performed on these datasets.

Parameters

recipe (aspecd.tasks.Recipe) – Recipe to cook, i.e. tasks to carry out on particular datasets

Raises

aspecd.tasks.MissingRecipeError – Raised if no recipe is available to be cooked

class aspecd.tasks.Task(recipe=None)

Bases: aspecd.utils.ToDictMixin

Base class storing information for a single task.

Different underlying objects used to actually perform the respective task have different requirements and different signatures. In order to generically perform a task, for each kind of task – such as processing, analysis, plotting – this class needs to be subclassed. For a number of basic tasks available in the ASpecD package, this has already been done. See:

Note that imports of datasets are usually not handled using tasks, as this is taken care of automatically by defining a list of datasets in a aspecd.tasks.Recipe.

Usually, you need not care to instantiate objects of the correct type, as this is done automatically by the aspecd.tasks.Recipe using the aspecd.tasks.TaskFactory.

kind

Kind of task.

Usually corresponds to the module name the type (class) is defined in. See the note below for special cases.

Type

str

type

Type of task.

Corresponds to the class name eventually responsible for performing the task.

Type

str

package

Name of the package the class eventually responsible for performing the task belongs to.

Type

str

properties

Properties necessary to perform the task.

Should have keys corresponding to the properties of the class given as type attribute.

Generally, all keys in aspecd.tasks.Task.properties will be mapped to the underlying object created to perform the actual task.

In contrast, all additional attributes of a given task object subclassing aspecd.tasks.Task that are specific to the task object as such and its operation, but not for the object created by the task object to perform the task, are not part of the aspecd.tasks.Task.properties dict. For a recipe, this means that these additional attributes are at the same level as aspecd.tasks.Task.properties.

Type

dict

apply_to

List of datasets the task should be applied to.

Defaults to an empty list, meaning that the task will be performed for all datasets contained in a aspecd.tasks.Recipe.

Each dataset is referred to by the value of its aspecd.dataset.Dataset.source attribute. This should be unique and can consist of a filename, path, URL/URI, LOI, or alike.

Type

list

recipe

Recipe containing the task and the list of datasets the task refers to

Type

aspecd.tasks.Recipe

Note

A note to developers: Usually, the aspecd.tasks.Task.kind attribute is identical to the module name the respective class resides in. However, sometimes this is not the case, as with the plotters. In this case, an additional, non-public attribute aspecd.tasks.Task._module can be set in classes derived from aspecd.tasks.Task.

Raises
  • aspecd.tasks.MissingDictError – Raised if no dict is provided when calling from_dict().

  • aspecd.tasks.MissingRecipeError – Raised if no recipe is available upon performing the task.

from_dict(dict_=None)

Set attributes from dictionary.

Parameters

dict (dict) – Dictionary containing information of a task.

Raises

aspecd.tasks.MissingDictError – Raised if no dict is provided.

to_dict()

Create dictionary containing public attributes of an object.

Furthermore, replace certain objects with their respective labels provided in the recipe. These objects currently include datasets, results, figures (i.e. figure records), and plotters.

Returns

public_attributes – Ordered dictionary containing the public attributes of the object

The order of attribute definition is preserved

Return type

collections.OrderedDict

Changed in version 0.4: (Implicit) parameters of underlying task object are added

to_yaml()

Create YAML representation of task.

As users will use tasks primarily within recipes, i.e. in YAML files, this method conveniently returns the YAML representation of a task.

Returns

yaml – YAML representation of a task

Return type

str

Examples

To get the YAML representation of a concrete task, you need to create a task object of the appropriate class and assign a type to it:

task = aspecd.tasks.ProcessingTask()
task.type = 'Normalisation'
print(task.to_yaml())

The result of the last line (the print statement from above) will look as follows:

kind: processing
type: Normalisation
properties:
  parameters:
    kind: maximum
    range: null
    range_unit: index
    noise_range: null
    noise_range_unit: percentage
apply_to: []
result: ''
comment: ''

As you can see, you will get the full set of settings for this particular task. Not all of them are strictly necessary, and for details, you still need to look into the respective documentation. But this can make it much easier to add tasks to a recipe.

Similarly, you can get YAML representations of an empty recipe where you can add to your tasks to. For details, see the documentation of the Recipe.to_yaml() method.

New in version 0.5.

perform()

Call the appropriate method of the underlying object.

The actual implementation is contained in the non-public method aspecd.tasks.Task._perform().

Different underlying objects have different methods used to actually perform the respective task. In order to generically perform a task, classes derived from the aspecd.tasks.Task base class need to override aspecd.tasks.Task._perform() accordingly.

Use aspecd.tasks.Task.get_object() to get an instance of the actual object necessary to perform the task, and afterwards call its appropriate method.

Similarly, to get the actual dataset using the dataset id stored in aspecd.tasks.Task.apply_to, use the method aspecd.tasks.Recipe.get_dataset() of the recipe stored in aspecd.tasks.Task.recipe.

Raises

aspecd.tasks.MissingRecipeError – Raised if no recipe is available.

get_object()

Return object for a particular task including all attributes.

Returns

obj – Object of a class defined in the type attribute of a task

Return type

object

class aspecd.tasks.ProcessingTask(recipe=None)

Bases: aspecd.tasks.Task

Processing step defined as task in recipe-driven data analysis.

Processing steps will always be performed individually for each dataset.

For more information on the underlying general class, see aspecd.processing.SingleProcessingStep.

For an example of how such a processing task may be included into a recipe, see the YAML listing below:

kind: processing
type: SingleProcessingStep
properties:
  parameters:
    param1: bar
    param2: foo
  comment: >
    Some free text describing in more details the processing step
apply_to:
  - loi:xxx

Note that you can refer to datasets and results created during cooking of a recipe using their respective labels. Those labels will automatically be replaced by the actual dataset/result prior to performing the task.

Sometimes it can come in quite handy to compare different processing steps on the same original dataset, e.g. a series of different parameters. Think of a polynomial baseline correction where you would like to compare the effect of polynomials of different order. Here, what you are interested in is to work on copies of the original dataset and get the results stored additionally. Here you go:

kind: processing
type: SingleProcessingStep
result: label

And if you now want to do that for multiple datasets, you can do that as well. However, make sure to provide as many result labels as you have datasets to perform the processing step on, as otherwise no result will be stored and the processing step will operate on the original datasets:

kind: processing
type: SingleProcessingStep
apply_to:
  - loi:xxx
  - loi:yyy
result:
  - label1
  - label2

Another thing that can be very useful for data processing is to add a comment to an individual step, e.g. with an explanation why this step has been performed:

kind: processing
type: SingleProcessingStep
comment: >
  Lorem ipsum dolor sit amet,
  consectetur adipiscing elit.

Note that using the > sign will replace newline characters with spaces. If you want to preserve the newline characters, use | instead.

result

Label for the results of a processing step.

Processing steps always operate on datasets. However, sometimes it is useful to have a processing task return a copy of the processed dataset, in order to compare different processings afterwards. Therefore, you can specify a result label. In this case, the dataset will be copied first, the processing step performed on it, and afterwards the result returned as a new dataset that is accessible throughout the rest of the recipe with the label provided.

In case you perform the processing on several datasets, you may want to provide as many result labels as there are datasets. Otherwise, no result will be assigned.

Type

str

comment

Textual comment regarding the processing step

Type

str

Changed in version 0.3: New attribute comment

to_dict()

Create dictionary containing public attributes of an object.

Furthermore, replace certain objects with their respective labels provided in the recipe. These objects currently include datasets, results, figures (i.e. figure records), and plotters.

Returns

public_attributes – (List of) ordered dictionary/ies containing the public attributes of the object

In case of multiple datasets and added parameters during execution of the task, a list of dicts (one dict for each dataset).

The order of attribute definition is preserved

Return type

collections.OrderedDict | list

Changed in version 0.4: Return list of dicts in case of multiple datasets and added parameters during execution of the task

from_dict(dict_=None)

Set attributes from dictionary.

Parameters

dict (dict) – Dictionary containing information of a task.

Raises

aspecd.tasks.MissingDictError – Raised if no dict is provided.

get_object()

Return object for a particular task including all attributes.

Returns

obj – Object of a class defined in the type attribute of a task

Return type

object

perform()

Call the appropriate method of the underlying object.

The actual implementation is contained in the non-public method aspecd.tasks.Task._perform().

Different underlying objects have different methods used to actually perform the respective task. In order to generically perform a task, classes derived from the aspecd.tasks.Task base class need to override aspecd.tasks.Task._perform() accordingly.

Use aspecd.tasks.Task.get_object() to get an instance of the actual object necessary to perform the task, and afterwards call its appropriate method.

Similarly, to get the actual dataset using the dataset id stored in aspecd.tasks.Task.apply_to, use the method aspecd.tasks.Recipe.get_dataset() of the recipe stored in aspecd.tasks.Task.recipe.

Raises

aspecd.tasks.MissingRecipeError – Raised if no recipe is available.

to_yaml()

Create YAML representation of task.

As users will use tasks primarily within recipes, i.e. in YAML files, this method conveniently returns the YAML representation of a task.

Returns

yaml – YAML representation of a task

Return type

str

Examples

To get the YAML representation of a concrete task, you need to create a task object of the appropriate class and assign a type to it:

task = aspecd.tasks.ProcessingTask()
task.type = 'Normalisation'
print(task.to_yaml())

The result of the last line (the print statement from above) will look as follows:

kind: processing
type: Normalisation
properties:
  parameters:
    kind: maximum
    range: null
    range_unit: index
    noise_range: null
    noise_range_unit: percentage
apply_to: []
result: ''
comment: ''

As you can see, you will get the full set of settings for this particular task. Not all of them are strictly necessary, and for details, you still need to look into the respective documentation. But this can make it much easier to add tasks to a recipe.

Similarly, you can get YAML representations of an empty recipe where you can add to your tasks to. For details, see the documentation of the Recipe.to_yaml() method.

New in version 0.5.

class aspecd.tasks.SingleprocessingTask(recipe=None)

Bases: aspecd.tasks.ProcessingTask

Singleprocessing step defined as task in recipe-driven data analysis.

This is a convenience alias class for ProcessingTask. Therefore, the following two tasks are identical:

- kind: processing
  type: SingleProcessingStep

- kind: singleprocessing
  type: SingleProcessingStep
from_dict(dict_=None)

Set attributes from dictionary.

Parameters

dict (dict) – Dictionary containing information of a task.

Raises

aspecd.tasks.MissingDictError – Raised if no dict is provided.

get_object()

Return object for a particular task including all attributes.

Returns

obj – Object of a class defined in the type attribute of a task

Return type

object

perform()

Call the appropriate method of the underlying object.

The actual implementation is contained in the non-public method aspecd.tasks.Task._perform().

Different underlying objects have different methods used to actually perform the respective task. In order to generically perform a task, classes derived from the aspecd.tasks.Task base class need to override aspecd.tasks.Task._perform() accordingly.

Use aspecd.tasks.Task.get_object() to get an instance of the actual object necessary to perform the task, and afterwards call its appropriate method.

Similarly, to get the actual dataset using the dataset id stored in aspecd.tasks.Task.apply_to, use the method aspecd.tasks.Recipe.get_dataset() of the recipe stored in aspecd.tasks.Task.recipe.

Raises

aspecd.tasks.MissingRecipeError – Raised if no recipe is available.

to_dict()

Create dictionary containing public attributes of an object.

Furthermore, replace certain objects with their respective labels provided in the recipe. These objects currently include datasets, results, figures (i.e. figure records), and plotters.

Returns

public_attributes – (List of) ordered dictionary/ies containing the public attributes of the object

In case of multiple datasets and added parameters during execution of the task, a list of dicts (one dict for each dataset).

The order of attribute definition is preserved

Return type

collections.OrderedDict | list

Changed in version 0.4: Return list of dicts in case of multiple datasets and added parameters during execution of the task

to_yaml()

Create YAML representation of task.

As users will use tasks primarily within recipes, i.e. in YAML files, this method conveniently returns the YAML representation of a task.

Returns

yaml – YAML representation of a task

Return type

str

Examples

To get the YAML representation of a concrete task, you need to create a task object of the appropriate class and assign a type to it:

task = aspecd.tasks.ProcessingTask()
task.type = 'Normalisation'
print(task.to_yaml())

The result of the last line (the print statement from above) will look as follows:

kind: processing
type: Normalisation
properties:
  parameters:
    kind: maximum
    range: null
    range_unit: index
    noise_range: null
    noise_range_unit: percentage
apply_to: []
result: ''
comment: ''

As you can see, you will get the full set of settings for this particular task. Not all of them are strictly necessary, and for details, you still need to look into the respective documentation. But this can make it much easier to add tasks to a recipe.

Similarly, you can get YAML representations of an empty recipe where you can add to your tasks to. For details, see the documentation of the Recipe.to_yaml() method.

New in version 0.5.

class aspecd.tasks.MultiprocessingTask(recipe=None)

Bases: aspecd.tasks.Task

Multiprocessing step defined as task in recipe-driven data analysis.

Processing steps will always be performed individually for each dataset. Nevertheless, in this particular case, the processing depends on the list of datasets provided in the apply_to field

For more information on the underlying general class, see aspecd.processing.MultiProcessingStep.

For an example of how such a processing task may be included into a recipe, see the YAML listing below:

kind: multiprocessing
type: MultiProcessingStep
properties:
  parameters:
    param1: bar
    param2: foo
apply_to:
  - loi:xxx
  - loi:yyy

Note that you can refer to datasets and results created during cooking of a recipe using their respective labels. Those labels will automatically be replaced by the actual dataset/result prior to performing the task.

Sometimes it can come in quite handy to compare different processing steps on the same original dataset, e.g. a series of different parameters. Here, what you are interested in is to work on copies of the original dataset and get the results stored additionally. Here you go:

kind: multiprocessing
type: MultiProcessingStep
apply_to:
  - loi:xxx
  - loi:yyy
result:
  - label1
  - label2
result

Labels for the results of a processing step.

Processing steps always operate on datasets. However, sometimes it is useful to have a processing task return a copy of the processed dataset, in order to compare different processings afterwards. Therefore, you can specify a result label. In this case, the dataset will be copied first, the processing step performed on it, and afterwards the result returned as a new dataset that is accessible throughout the rest of the recipe with the label provided.

In case you perform the processing on several datasets, you may want to provide as many result labels as there are datasets. Otherwise, no result will be assigned.

Type

list

from_dict(dict_=None)

Set attributes from dictionary.

Parameters

dict (dict) – Dictionary containing information of a task.

Raises

aspecd.tasks.MissingDictError – Raised if no dict is provided.

get_object()

Return object for a particular task including all attributes.

Returns

obj – Object of a class defined in the type attribute of a task

Return type

object

perform()

Call the appropriate method of the underlying object.

The actual implementation is contained in the non-public method aspecd.tasks.Task._perform().

Different underlying objects have different methods used to actually perform the respective task. In order to generically perform a task, classes derived from the aspecd.tasks.Task base class need to override aspecd.tasks.Task._perform() accordingly.

Use aspecd.tasks.Task.get_object() to get an instance of the actual object necessary to perform the task, and afterwards call its appropriate method.

Similarly, to get the actual dataset using the dataset id stored in aspecd.tasks.Task.apply_to, use the method aspecd.tasks.Recipe.get_dataset() of the recipe stored in aspecd.tasks.Task.recipe.

Raises

aspecd.tasks.MissingRecipeError – Raised if no recipe is available.

to_dict()

Create dictionary containing public attributes of an object.

Furthermore, replace certain objects with their respective labels provided in the recipe. These objects currently include datasets, results, figures (i.e. figure records), and plotters.

Returns

public_attributes – Ordered dictionary containing the public attributes of the object

The order of attribute definition is preserved

Return type

collections.OrderedDict

Changed in version 0.4: (Implicit) parameters of underlying task object are added

to_yaml()

Create YAML representation of task.

As users will use tasks primarily within recipes, i.e. in YAML files, this method conveniently returns the YAML representation of a task.

Returns

yaml – YAML representation of a task

Return type

str

Examples

To get the YAML representation of a concrete task, you need to create a task object of the appropriate class and assign a type to it:

task = aspecd.tasks.ProcessingTask()
task.type = 'Normalisation'
print(task.to_yaml())

The result of the last line (the print statement from above) will look as follows:

kind: processing
type: Normalisation
properties:
  parameters:
    kind: maximum
    range: null
    range_unit: index
    noise_range: null
    noise_range_unit: percentage
apply_to: []
result: ''
comment: ''

As you can see, you will get the full set of settings for this particular task. Not all of them are strictly necessary, and for details, you still need to look into the respective documentation. But this can make it much easier to add tasks to a recipe.

Similarly, you can get YAML representations of an empty recipe where you can add to your tasks to. For details, see the documentation of the Recipe.to_yaml() method.

New in version 0.5.

class aspecd.tasks.AnalysisTask(recipe=None)

Bases: aspecd.tasks.Task

Analysis step defined as task in recipe-driven data analysis.

Analysis steps can be performed individually for each dataset or the results combined, depending on the type of analysis step.

Important

An AnalysisTask should not be used directly but rather the two classes derived from this class, namely:

For more information on the underlying general class, see aspecd.analysis.AnalysisStep.

result

Label for the result of an analysis step.

The result of an analysis step can be everything from a scalar to an entire (new) dataset.

This label will be used to refer to the result later on when further processing the recipe.

Type

str

comment

Textual comment regarding the analysis step

Type

str

Changed in version 0.3: New attribute comment

Changed in version 0.4: Raises warning if perform() is called

from_dict(dict_=None)

Set attributes from dictionary.

Parameters

dict (dict) – Dictionary containing information of a task.

Raises

aspecd.tasks.MissingDictError – Raised if no dict is provided.

get_object()

Return object for a particular task including all attributes.

Returns

obj – Object of a class defined in the type attribute of a task

Return type

object

perform()

Call the appropriate method of the underlying object.

The actual implementation is contained in the non-public method aspecd.tasks.Task._perform().

Different underlying objects have different methods used to actually perform the respective task. In order to generically perform a task, classes derived from the aspecd.tasks.Task base class need to override aspecd.tasks.Task._perform() accordingly.

Use aspecd.tasks.Task.get_object() to get an instance of the actual object necessary to perform the task, and afterwards call its appropriate method.

Similarly, to get the actual dataset using the dataset id stored in aspecd.tasks.Task.apply_to, use the method aspecd.tasks.Recipe.get_dataset() of the recipe stored in aspecd.tasks.Task.recipe.

Raises

aspecd.tasks.MissingRecipeError – Raised if no recipe is available.

to_dict()

Create dictionary containing public attributes of an object.

Furthermore, replace certain objects with their respective labels provided in the recipe. These objects currently include datasets, results, figures (i.e. figure records), and plotters.

Returns

public_attributes – Ordered dictionary containing the public attributes of the object

The order of attribute definition is preserved

Return type

collections.OrderedDict

Changed in version 0.4: (Implicit) parameters of underlying task object are added

to_yaml()

Create YAML representation of task.

As users will use tasks primarily within recipes, i.e. in YAML files, this method conveniently returns the YAML representation of a task.

Returns

yaml – YAML representation of a task

Return type

str

Examples

To get the YAML representation of a concrete task, you need to create a task object of the appropriate class and assign a type to it:

task = aspecd.tasks.ProcessingTask()
task.type = 'Normalisation'
print(task.to_yaml())

The result of the last line (the print statement from above) will look as follows:

kind: processing
type: Normalisation
properties:
  parameters:
    kind: maximum
    range: null
    range_unit: index
    noise_range: null
    noise_range_unit: percentage
apply_to: []
result: ''
comment: ''

As you can see, you will get the full set of settings for this particular task. Not all of them are strictly necessary, and for details, you still need to look into the respective documentation. But this can make it much easier to add tasks to a recipe.

Similarly, you can get YAML representations of an empty recipe where you can add to your tasks to. For details, see the documentation of the Recipe.to_yaml() method.

New in version 0.5.

class aspecd.tasks.SingleanalysisTask(recipe=None)

Bases: aspecd.tasks.AnalysisTask

Analysis step defined as task in recipe-driven data analysis.

Singleanalysis steps can only be performed individually for each dataset. For analyses combining multiple datasets, see aspecd.tasks.MultianalyisTask.

For more information on the underlying general class, see aspecd.analysis.SingleAnalysisStep.

For an example of how such an analysis task may be included into a recipe, see the YAML listing below:

kind: singleanalysis
type: SingleAnalysisStep
properties:
  parameters:
    param1: bar
    param2: foo
  comment: >
    Some free text describing in more details the analysis step
apply_to:
  - loi:xxx
result: label

Note that you can refer to datasets and results created during cooking of a recipe using their respective labels. Those labels will automatically be replaced by the actual dataset/result prior to performing the task.

And if you now want to do that for multiple datasets, you can do that as well. However, make sure to provide as many result labels as you have datasets to perform the analysis step on, as otherwise no result will be stored:

kind: singleanalysis
type: SingleAnalysisStep
apply_to:
  - loi:xxx
  - loi:yyy
result:
  - label1
  - label2

In case you perform the analysis on several datasets, you may want to provide as many result labels as there are datasets. Otherwise, no result will be assigned.

Another thing that can be very useful for data analysis is to add a comment to an individual step, e.g. with an explanation why this step has been performed:

kind: singleanalysis
type: SingleAnalysisStep
comment: >
  Lorem ipsum dolor sit amet,
  consectetur adipiscing elit.

Note that using the > sign will replace newline characters with spaces. If you want to preserve the newline characters, use | instead.

from_dict(dict_=None)

Set attributes from dictionary.

Parameters

dict (dict) – Dictionary containing information of a task.

Raises

aspecd.tasks.MissingDictError – Raised if no dict is provided.

get_object()

Return object for a particular task including all attributes.

Returns

obj – Object of a class defined in the type attribute of a task

Return type

object

perform()

Call the appropriate method of the underlying object.

The actual implementation is contained in the non-public method aspecd.tasks.Task._perform().

Different underlying objects have different methods used to actually perform the respective task. In order to generically perform a task, classes derived from the aspecd.tasks.Task base class need to override aspecd.tasks.Task._perform() accordingly.

Use aspecd.tasks.Task.get_object() to get an instance of the actual object necessary to perform the task, and afterwards call its appropriate method.

Similarly, to get the actual dataset using the dataset id stored in aspecd.tasks.Task.apply_to, use the method aspecd.tasks.Recipe.get_dataset() of the recipe stored in aspecd.tasks.Task.recipe.

Raises

aspecd.tasks.MissingRecipeError – Raised if no recipe is available.

to_dict()

Create dictionary containing public attributes of an object.

Furthermore, replace certain objects with their respective labels provided in the recipe. These objects currently include datasets, results, figures (i.e. figure records), and plotters.

Returns

public_attributes – Ordered dictionary containing the public attributes of the object

The order of attribute definition is preserved

Return type

collections.OrderedDict

Changed in version 0.4: (Implicit) parameters of underlying task object are added

to_yaml()

Create YAML representation of task.

As users will use tasks primarily within recipes, i.e. in YAML files, this method conveniently returns the YAML representation of a task.

Returns

yaml – YAML representation of a task

Return type

str

Examples

To get the YAML representation of a concrete task, you need to create a task object of the appropriate class and assign a type to it:

task = aspecd.tasks.ProcessingTask()
task.type = 'Normalisation'
print(task.to_yaml())

The result of the last line (the print statement from above) will look as follows:

kind: processing
type: Normalisation
properties:
  parameters:
    kind: maximum
    range: null
    range_unit: index
    noise_range: null
    noise_range_unit: percentage
apply_to: []
result: ''
comment: ''

As you can see, you will get the full set of settings for this particular task. Not all of them are strictly necessary, and for details, you still need to look into the respective documentation. But this can make it much easier to add tasks to a recipe.

Similarly, you can get YAML representations of an empty recipe where you can add to your tasks to. For details, see the documentation of the Recipe.to_yaml() method.

New in version 0.5.

class aspecd.tasks.MultianalysisTask(recipe=None)

Bases: aspecd.tasks.AnalysisTask

Analysis step defined as task in recipe-driven data analysis.

Multianalysis steps are performed on a list of datasets and combine them in one single analysis. For analyses performed on individual datasets, see aspecd.tasks.SingleanalysisTask.

For more information on the underlying general class, see aspecd.analysis.MultiAnalysisStep.

For an example of how such an analysis task may be included into a recipe, see the YAML listing below:

kind: multianalysis
type: MultiAnalysisStep
properties:
  parameters:
    param1: bar
    param2: foo
  comment: >
    Some free text describing in more details the analysis step
apply_to:
  - loi:xxx
result:
  - label1
  - label2

Note that you can refer to datasets and results created during cooking of a recipe using their respective labels. Those labels will automatically be replaced by the actual dataset/result prior to performing the task.

In case such a multianalysis step results in a list of resulting datasets, result should be a list of labels, not a single label.

Raises

IndexError – Raised if list of result labels and results are not of same length

from_dict(dict_=None)

Set attributes from dictionary.

Parameters

dict (dict) – Dictionary containing information of a task.

Raises

aspecd.tasks.MissingDictError – Raised if no dict is provided.

get_object()

Return object for a particular task including all attributes.

Returns

obj – Object of a class defined in the type attribute of a task

Return type

object

perform()

Call the appropriate method of the underlying object.

The actual implementation is contained in the non-public method aspecd.tasks.Task._perform().

Different underlying objects have different methods used to actually perform the respective task. In order to generically perform a task, classes derived from the aspecd.tasks.Task base class need to override aspecd.tasks.Task._perform() accordingly.

Use aspecd.tasks.Task.get_object() to get an instance of the actual object necessary to perform the task, and afterwards call its appropriate method.

Similarly, to get the actual dataset using the dataset id stored in aspecd.tasks.Task.apply_to, use the method aspecd.tasks.Recipe.get_dataset() of the recipe stored in aspecd.tasks.Task.recipe.

Raises

aspecd.tasks.MissingRecipeError – Raised if no recipe is available.

to_dict()

Create dictionary containing public attributes of an object.

Furthermore, replace certain objects with their respective labels provided in the recipe. These objects currently include datasets, results, figures (i.e. figure records), and plotters.

Returns

public_attributes – Ordered dictionary containing the public attributes of the object

The order of attribute definition is preserved

Return type

collections.OrderedDict

Changed in version 0.4: (Implicit) parameters of underlying task object are added

to_yaml()

Create YAML representation of task.

As users will use tasks primarily within recipes, i.e. in YAML files, this method conveniently returns the YAML representation of a task.

Returns

yaml – YAML representation of a task

Return type

str

Examples

To get the YAML representation of a concrete task, you need to create a task object of the appropriate class and assign a type to it:

task = aspecd.tasks.ProcessingTask()
task.type = 'Normalisation'
print(task.to_yaml())

The result of the last line (the print statement from above) will look as follows:

kind: processing
type: Normalisation
properties:
  parameters:
    kind: maximum
    range: null
    range_unit: index
    noise_range: null
    noise_range_unit: percentage
apply_to: []
result: ''
comment: ''

As you can see, you will get the full set of settings for this particular task. Not all of them are strictly necessary, and for details, you still need to look into the respective documentation. But this can make it much easier to add tasks to a recipe.

Similarly, you can get YAML representations of an empty recipe where you can add to your tasks to. For details, see the documentation of the Recipe.to_yaml() method.

New in version 0.5.

class aspecd.tasks.AggregatedanalysisTask(recipe=None)

Bases: aspecd.tasks.AnalysisTask

Analysis step defined as task in recipe-driven data analysis.

AggregatedAnalysis steps perform a given aspecd.tasks.SingleanalysisTask on a list of datasets and combine the result in a aspecd.dataset.CalculatedDataset.

For more information on the underlying general class, see aspecd.analysis.AggregatedAnalysisStep.

For an example of how such an analysis task may be included into a recipe, see the YAML listing below:

- kind: aggregatedanalysis
  type: BasicCharacteristics
  properties:
    parameters:
      kind: min
  apply_to:
    - dataset1
    - dataset2
  result: basic_characteristics

New in version 0.5.

from_dict(dict_=None)

Set attributes from dictionary.

Parameters

dict (dict) – Dictionary containing information of a task.

Raises

aspecd.tasks.MissingDictError – Raised if no dict is provided.

get_object()

Return object for a particular task including all attributes.

Returns

obj – Object of a class defined in the type attribute of a task

Return type

object

perform()

Call the appropriate method of the underlying object.

The actual implementation is contained in the non-public method aspecd.tasks.Task._perform().

Different underlying objects have different methods used to actually perform the respective task. In order to generically perform a task, classes derived from the aspecd.tasks.Task base class need to override aspecd.tasks.Task._perform() accordingly.

Use aspecd.tasks.Task.get_object() to get an instance of the actual object necessary to perform the task, and afterwards call its appropriate method.

Similarly, to get the actual dataset using the dataset id stored in aspecd.tasks.Task.apply_to, use the method aspecd.tasks.Recipe.get_dataset() of the recipe stored in aspecd.tasks.Task.recipe.

Raises

aspecd.tasks.MissingRecipeError – Raised if no recipe is available.

to_dict()

Create dictionary containing public attributes of an object.

Furthermore, replace certain objects with their respective labels provided in the recipe. These objects currently include datasets, results, figures (i.e. figure records), and plotters.

Returns

public_attributes – Ordered dictionary containing the public attributes of the object

The order of attribute definition is preserved

Return type

collections.OrderedDict

Changed in version 0.4: (Implicit) parameters of underlying task object are added

to_yaml()

Create YAML representation of task.

As users will use tasks primarily within recipes, i.e. in YAML files, this method conveniently returns the YAML representation of a task.

Returns

yaml – YAML representation of a task

Return type

str

Examples

To get the YAML representation of a concrete task, you need to create a task object of the appropriate class and assign a type to it:

task = aspecd.tasks.ProcessingTask()
task.type = 'Normalisation'
print(task.to_yaml())

The result of the last line (the print statement from above) will look as follows:

kind: processing
type: Normalisation
properties:
  parameters:
    kind: maximum
    range: null
    range_unit: index
    noise_range: null
    noise_range_unit: percentage
apply_to: []
result: ''
comment: ''

As you can see, you will get the full set of settings for this particular task. Not all of them are strictly necessary, and for details, you still need to look into the respective documentation. But this can make it much easier to add tasks to a recipe.

Similarly, you can get YAML representations of an empty recipe where you can add to your tasks to. For details, see the documentation of the Recipe.to_yaml() method.

New in version 0.5.

class aspecd.tasks.AnnotationTask(recipe=None)

Bases: aspecd.tasks.Task

Annotation step defined as task in recipe-driven data analysis.

Annotation steps will always be performed individually for each dataset.

For more information on the underlying general class, see aspecd.processing.Annotation.

from_dict(dict_=None)

Set attributes from dictionary.

Parameters

dict (dict) – Dictionary containing information of a task.

Raises

aspecd.tasks.MissingDictError – Raised if no dict is provided.

get_object()

Return object for a particular task including all attributes.

Returns

obj – Object of a class defined in the type attribute of a task

Return type

object

perform()

Call the appropriate method of the underlying object.

The actual implementation is contained in the non-public method aspecd.tasks.Task._perform().

Different underlying objects have different methods used to actually perform the respective task. In order to generically perform a task, classes derived from the aspecd.tasks.Task base class need to override aspecd.tasks.Task._perform() accordingly.

Use aspecd.tasks.Task.get_object() to get an instance of the actual object necessary to perform the task, and afterwards call its appropriate method.

Similarly, to get the actual dataset using the dataset id stored in aspecd.tasks.Task.apply_to, use the method aspecd.tasks.Recipe.get_dataset() of the recipe stored in aspecd.tasks.Task.recipe.

Raises

aspecd.tasks.MissingRecipeError – Raised if no recipe is available.

to_dict()

Create dictionary containing public attributes of an object.

Furthermore, replace certain objects with their respective labels provided in the recipe. These objects currently include datasets, results, figures (i.e. figure records), and plotters.

Returns

public_attributes – Ordered dictionary containing the public attributes of the object

The order of attribute definition is preserved

Return type

collections.OrderedDict

Changed in version 0.4: (Implicit) parameters of underlying task object are added

to_yaml()

Create YAML representation of task.

As users will use tasks primarily within recipes, i.e. in YAML files, this method conveniently returns the YAML representation of a task.

Returns

yaml – YAML representation of a task

Return type

str

Examples

To get the YAML representation of a concrete task, you need to create a task object of the appropriate class and assign a type to it:

task = aspecd.tasks.ProcessingTask()
task.type = 'Normalisation'
print(task.to_yaml())

The result of the last line (the print statement from above) will look as follows:

kind: processing
type: Normalisation
properties:
  parameters:
    kind: maximum
    range: null
    range_unit: index
    noise_range: null
    noise_range_unit: percentage
apply_to: []
result: ''
comment: ''

As you can see, you will get the full set of settings for this particular task. Not all of them are strictly necessary, and for details, you still need to look into the respective documentation. But this can make it much easier to add tasks to a recipe.

Similarly, you can get YAML representations of an empty recipe where you can add to your tasks to. For details, see the documentation of the Recipe.to_yaml() method.

New in version 0.5.

class aspecd.tasks.PlotTask

Bases: aspecd.tasks.Task

Plot step defined as task in recipe-driven data analysis.

Important

A PlotTask should not be used directly but rather the two classes derived from this class, namely:

For more information on the underlying general class, see aspecd.plotting.Plotter.

label

Label for the figure resulting from a plotting step.

This label will be used to refer to the plot later on when further processing the recipe. Actually, in the recipe’s aspecd.tasks.Recipe.figures dict, this label is used as a key and a aspecd.tasks.FigureRecord object stored containing all information necessary for further handling the results of the plot.

Type

str

result

Label for the plotter of a plotting step.

This is useful in case of CompositePlotters, where different plotters need to be defined for each of the panels.

Type

str

target

Label of an existing previous plotter the plot should be added to.

Sometimes it is desirable to add something to an already existing plot after this original plot has been created. Programmatically, this would be equivalent to setting the aspecd.plotting.Plotter.figure and aspecd.plotting.Plotter.axes attributes of the underlying plotter object.

The result: Your plot will be a new figure window, but with the original plot contained and the new plot added on top of it.

Type

str

Changed in version 0.4: Added attribute target

to_dict()

Create dictionary containing public attributes of an object.

Furthermore, replace certain objects with their respective labels provided in the recipe. These objects currently include datasets, results, figures (i.e. figure records), and plotters.

Returns

public_attributes – Ordered dictionary containing the public attributes of the object

The order of attribute definition is preserved

Return type

collections.OrderedDict

Changed in version 0.5: Properties of underlying task object are added

perform()

Call the appropriate method of the underlying object.

For details, see the method aspecd.tasks.Task.perform() of the base class.

Additionally to what is done in the base class, a PlotTask adds a aspecd.tasks.FigureRecord object to the aspecd.tasks.Recipe.figures property of the underlying recipe in case an aspecd.tasks.PlotTask.label has been set.

save_plot(plot=None)

Save the figure of the plot created by the task.

Parameters

plot (aspecd.plotting.Plotter) – Plot whose figure should be saved

from_dict(dict_=None)

Set attributes from dictionary.

Parameters

dict (dict) – Dictionary containing information of a task.

Raises

aspecd.tasks.MissingDictError – Raised if no dict is provided.

get_object()

Return object for a particular task including all attributes.

Returns

obj – Object of a class defined in the type attribute of a task

Return type

object

to_yaml()

Create YAML representation of task.

As users will use tasks primarily within recipes, i.e. in YAML files, this method conveniently returns the YAML representation of a task.

Returns

yaml – YAML representation of a task

Return type

str

Examples

To get the YAML representation of a concrete task, you need to create a task object of the appropriate class and assign a type to it:

task = aspecd.tasks.ProcessingTask()
task.type = 'Normalisation'
print(task.to_yaml())

The result of the last line (the print statement from above) will look as follows:

kind: processing
type: Normalisation
properties:
  parameters:
    kind: maximum
    range: null
    range_unit: index
    noise_range: null
    noise_range_unit: percentage
apply_to: []
result: ''
comment: ''

As you can see, you will get the full set of settings for this particular task. Not all of them are strictly necessary, and for details, you still need to look into the respective documentation. But this can make it much easier to add tasks to a recipe.

Similarly, you can get YAML representations of an empty recipe where you can add to your tasks to. For details, see the documentation of the Recipe.to_yaml() method.

New in version 0.5.

class aspecd.tasks.SingleplotTask

Bases: aspecd.tasks.PlotTask

Singleplot step defined as task in recipe-driven data analysis.

Singleplot steps can only be performed individually for each dataset. For plots combining multiple datasets, see aspecd.tasks.MultiplotTask.

For more information on the underlying general class, see aspecd.plotting.SinglePlotter.

For an example of how such a singleplot task may be included into a recipe, see the YAML listing below:

kind: singleplot
type: SinglePlotter
properties:
  properties:
    figure:
      title: My fancy figure title
    drawing:
      color: darkorange
      label: my data
      linewidth: 4
      linestyle: dashed
    legend:
      location: northeast
  parameters:
    show_legend: True
  caption:
    title: >
      Ideally a single sentence summarising the intend of the figure
    text: >
      More text for the figure caption
    parameters:
      - a list of parameters
      - that shall (additionally) be listed
      - in the figure caption
  filename: fancyfigure.pdf
apply_to:
  - loi:xxx
label: label

Note that you can refer to datasets and results created during cooking of a recipe using their respective labels. Those labels will automatically be replaced by the actual dataset/result prior to performing the task.

Note

As soon as you provide a filename in the properties of your recipe, the resulting plot will automatically be saved to that filename, inferring the file format from the extension of the filename. For details of how the format is inferred see the documentation for the matplotlib.figure.Figure.savefig() method.

In case you apply the single plotter to more than one dataset and would like to save individual plots, you can do that by supplying a list of filenames instead of only a single filename. In this case, the plots get saved to the filenames in the list. A minimal example may look like this:

kind: singleplot
type: SinglePlotter
properties:
  filename:
    - fancyfigure1.pdf
    - fancyfigure2.pdf
apply_to:
  - loi:xxx
  - loi:yyy

Important

Make sure to provide the same number of file names in your recipe as the number of datasets you apply the plotter to. Otherwise you may run into trouble.

Note

If the recipe contains the output key in its directories dict, the figure(s) will be saved to this directory.

As long as autosave_plots in the recipe is set to True, the plots will be saved automatically, even if no filename is provided. These automatically generated filenames consist of the last part of the dataset source (excluding a potential file extension) and the name of the plotter used. To prevent the plotters in a recipe from automatically saving the plots, include the autosave_plots directive in the settings dict of your recipe and set it to False.

from_dict(dict_=None)

Set attributes from dictionary.

Parameters

dict (dict) – Dictionary containing information of a task.

Raises

aspecd.tasks.MissingDictError – Raised if no dict is provided.

get_object()

Return object for a particular task including all attributes.

Returns

obj – Object of a class defined in the type attribute of a task

Return type

object

perform()

Call the appropriate method of the underlying object.

For details, see the method aspecd.tasks.Task.perform() of the base class.

Additionally to what is done in the base class, a PlotTask adds a aspecd.tasks.FigureRecord object to the aspecd.tasks.Recipe.figures property of the underlying recipe in case an aspecd.tasks.PlotTask.label has been set.

save_plot(plot=None)

Save the figure of the plot created by the task.

Parameters

plot (aspecd.plotting.Plotter) – Plot whose figure should be saved

to_dict()

Create dictionary containing public attributes of an object.

Furthermore, replace certain objects with their respective labels provided in the recipe. These objects currently include datasets, results, figures (i.e. figure records), and plotters.

Returns

public_attributes – Ordered dictionary containing the public attributes of the object

The order of attribute definition is preserved

Return type

collections.OrderedDict

Changed in version 0.5: Properties of underlying task object are added

to_yaml()

Create YAML representation of task.

As users will use tasks primarily within recipes, i.e. in YAML files, this method conveniently returns the YAML representation of a task.

Returns

yaml – YAML representation of a task

Return type

str

Examples

To get the YAML representation of a concrete task, you need to create a task object of the appropriate class and assign a type to it:

task = aspecd.tasks.ProcessingTask()
task.type = 'Normalisation'
print(task.to_yaml())

The result of the last line (the print statement from above) will look as follows:

kind: processing
type: Normalisation
properties:
  parameters:
    kind: maximum
    range: null
    range_unit: index
    noise_range: null
    noise_range_unit: percentage
apply_to: []
result: ''
comment: ''

As you can see, you will get the full set of settings for this particular task. Not all of them are strictly necessary, and for details, you still need to look into the respective documentation. But this can make it much easier to add tasks to a recipe.

Similarly, you can get YAML representations of an empty recipe where you can add to your tasks to. For details, see the documentation of the Recipe.to_yaml() method.

New in version 0.5.

class aspecd.tasks.MultiplotTask

Bases: aspecd.tasks.PlotTask

Multiplot step defined as task in recipe-driven data analysis.

Multiplot steps are performed on a list of datasets and combine them in one single plot. For plots performed on individual datasets, see aspecd.tasks.SingleplotTask.

For more information on the underlying general class, see aspecd.plotting.MultiPlotter.

For an example of how such a multiplot task may be included into a recipe, see the YAML listing below:

kind: multiplot
type: MultiPlotter
properties:
  parameters:
    axes:
      - quantity: wavelength
        unit: nm
      - quantity: intensity
        unit:
    show_legend: True
  caption:
    title: >
      Ideally a single sentence summarising the intend of the figure
    text: >
      More text for the figure caption
    parameters:
      - a list of parameters
      - that shall (additionally) be listed
      - in the figure caption
  filename: fancyfigure.pdf
apply_to:
  - loi:xxx
  - loi:yyy
label: label

Note that you can refer to datasets and results created during cooking of a recipe using their respective labels. Those labels will automatically be replaced by the actual dataset/result prior to performing the task.

A specialty of plots of multiple datasets is that you cannot necessarily infer the axis labels from the datasets, hence may be interested to set them directly. This is done using the axes key of the parameters property of the aspecd.plotting.MultiPlotter class, as shown in the recipe example above.

Note

As soon as you provide a filename in the properties of your recipe, the resulting plot will automatically be saved to that filename, inferring the file format from the extension of the filename. For details of how the format is inferred see the documentation for the matplotlib.figure.Figure.savefig() method.

Note

If the recipe contains the output key in its directories dict, the figure(s) will be saved to this directory.

As long as autosave_plots in the recipe is set to True, the plots will be saved automatically, even if no filename is provided. These automatically generated filenames consist of the last part of the dataset source (excluding a potential file extension) and the name of the plotter used. To prevent the plotters in a recipe from automatically saving the plots, include the autosave_plots directive in the settings dict of your recipe and set it to False.

from_dict(dict_=None)

Set attributes from dictionary.

Parameters

dict (dict) – Dictionary containing information of a task.

Raises

aspecd.tasks.MissingDictError – Raised if no dict is provided.

get_object()

Return object for a particular task including all attributes.

Returns

obj – Object of a class defined in the type attribute of a task

Return type

object

perform()

Call the appropriate method of the underlying object.

For details, see the method aspecd.tasks.Task.perform() of the base class.

Additionally to what is done in the base class, a PlotTask adds a aspecd.tasks.FigureRecord object to the aspecd.tasks.Recipe.figures property of the underlying recipe in case an aspecd.tasks.PlotTask.label has been set.

save_plot(plot=None)

Save the figure of the plot created by the task.

Parameters

plot (aspecd.plotting.Plotter) – Plot whose figure should be saved

to_dict()

Create dictionary containing public attributes of an object.

Furthermore, replace certain objects with their respective labels provided in the recipe. These objects currently include datasets, results, figures (i.e. figure records), and plotters.

Returns

public_attributes – Ordered dictionary containing the public attributes of the object

The order of attribute definition is preserved

Return type

collections.OrderedDict

Changed in version 0.5: Properties of underlying task object are added

to_yaml()

Create YAML representation of task.

As users will use tasks primarily within recipes, i.e. in YAML files, this method conveniently returns the YAML representation of a task.

Returns

yaml – YAML representation of a task

Return type

str

Examples

To get the YAML representation of a concrete task, you need to create a task object of the appropriate class and assign a type to it:

task = aspecd.tasks.ProcessingTask()
task.type = 'Normalisation'
print(task.to_yaml())

The result of the last line (the print statement from above) will look as follows:

kind: processing
type: Normalisation
properties:
  parameters:
    kind: maximum
    range: null
    range_unit: index
    noise_range: null
    noise_range_unit: percentage
apply_to: []
result: ''
comment: ''

As you can see, you will get the full set of settings for this particular task. Not all of them are strictly necessary, and for details, you still need to look into the respective documentation. But this can make it much easier to add tasks to a recipe.

Similarly, you can get YAML representations of an empty recipe where you can add to your tasks to. For details, see the documentation of the Recipe.to_yaml() method.

New in version 0.5.

class aspecd.tasks.CompositeplotTask

Bases: aspecd.tasks.PlotTask

Compositeplot step defined as task in recipe-driven data analysis.

Compositeplot steps are performed on a list of plots and combine them in one single figure. For more common plots employing only a single axes, see aspecd.tasks.SingleplotTask and aspecd.tasks.MultiplotTask.

For more information on the underlying general class, see aspecd.plotting.CompositePlotter.

For an example of how such a compositeplot task may be included into a recipe, see the YAML listing below:

- kind: singleplot
  type: SinglePlotter1D
  apply_to:
    - dataset1
  result: 1D_plot

- kind: singleplot
  type: SinglePlotter2D
  apply_to:
    - dataset2
  result: 2D_plot

- kind: compositeplot
  type: CompositePlotter
  properties:
    grid_dimensions: [1, 2]
    subplot_locations:
      - [0, 0, 1, 1]
      - [0, 1, 1, 1]
    plotter:
      - 1D_plot
      - 2D_plot
    filename: composed_plot.pdf

The crucial aspect here is to first define the individual plotters that get used for the respective panels of the CompositePlotter. In this particular example, two different plots on two different datasets are created and afterwards combined into the CompositePlotter. Furthermore, for a CompositePlot you need to specify both, grid dimensions and subplot locations, as they will be set to one single axis by default.

Note

As long as the autosave_plots in the recipe is set to True, the results of the individual plotters combined in the CompositePlotter will be saved to generic filenames. To prevent this from happening, include the autosave_plots directive in the settings dict of your recipe and set it to False.

to_dict()

Create dictionary containing public attributes of the object.

Returns

public_attributes – Ordered dictionary containing the public attributes of the object

The order of attribute definition is preserved

Return type

collections.OrderedDict

from_dict(dict_=None)

Set attributes from dictionary.

Parameters

dict (dict) – Dictionary containing information of a task.

Raises

aspecd.tasks.MissingDictError – Raised if no dict is provided.

get_object()

Return object for a particular task including all attributes.

Returns

obj – Object of a class defined in the type attribute of a task

Return type

object

perform()

Call the appropriate method of the underlying object.

For details, see the method aspecd.tasks.Task.perform() of the base class.

Additionally to what is done in the base class, a PlotTask adds a aspecd.tasks.FigureRecord object to the aspecd.tasks.Recipe.figures property of the underlying recipe in case an aspecd.tasks.PlotTask.label has been set.

save_plot(plot=None)

Save the figure of the plot created by the task.

Parameters

plot (aspecd.plotting.Plotter) – Plot whose figure should be saved

to_yaml()

Create YAML representation of task.

As users will use tasks primarily within recipes, i.e. in YAML files, this method conveniently returns the YAML representation of a task.

Returns

yaml – YAML representation of a task

Return type

str

Examples

To get the YAML representation of a concrete task, you need to create a task object of the appropriate class and assign a type to it:

task = aspecd.tasks.ProcessingTask()
task.type = 'Normalisation'
print(task.to_yaml())

The result of the last line (the print statement from above) will look as follows:

kind: processing
type: Normalisation
properties:
  parameters:
    kind: maximum
    range: null
    range_unit: index
    noise_range: null
    noise_range_unit: percentage
apply_to: []
result: ''
comment: ''

As you can see, you will get the full set of settings for this particular task. Not all of them are strictly necessary, and for details, you still need to look into the respective documentation. But this can make it much easier to add tasks to a recipe.

Similarly, you can get YAML representations of an empty recipe where you can add to your tasks to. For details, see the documentation of the Recipe.to_yaml() method.

New in version 0.5.

class aspecd.tasks.ReportTask

Bases: aspecd.tasks.Task

Reporting step defined as task in recipe-driven data analysis.

For more information on the underlying general class, see aspecd.report.Reporter.

For an example of how such an analysis task may be included into a recipe, see the YAML listing below:

kind: report
type: LaTeXReporter
properties:
  template: my-fancy-latex-template.tex
  filename: some-filename-for-final-report.tex
  context:
    general:
      title: Some fancy title
      author: John Doe
    free_text:
      intro: >
        Short introduction of the experiment performed
      metadata: >
        Tabular and customisable overview of the dataset's metadata
      history: >
        Presentation of all processing, analysis and representation
        steps
    figures:
      title: my_fancy_figure
compile: True
apply_to:
  - loi:xxx

Note that you can refer to datasets, results, and figures created during cooking of a recipe using their respective labels. Those labels will automatically be replaced by the actual dataset/result prior to performing the task.

Whatever fields you set as property context can be accessed directly from within the template using the usual Python syntax for accessing keys of dictionaries. The fields shown here assume a certain structure of your template containing user-supplied free text for the introduction to several sections.

Additionally, the context will contain the key dataset containing the result of the aspecd.dataset.Dataset.to_dict() method, thus the full information contained in the dataset.

You can, of course, apply the report task to multiple datasets individually. In this case, you most probably would like to have your reports saved to individual files. This means that the property filename needs to become a list:

datasets:
  - foo
  - bar

tasks:
  - kind: report
    type: LaTeXReporter
    properties:
      template: my-fancy-latex-template.tex
      filename:
        - report1.tex
        - report2.tex

Important

Make sure to provide the same number of file names in your recipe as the number of datasets you apply the report to. Otherwise you may run into trouble.

Note

If the recipe contains the output_directory key on the top level, the reports will be written to this directory.

compile

Option for compiling a template.

Some types of templates need an additional “compile” step to create output, most prominently LaTeX templates. If the Reporter class does not support compiling, but compile is set to True, it gets silently ignored.

Type

bool

from_dict(dict_=None)

Set attributes from dictionary.

Parameters

dict (dict) – Dictionary containing information of a task.

Raises

aspecd.tasks.MissingDictError – Raised if no dict is provided.

get_object()

Return object for a particular task including all attributes.

Returns

obj – Object of a class defined in the type attribute of a task

Return type

object

perform()

Call the appropriate method of the underlying object.

The actual implementation is contained in the non-public method aspecd.tasks.Task._perform().

Different underlying objects have different methods used to actually perform the respective task. In order to generically perform a task, classes derived from the aspecd.tasks.Task base class need to override aspecd.tasks.Task._perform() accordingly.

Use aspecd.tasks.Task.get_object() to get an instance of the actual object necessary to perform the task, and afterwards call its appropriate method.

Similarly, to get the actual dataset using the dataset id stored in aspecd.tasks.Task.apply_to, use the method aspecd.tasks.Recipe.get_dataset() of the recipe stored in aspecd.tasks.Task.recipe.

Raises

aspecd.tasks.MissingRecipeError – Raised if no recipe is available.

to_dict()

Create dictionary containing public attributes of an object.

Furthermore, replace certain objects with their respective labels provided in the recipe. These objects currently include datasets, results, figures (i.e. figure records), and plotters.

Returns

public_attributes – Ordered dictionary containing the public attributes of the object

The order of attribute definition is preserved

Return type

collections.OrderedDict

Changed in version 0.4: (Implicit) parameters of underlying task object are added

to_yaml()

Create YAML representation of task.

As users will use tasks primarily within recipes, i.e. in YAML files, this method conveniently returns the YAML representation of a task.

Returns

yaml – YAML representation of a task

Return type

str

Examples

To get the YAML representation of a concrete task, you need to create a task object of the appropriate class and assign a type to it:

task = aspecd.tasks.ProcessingTask()
task.type = 'Normalisation'
print(task.to_yaml())

The result of the last line (the print statement from above) will look as follows:

kind: processing
type: Normalisation
properties:
  parameters:
    kind: maximum
    range: null
    range_unit: index
    noise_range: null
    noise_range_unit: percentage
apply_to: []
result: ''
comment: ''

As you can see, you will get the full set of settings for this particular task. Not all of them are strictly necessary, and for details, you still need to look into the respective documentation. But this can make it much easier to add tasks to a recipe.

Similarly, you can get YAML representations of an empty recipe where you can add to your tasks to. For details, see the documentation of the Recipe.to_yaml() method.

New in version 0.5.

class aspecd.tasks.ModelTask(recipe=None)

Bases: aspecd.tasks.Task

Building a model defined as task in recipe-driven data analysis.

For more information on the underlying general class, see aspecd.model.Model.

For an example of how such a model task may be included into a recipe, see the YAML listing below:

kind: model
type: Model
properties:
  parameters:
    foo: 42
    bar: 21
from_dataset: dataset_label
result: foo

Note that you can refer to datasets and results created during cooking of a recipe using their respective labels. Those labels will automatically be replaced by the actual dataset/result prior to performing the task.

Here, we have used this for the parameter from_dataset in the above recipe excerpt. For an aspecd.model.Model object, you can set the variables explicitly. However, in context of a recipe, this is rarely useful. Therefore, the from_dataset parameter lets you refer to a dataset (by its label used within the recipe) that is used to call the aspecd.model.Model.from_dataset() method with to obtain the variables from this dataset.

result

Label for the dataset resulting from the model creation

The result will always be an aspecd.dataset.CalculatedDataset object.

This label will be used to refer to the result later on when further processing the recipe.

Type

str

from_dataset

Label of a dataset to obtain variables from

The label needs to be a valid label to a dataset within the given recipe. The underlying dataset is obtained from the recipe and used to call the aspecd.model.Model.from_dataset() method with to obtain the variables from this dataset.

Type

str

from_dict(dict_=None)

Set attributes from dictionary.

Parameters

dict (dict) – Dictionary containing information of a task.

Raises

aspecd.tasks.MissingDictError – Raised if no dict is provided.

get_object()

Return object for a particular task including all attributes.

Returns

obj – Object of a class defined in the type attribute of a task

Return type

object

perform()

Call the appropriate method of the underlying object.

The actual implementation is contained in the non-public method aspecd.tasks.Task._perform().

Different underlying objects have different methods used to actually perform the respective task. In order to generically perform a task, classes derived from the aspecd.tasks.Task base class need to override aspecd.tasks.Task._perform() accordingly.

Use aspecd.tasks.Task.get_object() to get an instance of the actual object necessary to perform the task, and afterwards call its appropriate method.

Similarly, to get the actual dataset using the dataset id stored in aspecd.tasks.Task.apply_to, use the method aspecd.tasks.Recipe.get_dataset() of the recipe stored in aspecd.tasks.Task.recipe.

Raises

aspecd.tasks.MissingRecipeError – Raised if no recipe is available.

to_dict()

Create dictionary containing public attributes of an object.

Furthermore, replace certain objects with their respective labels provided in the recipe. These objects currently include datasets, results, figures (i.e. figure records), and plotters.

Returns

public_attributes – Ordered dictionary containing the public attributes of the object

The order of attribute definition is preserved

Return type

collections.OrderedDict

Changed in version 0.4: (Implicit) parameters of underlying task object are added

to_yaml()

Create YAML representation of task.

As users will use tasks primarily within recipes, i.e. in YAML files, this method conveniently returns the YAML representation of a task.

Returns

yaml – YAML representation of a task

Return type

str

Examples

To get the YAML representation of a concrete task, you need to create a task object of the appropriate class and assign a type to it:

task = aspecd.tasks.ProcessingTask()
task.type = 'Normalisation'
print(task.to_yaml())

The result of the last line (the print statement from above) will look as follows:

kind: processing
type: Normalisation
properties:
  parameters:
    kind: maximum
    range: null
    range_unit: index
    noise_range: null
    noise_range_unit: percentage
apply_to: []
result: ''
comment: ''

As you can see, you will get the full set of settings for this particular task. Not all of them are strictly necessary, and for details, you still need to look into the respective documentation. But this can make it much easier to add tasks to a recipe.

Similarly, you can get YAML representations of an empty recipe where you can add to your tasks to. For details, see the documentation of the Recipe.to_yaml() method.

New in version 0.5.

class aspecd.tasks.ExportTask

Bases: aspecd.tasks.Task

Export of datasets.

Sometimes, datasets need to be exported and stored as file.

For more information on the underlying general class, see aspecd.io.DatasetExporter.

For an example of how such an export task may be included into a recipe, see the YAML listing below:

kind: export
type: AdfExporter
properties:
  target:
    - dataset.adf
apply_to:
  - loi:xxx

Note that you can refer to datasets and results created during cooking of a recipe using their respective labels. Those labels will automatically be replaced by the actual dataset/result prior to performing the task.

In case you apply the task to more than one dataset, you will need to supply a list of filenames instead of only a single filename. A minimal example may look like this:

kind: export
type: AdfExporter
properties:
  target:
    - dataset1.adf
    - dataset2.adf
apply_to:
  - loi:xxx
  - loi:yyy

Important

Make sure to provide the same number of file names in your recipe as the number of datasets you apply the exporter to. Otherwise you may run into trouble.

Note

If the recipe contains the output_directory key on the top level, the datasets will be saved to this directory.

from_dict(dict_=None)

Set attributes from dictionary.

Parameters

dict (dict) – Dictionary containing information of a task.

Raises

aspecd.tasks.MissingDictError – Raised if no dict is provided.

get_object()

Return object for a particular task including all attributes.

Returns

obj – Object of a class defined in the type attribute of a task

Return type

object

perform()

Call the appropriate method of the underlying object.

The actual implementation is contained in the non-public method aspecd.tasks.Task._perform().

Different underlying objects have different methods used to actually perform the respective task. In order to generically perform a task, classes derived from the aspecd.tasks.Task base class need to override aspecd.tasks.Task._perform() accordingly.

Use aspecd.tasks.Task.get_object() to get an instance of the actual object necessary to perform the task, and afterwards call its appropriate method.

Similarly, to get the actual dataset using the dataset id stored in aspecd.tasks.Task.apply_to, use the method aspecd.tasks.Recipe.get_dataset() of the recipe stored in aspecd.tasks.Task.recipe.

Raises

aspecd.tasks.MissingRecipeError – Raised if no recipe is available.

to_dict()

Create dictionary containing public attributes of an object.

Furthermore, replace certain objects with their respective labels provided in the recipe. These objects currently include datasets, results, figures (i.e. figure records), and plotters.

Returns

public_attributes – Ordered dictionary containing the public attributes of the object

The order of attribute definition is preserved

Return type

collections.OrderedDict

Changed in version 0.4: (Implicit) parameters of underlying task object are added

to_yaml()

Create YAML representation of task.

As users will use tasks primarily within recipes, i.e. in YAML files, this method conveniently returns the YAML representation of a task.

Returns

yaml – YAML representation of a task

Return type

str

Examples

To get the YAML representation of a concrete task, you need to create a task object of the appropriate class and assign a type to it:

task = aspecd.tasks.ProcessingTask()
task.type = 'Normalisation'
print(task.to_yaml())

The result of the last line (the print statement from above) will look as follows:

kind: processing
type: Normalisation
properties:
  parameters:
    kind: maximum
    range: null
    range_unit: index
    noise_range: null
    noise_range_unit: percentage
apply_to: []
result: ''
comment: ''

As you can see, you will get the full set of settings for this particular task. Not all of them are strictly necessary, and for details, you still need to look into the respective documentation. But this can make it much easier to add tasks to a recipe.

Similarly, you can get YAML representations of an empty recipe where you can add to your tasks to. For details, see the documentation of the Recipe.to_yaml() method.

New in version 0.5.

class aspecd.tasks.TabulateTask

Bases: aspecd.tasks.Task

Tabulate step defined as task in recipe-driven data analysis.

Tables will always be created individually for each dataset.

For more information on the underlying general class, see aspecd.table.Table.

For an example of how such a tabulating task may be included into a recipe, see the YAML listing below:

kind: tabulate
type: Table
properties:
  caption:
    title: >
      Ideally a single sentence summarising the intend of the table
    text: >
      More text for the table caption
  filename: fancytable.txt
apply_to:
  - loi:xxx

Note that you can refer to datasets and results created during cooking of a recipe using their respective labels. Those labels will automatically be replaced by the actual dataset/result prior to performing the task.

Note

As soon as you provide a filename in the properties of your recipe, the resulting table will automatically be saved to that filename.

In case you apply the task to more than one dataset and would like to save individual tables, you can do that by supplying a list of filenames instead of only a single filename. In this case, the tables get saved to the filenames in the list. A minimal example may look like this:

kind: tabulate
type: Table
properties:
  filename:
    - fancytable1.pdf
    - fancytable2.pdf
apply_to:
  - loi:xxx
  - loi:yyy

Important

Make sure to provide the same number of file names in your recipe as the number of datasets you create the tables for. Otherwise you may run into trouble.

Note

If the recipe contains the output key in its directories dict, the figure(s) will be saved to this directory.

New in version 0.5.

save_table(table=None)

Save the figure of the plot created by the task.

Parameters

table (aspecd.table.Table) – Table whose table should be saved

from_dict(dict_=None)

Set attributes from dictionary.

Parameters

dict (dict) – Dictionary containing information of a task.

Raises

aspecd.tasks.MissingDictError – Raised if no dict is provided.

get_object()

Return object for a particular task including all attributes.

Returns

obj – Object of a class defined in the type attribute of a task

Return type

object

perform()

Call the appropriate method of the underlying object.

The actual implementation is contained in the non-public method aspecd.tasks.Task._perform().

Different underlying objects have different methods used to actually perform the respective task. In order to generically perform a task, classes derived from the aspecd.tasks.Task base class need to override aspecd.tasks.Task._perform() accordingly.

Use aspecd.tasks.Task.get_object() to get an instance of the actual object necessary to perform the task, and afterwards call its appropriate method.

Similarly, to get the actual dataset using the dataset id stored in aspecd.tasks.Task.apply_to, use the method aspecd.tasks.Recipe.get_dataset() of the recipe stored in aspecd.tasks.Task.recipe.

Raises

aspecd.tasks.MissingRecipeError – Raised if no recipe is available.

to_dict()

Create dictionary containing public attributes of an object.

Furthermore, replace certain objects with their respective labels provided in the recipe. These objects currently include datasets, results, figures (i.e. figure records), and plotters.

Returns

public_attributes – Ordered dictionary containing the public attributes of the object

The order of attribute definition is preserved

Return type

collections.OrderedDict

Changed in version 0.4: (Implicit) parameters of underlying task object are added

to_yaml()

Create YAML representation of task.

As users will use tasks primarily within recipes, i.e. in YAML files, this method conveniently returns the YAML representation of a task.

Returns

yaml – YAML representation of a task

Return type

str

Examples

To get the YAML representation of a concrete task, you need to create a task object of the appropriate class and assign a type to it:

task = aspecd.tasks.ProcessingTask()
task.type = 'Normalisation'
print(task.to_yaml())

The result of the last line (the print statement from above) will look as follows:

kind: processing
type: Normalisation
properties:
  parameters:
    kind: maximum
    range: null
    range_unit: index
    noise_range: null
    noise_range_unit: percentage
apply_to: []
result: ''
comment: ''

As you can see, you will get the full set of settings for this particular task. Not all of them are strictly necessary, and for details, you still need to look into the respective documentation. But this can make it much easier to add tasks to a recipe.

Similarly, you can get YAML representations of an empty recipe where you can add to your tasks to. For details, see the documentation of the Recipe.to_yaml() method.

New in version 0.5.

class aspecd.tasks.TaskFactory

Bases: object

Factory for creating task objects based on the kind provided.

The kind reflects the name of the module the actual object required for performing the task resides in. Furthermore, two ways are available for specifying the kind, either directly as argument provided to aspecd.tasks.TaskFactory.get_task() or as key in a dict used as an argument for aspecd.tasks.TaskFactory.get_task_from_dict().

The classes for the different tasks follow a simple convention: “<Module>Task” with “<Module>” being the capitalised module name the actual class necessary for performing the task resides in. Therefore, for each new module tasks should be available for, you will need to create an appropriate task class deriving from aspecd.tasks.Task.

Raises
  • aspecd.tasks.MissingTaskDescriptionError – Raised if no description is given necessary to create task.

  • KeyError – Raised if dict with task description does not contain “kind” key.

get_task(kind=None)

Return task object specified by its kind.

Parameters

kind (str) –

Kind of task to create

Reflects the name of the module the actual object required for performing the task resides in.

Returns

task – Task object

The actual subclass depends on the kind.

Return type

aspecd.tasks.Task

Raises

aspecd.tasks.MissingTaskDescriptionError – Raised if no description is given necessary to create task.

get_task_from_dict(dict_=None)

Return task object specified by the “kind” key in the dict.

Parameters

dict (dict) –

Dictionary containing “kind” key

The “kind” key reflects the name of the module the actual object required for performing the task resides in.

Returns

task – Task object

The actual subclass depends on the kind.

Return type

aspecd.tasks.Task

Raises
  • aspecd.tasks.MissingTaskDescriptionError – Raised if no description is given necessary to create task.

  • KeyError – Raised if dict does not contain “kind” key.

class aspecd.tasks.FigureRecord

Bases: aspecd.utils.ToDictMixin

Information about a figure created by a PlotTask.

Figures created during recipe-driven data analysis may need to be added, e.g., to a report. Therefore, the information contained in the PlotTask needs to be accessible by the recipe and other tasks in turn.

caption

User-supplied information for the figure caption.

Has three fields: “title”, “text”, and “parameters”.

“title” is usually one sentence describing the intent of the figure and often plotted bold-face in a figure caption.

“text” is additional text directly following the title, containing more information about the plot.

“parameters” is a list of parameter names that should be included in the figure caption, usually at the very end.

Type

dict

parameters

All parameters necessary for the plot, implicit and explicit

Type

dict

label

Label the figure should be referred to from within the recipe

Similar to the aspecd.tasks.SingleanalysisTask.result attribute of the aspecd.tasks.SingleanalysisTask class.

Type

str

filename

Name of file to save the plot to

Type

str

Raises

aspecd.tasks.MissingPlotterError – Raised if no plotter is provided

from_plotter(plotter=None)

Set attributes from plotter

Usually, a plotter contains all information necessary for an aspecd.tasks.FigureRecord object.

Parameters

plotter (aspecd.plotting.Plotter) – Plotter the figure record should be created for.

Raises

aspecd.tasks.MissingPlotterError – Raised if no plotter is provided

to_dict()

Create dictionary containing public attributes of an object.

Returns

public_attributes – Ordered dictionary containing the public attributes of the object

The order of attribute definition is preserved

Return type

collections.OrderedDict

class aspecd.tasks.ChefDeService

Bases: object

Wrapper for serving the results of recipes given a recipe file name.

In recipe-driven data analysis, a recipe of class aspecd.tasks.Recipe get cooked by a chef of class aspecd.tasks.Chef. However, this requires to get the appropriate dataset factory of class aspecd.dataset.DatasetFactory or a class inheriting from this one, depending on the package actually used.

However, the (end) user would rather like to not care about those details and simply provide a recipe filename (of a YAML file) to an instance of a class and get the results back. This is where the ChefDeService comes in.

Obtaining the results of a recipe will become as simple as:

chef_de_service = ChefDeService()
chef_de_service.serve(recipe_filename='my_recipe.yaml')

Furthermore, the ChefDeService takes care of persisting the history of the cooked recipe in form of a YAML file. Therefore, an additional file gets created consisting of the filename of the recipe provided, extended by the timestamp of serving the results. These history files can be used as recipe again, allowing for full turnover.

recipe_filename

Name of the recipe file to serve the cooked results for

Type

str

Raises

aspecd.tasks.MissingRecipeError – Raised if no recipe filename is provided upon trying to serve

serve(recipe_filename='')

Serve the results of cooking a recipe

All you need to do is to provide the filename of a recipe YAML file. Additionally, the history will be served in a YAML file consisting of the name of the filename provided as recipe, with the timestamp of serving added to it.

Parameters

recipe_filename (str) – Name of the recipe YAML file to cook

Raises

aspecd.tasks.MissingRecipeError – Raised if no recipe filename is provided upon trying to serve

aspecd.tasks.serve()

Serve the results of cooking a recipe

All you need to do is to provide the filename of a recipe YAML file.

The ASpecD framework creates a console script entry point named “serve” that will allow you to even type serve <recipe_name.yaml> on the command line after installing the ASpecD framework.

Thanks to the modular nature of the ASpecD framework, if your recipe contains the default_package key followed by the name of a package based on the ASpecD framework, the correct instance of the aspecd.dataset.DatasetFactory class will be created and added to the recipe automatically, thus allowing you to serve recipes that rely on functionality of ASpecD-derived packages for their being cooked and served.

Raises

aspecd.exceptions.MissingRecipeError – Raised if no recipe filename is provided upon trying to serve