Constituents of a recipe-driven data analysis.

One main aspect of tasks is to provide the constituents of a recipe-driven data analysis, i.e. aspecd.tasks.Recipe and aspecd.tasks.Chef. In its simplest form, a recipe gets cooked by a chef, resulting in a series of tasks being performed on a list of datasets.

The idea of recipes here is to provide all necessary information for data processing and analysis in a simple, human-readable and human-writable form. This allows users not familiar with programming to perform even complex tasks. In addition, recipes can even be “executed” using the command line, not needing to start a Python interpreter.

From a user’s perspective, a recipe is usually stored in a YAML file. This allows to easily create and modify recipes without knowing too much about the underlying processes. For an accessible overview of the YAML syntax, see the introduction provided by ansible .

## Recipe-driven data analysis by example¶

Recipes always consist of two major parts: A list of datasets to operate on, and a list of tasks to be performed on the datasets. Of course, you can specify for each task on which datasets it should be performed, and if possible, whether it should be performed on each dataset separately or combined. The latter is particularly interesting for representations (e.g., plots) consisting of multiple datasets, or analysis steps spanning multiple datasets.

### A first recipe¶

To give a first impression of how such a recipe may look like:

format:
type: ASpecD recipe
version: '0.2'

datasets:
- loi:xxx
- loi:yyy

- kind: processing
type: SingleProcessingStep
properties:
parameters:
param1: bar
param2: foo
prop2: blub
- kind: singleanalysis
type: SingleAnalysisStep
properties:
parameters:
param1: bar
param2: foo
prop2: blub
apply_to:
- loi:xxx
result: new_dataset


Here, tasks is a list of dictionary-style entries. The key kind determines which kind of task should be performed. For each kind, a class subclassing aspecd.tasks.Task needs to exist. For details, see below. The key type stores the name of the actual class, such as a concrete processing step derived from aspecd.processing.SingleProcessingStep. The dictionary properties contains keys corresponding to the attributes of the respective class. Depending on the type of task, additional keys can be used, such as apply_to to determine the datasets this task should be applied to, or result providing a label for a dataset created newly by an analysis task.

Note

The use of loi: markers in the example above points to a situation in which every dataset can be accessed by a unique identifier. For details, see the LabInform documentation.

### Base directory for dataset import¶

There are different ways to refer to datasets, but the most common (for now) is to specify the (relative or absolute) path to the datasets within the local file system.

At the same time, the “paths” listed in the datasets list are used as internal references within the recipe. Therefore, short names are preferrable.

To make things a bit easier, there is a way to define the source directory for datasets:

directories:
datasets_source: /path/to/my/datasets/

datasets:
- dataset1
- dataset2

- kind: processing
type: SingleProcessingStep


In this case, all dataset names will be treated relative to the source directory. Note that if you provide the option datasets_source_directory, this can be both, an absolute path, as shown here for unixoid file systems, and a relative path, as shown in the second example below.

directories:
datasets_source: relative/path/to/my/datasets/

datasets:
- dataset1
- dataset2

- kind: processing
type: SingleProcessingStep


Here, paths have been given for unixoid file systems, using / as a separator. Adjust to your needs if necessary.

### Output directory¶

Some tasks, namely plotting and report tasks, can save their results to files. This will usually be the directory you cook the recipe from. However, sometimes it is quite convenient to specify an output directory, either relative or absolute.

To do so, simply add the output key to the top level directories key of your recipe:

directories:
output: /absolute/path/for/the/outputs

datasets:
- dataset

- kind: singleplot
type: SinglePlotter
properties:
filename:
- fancyfigure.pdf


As said, this path can as well be a relative path with respect to the directory you cook your recipes from:

directories:
output: relative/path/for/the/outputs

datasets:
- dataset

- kind: singleplot
type: SinglePlotter
properties:
filename:
- fancyfigure.pdf


Here, paths have been given for unixoid file systems, using / as a separator. Adjust to your needs if necessary.

Usually, you will use classes to perform the individual tasks that come from your own package. There is a a simple way of doing that, not having to prefix the kind property of every single task: define the default package name like so:

settings:
default_package: my_package

datasets:
- loi:xxx
- loi:yyy

- kind: processing
type: SingleProcessingStep


If you would like to use a class from a different package for only one task, feel free to prefix the “kind” attribute of the respective task, as shown:

tasks:
- kind: some_other_package.processing
type: SingleProcessingStep


Of course, in order to work, this package termed here “some_other_package” needs to follow the same basic rules and layout as the ASpecD framework and packages derived from it. In particular, if you use the “default_package” directive in your recipe, the given package needs to implement a child of the aspecd.dataset.DatasetFactory class.

To state the obvious: You can, of course, combine both strategies, defining a default package and overriding this for a particular task:

settings:
default_package: my_package

datasets:
- loi:xxx
- loi:yyy

- kind: some_other_package.processing
type: SingleProcessingStep


There is one important exception from this rule: If you have defined a default package, and for the type defined in a task there exists no corresponding class in the default package, the class will be looked up in the ASpecD framework. This means as well that if you define a class with the identical name to a class in the ASpecD framework in a derived package, and you want to explicitly use the “original” class from the ASpecD package, you need to explicitly prefix the value of the kind key of your respective task with “aspecd.”.

### Setting own labels (and properties) for datasets¶

Usually, you specify the path (or any other unique and supported identifier) to your dataset(s) in the list of datasets at the beginning of a recipe, like this:

datasets:
- /lengthly/path/to/dataset1
- /lengthly/path/to/dataset2


In this case, you will have to refer to the datasets by their path (or whatever other identifier you used). Usually, these identifiers are quite lengthly, hence not necessarily convenient for use as labels within a recipe. However, you can set your own ids for datasets:

datasets:
- source: /lengthly/path/to/dataset1
id: dataset1
- source: /lengthly/path/to/dataset2
id: dataset2


Make sure to set the source value to the identifier of your dataset. For your id, you are free to choose, as long as it is a valid key for a dict. From now on, refer to the datasets by their respective ids throughout the recipe.

Note

If you use the source key but don’t specify a id key as well, the source will be used as id, as before.

However, you can even drive the whole thing one step further: Suppose you are bored from having always the dataset label (that is by default identical to the source it is imported from) appearing in a figure legend, as it simply does not fit to what you need. How about that:

datasets:
- source: /lengthly/path/to/dataset1
id: dataset1
label: low concentration
- source: /lengthly/path/to/dataset2
id: dataset2
label: high concentration


In this case, you assign the label field of your datasets upon loading them. The idea behind: When specifying which dataset to load, you usually know best about such things, and you don’t want to need to deal with this later on when plotting.

Important

Generally, each property of a dataset can be set this way. However, be careful not to override properties that are not scalar and cannot easily be represented in YAML in the recipe, as you will most certainly break things otherwise. A good example of how to definitely break things would be to override the data property of a dataset.

### Import datasets from other packages¶

Sounds strange in the first place, but appears to be more common than you may imagine: Sometimes, you need to compare datasets recorded using different methods that are in turn handled by different ASpecD-derived packages.

So how to import a dataset using the importer of a different package than the current one? The syntax for the recipe is much the same as the one described above for setting other properties of a dataset:

datasets:
- source: /lengthly/path/to/dataset1
- source: /lengthly/path/to/dataset2
package: other_package


In this example, the first dataset will be imported using the default package set for the recipe, but the second dataset will be loaded using the aspecd.dataset.DatasetFactory and aspecd.io.DatasetImporterFactory classes from other_package. Of course, you need to make sure that other_package exists and contains both, a dataset factory and dataset importer factory. Furthermore, these two classes need to reside in the same modules as in the ASpecD framework, i.e., the dataset factory needs to reside in the “dataset” module and the dataset importer factory in the “io” module.

### Specify importer for datasets¶

Sometimes it may be necessary to explicitly provide the importer class that shall be used to import a dataset. In this case, you can explicitly say which importer to use:

datasets:
- source: /lengthly/path/to/dataset1
- source: /lengthly/path/to/dataset2
importer: TxtImporter


However, be careful to match data format and importer, as you are overriding the automatic importer determination of the aspecd.io.DatasetImporterFactory this way. Furthermore, make sure the respective importer class exists. Of course, this works as well with providing an alternative package:

datasets:
- source: /lengthly/path/to/dataset1
- source: /lengthly/path/to/dataset2
package: other_package
importer: TxtImporter


In this particular example, the importer located in other_package.io.TxtImporter would be used to import your dataset. The parameters will be directly passed to the importer without further checking, and it is the sole responsibility of the importer class to make sense of the parameters provided. Have a look at the documentation of the actual importer class you intend to use for parameters you can set (if any). Note that many parameters will not recognise additional parameters.

### Specify importer parameters for datasets¶

Furthermore, sometimes you may want to provide parameters for an importer, e.g. in case of importing text files with headers, and you can do this as well:

datasets:
- source: /lengthly/path/to/dataset1
- source: /lengthly/path/to/dataset2
importer: TxtImporter
importer_parameters:
skiprows: 3


You can even provide importer_parameters without explicitly specifying an importer to use, although this may lead to hard to detect behaviour, as you rely on the automatism of choosing the importer class implemented in the aspecd.io.DatasetImporterFactory in this case.

### Referring to other datasets and results¶

Some tasks yield results you usually would want to use later on in the recipe. Prime examples are analysis steps and plots. While analysis steps have a property result that can refer to either a dataset or something else, depending on the actual type of analysis step, plots have a label that can be used to refer to them.

While analysis steps always yield results, processing steps usually operate on a dataset that gets modified in turn. However, sometimes it is desired to return the modified dataset as a new dataset, independent of the original one. In this case, specify a result here, too. For details, see the aspecd.tasks.ProcessingTask documentation below.

### Variable replacement¶

Additionally to the labels described above, variables will be parsed and replaced. Currently, the following types of variables are understood:

key1: {{ basename(id) }}
key2: {{ path(id) }}
key3: {{ id(id) }}


Here, id is the id used internally for referring to a dataset, {{ basename(id) }} will be replaced with the file basename of the respective dataset source, {{ path(id) }} will be replaced by the path of the respective dataset source, and {{ id(id) }} will be replaced by the id itself.

Note: The spaces within the double curly brackets are only for better readability, they can be omitted, although this is not recommended.

Why is this interesting? Suppose you would like to create a rather generic recipe always performing the same tasks, but for different datasets. A rather minimal example is given below:

datasets:
- source: /path/to/my_dataset.txt
id: first_measurement

- kind: processing
type: SubtractBaseline
properties:
parameters:
kind: polynomial
order: 0
- kind: singleplot
type: SinglePlotter
properties:
filename:
- {{ basename(first_measurement) }}.pdf


Here, you can see that all you would need to do is to replace the source with the actual path to your dataset. This will automatically perform the tasks of the recipe on the given dataset, storing the plot to a file named my_dataset.pdf.

## Executing recipes: serving the cooked results¶

As stated above, a recipe gets cooked by a chef, resulting in a series of tasks being performed on a list of datasets. However, as an (end) user you usually don’t care about chefs and recipes besides the human-readable and writable representation of a recipe in YAML format. Therefore, there is a fairly simple way to get a recipe executed, or, in terms of the metaphor of recipe and cook, to get the meal served:

serve <my-recipe>.yaml


No need of running a Python terminal, no need of instantiating any class. Simply executing a command from a terminal, that’s all that is to it. In this particular example, <my-recipe> is a placeholder for your recipe file name.

Of course, you need to have the ASpecD package installed (preferrably within a virtual environment), and you still need to have access to a terminal. But that’s all. And when you have hit “enter”, you will usually see a list of lines starting with “INFO” telling you what is happening. If something is notable or gets entirely wrong, you will see lines starting with “WARNING” or “ERROR”, respectively. With standard settings, the latter will not provide you with the complete stack trace, as this is usually not helpful for the user. If, however, you are developer or otherwise interested in the details (or are asked by a developer to provide more details), use the -v switch to get more verbose output. Conversely, if you insist on not seeing any info of what happened, you can use the -q switch to silence the serve command. To get more information, use the help builtin to the serve command:

serve -h


Its output will look similar to the following:

usage: serve [-h] [-v | -q] recipe

Process a recipe in context of recipe-driven data analysis

positional arguments:
recipe         YAML file containing recipe

optional arguments:
-h, --help     show this help message and exit
-v, --verbose  show debug output
-q, --quiet    don't show any output


Of course, you can do the same from within Python (however, why would you want to do that):

serve(recipe_filename='<my-recipe>.yaml')


And if you insist, of course there is an object-oriented way to do it:

chef_de_service = ChefDeService()
chef_de_service.serve(recipe_filename='<my-recipe>.yaml')


The good news with all this: It should work for every package derived from the ASpecD framework, as long as you specify the default_package directive within the recipe. And of course, calling the recipe from the command-line will only help you if it creates some kind of output.

## History of a recipe¶

The aspecd.tasks.Chef class takes care of automatically creating a history of the recipe cooked, with a full list of parameters for each task. This history is a dict that follows the same structure as the original recipe. Therefore, you can save this history to a YAML file and use it as a recipe again, perhaps after some modifications.

If you use the aspecd.tasks.ChefDeService class, you need not care about actually writing the history to a YAML file. Therefore, using this class or even the command-line call to serve as described above, is highly recommended. In this case, you will have a full history of all your tasks contained in a human-readable YAML file, together with some additional information on the system and package versions used to cook the recipe, as well as the time for start and end of cooking.

To make it short: The history of the recipe allows you to perform a fully reproducible data analysis even of multiple datasets and arbitrarily complex tasks without having to care about the details. You get it all for free. That’s what the ASpecD framework is all about. Care about the results of your data analysis and what this means in terms of answering the scientific questions that originally triggered obtaining and analysing the data. Reproducibility is been taken care of for you.

### Suppress automatically writing history¶

Warning

Not having a history of each individual step of your data analysis is considered bad practice and not consistent with reproducible research. Therefore, use the following only in debugging/development settings, never in real life applications.

In some particular cases, namely debugging and development, writing a history for each individual cooking of a recipe might be inconvenient. Therefore, you can tell ASpecD not to automatically write a history. However, use with extreme caution!

settings:
write_history: false


To remind you of what you are doing, this will issue a warning on the command line if using serve.

Tasks can be grouped similarly to the way classes of the ASpecD framework are grouped into different modules. Hence, there are different kinds of tasks. Each task is internally represented by an aspecd.tasks.Task object, more precisely an object instantiated from a subclass of aspecd.tasks.Task. This polymorphism of task classes makes it possible to easily extend the scope of recipe-driven data analysis. Therefore, to allow ASpecD to know how to handle your task (i.e., what task object to create), you need to specify the kind of your task within the recipe, besides the type that is the class name of the actual class performing the respective task.

Currently, the following subclasses are implemented:

As you can see from the above list, there are (currently) three special cases of kinds of tasks: processing, analysis and plot tasks. Usually, you will set the kind of a task in a recipe to the module the class eventually performing the task resides in. As both, analyses and plots can either span one or several datasets, here we have to discriminate. Therefore, it is essential that you take care to set the kind value in your recipe for these kinds of tasks to singleanalysis or multianalysis, respectively. the same is true for plots. To make this a bit easier to follow, see the example below.

tasks:
- kind: processing
type: SingleProcessingStep

- kind: singleanalysis
type: AnalysisStep

- kind: multiplot
type: MultiPlotter1D


Important

As long as there is no automatic syntax checking of recipes before they get executed, you are entirely responsible on your own to provide correct syntax. From own experience, there are a few problems frequently arising: Don’t use analysis, but either singleanalysis or multianalysis as kind in an analysis step. The same applies to plots. Don’t use plotting, but either singleplot or multiplot as kind.

For each task, you can set all attributes of the underlying class using the properties dictionary in the recipe. Therefore, to know which parameters can be set for what kind of task means simply to check the documentation for the respective classes. I.e., for a task represented by an aspecd.tasks.ProcessingTask object, check out the appropriate class from the aspecd.processing module. The same is true for packages derived from ASpecD.

A simple example is the normalisation processing step using the aspecd.processing.Normalisation class:

tasks:
- kind: processing
type: Normalisation
properties:
parameters:
kind: amplitude


How to know what properties can be set? Have a look at the aspecd.processing.Normalisation documentation. Note that all properties that are documented there can be set using a recipe. As processing steps always have a property parameters that is a dict, you need to set the individual keys of this dictionary.

Additionally, for each task, you can explicitly state to which of the datasets it should be applied to. Note that not only the datasets initially loaded can be used here, but all labels referring to datasets that originate from other tasks.

Furthermore, depending on the kind of task, you may be able to set additional parameters controlling in more detail how the particular task is performed. For details, see the documentation of the respective task subclass in this module below.

## Prerequisites for recipe-driven data analysis¶

Note

This section is mostly relevant for those developing packages based on the ASpecD framework. Users of recipe-driven data analysis usually need not bother about these details (as others did for them already).

To be able to use recipe-driven data analysis in packages derived from the ASpecD framework, a series of prerequisites needs to be met, i.e., classes implemented. Besides the usual suspects such as aspecd.dataset.Dataset and its constituents as well as the different processing and analysis steps based on aspecd.processing.SingleProcessingStep and aspecd.analysis.SingleAnalysisStep, two different factory classes need to be implemented in particular, subclassing

respectively. Actually, only aspecd.dataset.DatasetFactory is directly used by aspecd.tasks.Recipe, however, internally it relies on the existence of aspecd.io.DatasetImporterFactory to return a dataset based solely on a (unique) ID.

Besides implementing these classes, the facilities provided by the aspecd.tasks module should be fully sufficient for regular recipe-driven data analysis. In particular, normally there should be no need to subclass any of the classes within this module in a package derived from the ASpecD framework. One particular design goal of recipe-driven data analysis is to decouple the actual tasks being performed from the general handling of recipes. The former is implemented within each respective package built upon the ASpecD framework, the latter is taken care of fully by the ASpecD framework itself. You might want to implement a simple proxy within a derived package to prevent the user from having to call out to functionality provided directly by the ASpecD framework. The latter might be confusing for those unfamiliar with the underlying details, i.e., most common users. More explicit, you may want to create proxy classes in the processing and analysis modules of your package, subclassing all the concrete processing and analysis steps already provided with the ASpecD framework.

## Notes for developers¶

Note

This section is only relevant for those further developing the ASpecD framework. Users of recipe-driven data analysis as well as developers of packages derived from the ASpecD framework usually need not bother about these details (as others did for them already).

Recipe-driven data analysis introduces another level of abstraction and indirection with its use of recipes in YAML format. Based on this analogy, we have a aspecd.tasks.Recipe consisting of a list of datasets and a list of aspecd.tasks.Task to be performed on the datasets. Such recipe gets “cooked” by a aspecd.tasks.Chef, and for the convenience of the user of recipe-driven data analysis, the result gets “served” by the aspecd.tasks.ChefDeService. An actual user will not see any of this, but simply call serve <recipe-name.yaml> from the command line.

Internally, recipes are represented by an instance of aspecd.tasks.Recipe, and this representation takes care already to import the datasets specified in the datasets block of a recipe. Therefore, all handling of data import needs to be done here. Similarly, upon populating a recipe (from dict or by importing), the tasks will already be created using a aspecd.tasks.TaskFactory.

The actual tasks are represented by instances of subclasses of aspecd.tasks.Task, and they in turn create an instance of the actual object internally, applying this to the dataset(s).

“Cooking” a recipe is done by aspecd.tasks.Chef, and this class takes care of writing a history in form of an executable recipe, thus ensuring reproducibility and good scientific practice.

“Serving” the results of a cooked recipe is eventually the responsibility of the aspecd.tasks.ChefDeService, and it is this class calling out to the aspecd.tasks.Chef and writing the history to an actual file that can be used as recipe again. For the convenience of the user, an entry point (console script) is included in the setup.py file calling aspecd.tasks.serve() that in turn takes care of loading the recipe and instantiating a aspecd.tasks.ChefDeService.

## Module documentation¶

Todo

There is a number of things that are not yet implemented, but highly recommended for a working recipe-driven data analysis that follows good practice for reproducible research. This includes (but may not be limited to):

• Parser for recipes performing a static analysis of their syntax. Useful particularly for larger datasets and/or longer lists of tasks.

class aspecd.tasks.Recipe

Bases: object

Recipes get cooked by chefs in recipe-driven data analysis.

A recipe contains a list of tasks to be performed on a list of datasets. To actually carry out all tasks in a recipe, it is handed over to a aspecd.tasks.Chef object for cooking using the respective aspecd.tasks.Chef.cook() method.

From a user’s perspective, recipes reside usually in YAML files from where they are imported into an aspecd.tasks.Recipe object using its respective import_into() method and an object of class aspecd.io.RecipeYamlImporter. Similarly, a given recipe can be exported back to a YAML file using the export_to() method and an object of class aspecd.io.RecipeYamlExporter.

In contrast to the persistent form of a recipe (e.g., as file on the file system), the object contains actual datasets and tasks that are objects of the respective classes. Therefore, the attributes of a recipe are normally set by the respective methods from either a file or a dictionary (that in turn will normally be created from contents of a file).

Retrieving datasets is delegated to an aspecd.dataset.DatasetFactory instance stored in dataset_factory. This provides a maximum of flexibility but makes it necessary to specify (and first implement) such factory in packages derived from the ASpecD framework.

Todo

Can recipes have LOIs themselves and therefore be retrieved from the extended data safe? Might be a sensible option, although generic (and at the same time unique) LOIs for recipes are much harder to create than LOIs for datasets and alike.

Generally, the concept of a LOI is nothing a recipe needs to know about. But it does know about an ID of any kind. Whether this ID is a (local) path or a LOI doesn’t matter. Somewhere in the ASpecD framework there may exist a resolver (factory) for handling IDs of any kind and eventually retrieving the respective information.

datasets

Ordered dictionary of datasets the tasks should be performed for

Each dataset is an object of class aspecd.dataset.Dataset.

The keys are the dataset ids.

Type

collections.OrderedDict

tasks

List of tasks to be performed on the datasets

Each task is an object of class aspecd.tasks.Task.

Type

list

results

Ordered dictionary of results originating from analysis tasks

Results can be of any type, but are mostly either instances of aspecd.dataset.Dataset or aspecd.metadata.PhysicalQuantity.

The keys are those defined by aspecd.tasks.SingleanalysisTask.result and aspecd.tasks.MultianalysisTask.result, respectively.

Type

collections.OrderedDict

figures

Ordered dictionary of figures originating from plotting tasks

Each entry is an object of class aspecd.tasks.FigureRecord.

Type

collections.OrderedDict

plotters

Ordered dictionary of plotters originating from plotting tasks

Each entry is an object of class aspecd.plotting.Plotter.

To end up in the list of plotters, the plot task needs to define a result. This is mainly used for tasks involving CompositePlotters, to define the plotters for each individual plot panel.

Type

collections.OrderedDict

dataset_factory

Factory for datasets used to retrieve datasets

If no factory is set, but a recipe imported from a file or set from a dictionary, an exception will be raised.

Type

aspecd.dataset.DatasetFactory

task_factory

Defaults to an object of class aspecd.tasks.TaskFactory.

If no factory is set, but a recipe imported from a file or set from a dictionary, an exception will be raised.

Type

aspecd.tasks.TaskFactory

format

Information on the format of the recipe

This information is used to automatically convert recipes to the current format.

Dictionary with the following fields:

typestr

Information on the type of file

Shall never be changed.

Default: ASpecD recipe

versionstr

Version of the recipe structure (in form X.Y)

This information is used to automatically convert recipes to the current format.

New in version 0.4.

Type

dict

settings

General settings relevant for cooking the recipe.

Dictionary with the following fields:

default_package: str

Name of the package the task objects are obtained from

If no name for a default package is supplied, “aspecd” is used.

autosave_plots: bool

Whether to save plots automatically even if no filename is provided.

If true, each aspecd.tasks.SingleplotTask and aspecd.tasks.MultiplotTask will save the plots to default file names. For details, see the documentation of the respective classes.

Default: True

New in version 0.2.

write_history: bool

Whether to write a history when serving the recipe.

If true, for each serving of a recipe, a history will be written into a file with same base name and timestamp appended. In terms of reproducible research, it is highly recommended to always write a history. However, when debugging, you may set this to “False”.

Default: True

New in version 0.4.

Changed in version 0.4: Moved properties to keys in this dictionary

Type

dict

directories

Optional control of different directories.

Dictionary with the following fields:

datasets_source: str

Root directory for the datasets.

Interpreted as absolute path if starting with the system-specific file separator. Otherwise, interpreted as relative to the current directory. If provided, all output resulting from cooking a recipe will be saved to this path.

output: str

Directory to save output (plots, reports, …) to.

Interpreted as absolute path if starting with the system-specific file separator. Otherwise, interpreted as relative to the current directory. If provided, all output resulting from cooking a recipe will be saved to this path.

Make sure the path actually exists. Otherwise, you may run into trouble when tasks try to save their output.

Changed in version 0.4: Moved properties to keys in this dictionary

Type

dict

filename

Name of the (YAML) file the recipe was loaded from.

The filename can be used to persist the history of a cooked recipe in form of a YAML file for full reproducibility. This will be done when using the aspecd.tasks.ChefDeService class and its aspecd.tasks.ChefDeService.serve() method.

Type

str

Raises
• aspecd.tasks.MissingDictError – Raised if no dict is provided.

• aspecd.tasks.MissingImporterError – Raised if no importer is provided.

• aspecd.tasks.MissingExporterError – Raised if no exporter is provided.

• aspecd.tasks.MissingTaskFactoryError – Raised if task_factory is invalid.

Changed in version 0.4: Move properties “default_package” and “autosave_plots” to new property “settings”; move properties “output_directory” and “datasets_source” to new property “directories”.

from_dict(dict_=None)

Set attributes from dictionary.

Loads datasets and creates aspecd.tasks.Task objects that are stored as lists respectively.

Parameters

dict (dict) – Dictionary containing information of a recipe.

Raises
• aspecd.tasks.MissingDictError – Raised if no dict is provided.

• aspecd.tasks.MissingDatasetFactoryError – Raised if importer_factory is invalid.

to_dict(remove_empty=False)

Return dict from attributes.

Parameters

remove_empty (bool) –

Whether to remove empty fields from tasks

Default: False

Returns

dict_ – Dictionary representing a recipe.

Contains fields “format”, “settings”, “directories”, “datasets”, and “tasks”.

Return type

dict

Changed in version 0.6: New parameter remove_empty to remove empty fields from tasks.

to_yaml(remove_empty=False)

Create YAML representation of recipe.

As users will interact with recipes primarily in form of YAML files, this method conveniently returns the YAML representation of an empty recipe that can serve as a starting point for own recipes.

Parameters

remove_empty (bool) –

Whether to remove empty fields from tasks

Default: False

Returns

yaml – YAML representation of a recipe

Return type

str

Examples

To get the YAML representation of an empty recipe, create a recipe object and call this method:

recipe = aspecd.tasks.Recipe()
print(recipe.to_yaml())


The result of the last line (the print statement from above) will look as follows:

format:
type: ASpecD recipe
version: '0.2'
settings:
default_package: ''
autosave_plots: true
write_history: true
directories:
output: ''
datasets_source: ''
datasets: []


As you can see, you will get the full set of settings currently allowed within a recipe. Not all of them are strictly necessary, and for details, you still need to look into the documentation. But this can make it much easier to start with writing recipes.

Similarly, you can get YAML representations of tasks that you can add to your recipe. For details, see the documentation of the Task.to_yaml() method.

New in version 0.5.

Changed in version 0.6: Parameter remove_empty to remove empty fields from tasks.

import_from(importer=None)

Import recipe using importer.

Importers can be created to read recipes from different sources. Thus the recipe as such is entirely independent of the persistence layer.

Parameters

importer (aspecd.io.RecipeImporter) – importer used to actually import recipe

Raises

aspecd.tasks.MissingImporterError – Raised if no importer is provided

export_to(exporter=None)

Export recipe using exporter.

Exporters can be created to write recipes to different targets. Thus the recipe as such is entirely independent of the persistence layer.

Parameters

exporter (aspecd.io.RecipeExporter) – exporter used to actually export recipe

Raises

aspecd.tasks.MissingExporterError – Raised if no exporter is provided

get_dataset(identifier='')

Return dataset corresponding to given identifier.

In case of having a list of identifiers, use the similar method aspecd.tasks.Recipe.get_datasets().

Parameters

identifier (str) – Identifier matching the aspecd.dataset.Dataset.id attribute.

Returns

dataset – Dataset corresponding to given identifier

If no dataset corresponding to the given identifier could be found, None is returned.

Return type

aspecd.dataset.Dataset

Raises

aspecd.tasks.MissingDatasetIdentifierError – Raised if no identifier is provided.

get_datasets(identifiers=None)

Return datasets corresponding to given list of identifiers.

In case of having a single identifier, use the similar method aspecd.tasks.Recipe.get_dataset().

Parameters

identifiers (list) – Identifiers matching the aspecd.dataset.Dataset.id attribute.

Returns

datasets – Datasets corresponding to given identifier

Each dataset is an instance of aspecd.dataset.Dataset.

If no datasets corresponding to the given identifiers could be found, an empty list is returned.

Return type

list

Raises

aspecd.tasks.MissingDatasetIdentifierError – Raised if no identifiers are provided.

class aspecd.tasks.Chef(recipe=None)

Bases: object

Chefs cook recipes in recipe-driven data analysis.

As a result, they create a full history of the tasks performed, including all parameters, implicit and explicit. In this respect, they make the history independent of a singe dataset and allow to trace processing and analysis of multiple datasets.

Note

One necessary prerequisite for full reproducibility is therefore some kind of persistent and unique identifier for each dataset. The “Lab Object Identifier” (LOI) as used within the LabInform framework, is one solution of such identifier.

For persisting the history of cooking a recipe, the contents of the history attribute should be saved as a YAML file. There are two ways how to do that: manually and fully automated. If you manually instantiate an object of the aspecd.tasks.Chef class, you would need to do that on your own, as follows:

chef = aspecd.tasks.Chef()
# ... obtaining recipe from file
chef.cook(recipe)

yaml = aspecd.utils.Yaml()
yaml.dict = chef.history
yaml.write_to(filename='<my-recipe-history>.yaml')


The other way is to use an instance of the aspecd.tasks.ChefDeService class and its aspecd.tasks.ChefDeService.serve() method:

chef_de_service = ChefDeService()
chef_de_service.serve(recipe_filename='my_recipe.yaml')


This will automatically save the recipe history for you as a YAML file with its filename derived from the original recipe name. For details, see the documentation of the aspecd.tasks.ChefDeService class.

The YAML files generated from saving the history should work as recipes themselves, therefore allowing a full turnover, as well as easy modification of a recipe.

recipe

Recipe to cook, i.e. to carry out

Type

aspecd.tasks.Recipe

history

History of cooking the recipe

Contains a complete record of each task performed, including all parameters, implicit and explicit. Additionally, contains system information as collected by the aspecd.system.SystemInfo class.

Can be exported to a YAML file that works as a recipe.

Type

collections.OrderedDict

Parameters

recipe (aspecd.tasks.Recipe) – Recipe to cook, i.e. to carry out

Raises

aspecd.tasks.MissingRecipeError – Raised if no recipe is available to be cooked

cook(recipe=None)

Cook recipe, i.e. carry out tasks contained therein.

A recipe is an object of class aspecd.tasks.Recipe and contains both, a list of datasets and a list of tasks to be performed on these datasets.

Parameters

recipe (aspecd.tasks.Recipe) – Recipe to cook, i.e. tasks to carry out on particular datasets

Raises

aspecd.tasks.MissingRecipeError – Raised if no recipe is available to be cooked

class aspecd.tasks.Task(recipe=None)

Base class storing information for a single task.

Different underlying objects used to actually perform the respective task have different requirements and different signatures. In order to generically perform a task, for each kind of task – such as processing, analysis, plotting – this class needs to be subclassed. For a number of basic tasks available in the ASpecD package, this has already been done. See:

Note that imports of datasets are usually not handled using tasks, as this is taken care of automatically by defining a list of datasets in a aspecd.tasks.Recipe.

Usually, you need not care to instantiate objects of the correct type, as this is done automatically by the aspecd.tasks.Recipe using the aspecd.tasks.TaskFactory.

kind

Usually corresponds to the module name the type (class) is defined in. See the note below for special cases.

Type

str

type

Corresponds to the class name eventually responsible for performing the task.

Type

str

package

Name of the package the class eventually responsible for performing the task belongs to.

Type

str

properties

Properties necessary to perform the task.

Should have keys corresponding to the properties of the class given as type attribute.

Generally, all keys in aspecd.tasks.Task.properties will be mapped to the underlying object created to perform the actual task.

In contrast, all additional attributes of a given task object subclassing aspecd.tasks.Task that are specific to the task object as such and its operation, but not for the object created by the task object to perform the task, are not part of the aspecd.tasks.Task.properties dict. For a recipe, this means that these additional attributes are at the same level as aspecd.tasks.Task.properties.

Type

dict

apply_to

List of datasets the task should be applied to.

Defaults to an empty list, meaning that the task will be performed for all datasets contained in a aspecd.tasks.Recipe.

Each dataset is referred to by the value of its aspecd.dataset.Dataset.source attribute. This should be unique and can consist of a filename, path, URL/URI, LOI, or alike.

Type

list

recipe

Recipe containing the task and the list of datasets the task refers to

Type

aspecd.tasks.Recipe

comment

User-supplied comment describing intent, purpose, reason, …

Type

str

Note

A note to developers: Usually, the aspecd.tasks.Task.kind attribute is identical to the module name the respective class resides in. However, sometimes this is not the case, as with the plotters. In this case, an additional, non-public attribute aspecd.tasks.Task._module can be set in classes derived from aspecd.tasks.Task.

Raises
• aspecd.tasks.MissingDictError – Raised if no dict is provided when calling from_dict().

• aspecd.tasks.MissingRecipeError – Raised if no recipe is available upon performing the task.

Changed in version 0.6.4: New attribute comment

from_dict(dict_=None)

Set attributes from dictionary.

Parameters

dict (dict) – Dictionary containing information of a task.

Raises

aspecd.tasks.MissingDictError – Raised if no dict is provided.

to_dict(remove_empty=False)

Create dictionary containing public attributes of an object.

Furthermore, replace certain objects with their respective labels provided in the recipe. These objects currently include datasets, results, figures (i.e. figure records), and plotters.

Parameters

remove_empty (bool) –

Whether to remove empty fields

Default: False

Returns

public_attributes – Ordered dictionary containing the public attributes of the object

The order of attribute definition is preserved

Return type

collections.OrderedDict

Changed in version 0.4: (Implicit) parameters of underlying task object are added

Changed in version 0.6: New parameter remove_empty

to_yaml(remove_empty=False)

As users will use tasks primarily within recipes, i.e. in YAML files, this method conveniently returns the YAML representation of a task.

Parameters

remove_empty (bool) –

Whether to remove empty fields

Default: False

Returns

yaml – YAML representation of a task

Return type

str

Examples

To get the YAML representation of a concrete task, you need to create a task object of the appropriate class and assign a type to it:

task = aspecd.tasks.ProcessingTask()


The result of the last line (the print statement from above) will look as follows:

kind: processing
type: Normalisation
properties:
parameters:
kind: maximum
range: null
range_unit: index
noise_range: null
noise_range_unit: percentage
apply_to: []
result: ''
comment: ''


As you can see, you will get the full set of settings for this particular task. Not all of them are strictly necessary, and for details, you still need to look into the respective documentation. But this can make it much easier to add tasks to a recipe.

Similarly, you can get YAML representations of an empty recipe where you can add to your tasks to. For details, see the documentation of the Recipe.to_yaml() method.

New in version 0.5.

Changed in version 0.6: Parameter remove_empty

perform()

Call the appropriate method of the underlying object.

The actual implementation is contained in the non-public method aspecd.tasks.Task._perform().

Different underlying objects have different methods used to actually perform the respective task. In order to generically perform a task, classes derived from the aspecd.tasks.Task base class need to override aspecd.tasks.Task._perform() accordingly.

Use aspecd.tasks.Task.get_object() to get an instance of the actual object necessary to perform the task, and afterwards call its appropriate method.

Similarly, to get the actual dataset using the dataset id stored in aspecd.tasks.Task.apply_to, use the method aspecd.tasks.Recipe.get_dataset() of the recipe stored in aspecd.tasks.Task.recipe.

Raises

aspecd.tasks.MissingRecipeError – Raised if no recipe is available.

get_object()

Return object for a particular task including all attributes.

Returns

obj – Object of a class defined in the type attribute of a task

Return type

object

class aspecd.tasks.ProcessingTask(recipe=None)

Processing step defined as task in recipe-driven data analysis.

Processing steps will always be performed individually for each dataset.

For more information on the underlying general class, see aspecd.processing.SingleProcessingStep.

For an example of how such a processing task may be included into a recipe, see the YAML listing below:

kind: processing
type: SingleProcessingStep
properties:
parameters:
param1: bar
param2: foo
comment: >
Some free text describing in more details the processing step
apply_to:
- loi:xxx


Note that you can refer to datasets and results created during cooking of a recipe using their respective labels. Those labels will automatically be replaced by the actual dataset/result prior to performing the task.

Sometimes it can come in quite handy to compare different processing steps on the same original dataset, e.g. a series of different parameters. Think of a polynomial baseline correction where you would like to compare the effect of polynomials of different order. Here, what you are interested in is to work on copies of the original dataset and get the results stored additionally. Here you go:

kind: processing
type: SingleProcessingStep
result: label


And if you now want to do that for multiple datasets, you can do that as well. However, make sure to provide as many result labels as you have datasets to perform the processing step on, as otherwise no result will be stored and the processing step will operate on the original datasets:

kind: processing
type: SingleProcessingStep
apply_to:
- loi:xxx
- loi:yyy
result:
- label1
- label2


Another thing that can be very useful for data processing is to add a comment to an individual step, e.g. with an explanation why this step has been performed:

kind: processing
type: SingleProcessingStep
comment: >
Lorem ipsum dolor sit amet,


Note that using the > sign will replace newline characters with spaces. If you want to preserve the newline characters, use | instead.

result

Label for the results of a processing step.

Processing steps always operate on datasets. However, sometimes it is useful to have a processing task return a copy of the processed dataset, in order to compare different processings afterwards. Therefore, you can specify a result label. In this case, the dataset will be copied first, the processing step performed on it, and afterwards the result returned as a new dataset that is accessible throughout the rest of the recipe with the label provided.

In case you perform the processing on several datasets, you may want to provide as many result labels as there are datasets. Otherwise, no result will be assigned.

Type

str

comment

Textual comment regarding the processing step

Type

str

Changed in version 0.3: New attribute comment

to_dict(remove_empty=False)

Create dictionary containing public attributes of an object.

Furthermore, replace certain objects with their respective labels provided in the recipe. These objects currently include datasets, results, figures (i.e. figure records), and plotters.

Returns

public_attributes – (List of) ordered dictionary/ies containing the public attributes of the object

In case of multiple datasets and added parameters during execution of the task, a list of dicts (one dict for each dataset).

The order of attribute definition is preserved

Return type

Changed in version 0.4: Return list of dicts in case of multiple datasets and added parameters during execution of the task

Changed in version 0.6: New parameter remove_empty

from_dict(dict_=None)

Set attributes from dictionary.

Parameters

dict (dict) – Dictionary containing information of a task.

Raises

aspecd.tasks.MissingDictError – Raised if no dict is provided.

get_object()

Return object for a particular task including all attributes.

Returns

obj – Object of a class defined in the type attribute of a task

Return type

object

perform()

Call the appropriate method of the underlying object.

The actual implementation is contained in the non-public method aspecd.tasks.Task._perform().

Different underlying objects have different methods used to actually perform the respective task. In order to generically perform a task, classes derived from the aspecd.tasks.Task base class need to override aspecd.tasks.Task._perform() accordingly.

Use aspecd.tasks.Task.get_object() to get an instance of the actual object necessary to perform the task, and afterwards call its appropriate method.

Similarly, to get the actual dataset using the dataset id stored in aspecd.tasks.Task.apply_to, use the method aspecd.tasks.Recipe.get_dataset() of the recipe stored in aspecd.tasks.Task.recipe.

Raises

aspecd.tasks.MissingRecipeError – Raised if no recipe is available.

to_yaml(remove_empty=False)

As users will use tasks primarily within recipes, i.e. in YAML files, this method conveniently returns the YAML representation of a task.

Parameters

remove_empty (bool) –

Whether to remove empty fields

Default: False

Returns

yaml – YAML representation of a task

Return type

str

Examples

To get the YAML representation of a concrete task, you need to create a task object of the appropriate class and assign a type to it:

task = aspecd.tasks.ProcessingTask()


The result of the last line (the print statement from above) will look as follows:

kind: processing
type: Normalisation
properties:
parameters:
kind: maximum
range: null
range_unit: index
noise_range: null
noise_range_unit: percentage
apply_to: []
result: ''
comment: ''


As you can see, you will get the full set of settings for this particular task. Not all of them are strictly necessary, and for details, you still need to look into the respective documentation. But this can make it much easier to add tasks to a recipe.

Similarly, you can get YAML representations of an empty recipe where you can add to your tasks to. For details, see the documentation of the Recipe.to_yaml() method.

New in version 0.5.

Changed in version 0.6: Parameter remove_empty

class aspecd.tasks.SingleprocessingTask(recipe=None)

Singleprocessing step defined as task in recipe-driven data analysis.

This is a convenience alias class for ProcessingTask. Therefore, the following two tasks are identical:

- kind: processing
type: SingleProcessingStep

- kind: singleprocessing
type: SingleProcessingStep

from_dict(dict_=None)

Set attributes from dictionary.

Parameters

dict (dict) – Dictionary containing information of a task.

Raises

aspecd.tasks.MissingDictError – Raised if no dict is provided.

get_object()

Return object for a particular task including all attributes.

Returns

obj – Object of a class defined in the type attribute of a task

Return type

object

perform()

Call the appropriate method of the underlying object.

The actual implementation is contained in the non-public method aspecd.tasks.Task._perform().

Different underlying objects have different methods used to actually perform the respective task. In order to generically perform a task, classes derived from the aspecd.tasks.Task base class need to override aspecd.tasks.Task._perform() accordingly.

Use aspecd.tasks.Task.get_object() to get an instance of the actual object necessary to perform the task, and afterwards call its appropriate method.

Similarly, to get the actual dataset using the dataset id stored in aspecd.tasks.Task.apply_to, use the method aspecd.tasks.Recipe.get_dataset() of the recipe stored in aspecd.tasks.Task.recipe.

Raises

aspecd.tasks.MissingRecipeError – Raised if no recipe is available.

to_dict(remove_empty=False)

Create dictionary containing public attributes of an object.

Furthermore, replace certain objects with their respective labels provided in the recipe. These objects currently include datasets, results, figures (i.e. figure records), and plotters.

Returns

public_attributes – (List of) ordered dictionary/ies containing the public attributes of the object

In case of multiple datasets and added parameters during execution of the task, a list of dicts (one dict for each dataset).

The order of attribute definition is preserved

Return type

Changed in version 0.4: Return list of dicts in case of multiple datasets and added parameters during execution of the task

Changed in version 0.6: New parameter remove_empty

to_yaml(remove_empty=False)

As users will use tasks primarily within recipes, i.e. in YAML files, this method conveniently returns the YAML representation of a task.

Parameters

remove_empty (bool) –

Whether to remove empty fields

Default: False

Returns

yaml – YAML representation of a task

Return type

str

Examples

To get the YAML representation of a concrete task, you need to create a task object of the appropriate class and assign a type to it:

task = aspecd.tasks.ProcessingTask()


The result of the last line (the print statement from above) will look as follows:

kind: processing
type: Normalisation
properties:
parameters:
kind: maximum
range: null
range_unit: index
noise_range: null
noise_range_unit: percentage
apply_to: []
result: ''
comment: ''


As you can see, you will get the full set of settings for this particular task. Not all of them are strictly necessary, and for details, you still need to look into the respective documentation. But this can make it much easier to add tasks to a recipe.

Similarly, you can get YAML representations of an empty recipe where you can add to your tasks to. For details, see the documentation of the Recipe.to_yaml() method.

New in version 0.5.

Changed in version 0.6: Parameter remove_empty

class aspecd.tasks.MultiprocessingTask(recipe=None)

Multiprocessing step defined as task in recipe-driven data analysis.

Processing steps will always be performed individually for each dataset. Nevertheless, in this particular case, the processing depends on the list of datasets provided in the apply_to field

For more information on the underlying general class, see aspecd.processing.MultiProcessingStep.

For an example of how such a processing task may be included into a recipe, see the YAML listing below:

kind: multiprocessing
type: MultiProcessingStep
properties:
parameters:
param1: bar
param2: foo
apply_to:
- loi:xxx
- loi:yyy


Note that you can refer to datasets and results created during cooking of a recipe using their respective labels. Those labels will automatically be replaced by the actual dataset/result prior to performing the task.

Sometimes it can come in quite handy to compare different processing steps on the same original dataset, e.g. a series of different parameters. Here, what you are interested in is to work on copies of the original dataset and get the results stored additionally. Here you go:

kind: multiprocessing
type: MultiProcessingStep
apply_to:
- loi:xxx
- loi:yyy
result:
- label1
- label2

result

Labels for the results of a processing step.

Processing steps always operate on datasets. However, sometimes it is useful to have a processing task return a copy of the processed dataset, in order to compare different processings afterwards. Therefore, you can specify a result label. In this case, the dataset will be copied first, the processing step performed on it, and afterwards the result returned as a new dataset that is accessible throughout the rest of the recipe with the label provided.

In case you perform the processing on several datasets, you may want to provide as many result labels as there are datasets. Otherwise, no result will be assigned.

Type

list

from_dict(dict_=None)

Set attributes from dictionary.

Parameters

dict (dict) – Dictionary containing information of a task.

Raises

aspecd.tasks.MissingDictError – Raised if no dict is provided.

get_object()

Return object for a particular task including all attributes.

Returns

obj – Object of a class defined in the type attribute of a task

Return type

object

perform()

Call the appropriate method of the underlying object.

The actual implementation is contained in the non-public method aspecd.tasks.Task._perform().

Different underlying objects have different methods used to actually perform the respective task. In order to generically perform a task, classes derived from the aspecd.tasks.Task base class need to override aspecd.tasks.Task._perform() accordingly.

Use aspecd.tasks.Task.get_object() to get an instance of the actual object necessary to perform the task, and afterwards call its appropriate method.

Similarly, to get the actual dataset using the dataset id stored in aspecd.tasks.Task.apply_to, use the method aspecd.tasks.Recipe.get_dataset() of the recipe stored in aspecd.tasks.Task.recipe.

Raises

aspecd.tasks.MissingRecipeError – Raised if no recipe is available.

to_dict(remove_empty=False)

Create dictionary containing public attributes of an object.

Furthermore, replace certain objects with their respective labels provided in the recipe. These objects currently include datasets, results, figures (i.e. figure records), and plotters.

Parameters

remove_empty (bool) –

Whether to remove empty fields

Default: False

Returns

public_attributes – Ordered dictionary containing the public attributes of the object

The order of attribute definition is preserved

Return type

collections.OrderedDict

Changed in version 0.4: (Implicit) parameters of underlying task object are added

Changed in version 0.6: New parameter remove_empty

to_yaml(remove_empty=False)

As users will use tasks primarily within recipes, i.e. in YAML files, this method conveniently returns the YAML representation of a task.

Parameters

remove_empty (bool) –

Whether to remove empty fields

Default: False

Returns

yaml – YAML representation of a task

Return type

str

Examples

To get the YAML representation of a concrete task, you need to create a task object of the appropriate class and assign a type to it:

task = aspecd.tasks.ProcessingTask()


The result of the last line (the print statement from above) will look as follows:

kind: processing
type: Normalisation
properties:
parameters:
kind: maximum
range: null
range_unit: index
noise_range: null
noise_range_unit: percentage
apply_to: []
result: ''
comment: ''


As you can see, you will get the full set of settings for this particular task. Not all of them are strictly necessary, and for details, you still need to look into the respective documentation. But this can make it much easier to add tasks to a recipe.

Similarly, you can get YAML representations of an empty recipe where you can add to your tasks to. For details, see the documentation of the Recipe.to_yaml() method.

New in version 0.5.

Changed in version 0.6: Parameter remove_empty

class aspecd.tasks.AnalysisTask(recipe=None)

Analysis step defined as task in recipe-driven data analysis.

Analysis steps can be performed individually for each dataset or the results combined, depending on the type of analysis step.

Important

An AnalysisTask should not be used directly but rather the two classes derived from this class, namely:

For more information on the underlying general class, see aspecd.analysis.AnalysisStep.

result

Label for the result of an analysis step.

The result of an analysis step can be everything from a scalar to an entire (new) dataset.

This label will be used to refer to the result later on when further processing the recipe.

Type

str

comment

Textual comment regarding the analysis step

Type

str

Changed in version 0.3: New attribute comment

Changed in version 0.4: Raises warning if perform() is called

from_dict(dict_=None)

Set attributes from dictionary.

Parameters

dict (dict) – Dictionary containing information of a task.

Raises

aspecd.tasks.MissingDictError – Raised if no dict is provided.

get_object()

Return object for a particular task including all attributes.

Returns

obj – Object of a class defined in the type attribute of a task

Return type

object

perform()

Call the appropriate method of the underlying object.

The actual implementation is contained in the non-public method aspecd.tasks.Task._perform().

Different underlying objects have different methods used to actually perform the respective task. In order to generically perform a task, classes derived from the aspecd.tasks.Task base class need to override aspecd.tasks.Task._perform() accordingly.

Use aspecd.tasks.Task.get_object() to get an instance of the actual object necessary to perform the task, and afterwards call its appropriate method.

Similarly, to get the actual dataset using the dataset id stored in aspecd.tasks.Task.apply_to, use the method aspecd.tasks.Recipe.get_dataset() of the recipe stored in aspecd.tasks.Task.recipe.

Raises

aspecd.tasks.MissingRecipeError – Raised if no recipe is available.

to_dict(remove_empty=False)

Create dictionary containing public attributes of an object.

Furthermore, replace certain objects with their respective labels provided in the recipe. These objects currently include datasets, results, figures (i.e. figure records), and plotters.

Parameters

remove_empty (bool) –

Whether to remove empty fields

Default: False

Returns

public_attributes – Ordered dictionary containing the public attributes of the object

The order of attribute definition is preserved

Return type

collections.OrderedDict

Changed in version 0.4: (Implicit) parameters of underlying task object are added

Changed in version 0.6: New parameter remove_empty

to_yaml(remove_empty=False)

As users will use tasks primarily within recipes, i.e. in YAML files, this method conveniently returns the YAML representation of a task.

Parameters

remove_empty (bool) –

Whether to remove empty fields

Default: False

Returns

yaml – YAML representation of a task

Return type

str

Examples

To get the YAML representation of a concrete task, you need to create a task object of the appropriate class and assign a type to it:

task = aspecd.tasks.ProcessingTask()


The result of the last line (the print statement from above) will look as follows:

kind: processing
type: Normalisation
properties:
parameters:
kind: maximum
range: null
range_unit: index
noise_range: null
noise_range_unit: percentage
apply_to: []
result: ''
comment: ''


As you can see, you will get the full set of settings for this particular task. Not all of them are strictly necessary, and for details, you still need to look into the respective documentation. But this can make it much easier to add tasks to a recipe.

Similarly, you can get YAML representations of an empty recipe where you can add to your tasks to. For details, see the documentation of the Recipe.to_yaml() method.

New in version 0.5.

Changed in version 0.6: Parameter remove_empty

class aspecd.tasks.SingleanalysisTask(recipe=None)

Analysis step defined as task in recipe-driven data analysis.

Singleanalysis steps can only be performed individually for each dataset. For analyses combining multiple datasets, see aspecd.tasks.MultianalyisTask.

For more information on the underlying general class, see aspecd.analysis.SingleAnalysisStep.

For an example of how such an analysis task may be included into a recipe, see the YAML listing below:

kind: singleanalysis
type: SingleAnalysisStep
properties:
parameters:
param1: bar
param2: foo
comment: >
Some free text describing in more details the analysis step
apply_to:
- loi:xxx
result: label


Note that you can refer to datasets and results created during cooking of a recipe using their respective labels. Those labels will automatically be replaced by the actual dataset/result prior to performing the task.

And if you now want to do that for multiple datasets, you can do that as well. However, make sure to provide as many result labels as you have datasets to perform the analysis step on, as otherwise no result will be stored:

kind: singleanalysis
type: SingleAnalysisStep
apply_to:
- loi:xxx
- loi:yyy
result:
- label1
- label2


In case you perform the analysis on several datasets, you may want to provide as many result labels as there are datasets. Otherwise, no result will be assigned.

Another thing that can be very useful for data analysis is to add a comment to an individual step, e.g. with an explanation why this step has been performed:

kind: singleanalysis
type: SingleAnalysisStep
comment: >
Lorem ipsum dolor sit amet,


Note that using the > sign will replace newline characters with spaces. If you want to preserve the newline characters, use | instead.

from_dict(dict_=None)

Set attributes from dictionary.

Parameters

dict (dict) – Dictionary containing information of a task.

Raises

aspecd.tasks.MissingDictError – Raised if no dict is provided.

get_object()

Return object for a particular task including all attributes.

Returns

obj – Object of a class defined in the type attribute of a task

Return type

object

perform()

Call the appropriate method of the underlying object.

The actual implementation is contained in the non-public method aspecd.tasks.Task._perform().

Different underlying objects have different methods used to actually perform the respective task. In order to generically perform a task, classes derived from the aspecd.tasks.Task base class need to override aspecd.tasks.Task._perform() accordingly.

Use aspecd.tasks.Task.get_object() to get an instance of the actual object necessary to perform the task, and afterwards call its appropriate method.

Similarly, to get the actual dataset using the dataset id stored in aspecd.tasks.Task.apply_to, use the method aspecd.tasks.Recipe.get_dataset() of the recipe stored in aspecd.tasks.Task.recipe.

Raises

aspecd.tasks.MissingRecipeError – Raised if no recipe is available.

to_dict(remove_empty=False)

Create dictionary containing public attributes of an object.

Furthermore, replace certain objects with their respective labels provided in the recipe. These objects currently include datasets, results, figures (i.e. figure records), and plotters.

Parameters

remove_empty (bool) –

Whether to remove empty fields

Default: False

Returns

public_attributes – Ordered dictionary containing the public attributes of the object

The order of attribute definition is preserved

Return type

collections.OrderedDict

Changed in version 0.4: (Implicit) parameters of underlying task object are added

Changed in version 0.6: New parameter remove_empty

to_yaml(remove_empty=False)

As users will use tasks primarily within recipes, i.e. in YAML files, this method conveniently returns the YAML representation of a task.

Parameters

remove_empty (bool) –

Whether to remove empty fields

Default: False

Returns

yaml – YAML representation of a task

Return type

str

Examples

To get the YAML representation of a concrete task, you need to create a task object of the appropriate class and assign a type to it:

task = aspecd.tasks.ProcessingTask()


The result of the last line (the print statement from above) will look as follows:

kind: processing
type: Normalisation
properties:
parameters:
kind: maximum
range: null
range_unit: index
noise_range: null
noise_range_unit: percentage
apply_to: []
result: ''
comment: ''


As you can see, you will get the full set of settings for this particular task. Not all of them are strictly necessary, and for details, you still need to look into the respective documentation. But this can make it much easier to add tasks to a recipe.

Similarly, you can get YAML representations of an empty recipe where you can add to your tasks to. For details, see the documentation of the Recipe.to_yaml() method.

New in version 0.5.

Changed in version 0.6: Parameter remove_empty

class aspecd.tasks.MultianalysisTask(recipe=None)

Analysis step defined as task in recipe-driven data analysis.

Multianalysis steps are performed on a list of datasets and combine them in one single analysis. For analyses performed on individual datasets, see aspecd.tasks.SingleanalysisTask.

For more information on the underlying general class, see aspecd.analysis.MultiAnalysisStep.

For an example of how such an analysis task may be included into a recipe, see the YAML listing below:

kind: multianalysis
type: MultiAnalysisStep
properties:
parameters:
param1: bar
param2: foo
comment: >
Some free text describing in more details the analysis step
apply_to:
- loi:xxx
result:
- label1
- label2


Note that you can refer to datasets and results created during cooking of a recipe using their respective labels. Those labels will automatically be replaced by the actual dataset/result prior to performing the task.

In case such a multianalysis step results in a list of resulting datasets, result should be a list of labels, not a single label.

Raises

IndexError – Raised if list of result labels and results are not of same length

from_dict(dict_=None)

Set attributes from dictionary.

Parameters

dict (dict) – Dictionary containing information of a task.

Raises

aspecd.tasks.MissingDictError – Raised if no dict is provided.

get_object()

Return object for a particular task including all attributes.

Returns

obj – Object of a class defined in the type attribute of a task

Return type

object

perform()

Call the appropriate method of the underlying object.

The actual implementation is contained in the non-public method aspecd.tasks.Task._perform().

Different underlying objects have different methods used to actually perform the respective task. In order to generically perform a task, classes derived from the aspecd.tasks.Task base class need to override aspecd.tasks.Task._perform() accordingly.

Use aspecd.tasks.Task.get_object() to get an instance of the actual object necessary to perform the task, and afterwards call its appropriate method.

Similarly, to get the actual dataset using the dataset id stored in aspecd.tasks.Task.apply_to, use the method aspecd.tasks.Recipe.get_dataset() of the recipe stored in aspecd.tasks.Task.recipe.

Raises

aspecd.tasks.MissingRecipeError – Raised if no recipe is available.

to_dict(remove_empty=False)

Create dictionary containing public attributes of an object.

Furthermore, replace certain objects with their respective labels provided in the recipe. These objects currently include datasets, results, figures (i.e. figure records), and plotters.

Parameters

remove_empty (bool) –

Whether to remove empty fields

Default: False

Returns

public_attributes – Ordered dictionary containing the public attributes of the object

The order of attribute definition is preserved

Return type

collections.OrderedDict

Changed in version 0.4: (Implicit) parameters of underlying task object are added

Changed in version 0.6: New parameter remove_empty

to_yaml(remove_empty=False)

As users will use tasks primarily within recipes, i.e. in YAML files, this method conveniently returns the YAML representation of a task.

Parameters

remove_empty (bool) –

Whether to remove empty fields

Default: False

Returns

yaml – YAML representation of a task

Return type

str

Examples

To get the YAML representation of a concrete task, you need to create a task object of the appropriate class and assign a type to it:

task = aspecd.tasks.ProcessingTask()


The result of the last line (the print statement from above) will look as follows:

kind: processing
type: Normalisation
properties:
parameters:
kind: maximum
range: null
range_unit: index
noise_range: null
noise_range_unit: percentage
apply_to: []
result: ''
comment: ''


As you can see, you will get the full set of settings for this particular task. Not all of them are strictly necessary, and for details, you still need to look into the respective documentation. But this can make it much easier to add tasks to a recipe.

Similarly, you can get YAML representations of an empty recipe where you can add to your tasks to. For details, see the documentation of the Recipe.to_yaml() method.

New in version 0.5.

Changed in version 0.6: Parameter remove_empty

class aspecd.tasks.AggregatedanalysisTask(recipe=None)

Analysis step defined as task in recipe-driven data analysis.

AggregatedAnalysis steps perform a given aspecd.tasks.SingleanalysisTask on a list of datasets and combine the result in a aspecd.dataset.CalculatedDataset.

For more information on the underlying general class, see aspecd.analysis.AggregatedAnalysisStep.

For an example of how such an analysis task may be included into a recipe, see the YAML listing below:

- kind: aggregatedanalysis
type: BasicCharacteristics
properties:
parameters:
kind: min
apply_to:
- dataset1
- dataset2
result: basic_characteristics


New in version 0.5.

from_dict(dict_=None)

Set attributes from dictionary.

Parameters

dict (dict) – Dictionary containing information of a task.

Raises

aspecd.tasks.MissingDictError – Raised if no dict is provided.

get_object()

Return object for a particular task including all attributes.

Returns

obj – Object of a class defined in the type attribute of a task

Return type

object

perform()

Call the appropriate method of the underlying object.

The actual implementation is contained in the non-public method aspecd.tasks.Task._perform().

Different underlying objects have different methods used to actually perform the respective task. In order to generically perform a task, classes derived from the aspecd.tasks.Task base class need to override aspecd.tasks.Task._perform() accordingly.

Use aspecd.tasks.Task.get_object() to get an instance of the actual object necessary to perform the task, and afterwards call its appropriate method.

Similarly, to get the actual dataset using the dataset id stored in aspecd.tasks.Task.apply_to, use the method aspecd.tasks.Recipe.get_dataset() of the recipe stored in aspecd.tasks.Task.recipe.

Raises

aspecd.tasks.MissingRecipeError – Raised if no recipe is available.

to_dict(remove_empty=False)

Create dictionary containing public attributes of an object.

Furthermore, replace certain objects with their respective labels provided in the recipe. These objects currently include datasets, results, figures (i.e. figure records), and plotters.

Parameters

remove_empty (bool) –

Whether to remove empty fields

Default: False

Returns

public_attributes – Ordered dictionary containing the public attributes of the object

The order of attribute definition is preserved

Return type

collections.OrderedDict

Changed in version 0.4: (Implicit) parameters of underlying task object are added

Changed in version 0.6: New parameter remove_empty

to_yaml(remove_empty=False)

As users will use tasks primarily within recipes, i.e. in YAML files, this method conveniently returns the YAML representation of a task.

Parameters

remove_empty (bool) –

Whether to remove empty fields

Default: False

Returns

yaml – YAML representation of a task

Return type

str

Examples

To get the YAML representation of a concrete task, you need to create a task object of the appropriate class and assign a type to it:

task = aspecd.tasks.ProcessingTask()


The result of the last line (the print statement from above) will look as follows:

kind: processing
type: Normalisation
properties:
parameters:
kind: maximum
range: null
range_unit: index
noise_range: null
noise_range_unit: percentage
apply_to: []
result: ''
comment: ''


As you can see, you will get the full set of settings for this particular task. Not all of them are strictly necessary, and for details, you still need to look into the respective documentation. But this can make it much easier to add tasks to a recipe.

Similarly, you can get YAML representations of an empty recipe where you can add to your tasks to. For details, see the documentation of the Recipe.to_yaml() method.

New in version 0.5.

Changed in version 0.6: Parameter remove_empty

class aspecd.tasks.AnnotationTask(recipe=None)

Annotation step defined as task in recipe-driven data analysis.

Annotation steps will always be performed individually for each dataset.

For more information on the underlying general class, see aspecd.processing.Annotation.

from_dict(dict_=None)

Set attributes from dictionary.

Parameters

dict (dict) – Dictionary containing information of a task.

Raises

aspecd.tasks.MissingDictError – Raised if no dict is provided.

get_object()

Return object for a particular task including all attributes.

Returns

obj – Object of a class defined in the type attribute of a task

Return type

object

perform()

Call the appropriate method of the underlying object.

The actual implementation is contained in the non-public method aspecd.tasks.Task._perform().

Different underlying objects have different methods used to actually perform the respective task. In order to generically perform a task, classes derived from the aspecd.tasks.Task base class need to override aspecd.tasks.Task._perform() accordingly.

Use aspecd.tasks.Task.get_object() to get an instance of the actual object necessary to perform the task, and afterwards call its appropriate method.

Similarly, to get the actual dataset using the dataset id stored in aspecd.tasks.Task.apply_to, use the method aspecd.tasks.Recipe.get_dataset() of the recipe stored in aspecd.tasks.Task.recipe.

Raises

aspecd.tasks.MissingRecipeError – Raised if no recipe is available.

to_dict(remove_empty=False)

Create dictionary containing public attributes of an object.

Furthermore, replace certain objects with their respective labels provided in the recipe. These objects currently include datasets, results, figures (i.e. figure records), and plotters.

Parameters

remove_empty (bool) –

Whether to remove empty fields

Default: False

Returns

public_attributes – Ordered dictionary containing the public attributes of the object

The order of attribute definition is preserved

Return type

collections.OrderedDict

Changed in version 0.4: (Implicit) parameters of underlying task object are added

Changed in version 0.6: New parameter remove_empty

to_yaml(remove_empty=False)

As users will use tasks primarily within recipes, i.e. in YAML files, this method conveniently returns the YAML representation of a task.

Parameters

remove_empty (bool) –

Whether to remove empty fields

Default: False

Returns

yaml – YAML representation of a task

Return type

str

Examples

To get the YAML representation of a concrete task, you need to create a task object of the appropriate class and assign a type to it:

task = aspecd.tasks.ProcessingTask()


The result of the last line (the print statement from above) will look as follows:

kind: processing
type: Normalisation
properties:
parameters:
kind: maximum
range: null
range_unit: index
noise_range: null
noise_range_unit: percentage
apply_to: []
result: ''
comment: ''


As you can see, you will get the full set of settings for this particular task. Not all of them are strictly necessary, and for details, you still need to look into the respective documentation. But this can make it much easier to add tasks to a recipe.

Similarly, you can get YAML representations of an empty recipe where you can add to your tasks to. For details, see the documentation of the Recipe.to_yaml() method.

New in version 0.5.

Changed in version 0.6: Parameter remove_empty

class aspecd.tasks.PlotTask

Plot step defined as task in recipe-driven data analysis.

Important

A PlotTask should not be used directly but rather the two classes derived from this class, namely:

For more information on the underlying general class, see aspecd.plotting.Plotter.

label

Label for the figure resulting from a plotting step.

This label will be used to refer to the plot later on when further processing the recipe. Actually, in the recipe’s aspecd.tasks.Recipe.figures dict, this label is used as a key and a aspecd.tasks.FigureRecord object stored containing all information necessary for further handling the results of the plot.

Type

str

result

Label for the plotter of a plotting step.

This is useful in case of CompositePlotters, where different plotters need to be defined for each of the panels.

Type

str

target

Label of an existing previous plotter the plot should be added to.

Sometimes it is desirable to add something to an already existing plot after this original plot has been created. Programmatically, this would be equivalent to setting the aspecd.plotting.Plotter.figure and aspecd.plotting.Plotter.axes attributes of the underlying plotter object.

The result: Your plot will be a new figure window, but with the original plot contained and the new plot added on top of it.

Type

str

Changed in version 0.4: Added attribute target

get_object()

Return object for a plot task including all attributes.

For plot tasks, if the label of a drawing has been replaced by a dataset item, it gets re-replaced by the original string.

Returns

obj – Object of a class defined in the type attribute of a task

Return type

object

Changed in version 0.6.3: Labels that have been replaced by datasets get re-replaced

perform()

Call the appropriate method of the underlying object.

For details, see the method aspecd.tasks.Task.perform() of the base class.

Additionally to what is done in the base class, a PlotTask adds a aspecd.tasks.FigureRecord object to the aspecd.tasks.Recipe.figures property of the underlying recipe in case an aspecd.tasks.PlotTask.label has been set.

save_plot(plot=None)

Save the figure of the plot created by the task.

Parameters

plot (aspecd.plotting.Plotter) – Plot whose figure should be saved

from_dict(dict_=None)

Set attributes from dictionary.

Parameters

dict (dict) – Dictionary containing information of a task.

Raises

aspecd.tasks.MissingDictError – Raised if no dict is provided.

to_dict(remove_empty=False)

Create dictionary containing public attributes of an object.

Furthermore, replace certain objects with their respective labels provided in the recipe. These objects currently include datasets, results, figures (i.e. figure records), and plotters.

Parameters

remove_empty (bool) –

Whether to remove empty fields

Default: False

Returns

public_attributes – Ordered dictionary containing the public attributes of the object

The order of attribute definition is preserved

Return type

collections.OrderedDict

Changed in version 0.4: (Implicit) parameters of underlying task object are added

Changed in version 0.6: New parameter remove_empty

to_yaml(remove_empty=False)

As users will use tasks primarily within recipes, i.e. in YAML files, this method conveniently returns the YAML representation of a task.

Parameters

remove_empty (bool) –

Whether to remove empty fields

Default: False

Returns

yaml – YAML representation of a task

Return type

str

Examples

To get the YAML representation of a concrete task, you need to create a task object of the appropriate class and assign a type to it:

task = aspecd.tasks.ProcessingTask()


The result of the last line (the print statement from above) will look as follows:

kind: processing
type: Normalisation
properties:
parameters:
kind: maximum
range: null
range_unit: index
noise_range: null
noise_range_unit: percentage
apply_to: []
result: ''
comment: ''


As you can see, you will get the full set of settings for this particular task. Not all of them are strictly necessary, and for details, you still need to look into the respective documentation. But this can make it much easier to add tasks to a recipe.

Similarly, you can get YAML representations of an empty recipe where you can add to your tasks to. For details, see the documentation of the Recipe.to_yaml() method.

New in version 0.5.

Changed in version 0.6: Parameter remove_empty

class aspecd.tasks.SingleplotTask

Singleplot step defined as task in recipe-driven data analysis.

Singleplot steps can only be performed individually for each dataset. For plots combining multiple datasets, see aspecd.tasks.MultiplotTask.

For more information on the underlying general class, see aspecd.plotting.SinglePlotter.

For an example of how such a singleplot task may be included into a recipe, see the YAML listing below:

kind: singleplot
type: SinglePlotter
properties:
properties:
figure:
title: My fancy figure title
drawing:
color: darkorange
label: my data
linewidth: 4
linestyle: dashed
legend:
location: northeast
parameters:
show_legend: True
caption:
title: >
Ideally a single sentence summarising the intend of the figure
text: >
More text for the figure caption
parameters:
- a list of parameters
- that shall (additionally) be listed
- in the figure caption
filename: fancyfigure.pdf
apply_to:
- loi:xxx
label: label


Note that you can refer to datasets and results created during cooking of a recipe using their respective labels. Those labels will automatically be replaced by the actual dataset/result prior to performing the task.

Note

As soon as you provide a filename in the properties of your recipe, the resulting plot will automatically be saved to that filename, inferring the file format from the extension of the filename. For details of how the format is inferred see the documentation for the matplotlib.figure.Figure.savefig() method.

In case you apply the single plotter to more than one dataset and would like to save individual plots, you can do that by supplying a list of filenames instead of only a single filename. In this case, the plots get saved to the filenames in the list. A minimal example may look like this:

kind: singleplot
type: SinglePlotter
properties:
filename:
- fancyfigure1.pdf
- fancyfigure2.pdf
apply_to:
- loi:xxx
- loi:yyy


Important

Make sure to provide the same number of file names in your recipe as the number of datasets you apply the plotter to. Otherwise you may run into trouble.

Note

If the recipe contains the output key in its directories dict, the figure(s) will be saved to this directory.

As long as autosave_plots in the recipe is set to True, the plots will be saved automatically, even if no filename is provided. These automatically generated filenames consist of the last part of the dataset source (excluding a potential file extension) and the name of the plotter used. To prevent the plotters in a recipe from automatically saving the plots, include the autosave_plots directive in the settings dict of your recipe and set it to False.

from_dict(dict_=None)

Set attributes from dictionary.

Parameters

dict (dict) – Dictionary containing information of a task.

Raises

aspecd.tasks.MissingDictError – Raised if no dict is provided.

get_object()

Return object for a plot task including all attributes.

For plot tasks, if the label of a drawing has been replaced by a dataset item, it gets re-replaced by the original string.

Returns

obj – Object of a class defined in the type attribute of a task

Return type

object

Changed in version 0.6.3: Labels that have been replaced by datasets get re-replaced

perform()

Call the appropriate method of the underlying object.

For details, see the method aspecd.tasks.Task.perform() of the base class.

Additionally to what is done in the base class, a PlotTask adds a aspecd.tasks.FigureRecord object to the aspecd.tasks.Recipe.figures property of the underlying recipe in case an aspecd.tasks.PlotTask.label has been set.

save_plot(plot=None)

Save the figure of the plot created by the task.

Parameters

plot (aspecd.plotting.Plotter) – Plot whose figure should be saved

to_dict(remove_empty=False)

Create dictionary containing public attributes of an object.

Furthermore, replace certain objects with their respective labels provided in the recipe. These objects currently include datasets, results, figures (i.e. figure records), and plotters.

Parameters

remove_empty (bool) –

Whether to remove empty fields

Default: False

Returns

public_attributes – Ordered dictionary containing the public attributes of the object

The order of attribute definition is preserved

Return type

collections.OrderedDict

Changed in version 0.4: (Implicit) parameters of underlying task object are added

Changed in version 0.6: New parameter remove_empty

to_yaml(remove_empty=False)

As users will use tasks primarily within recipes, i.e. in YAML files, this method conveniently returns the YAML representation of a task.

Parameters

remove_empty (bool) –

Whether to remove empty fields

Default: False

Returns

yaml – YAML representation of a task

Return type

str

Examples

To get the YAML representation of a concrete task, you need to create a task object of the appropriate class and assign a type to it:

task = aspecd.tasks.ProcessingTask()


The result of the last line (the print statement from above) will look as follows:

kind: processing
type: Normalisation
properties:
parameters:
kind: maximum
range: null
range_unit: index
noise_range: null
noise_range_unit: percentage
apply_to: []
result: ''
comment: ''


As you can see, you will get the full set of settings for this particular task. Not all of them are strictly necessary, and for details, you still need to look into the respective documentation. But this can make it much easier to add tasks to a recipe.

Similarly, you can get YAML representations of an empty recipe where you can add to your tasks to. For details, see the documentation of the Recipe.to_yaml() method.

New in version 0.5.

Changed in version 0.6: Parameter remove_empty

class aspecd.tasks.MultiplotTask

Multiplot step defined as task in recipe-driven data analysis.

Multiplot steps are performed on a list of datasets and combine them in one single plot. For plots performed on individual datasets, see aspecd.tasks.SingleplotTask.

For more information on the underlying general class, see aspecd.plotting.MultiPlotter.

For an example of how such a multiplot task may be included into a recipe, see the YAML listing below:

kind: multiplot
type: MultiPlotter
properties:
parameters:
axes:
- quantity: wavelength
unit: nm
- quantity: intensity
unit:
show_legend: True
caption:
title: >
Ideally a single sentence summarising the intend of the figure
text: >
More text for the figure caption
parameters:
- a list of parameters
- that shall (additionally) be listed
- in the figure caption
filename: fancyfigure.pdf
apply_to:
- loi:xxx
- loi:yyy
label: label


Note that you can refer to datasets and results created during cooking of a recipe using their respective labels. Those labels will automatically be replaced by the actual dataset/result prior to performing the task.

A specialty of plots of multiple datasets is that you cannot necessarily infer the axis labels from the datasets, hence may be interested to set them directly. This is done using the axes key of the parameters property of the aspecd.plotting.MultiPlotter class, as shown in the recipe example above.

Note

As soon as you provide a filename in the properties of your recipe, the resulting plot will automatically be saved to that filename, inferring the file format from the extension of the filename. For details of how the format is inferred see the documentation for the matplotlib.figure.Figure.savefig() method.

Note

If the recipe contains the output key in its directories dict, the figure(s) will be saved to this directory.

As long as autosave_plots in the recipe is set to True, the plots will be saved automatically, even if no filename is provided. These automatically generated filenames consist of the last part of the dataset source (excluding a potential file extension) and the name of the plotter used. To prevent the plotters in a recipe from automatically saving the plots, include the autosave_plots directive in the settings dict of your recipe and set it to False.

from_dict(dict_=None)

Set attributes from dictionary.

Parameters

dict (dict) – Dictionary containing information of a task.

Raises

aspecd.tasks.MissingDictError – Raised if no dict is provided.

get_object()

Return object for a plot task including all attributes.

For plot tasks, if the label of a drawing has been replaced by a dataset item, it gets re-replaced by the original string.

Returns

obj – Object of a class defined in the type attribute of a task

Return type

object

Changed in version 0.6.3: Labels that have been replaced by datasets get re-replaced

perform()

Call the appropriate method of the underlying object.

For details, see the method aspecd.tasks.Task.perform() of the base class.

Additionally to what is done in the base class, a PlotTask adds a aspecd.tasks.FigureRecord object to the aspecd.tasks.Recipe.figures property of the underlying recipe in case an aspecd.tasks.PlotTask.label has been set.

save_plot(plot=None)

Save the figure of the plot created by the task.

Parameters

plot (aspecd.plotting.Plotter) – Plot whose figure should be saved

to_dict(remove_empty=False)

Create dictionary containing public attributes of an object.

Furthermore, replace certain objects with their respective labels provided in the recipe. These objects currently include datasets, results, figures (i.e. figure records), and plotters.

Parameters

remove_empty (bool) –

Whether to remove empty fields

Default: False

Returns

public_attributes – Ordered dictionary containing the public attributes of the object

The order of attribute definition is preserved

Return type

collections.OrderedDict

Changed in version 0.4: (Implicit) parameters of underlying task object are added

Changed in version 0.6: New parameter remove_empty

to_yaml(remove_empty=False)

As users will use tasks primarily within recipes, i.e. in YAML files, this method conveniently returns the YAML representation of a task.

Parameters

remove_empty (bool) –

Whether to remove empty fields

Default: False

Returns

yaml – YAML representation of a task

Return type

str

Examples

To get the YAML representation of a concrete task, you need to create a task object of the appropriate class and assign a type to it:

task = aspecd.tasks.ProcessingTask()


The result of the last line (the print statement from above) will look as follows:

kind: processing
type: Normalisation
properties:
parameters:
kind: maximum
range: null
range_unit: index
noise_range: null
noise_range_unit: percentage
apply_to: []
result: ''
comment: ''


As you can see, you will get the full set of settings for this particular task. Not all of them are strictly necessary, and for details, you still need to look into the respective documentation. But this can make it much easier to add tasks to a recipe.

Similarly, you can get YAML representations of an empty recipe where you can add to your tasks to. For details, see the documentation of the Recipe.to_yaml() method.

New in version 0.5.

Changed in version 0.6: Parameter remove_empty

class aspecd.tasks.CompositeplotTask

Compositeplot step defined as task in recipe-driven data analysis.

Compositeplot steps are performed on a list of plots and combine them in one single figure. For more common plots employing only a single axes, see aspecd.tasks.SingleplotTask and aspecd.tasks.MultiplotTask.

For more information on the underlying general class, see aspecd.plotting.CompositePlotter.

For an example of how such a compositeplot task may be included into a recipe, see the YAML listing below:

- kind: singleplot
type: SinglePlotter1D
apply_to:
- dataset1
result: 1D_plot

- kind: singleplot
type: SinglePlotter2D
apply_to:
- dataset2
result: 2D_plot

- kind: compositeplot
type: CompositePlotter
properties:
grid_dimensions: [1, 2]
subplot_locations:
- [0, 0, 1, 1]
- [0, 1, 1, 1]
plotter:
- 1D_plot
- 2D_plot
filename: composed_plot.pdf


The crucial aspect here is to first define the individual plotters that get used for the respective panels of the CompositePlotter. In this particular example, two different plots on two different datasets are created and afterwards combined into the CompositePlotter. Furthermore, for a CompositePlot you need to specify both, grid dimensions and subplot locations, as they will be set to one single axis by default.

The example above would create a plot with one row and two columns (i.e., two axes side-by-side). The grid dimensions are given as “[number of rows, number of columns]”, and each subplot location is a list with four integer values: “[start_row, start_column, row_span, column_span]”. Therefore, having the same plot, but with the axes appearing in one column and two rows (i.e., stacked on top of each other), you would define the CompositePlotter step as follows:

- kind: compositeplot
type: CompositePlotter
properties:
grid_dimensions: [2, 1]
subplot_locations:
- [0, 0, 1, 1]
- [1, 0, 1, 1]
plotter:
- 1D_plot
- 2D_plot
filename: composed_plot.pdf


Of course, you can create arbitrarily complex arrangements of axes within a figure, even with one axis spanning several rows or columns. Note that the size you set to the figure of the composite plotter defines the aspect ratio and relative size of the individual axes. As an example, in a two-row layout with two axes stacked on top of each other, you may want to have an overall quadratic figure with size 6x6 inch to fit decently to a normal page:

- kind: compositeplot
type: CompositePlotter
properties:
grid_dimensions: [2, 1]
subplot_locations:
- [0, 0, 1, 1]
- [1, 0, 1, 1]
plotter:
- 1D_plot
- 2D_plot
properties:
figure:
size: [6.0, 6.0]
filename: composed_plot.pdf


As with all the other plotters, there are many more options to control the appearance of your figures.

Note

As long as the autosave_plots in the recipe is set to True, the results of the individual plotters combined in the CompositePlotter will be saved to generic filenames. To prevent this from happening, include the autosave_plots directive in the settings dict of your recipe and set it to False.

to_dict(remove_empty=False)

Create dictionary containing public attributes of the object.

Parameters

remove_empty (bool) –

Whether to remove empty fields

Default: False

Returns

public_attributes – Ordered dictionary containing the public attributes of the object

The order of attribute definition is preserved

Return type

collections.OrderedDict

Changed in version 0.6: New parameter remove_empty

from_dict(dict_=None)

Set attributes from dictionary.

Parameters

dict (dict) – Dictionary containing information of a task.

Raises

aspecd.tasks.MissingDictError – Raised if no dict is provided.

get_object()

Return object for a plot task including all attributes.

For plot tasks, if the label of a drawing has been replaced by a dataset item, it gets re-replaced by the original string.

Returns

obj – Object of a class defined in the type attribute of a task

Return type

object

Changed in version 0.6.3: Labels that have been replaced by datasets get re-replaced

perform()

Call the appropriate method of the underlying object.

For details, see the method aspecd.tasks.Task.perform() of the base class.

Additionally to what is done in the base class, a PlotTask adds a aspecd.tasks.FigureRecord object to the aspecd.tasks.Recipe.figures property of the underlying recipe in case an aspecd.tasks.PlotTask.label has been set.

save_plot(plot=None)

Save the figure of the plot created by the task.

Parameters

plot (aspecd.plotting.Plotter) – Plot whose figure should be saved

to_yaml(remove_empty=False)

As users will use tasks primarily within recipes, i.e. in YAML files, this method conveniently returns the YAML representation of a task.

Parameters

remove_empty (bool) –

Whether to remove empty fields

Default: False

Returns

yaml – YAML representation of a task

Return type

str

Examples

To get the YAML representation of a concrete task, you need to create a task object of the appropriate class and assign a type to it:

task = aspecd.tasks.ProcessingTask()


The result of the last line (the print statement from above) will look as follows:

kind: processing
type: Normalisation
properties:
parameters:
kind: maximum
range: null
range_unit: index
noise_range: null
noise_range_unit: percentage
apply_to: []
result: ''
comment: ''


As you can see, you will get the full set of settings for this particular task. Not all of them are strictly necessary, and for details, you still need to look into the respective documentation. But this can make it much easier to add tasks to a recipe.

Similarly, you can get YAML representations of an empty recipe where you can add to your tasks to. For details, see the documentation of the Recipe.to_yaml() method.

New in version 0.5.

Changed in version 0.6: Parameter remove_empty

class aspecd.tasks.ReportTask

Reporting step defined as task in recipe-driven data analysis.

For more information on the underlying general class, see aspecd.report.Reporter.

For an example of how such a report task may be included into a recipe, see the YAML listing below:

- kind: report
type: LaTeXReporter
properties:
template: dataset.tex
filename: report.tex
compile: true


In this particular case, we use a LaTeX reporter and most likely one of the templates that come bundled with the ASpecD package (atl least, a template with that name comes bundled with ASpecD). As the template name already suggests, this report will contain information on a dataset. Furthermore, setting compile to true will render the generated report into a PDF document.

Note that you can refer to datasets, results, and figures created during cooking of a recipe using their respective labels. Those labels will automatically be replaced by the actual dataset/result prior to performing the task.

Whatever fields you set in the property context can be accessed directly from within the template using the usual Python syntax for accessing keys of dictionaries as well as the (more convenient) dot syntax provided by Jinja2.

Additionally, the context will contain the key dataset containing the result of the aspecd.dataset.Dataset.to_dict() method, thus the full information contained in the dataset.

You can, of course, apply the report task to multiple datasets individually. In this case, you most probably would like to have your reports saved to individual files. This means that the property filename needs to become a list:

datasets:
- foo
- bar

- kind: report
type: LaTeXReporter
properties:
template: my-fancy-latex-template.tex
filename:
- report1.tex
- report2.tex


Important

Make sure to provide the same number of file names in your recipe as the number of datasets you apply the report to. Otherwise you may run into trouble.

Note

If the recipe contains the output key in its directories dict, the report(s) will be saved to this directory.

In case you do not provide a filename, the report is nevertheless saved for each of the datasets, using an auto-generated filename consisting of the dataset label and the template used. Assuming a dataset “foo” and a template “dataset.tex”, the resulting report will be saved to the file “foo_report_dataset.tex”.

Generally, you are entirely free to create content for your reports from within a recipe, as the following fictitious example shows:

kind: report
type: LaTeXReporter
properties:
template: my-fancy-latex-template.tex
filename: some-filename-for-final-report.tex
context:
general:
title: Some fancy title
author: John Doe
free_text:
intro: >
Short introduction of the experiment performed
Tabular and customisable overview of the dataset's metadata
history: >
Presentation of all processing, analysis and representation
steps
figures:
title: my_fancy_figure
compile: True
apply_to:
- loi:xxx


The fields shown here assume a certain structure of your template containing user-supplied free text for the introduction to several sections. And be aware that in such cases, you need to know your templates quite well and have a direct dependency between the keys provided in the recipe and the corresponding placeholders in your template. Therefore, much more often, you will use either general reporters for datasets and alike (as shown above) or create specialised reporter classes collecting the necessary information for you.

compile

Option for compiling a template.

Some types of templates need an additional “compile” step to create output, most prominently LaTeX templates. If the Reporter class does not support compiling, but compile is set to True, it gets silently ignored.

Type

bool

to_dict(remove_empty=False)

Create dictionary containing public attributes of the object.

Parameters

remove_empty (bool) –

Whether to remove empty fields

Default: False

Returns

public_attributes – Ordered dictionary containing the public attributes of the object

The order of attribute definition is preserved

Return type

collections.OrderedDict

Changed in version 0.6: New parameter remove_empty

from_dict(dict_=None)

Set attributes from dictionary.

Parameters

dict (dict) – Dictionary containing information of a task.

Raises

aspecd.tasks.MissingDictError – Raised if no dict is provided.

get_object()

Return object for a particular task including all attributes.

Returns

obj – Object of a class defined in the type attribute of a task

Return type

object

perform()

Call the appropriate method of the underlying object.

The actual implementation is contained in the non-public method aspecd.tasks.Task._perform().

Different underlying objects have different methods used to actually perform the respective task. In order to generically perform a task, classes derived from the aspecd.tasks.Task base class need to override aspecd.tasks.Task._perform() accordingly.

Use aspecd.tasks.Task.get_object() to get an instance of the actual object necessary to perform the task, and afterwards call its appropriate method.

Similarly, to get the actual dataset using the dataset id stored in aspecd.tasks.Task.apply_to, use the method aspecd.tasks.Recipe.get_dataset() of the recipe stored in aspecd.tasks.Task.recipe.

Raises

aspecd.tasks.MissingRecipeError – Raised if no recipe is available.

to_yaml(remove_empty=False)

As users will use tasks primarily within recipes, i.e. in YAML files, this method conveniently returns the YAML representation of a task.

Parameters

remove_empty (bool) –

Whether to remove empty fields

Default: False

Returns

yaml – YAML representation of a task

Return type

str

Examples

To get the YAML representation of a concrete task, you need to create a task object of the appropriate class and assign a type to it:

task = aspecd.tasks.ProcessingTask()


The result of the last line (the print statement from above) will look as follows:

kind: processing
type: Normalisation
properties:
parameters:
kind: maximum
range: null
range_unit: index
noise_range: null
noise_range_unit: percentage
apply_to: []
result: ''
comment: ''


As you can see, you will get the full set of settings for this particular task. Not all of them are strictly necessary, and for details, you still need to look into the respective documentation. But this can make it much easier to add tasks to a recipe.

Similarly, you can get YAML representations of an empty recipe where you can add to your tasks to. For details, see the documentation of the Recipe.to_yaml() method.

New in version 0.5.

Changed in version 0.6: Parameter remove_empty

class aspecd.tasks.FigurereportTask

Reporting step particularly for figure captions.

While the more generic ReportTask operates on datasets, this task operates on figure records stored within a recipe’s Recipe.figures attribute and is hence dedicated to creating figure captions. This is pretty useful in case you have added captions to your figures within a recipe and want to automatically include the final figure and caption into a document.

Think of a report or thesis written in LaTeX: With a series of recipes analysing your data and creating the figures, wouldn’t it be charming to have the figure captions automatically created as well from within the recipe, as there you probably know the best how to describe your data? That’s what this task is good for. Including the figure and caption in this case reduces to a mere \input statement in LaTeX.

As reporters, you can use all the reporter classes provided with the ASpecD framework. For more information see the aspecd.report module.

For an example of how such a report task may be included into a recipe, see the YAML listing below:

- kind: singleplot
type: SinglePlotter1D
label: overview1D

- kind: figurereport
type: LaTeXReporter
properties:
template: figure.tex
filename: fig_caption.tex
apply_to: overview1D


In this particular case, we use a LaTeX reporter and most likely one of the templates that come bundled with the ASpecD package (atl least, a template with that name comes bundled with ASpecD).

You can, of course, apply the report task to multiple figures individually. In this case, you most probably would like to have your reports saved to individual files. This means that the property filename needs to become a list:

- kind: singleplot
type: SinglePlotter1D
label: overview1D
- kind: singleplot
type: SinglePLotter1D
label: overview2
- kind: report
type: LaTeXReporter
properties:
template: figure.tex
filename:
- fig1_caption.tex
- fig2_caption.tex
apply_to:
- overview1D
- overview2


Important

Make sure to provide the same number of file names in your recipe as the number of figures you apply the report to. Otherwise you may run into trouble.

Note

If the recipe contains the output key in its directories dict, the report(s) will be saved to this directory.

In case you do not provide a filename, the report is nevertheless saved for each of the figures, using an auto-generated filename consisting of the figure label and the template used. Assuming a figure “fig1” and a template “figure.tex”, the resulting report will be saved to the file “fig1_report_figure.tex”.

New in version 0.7.

from_dict(dict_=None)

Set attributes from dictionary.

Parameters

dict (dict) – Dictionary containing information of a task.

Raises

aspecd.tasks.MissingDictError – Raised if no dict is provided.

get_object()

Return object for a particular task including all attributes.

Returns

obj – Object of a class defined in the type attribute of a task

Return type

object

perform()

Call the appropriate method of the underlying object.

The actual implementation is contained in the non-public method aspecd.tasks.Task._perform().

Different underlying objects have different methods used to actually perform the respective task. In order to generically perform a task, classes derived from the aspecd.tasks.Task base class need to override aspecd.tasks.Task._perform() accordingly.

Use aspecd.tasks.Task.get_object() to get an instance of the actual object necessary to perform the task, and afterwards call its appropriate method.

Similarly, to get the actual dataset using the dataset id stored in aspecd.tasks.Task.apply_to, use the method aspecd.tasks.Recipe.get_dataset() of the recipe stored in aspecd.tasks.Task.recipe.

Raises

aspecd.tasks.MissingRecipeError – Raised if no recipe is available.

to_dict(remove_empty=False)

Create dictionary containing public attributes of an object.

Furthermore, replace certain objects with their respective labels provided in the recipe. These objects currently include datasets, results, figures (i.e. figure records), and plotters.

Parameters

remove_empty (bool) –

Whether to remove empty fields

Default: False

Returns

public_attributes – Ordered dictionary containing the public attributes of the object

The order of attribute definition is preserved

Return type

collections.OrderedDict

Changed in version 0.4: (Implicit) parameters of underlying task object are added

Changed in version 0.6: New parameter remove_empty

to_yaml(remove_empty=False)

As users will use tasks primarily within recipes, i.e. in YAML files, this method conveniently returns the YAML representation of a task.

Parameters

remove_empty (bool) –

Whether to remove empty fields

Default: False

Returns

yaml – YAML representation of a task

Return type

str

Examples

To get the YAML representation of a concrete task, you need to create a task object of the appropriate class and assign a type to it:

task = aspecd.tasks.ProcessingTask()


The result of the last line (the print statement from above) will look as follows:

kind: processing
type: Normalisation
properties:
parameters:
kind: maximum
range: null
range_unit: index
noise_range: null
noise_range_unit: percentage
apply_to: []
result: ''
comment: ''


As you can see, you will get the full set of settings for this particular task. Not all of them are strictly necessary, and for details, you still need to look into the respective documentation. But this can make it much easier to add tasks to a recipe.

Similarly, you can get YAML representations of an empty recipe where you can add to your tasks to. For details, see the documentation of the Recipe.to_yaml() method.

New in version 0.5.

Changed in version 0.6: Parameter remove_empty

class aspecd.tasks.ModelTask(recipe=None)

Building a model defined as task in recipe-driven data analysis.

For more information on the underlying general class, see aspecd.model.Model.

For an example of how such a model task may be included into a recipe, see the YAML listing below:

kind: model
type: Model
properties:
parameters:
foo: 42
bar: 21
from_dataset: dataset_label
result: foo


Note that you can refer to datasets and results created during cooking of a recipe using their respective labels. Those labels will automatically be replaced by the actual dataset/result prior to performing the task.

Here, we have used this for the parameter from_dataset in the above recipe excerpt. For an aspecd.model.Model object, you can set the variables explicitly. However, in context of a recipe, this is rarely useful. Therefore, the from_dataset parameter lets you refer to a dataset (by its label used within the recipe) that is used to call the aspecd.model.Model.from_dataset() method with to obtain the variables from this dataset.

result

Label for the dataset resulting from the model creation

The result will always be an aspecd.dataset.CalculatedDataset object.

This label will be used to refer to the result later on when further processing the recipe.

Type

str

from_dataset

Label of a dataset to obtain variables from

The label needs to be a valid label to a dataset within the given recipe. The underlying dataset is obtained from the recipe and used to call the aspecd.model.Model.from_dataset() method with to obtain the variables from this dataset.

Type

str

output

Type of output returned in result if given

Can be “dataset” (default) or “model”. The latter is important in case the model as such needs to be accessed, e.g. in context of fitting models to data.

Type

str

Changed in version 0.7: Added attribute output

from_dict(dict_=None)

Set attributes from dictionary.

Parameters

dict (dict) – Dictionary containing information of a task.

Raises

aspecd.tasks.MissingDictError – Raised if no dict is provided.

get_object()

Return object for a particular task including all attributes.

Returns

obj – Object of a class defined in the type attribute of a task

Return type

object

perform()

Call the appropriate method of the underlying object.

The actual implementation is contained in the non-public method aspecd.tasks.Task._perform().

Different underlying objects have different methods used to actually perform the respective task. In order to generically perform a task, classes derived from the aspecd.tasks.Task base class need to override aspecd.tasks.Task._perform() accordingly.

Use aspecd.tasks.Task.get_object() to get an instance of the actual object necessary to perform the task, and afterwards call its appropriate method.

Similarly, to get the actual dataset using the dataset id stored in aspecd.tasks.Task.apply_to, use the method aspecd.tasks.Recipe.get_dataset() of the recipe stored in aspecd.tasks.Task.recipe.

Raises

aspecd.tasks.MissingRecipeError – Raised if no recipe is available.

to_dict(remove_empty=False)

Create dictionary containing public attributes of an object.

Furthermore, replace certain objects with their respective labels provided in the recipe. These objects currently include datasets, results, figures (i.e. figure records), and plotters.

Parameters

remove_empty (bool) –

Whether to remove empty fields

Default: False

Returns

public_attributes – Ordered dictionary containing the public attributes of the object

The order of attribute definition is preserved

Return type

collections.OrderedDict

Changed in version 0.4: (Implicit) parameters of underlying task object are added

Changed in version 0.6: New parameter remove_empty

to_yaml(remove_empty=False)

As users will use tasks primarily within recipes, i.e. in YAML files, this method conveniently returns the YAML representation of a task.

Parameters

remove_empty (bool) –

Whether to remove empty fields

Default: False

Returns

yaml – YAML representation of a task

Return type

str

Examples

To get the YAML representation of a concrete task, you need to create a task object of the appropriate class and assign a type to it:

task = aspecd.tasks.ProcessingTask()


The result of the last line (the print statement from above) will look as follows:

kind: processing
type: Normalisation
properties:
parameters:
kind: maximum
range: null
range_unit: index
noise_range: null
noise_range_unit: percentage
apply_to: []
result: ''
comment: ''


As you can see, you will get the full set of settings for this particular task. Not all of them are strictly necessary, and for details, you still need to look into the respective documentation. But this can make it much easier to add tasks to a recipe.

Similarly, you can get YAML representations of an empty recipe where you can add to your tasks to. For details, see the documentation of the Recipe.to_yaml() method.

New in version 0.5.

Changed in version 0.6: Parameter remove_empty

class aspecd.tasks.ExportTask

Export of datasets.

Sometimes, datasets need to be exported and stored as file.

For more information on the underlying general class, see aspecd.io.DatasetExporter.

For an example of how such an export task may be included into a recipe, see the YAML listing below:

kind: export
properties:
target:
apply_to:
- loi:xxx


Note that you can refer to datasets and results created during cooking of a recipe using their respective labels. Those labels will automatically be replaced by the actual dataset/result prior to performing the task.

In case you apply the task to more than one dataset, you will need to supply a list of filenames instead of only a single filename. A minimal example may look like this:

kind: export
properties:
target:
apply_to:
- loi:xxx
- loi:yyy


Important

Make sure to provide the same number of file names in your recipe as the number of datasets you apply the exporter to. Otherwise you may run into trouble.

Note

If the recipe contains the output key in its directories dict, the datasets will be saved to this directory.

from_dict(dict_=None)

Set attributes from dictionary.

Parameters

dict (dict) – Dictionary containing information of a task.

Raises

aspecd.tasks.MissingDictError – Raised if no dict is provided.

get_object()

Return object for a particular task including all attributes.

Returns

obj – Object of a class defined in the type attribute of a task

Return type

object

perform()

Call the appropriate method of the underlying object.

The actual implementation is contained in the non-public method aspecd.tasks.Task._perform().

Different underlying objects have different methods used to actually perform the respective task. In order to generically perform a task, classes derived from the aspecd.tasks.Task base class need to override aspecd.tasks.Task._perform() accordingly.

Use aspecd.tasks.Task.get_object() to get an instance of the actual object necessary to perform the task, and afterwards call its appropriate method.

Similarly, to get the actual dataset using the dataset id stored in aspecd.tasks.Task.apply_to, use the method aspecd.tasks.Recipe.get_dataset() of the recipe stored in aspecd.tasks.Task.recipe.

Raises

aspecd.tasks.MissingRecipeError – Raised if no recipe is available.

to_dict(remove_empty=False)

Create dictionary containing public attributes of an object.

Furthermore, replace certain objects with their respective labels provided in the recipe. These objects currently include datasets, results, figures (i.e. figure records), and plotters.

Parameters

remove_empty (bool) –

Whether to remove empty fields

Default: False

Returns

public_attributes – Ordered dictionary containing the public attributes of the object

The order of attribute definition is preserved

Return type

collections.OrderedDict

Changed in version 0.4: (Implicit) parameters of underlying task object are added

Changed in version 0.6: New parameter remove_empty

to_yaml(remove_empty=False)

As users will use tasks primarily within recipes, i.e. in YAML files, this method conveniently returns the YAML representation of a task.

Parameters

remove_empty (bool) –

Whether to remove empty fields

Default: False

Returns

yaml – YAML representation of a task

Return type

str

Examples

To get the YAML representation of a concrete task, you need to create a task object of the appropriate class and assign a type to it:

task = aspecd.tasks.ProcessingTask()


The result of the last line (the print statement from above) will look as follows:

kind: processing
type: Normalisation
properties:
parameters:
kind: maximum
range: null
range_unit: index
noise_range: null
noise_range_unit: percentage
apply_to: []
result: ''
comment: ''


As you can see, you will get the full set of settings for this particular task. Not all of them are strictly necessary, and for details, you still need to look into the respective documentation. But this can make it much easier to add tasks to a recipe.

Similarly, you can get YAML representations of an empty recipe where you can add to your tasks to. For details, see the documentation of the Recipe.to_yaml() method.

New in version 0.5.

Changed in version 0.6: Parameter remove_empty

class aspecd.tasks.TabulateTask

Tabulate step defined as task in recipe-driven data analysis.

Tables will always be created individually for each dataset.

For more information on the underlying general class, see aspecd.table.Table.

For an example of how such a tabulating task may be included into a recipe, see the YAML listing below:

kind: tabulate
type: Table
properties:
caption:
title: >
Ideally a single sentence summarising the intend of the table
text: >
More text for the table caption
filename: fancytable.txt
apply_to:
- loi:xxx


Note that you can refer to datasets and results created during cooking of a recipe using their respective labels. Those labels will automatically be replaced by the actual dataset/result prior to performing the task.

Note

As soon as you provide a filename in the properties of your recipe, the resulting table will automatically be saved to that filename.

In case you apply the task to more than one dataset and would like to save individual tables, you can do that by supplying a list of filenames instead of only a single filename. In this case, the tables get saved to the filenames in the list. A minimal example may look like this:

kind: tabulate
type: Table
properties:
filename:
- fancytable1.pdf
- fancytable2.pdf
apply_to:
- loi:xxx
- loi:yyy


Important

Make sure to provide the same number of file names in your recipe as the number of datasets you create the tables for. Otherwise you may run into trouble.

Note

If the recipe contains the output key in its directories dict, the figure(s) will be saved to this directory.

New in version 0.5.

save_table(table=None)

Save the figure of the plot created by the task.

Parameters

table (aspecd.table.Table) – Table whose table should be saved

from_dict(dict_=None)

Set attributes from dictionary.

Parameters

dict (dict) – Dictionary containing information of a task.

Raises

aspecd.tasks.MissingDictError – Raised if no dict is provided.

get_object()

Return object for a particular task including all attributes.

Returns

obj – Object of a class defined in the type attribute of a task

Return type

object

perform()

Call the appropriate method of the underlying object.

The actual implementation is contained in the non-public method aspecd.tasks.Task._perform().

Different underlying objects have different methods used to actually perform the respective task. In order to generically perform a task, classes derived from the aspecd.tasks.Task base class need to override aspecd.tasks.Task._perform() accordingly.

Use aspecd.tasks.Task.get_object() to get an instance of the actual object necessary to perform the task, and afterwards call its appropriate method.

Similarly, to get the actual dataset using the dataset id stored in aspecd.tasks.Task.apply_to, use the method aspecd.tasks.Recipe.get_dataset() of the recipe stored in aspecd.tasks.Task.recipe.

Raises

aspecd.tasks.MissingRecipeError – Raised if no recipe is available.

to_dict(remove_empty=False)

Create dictionary containing public attributes of an object.

Furthermore, replace certain objects with their respective labels provided in the recipe. These objects currently include datasets, results, figures (i.e. figure records), and plotters.

Parameters

remove_empty (bool) –

Whether to remove empty fields

Default: False

Returns

public_attributes – Ordered dictionary containing the public attributes of the object

The order of attribute definition is preserved

Return type

collections.OrderedDict

Changed in version 0.4: (Implicit) parameters of underlying task object are added

Changed in version 0.6: New parameter remove_empty

to_yaml(remove_empty=False)

As users will use tasks primarily within recipes, i.e. in YAML files, this method conveniently returns the YAML representation of a task.

Parameters

remove_empty (bool) –

Whether to remove empty fields

Default: False

Returns

yaml – YAML representation of a task

Return type

str

Examples

To get the YAML representation of a concrete task, you need to create a task object of the appropriate class and assign a type to it:

task = aspecd.tasks.ProcessingTask()


The result of the last line (the print statement from above) will look as follows:

kind: processing
type: Normalisation
properties:
parameters:
kind: maximum
range: null
range_unit: index
noise_range: null
noise_range_unit: percentage
apply_to: []
result: ''
comment: ''


As you can see, you will get the full set of settings for this particular task. Not all of them are strictly necessary, and for details, you still need to look into the respective documentation. But this can make it much easier to add tasks to a recipe.

Similarly, you can get YAML representations of an empty recipe where you can add to your tasks to. For details, see the documentation of the Recipe.to_yaml() method.

New in version 0.5.

Changed in version 0.6: Parameter remove_empty

class aspecd.tasks.TaskFactory

Bases: object

Factory for creating task objects based on the kind provided.

The kind reflects the name of the module the actual object required for performing the task resides in. Furthermore, two ways are available for specifying the kind, either directly as argument provided to aspecd.tasks.TaskFactory.get_task() or as key in a dict used as an argument for aspecd.tasks.TaskFactory.get_task_from_dict().

The classes for the different tasks follow a simple convention: “<Module>Task” with “<Module>” being the capitalised module name the actual class necessary for performing the task resides in. Therefore, for each new module tasks should be available for, you will need to create an appropriate task class deriving from aspecd.tasks.Task.

Raises

• KeyError – Raised if dict with task description does not contain “kind” key.

get_task(kind=None)

Return task object specified by its kind.

Parameters

kind (str) –

Reflects the name of the module the actual object required for performing the task resides in.

Returns

The actual subclass depends on the kind.

Return type

aspecd.tasks.Task

Raises

get_task_from_dict(dict_=None)

Return task object specified by the “kind” key in the dict.

Parameters

dict (dict) –

Dictionary containing “kind” key

The “kind” key reflects the name of the module the actual object required for performing the task resides in.

Returns

The actual subclass depends on the kind.

Return type

aspecd.tasks.Task

Raises

• KeyError – Raised if dict does not contain “kind” key.

class aspecd.tasks.FigureRecord

Figures created during recipe-driven data analysis may need to be added, e.g., to a report. Therefore, the information contained in the PlotTask needs to be accessible by the recipe and other tasks in turn.

caption

User-supplied information for the figure caption.

Has three fields: “title”, “text”, and “parameters”.

“title” is usually one sentence describing the intent of the figure and often plotted bold-face in a figure caption.

“parameters” is a list of parameter names that should be included in the figure caption, usually at the very end.

Type

dict

parameters

All parameters necessary for the plot, implicit and explicit

Type

dict

label

Label the figure should be referred to from within the recipe

Similar to the aspecd.tasks.SingleanalysisTask.result attribute of the aspecd.tasks.SingleanalysisTask class.

Type

str

filename

Name of file to save the plot to

Type

str

Raises

aspecd.tasks.MissingPlotterError – Raised if no plotter is provided

from_plotter(plotter=None)

Set attributes from plotter

Usually, a plotter contains all information necessary for an aspecd.tasks.FigureRecord object.

Parameters

plotter (aspecd.plotting.Plotter) – Plotter the figure record should be created for.

Raises

aspecd.tasks.MissingPlotterError – Raised if no plotter is provided

to_dict(remove_empty=False)

Create dictionary containing public attributes of an object.

Parameters

remove_empty (bool) –

Whether to remove keys with empty values

Default: False

Returns

public_attributes – Ordered dictionary containing the public attributes of the object

The order of attribute definition is preserved

Return type

collections.OrderedDict

Changed in version 0.6: New parameter remove_empty

class aspecd.tasks.ChefDeService

Bases: object

Wrapper for serving the results of recipes given a recipe file name.

In recipe-driven data analysis, a recipe of class aspecd.tasks.Recipe get cooked by a chef of class aspecd.tasks.Chef. However, this requires to get the appropriate dataset factory of class aspecd.dataset.DatasetFactory or a class inheriting from this one, depending on the package actually used.

However, the (end) user would rather like to not care about those details and simply provide a recipe filename (of a YAML file) to an instance of a class and get the results back. This is where the ChefDeService comes in.

Obtaining the results of a recipe will become as simple as:

chef_de_service = ChefDeService()
chef_de_service.serve(recipe_filename='my_recipe.yaml')


Furthermore, the ChefDeService takes care of persisting the history of the cooked recipe in form of a YAML file. Therefore, an additional file gets created consisting of the filename of the recipe provided, extended by the timestamp of serving the results. These history files can be used as recipe again, allowing for full turnover.

recipe_filename

Name of the recipe file to serve the cooked results for

Type

str

Raises

aspecd.tasks.MissingRecipeError – Raised if no recipe filename is provided upon trying to serve

serve(recipe_filename='')

Serve the results of cooking a recipe

All you need to do is to provide the filename of a recipe YAML file. Additionally, the history will be served in a YAML file consisting of the name of the filename provided as recipe, with the timestamp of serving added to it.

Parameters

recipe_filename (str) – Name of the recipe YAML file to cook

Raises

aspecd.tasks.MissingRecipeError – Raised if no recipe filename is provided upon trying to serve

aspecd.tasks.serve()

Serve the results of cooking a recipe

All you need to do is to provide the filename of a recipe YAML file.

The ASpecD framework creates a console script entry point named “serve” that will allow you to even type serve <recipe_name.yaml> on the command line after installing the ASpecD framework.

Thanks to the modular nature of the ASpecD framework, if your recipe contains the default_package key followed by the name of a package based on the ASpecD framework, the correct instance of the aspecd.dataset.DatasetFactory class will be created and added to the recipe automatically, thus allowing you to serve recipes that rely on functionality of ASpecD-derived packages for their being cooked and served.

Raises

aspecd.exceptions.MissingRecipeError – Raised if no recipe filename is provided upon trying to serve