aspecd.dataset module

Datasets: units containing data and metadata.

The dataset is one key concept of the ASpecD framework, consisting of the data as well as the corresponding metadata. Storing metadata in a structured way is a prerequisite for a semantic understanding within the routines. Furthermore, a history of every processing, analysis and annotation step is recorded as well, aiming at a maximum of reproducibility. This is part of how the ASpecD framework tries to support good scientific practice.

Therefore, each processing and analysis step of data should always be performed using the respective methods of a dataset, at least as long as it can be performed on a single dataset.

Generally, there are two types of datasets: Those containing experimental data and those containing calculated data. Therefore, two corresponding subclasses exist, and packages building upon the ASpecD framework should inherit from either of them:

Additional classes used within the dataset that are normally not necessary to implement directly on your own in packages building upon the ASpecD framework, are:

  • aspecd.dataset.Data

    Unit containing both, numeric data and corresponding axes.

    The data class ensures consistency in terms of dimensions between numerical data and axes.

  • aspecd.dataset.Axis

    Axis for data in a dataset.

    An axis contains always both, numerical values as well as the metadata necessary to create axis labels and to make sense of the numerical information.

  • aspecd.dataset.DatasetReference

    Reference to a dataset.

    Often, one dataset needs to reference other datasets. A typical example would be a simulation stored in a dataset of class aspecd.dataset.CalculatedDataset that needs to reference the corresponding experimental data, stored in a dataset of class aspecd.dataset.ExperimentalDataset. Vice versa, the experimental dataset might want to store a reference to one (or more) simulations.

In addition, to handle the history contained within a dataset, there is a series of classes for storing history records:

  • aspecd.dataset.HistoryRecord

    Generic base class for all kinds of history records.

    For all classes operating on datasets, such as aspecd.processing.SingleProcessingStep, aspecd.analysis.SingleAnalysisStep and others, there exist at least two “representations”: (i) the generic one not (necessarily) tied to any concrete dataset, thus portable, and (ii) a concrete one having operated on a dataset and thus being accompanied with information about who has done what when how to what dataset.

    For this second type, a history class derived from aspecd.dataset.HistoryRecord gets used, and it is this second type that is stored inside the Dataset object.

  • aspecd.dataset.ProcessingHistoryRecord

    History record for processing steps on datasets.

  • aspecd.dataset.AnalysisHistoryRecord

    History record for analysis steps on datasets.

  • aspecd.dataset.AnnotationHistoryRecord

    History record for annotations of datasets.

  • aspecd.dataset.PlotHistoryRecord

    History record for plots of datasets.

class aspecd.dataset.Dataset

Bases: aspecd.utils.ToDictMixin

Base class for all kinds of datasets.

The dataset is one of the core elements of the ASpecD framework, basically containing both, (numeric) data and corresponding metadata, aka information available about the data.

Generally, there are two types of datasets: Those containing experimental data and those containing calculated data. Therefore, two corresponding subclasses exist, and packages building upon the ASpecD framework should inherit from either of them:

The public attributes of a dataset can be converted to a dict via aspecd.utils.ToDictMixin.to_dict().

id

(unique) identifier of the dataset (i.e., path, LOI, or else)

Type

str

label

Short description of the dataset

Can be set by the user, defaults to the value set as aspecd.dataset.id by the importer.

Type

str

data

numeric data and axes

Type

aspecd.dataset.Data

metadata

hierarchical key-value store of metadata

Type

aspecd.metadata.DatasetMetadata

history

processing steps performed on the numeric data

For a full list of tasks performed on a dataset in chronological order see the aspecd.dataset.Dataset.tasks attribute.

Type

list

analyses

analysis steps performed on the dataset

For a full list of tasks performed on a dataset in chronological order see the aspecd.dataset.Dataset.tasks attribute.

Type

list

annotations

annotations of the dataset

For a full list of tasks performed on a dataset in chronological order see the aspecd.dataset.Dataset.tasks attribute.

Type

list

representations

representations of the dataset, e.g., plots

For a full list of tasks performed on a dataset in chronological order see the aspecd.dataset.Dataset.tasks attribute.

Type

list

references

references to other datasets

Each reference is an object of type aspecd.dataset.DatasetReference.

Type

list

tasks

tasks performed on the dataset in chronological order

Each entry in the list is a dict containing information about the type of task (i.e., processing, analysis, annotation, representation) and a reference to the object containing more information about the respective task.

Tasks come in quite handy in cases where the exact chronological order of steps performed on a dataset are of relevance, regardless of their particular type, e.g., in context of reports.

Type

list

Raises
  • aspecd.dataset.UndoWithEmptyHistoryError – Raised when trying to undo with empty history

  • aspecd.dataset.UndoAtBeginningOfHistoryError – Raised when trying to undo with history pointer at zero

  • aspecd.dataset.UndoStepUndoableError – Raised when trying to undo an undoable step of history

  • aspecd.dataset.RedoAlreadyAtLatestChangeError – Raised when trying to redo with empty history

  • aspecd.dataset.ProcessingWithLeadingHistoryError – Raised when trying to process with leading history

property package_name

Return package name.

The name of the package the dataset is implemented in is a crucial detail for writing the history. The value is set automatically and is read-only.

process(processing_step=None)

Apply processing step to dataset.

Every processing step is an object of type aspecd.processing.SingleProcessingStep and is passed as argument to process().

Calling this function ensures that the history record is added to the dataset as well as a few basic checks are performed such as for leading history, meaning that the _history_pointer is not set to the current tip of the history of the dataset. In this case, an error is raised.

Note

If processing_step is undoable, all previous plots stored in the list of representations will be removed, as these plots cannot be reproduced due to a change in _origdata.

Parameters

processing_step (aspecd.processing.SingleProcessingStep) – processing step to apply to the dataset

Returns

processing_step – processing step applied to the dataset

Return type

aspecd.processing.SingleProcessingStep

Raises

aspecd.dataset.ProcessingWithLeadingHistoryError – Raised when trying to process with leading history

undo()

Revert last processing step.

Actually, the history pointer is decremented and starting from the _origdata, all processing steps are reapplied to the data up to this point in history.

Raises
  • aspecd.dataset.UndoWithEmptyHistoryError – Raised when trying to undo with empty history

  • aspecd.dataset.UndoAtBeginningOfHistoryError – Raised when trying to undo with history pointer at zero

  • aspecd.dataset.UndoStepUndoableError – Raised when trying to undo an undoable step of history

redo()

Reapply previously undone processing step.

Raises

aspecd.dataset.RedoAlreadyAtLatestChangeError – Raised when trying to redo with empty history

append_history_record(history_record)

Append history record to dataset history.

This method should never be called manually, but only from within classes of the ASpecD framework, at least as long as you are not interested in Orwellian History.

Parameters

history_record (aspecd.history.HistoryRecord) – History record (of a processing step) to be appended.

Changed in version 0.2: Converted into a public method, due to needs of aspecd.processing.MultiProcessingStep

strip_history()

Remove leading history, if any.

If a dataset has a leading history, i.e., its history pointer does not point to the last entry of the history, and you want to perform a processing step on this very dataset, you need first to strip its history, as otherwise, a ProcessingWithLeadingHistoryError will be raised.

analyse(analysis_step=None)

Apply analysis to dataset.

Every analysis step is an object of type aspecd.analysis.SingleAnalysisStep and is passed as an argument to analyse().

The information necessary to reproduce an analysis is stored in the analyses attribute as object of class aspecd.dataset.AnalysisHistoryRecord. This record contains as well a (deep) copy of the complete history of the dataset stored in history.

Parameters

analysis_step (aspecd.analysis.SingleAnalysisStep) – analysis step to apply to the dataset

Returns

analysis_step – analysis step applied to the dataset

Return type

aspecd.analysis.SingleAnalysisStep

analyze(analysis_step=None)

Apply analysis to dataset.

Same method as analyse(), but for those preferring AE over BE.

delete_analysis(index=None)

Remove analysis step record from dataset.

Parameters

index (int) – Number of analysis in analyses to delete

annotate(annotation_=None)

Add annotation to dataset.

Parameters

annotation (aspecd.annotation.Annotation) – annotation to add to the dataset

delete_annotation(index=None)

Remove annotation record from dataset.

Parameters

index (int) – Number of analysis in analyses to delete

plot(plotter=None)

Perform plot with data of current dataset.

Every plotter is an object of type aspecd.plotting.Plotter and is passed as an argument to plot().

The information necessary to reproduce a plot is stored in the representations attribute as object of class aspecd.dataset.PlotHistoryRecord. This record contains as well a (deep) copy of the complete history of the dataset stored in history. Besides being a necessary prerequisite to reproduce a plot, this allows to automatically recreate plots requiring different incompatible preprocessing steps in arbitrary order.

Parameters

plotter (aspecd.plotting.Plotter) – plot to perform with data of current dataset

Returns

plotter – plot performed on the current dataset

Return type

aspecd.plotting.Plotter

Raises

aspecd.dataset.MissingPlotterError – Raised when trying to plot without plotter

delete_representation(index=None)

Remove representation record from dataset.

Parameters

index (int) – Number of analysis in analyses to delete

load(filename=None)

Load dataset object from persistence layer.

The dataset will be loaded from a file conforming to the ASpecD dataset format (adf). For details, see the aspecd.io.AdfExporter class.

save(filename=None)

Save dataset to persistence layer.

The dataset will be saved in ASpecD dataset format (adf). For details, see the aspecd.io.AdfExporter class.

import_from(importer=None)

Import data and metadata contained in importer object.

This requires initialising an aspecd.io.Importer object first that is provided as an argument for this method.

Note

The same operation can be performed by calling the import_into() method of an aspecd.io.Importer object taking an aspecd.dataset.Dataset object as argument.

However, as usually one wants to continue working with a dataset, first creating an instance of a dataset and a respective importer and then calling import_from() of the dataset is the preferred way.

Parameters

importer (aspecd.io.DatasetImporter) – Importer containing data and metadata read from some source

export_to(exporter=None)

Export data and metadata.

This requires initialising an aspecd.io.DatasetImporter object first that is provided as an argument for this method.

Note

The same operation can be performed by calling the export_from() method of an aspecd.io.Exporter object taking an aspecd.dataset.Dataset object as argument.

However, as usually the dataset is already at hand, first creating an instance of a respective exporter and then calling export_to() of the dataset is the preferred way.

Parameters

exporter (aspecd.io.DatasetExporter) – Exporter writing data and metadata to specific output format

add_reference(dataset=None)

Add a reference to another dataset to the list of references.

A reference is always an object of type aspecd.dataset.DatasetReference that will be automatically created from the dataset provided.

Parameters

dataset (aspecd.dataset.Dataset) – dataset a reference for should be added to the list of references

Raises

aspecd.dataset.MissingDatasetError – Raised if no dataset was provided

remove_reference(dataset_id=None)

Remove a reference to another dataset from the list of references.

A reference is always an object of type aspecd.dataset.DatasetReference that was automatically created from the respective dataset when adding the reference.

Parameters

dataset_id (string) – ID of the dataset the reference should be removed for

Raises

aspecd.dataset.MissingDatasetError – Raised if no dataset ID was provided

from_dict(dict_=None)

Set properties from dictionary.

Only parameters in the dictionary that are valid properties of the class are set accordingly.

Note

In conjunction with the aspecd.dataset.to_dict() method, this method allows to serialise and deserialise dataset objects, i.e. all kinds of storage to the persistence layer.

Parameters

dict (dict) – Dictionary containing properties to set

to_dict()

Create dictionary containing public attributes of an object.

Returns

public_attributes – Ordered dictionary containing the public attributes of the object

The order of attribute definition is preserved

Return type

collections.OrderedDict

class aspecd.dataset.ExperimentalDataset

Bases: aspecd.dataset.Dataset

Base class for experimental datasets.

The dataset is one of the core elements of the ASpecD framework, basically containing both, (numeric) data and corresponding metadata, aka information available about the data.

The public attributes of a dataset can be converted to a dict via aspecd.utils.ToDictMixin.to_dict().

metadata

hierarchical key-value store of metadata

Type

aspecd.metadata.ExperimentalDatasetMetadata

add_reference(dataset=None)

Add a reference to another dataset to the list of references.

A reference is always an object of type aspecd.dataset.DatasetReference that will be automatically created from the dataset provided.

Parameters

dataset (aspecd.dataset.Dataset) – dataset a reference for should be added to the list of references

Raises

aspecd.dataset.MissingDatasetError – Raised if no dataset was provided

analyse(analysis_step=None)

Apply analysis to dataset.

Every analysis step is an object of type aspecd.analysis.SingleAnalysisStep and is passed as an argument to analyse().

The information necessary to reproduce an analysis is stored in the analyses attribute as object of class aspecd.dataset.AnalysisHistoryRecord. This record contains as well a (deep) copy of the complete history of the dataset stored in history.

Parameters

analysis_step (aspecd.analysis.SingleAnalysisStep) – analysis step to apply to the dataset

Returns

analysis_step – analysis step applied to the dataset

Return type

aspecd.analysis.SingleAnalysisStep

analyze(analysis_step=None)

Apply analysis to dataset.

Same method as analyse(), but for those preferring AE over BE.

annotate(annotation_=None)

Add annotation to dataset.

Parameters

annotation (aspecd.annotation.Annotation) – annotation to add to the dataset

append_history_record(history_record)

Append history record to dataset history.

This method should never be called manually, but only from within classes of the ASpecD framework, at least as long as you are not interested in Orwellian History.

Parameters

history_record (aspecd.history.HistoryRecord) – History record (of a processing step) to be appended.

Changed in version 0.2: Converted into a public method, due to needs of aspecd.processing.MultiProcessingStep

delete_analysis(index=None)

Remove analysis step record from dataset.

Parameters

index (int) – Number of analysis in analyses to delete

delete_annotation(index=None)

Remove annotation record from dataset.

Parameters

index (int) – Number of analysis in analyses to delete

delete_representation(index=None)

Remove representation record from dataset.

Parameters

index (int) – Number of analysis in analyses to delete

export_to(exporter=None)

Export data and metadata.

This requires initialising an aspecd.io.DatasetImporter object first that is provided as an argument for this method.

Note

The same operation can be performed by calling the export_from() method of an aspecd.io.Exporter object taking an aspecd.dataset.Dataset object as argument.

However, as usually the dataset is already at hand, first creating an instance of a respective exporter and then calling export_to() of the dataset is the preferred way.

Parameters

exporter (aspecd.io.DatasetExporter) – Exporter writing data and metadata to specific output format

from_dict(dict_=None)

Set properties from dictionary.

Only parameters in the dictionary that are valid properties of the class are set accordingly.

Note

In conjunction with the aspecd.dataset.to_dict() method, this method allows to serialise and deserialise dataset objects, i.e. all kinds of storage to the persistence layer.

Parameters

dict (dict) – Dictionary containing properties to set

import_from(importer=None)

Import data and metadata contained in importer object.

This requires initialising an aspecd.io.Importer object first that is provided as an argument for this method.

Note

The same operation can be performed by calling the import_into() method of an aspecd.io.Importer object taking an aspecd.dataset.Dataset object as argument.

However, as usually one wants to continue working with a dataset, first creating an instance of a dataset and a respective importer and then calling import_from() of the dataset is the preferred way.

Parameters

importer (aspecd.io.DatasetImporter) – Importer containing data and metadata read from some source

load(filename=None)

Load dataset object from persistence layer.

The dataset will be loaded from a file conforming to the ASpecD dataset format (adf). For details, see the aspecd.io.AdfExporter class.

property package_name

Return package name.

The name of the package the dataset is implemented in is a crucial detail for writing the history. The value is set automatically and is read-only.

plot(plotter=None)

Perform plot with data of current dataset.

Every plotter is an object of type aspecd.plotting.Plotter and is passed as an argument to plot().

The information necessary to reproduce a plot is stored in the representations attribute as object of class aspecd.dataset.PlotHistoryRecord. This record contains as well a (deep) copy of the complete history of the dataset stored in history. Besides being a necessary prerequisite to reproduce a plot, this allows to automatically recreate plots requiring different incompatible preprocessing steps in arbitrary order.

Parameters

plotter (aspecd.plotting.Plotter) – plot to perform with data of current dataset

Returns

plotter – plot performed on the current dataset

Return type

aspecd.plotting.Plotter

Raises

aspecd.dataset.MissingPlotterError – Raised when trying to plot without plotter

process(processing_step=None)

Apply processing step to dataset.

Every processing step is an object of type aspecd.processing.SingleProcessingStep and is passed as argument to process().

Calling this function ensures that the history record is added to the dataset as well as a few basic checks are performed such as for leading history, meaning that the _history_pointer is not set to the current tip of the history of the dataset. In this case, an error is raised.

Note

If processing_step is undoable, all previous plots stored in the list of representations will be removed, as these plots cannot be reproduced due to a change in _origdata.

Parameters

processing_step (aspecd.processing.SingleProcessingStep) – processing step to apply to the dataset

Returns

processing_step – processing step applied to the dataset

Return type

aspecd.processing.SingleProcessingStep

Raises

aspecd.dataset.ProcessingWithLeadingHistoryError – Raised when trying to process with leading history

redo()

Reapply previously undone processing step.

Raises

aspecd.dataset.RedoAlreadyAtLatestChangeError – Raised when trying to redo with empty history

remove_reference(dataset_id=None)

Remove a reference to another dataset from the list of references.

A reference is always an object of type aspecd.dataset.DatasetReference that was automatically created from the respective dataset when adding the reference.

Parameters

dataset_id (string) – ID of the dataset the reference should be removed for

Raises

aspecd.dataset.MissingDatasetError – Raised if no dataset ID was provided

save(filename=None)

Save dataset to persistence layer.

The dataset will be saved in ASpecD dataset format (adf). For details, see the aspecd.io.AdfExporter class.

strip_history()

Remove leading history, if any.

If a dataset has a leading history, i.e., its history pointer does not point to the last entry of the history, and you want to perform a processing step on this very dataset, you need first to strip its history, as otherwise, a ProcessingWithLeadingHistoryError will be raised.

to_dict()

Create dictionary containing public attributes of an object.

Returns

public_attributes – Ordered dictionary containing the public attributes of the object

The order of attribute definition is preserved

Return type

collections.OrderedDict

undo()

Revert last processing step.

Actually, the history pointer is decremented and starting from the _origdata, all processing steps are reapplied to the data up to this point in history.

Raises
  • aspecd.dataset.UndoWithEmptyHistoryError – Raised when trying to undo with empty history

  • aspecd.dataset.UndoAtBeginningOfHistoryError – Raised when trying to undo with history pointer at zero

  • aspecd.dataset.UndoStepUndoableError – Raised when trying to undo an undoable step of history

class aspecd.dataset.CalculatedDataset

Bases: aspecd.dataset.Dataset

Base class for datasets containing calculated data.

The dataset is one of the core elements of the ASpecD framework, basically containing both, (numeric) data and corresponding metadata, aka information available about the data.

The public attributes of a dataset can be converted to a dict via aspecd.utils.ToDictMixin.to_dict().

metadata

hierarchical key-value store of metadata

Type

aspecd.metadata.CalculatedDatasetMetadata

add_reference(dataset=None)

Add a reference to another dataset to the list of references.

A reference is always an object of type aspecd.dataset.DatasetReference that will be automatically created from the dataset provided.

Parameters

dataset (aspecd.dataset.Dataset) – dataset a reference for should be added to the list of references

Raises

aspecd.dataset.MissingDatasetError – Raised if no dataset was provided

analyse(analysis_step=None)

Apply analysis to dataset.

Every analysis step is an object of type aspecd.analysis.SingleAnalysisStep and is passed as an argument to analyse().

The information necessary to reproduce an analysis is stored in the analyses attribute as object of class aspecd.dataset.AnalysisHistoryRecord. This record contains as well a (deep) copy of the complete history of the dataset stored in history.

Parameters

analysis_step (aspecd.analysis.SingleAnalysisStep) – analysis step to apply to the dataset

Returns

analysis_step – analysis step applied to the dataset

Return type

aspecd.analysis.SingleAnalysisStep

analyze(analysis_step=None)

Apply analysis to dataset.

Same method as analyse(), but for those preferring AE over BE.

annotate(annotation_=None)

Add annotation to dataset.

Parameters

annotation (aspecd.annotation.Annotation) – annotation to add to the dataset

append_history_record(history_record)

Append history record to dataset history.

This method should never be called manually, but only from within classes of the ASpecD framework, at least as long as you are not interested in Orwellian History.

Parameters

history_record (aspecd.history.HistoryRecord) – History record (of a processing step) to be appended.

Changed in version 0.2: Converted into a public method, due to needs of aspecd.processing.MultiProcessingStep

delete_analysis(index=None)

Remove analysis step record from dataset.

Parameters

index (int) – Number of analysis in analyses to delete

delete_annotation(index=None)

Remove annotation record from dataset.

Parameters

index (int) – Number of analysis in analyses to delete

delete_representation(index=None)

Remove representation record from dataset.

Parameters

index (int) – Number of analysis in analyses to delete

export_to(exporter=None)

Export data and metadata.

This requires initialising an aspecd.io.DatasetImporter object first that is provided as an argument for this method.

Note

The same operation can be performed by calling the export_from() method of an aspecd.io.Exporter object taking an aspecd.dataset.Dataset object as argument.

However, as usually the dataset is already at hand, first creating an instance of a respective exporter and then calling export_to() of the dataset is the preferred way.

Parameters

exporter (aspecd.io.DatasetExporter) – Exporter writing data and metadata to specific output format

from_dict(dict_=None)

Set properties from dictionary.

Only parameters in the dictionary that are valid properties of the class are set accordingly.

Note

In conjunction with the aspecd.dataset.to_dict() method, this method allows to serialise and deserialise dataset objects, i.e. all kinds of storage to the persistence layer.

Parameters

dict (dict) – Dictionary containing properties to set

import_from(importer=None)

Import data and metadata contained in importer object.

This requires initialising an aspecd.io.Importer object first that is provided as an argument for this method.

Note

The same operation can be performed by calling the import_into() method of an aspecd.io.Importer object taking an aspecd.dataset.Dataset object as argument.

However, as usually one wants to continue working with a dataset, first creating an instance of a dataset and a respective importer and then calling import_from() of the dataset is the preferred way.

Parameters

importer (aspecd.io.DatasetImporter) – Importer containing data and metadata read from some source

load(filename=None)

Load dataset object from persistence layer.

The dataset will be loaded from a file conforming to the ASpecD dataset format (adf). For details, see the aspecd.io.AdfExporter class.

property package_name

Return package name.

The name of the package the dataset is implemented in is a crucial detail for writing the history. The value is set automatically and is read-only.

plot(plotter=None)

Perform plot with data of current dataset.

Every plotter is an object of type aspecd.plotting.Plotter and is passed as an argument to plot().

The information necessary to reproduce a plot is stored in the representations attribute as object of class aspecd.dataset.PlotHistoryRecord. This record contains as well a (deep) copy of the complete history of the dataset stored in history. Besides being a necessary prerequisite to reproduce a plot, this allows to automatically recreate plots requiring different incompatible preprocessing steps in arbitrary order.

Parameters

plotter (aspecd.plotting.Plotter) – plot to perform with data of current dataset

Returns

plotter – plot performed on the current dataset

Return type

aspecd.plotting.Plotter

Raises

aspecd.dataset.MissingPlotterError – Raised when trying to plot without plotter

process(processing_step=None)

Apply processing step to dataset.

Every processing step is an object of type aspecd.processing.SingleProcessingStep and is passed as argument to process().

Calling this function ensures that the history record is added to the dataset as well as a few basic checks are performed such as for leading history, meaning that the _history_pointer is not set to the current tip of the history of the dataset. In this case, an error is raised.

Note

If processing_step is undoable, all previous plots stored in the list of representations will be removed, as these plots cannot be reproduced due to a change in _origdata.

Parameters

processing_step (aspecd.processing.SingleProcessingStep) – processing step to apply to the dataset

Returns

processing_step – processing step applied to the dataset

Return type

aspecd.processing.SingleProcessingStep

Raises

aspecd.dataset.ProcessingWithLeadingHistoryError – Raised when trying to process with leading history

redo()

Reapply previously undone processing step.

Raises

aspecd.dataset.RedoAlreadyAtLatestChangeError – Raised when trying to redo with empty history

remove_reference(dataset_id=None)

Remove a reference to another dataset from the list of references.

A reference is always an object of type aspecd.dataset.DatasetReference that was automatically created from the respective dataset when adding the reference.

Parameters

dataset_id (string) – ID of the dataset the reference should be removed for

Raises

aspecd.dataset.MissingDatasetError – Raised if no dataset ID was provided

save(filename=None)

Save dataset to persistence layer.

The dataset will be saved in ASpecD dataset format (adf). For details, see the aspecd.io.AdfExporter class.

strip_history()

Remove leading history, if any.

If a dataset has a leading history, i.e., its history pointer does not point to the last entry of the history, and you want to perform a processing step on this very dataset, you need first to strip its history, as otherwise, a ProcessingWithLeadingHistoryError will be raised.

to_dict()

Create dictionary containing public attributes of an object.

Returns

public_attributes – Ordered dictionary containing the public attributes of the object

The order of attribute definition is preserved

Return type

collections.OrderedDict

undo()

Revert last processing step.

Actually, the history pointer is decremented and starting from the _origdata, all processing steps are reapplied to the data up to this point in history.

Raises
  • aspecd.dataset.UndoWithEmptyHistoryError – Raised when trying to undo with empty history

  • aspecd.dataset.UndoAtBeginningOfHistoryError – Raised when trying to undo with history pointer at zero

  • aspecd.dataset.UndoStepUndoableError – Raised when trying to undo an undoable step of history

class aspecd.dataset.DatasetReference

Bases: aspecd.utils.ToDictMixin

Reference to a given dataset.

Often, one dataset needs to reference other datasets. A typical example would be a simulation stored in a dataset of class aspecd.dataset.CalculatedDataset that needs to reference the corresponding experimental data, stored in a dataset of class aspecd.dataset.ExperimentalDataset. Vice versa, the experimental dataset might want to store a reference to one (or more) simulations.

As the dataset ID is not sufficient, both, the ID as well as the history of the dataset at the time the reference has been created gets stored in the reference and restored upon creating a (new) dataset. Hence, at least the data of the dataset returned should be identical to the data of the original dataset the reference has been created for.

type

type of dataset

Will be inferred directly from dataset when creating a reference from a given dataset and is used to return a dataset of same type.

Type

str

id

(unique) id of the dataset, i.e. path, LOI, or else

Type

str

history

history of processing steps performed on the dataset to be referenced

Type

list

Raises

aspecd.dataset.MissingDatasetError – Raised if no dataset was provided when calling from_dataset()

from_dataset(dataset=None)

Create dataset reference from dataset.

Parameters

dataset (aspecd.dataset.Dataset) – Dataset the reference should be created for

Raises

aspecd.dataset.MissingDatasetError – Raised if no dataset was provided

to_dataset()

Create (new) dataset from reference

The history stored will be applied to the newly created dataset, hence the dataset should be in the same state with respect to processing steps as the original dataset was upon creating the reference.

Returns

dataset – Dataset with identical data to the one the reference has been created for

Return type

aspecd.dataset.Dataset

from_dict(dict_=None)

Set properties from dictionary.

Only parameters in the dictionary that are valid properties of the class are set accordingly.

Parameters

dict (dict) – Dictionary containing properties to set

to_dict()

Create dictionary containing public attributes of an object.

Returns

public_attributes – Ordered dictionary containing the public attributes of the object

The order of attribute definition is preserved

Return type

collections.OrderedDict

class aspecd.dataset.DatasetFactory

Bases: object

Factory for creating dataset objects based on the source provided.

Particularly in case of recipe-driven data analysis (c.f. tasks), there is a need to automatically retrieve datasets using nothing more than a source string that can be, e.g., a path or LOI.

Packages derived from ASpecD should implement a DatasetFactory inheriting from aspecd.dataset.DatasetFactory and overriding the protected method _create_dataset(). The only task of this protected method is to provide the correct dataset object, in most cases an instance of a class inheriting from aspecd.dataset.ExperimentalDataset.

importer_factory

ImporterFactory instance used for importing datasets

Type

aspecd.io.DatasetImporterFactory

Raises
  • aspecd.dataset.MissingSourceError – Raised if no source is provided

  • aspecd.dataset.MissingImporterFactoryError – Raised if no ImporterFactory is available

get_dataset(source='', importer='', parameters=None)

Return dataset object for dataset specified by its source.

The import of data into the dataset is handled using an instance of aspecd.io.DatasetImporterFactory.

The actual code for deciding which type of dataset to return in what case should be implemented in the non-public method _create_dataset() in any package based on the ASpecD framework.

Parameters
  • source (str) –

    string describing the source of the dataset

    May be a filename or path, a URL/URI, a LOI, or similar

  • importer (str) –

    Name of the importer to use for importing the dataset

    Default: ‘’

    New in version 0.2.

  • parameters (dict) –

    Additional parameters for controlling the import

    Default: None

    New in version 0.2.

Returns

dataset – Dataset object of appropriate class

Return type

aspecd.dataset.Dataset

Raises
  • aspecd.dataset.MissingSourceError – Raised if no source is provided

  • aspecd.dataset.MissingImporterFactoryError – Raised if no ImporterFactory is available

class aspecd.dataset.Data(data=array([], dtype=float64), axes=None, calculated=False)

Bases: aspecd.utils.ToDictMixin

Unit containing both, numeric data and corresponding axes.

The data class ensures consistency in terms of dimensions between numerical data and axes.

Parameters
  • data (numpy.array) – Numerical data

  • axes (list) –

    List of objects of type aspecd.dataset.Axis

    The number of axes needs to be consistent with the dimensions of data.

    Axes will be set automatically when setting data. Hence, the easiest is to first set data and only afterwards set axis values.

  • calculated (bool) – Indicator for the origin of the numerical data (calculation or experiment).

calculated

Indicate whether numeric data are calculated rather than experimentally recorded

Type

bool

Raises
  • aspecd.dataset.AxesCountError – Raised if number of axes is inconsistent with data dimensions

  • aspecd.dataset.AxesValuesInconsistentWithDataError – Raised if axes values are inconsistent with data

property data

Get or set (numeric) data.

Note

If you set data that have different dimensions to the data previously stored in the dataset, the axes values will be set to an array with indices corresponding to the size of the respective data dimension. You will most probably assign proper axis values afterwards. On the other hand, all other information stored in the axis object will be retained, namely quantity, unit, and label.

property axes

Get or set axes.

If you set axes, they will be checked for consistency with the data. Therefore, first set the data and only afterwards the axes, with values corresponding to the dimensions of the data.

Raises
  • aspecd.dataset.AxesCountError – Raised if number of axes is inconsistent with data dimensions

  • aspecd.dataset.AxesValuesInconsistentWithDataError – Raised if axes values are inconsistent with data dimensions

from_dict(dict_=None)

Set properties from dictionary, e.g., from serialised dataset.

Only parameters in the dictionary that are valid properties of the class are set accordingly.

The list of axes is handled appropriately.

Parameters

dict (dict) – Dictionary containing properties to set

to_dict()

Create dictionary containing public attributes of an object.

Returns

public_attributes – Ordered dictionary containing the public attributes of the object

The order of attribute definition is preserved

Return type

collections.OrderedDict

class aspecd.dataset.Axis

Bases: aspecd.utils.ToDictMixin

Axis for data in a dataset.

An axis contains always both, numerical values as well as the metadata necessary to create axis labels and to make sense of the numerical information.

quantity

quantity of the numerical data, usually used as first part of an automatically generated axis label

Type

string

unit

unit of the numerical data, usually used as second part of an automatically generated axis label

Type

string

symbol

symbol for the quantity of the numerical data, usually used as first part of an automatically generated axis label

Type

string

label

manual label for the axis, particularly useful in cases where no quantity and unit are provided or should be overwritten.

Type

string

Note

There are three alternative ways of writing axis labels, one with using the quantity name and the unit, one with using the quantity symbol and the unit, and one using both, quantity name and symbol, usually separated by comma. Quantity and unit shall always be separated by a slash. Which way you prefer is a matter of personal taste and given context.

Raises
  • aspecd.dataset.AxisValuesTypeError – Raised when trying to set axis values to another type than numpy array

  • aspecd.dataset.AxisValuesDimensionError – Raised when trying to set axis values to an array with more than one dimension.

to_dict()

Create dictionary containing public attributes of an object.

Returns

public_attributes – Ordered dictionary containing the public attributes of the object

The order of attribute definition is preserved

Return type

collections.OrderedDict

property values

Get or set the numerical axis values.

Values require to be a one-dimensional numpy array. Trying to set values to either a different type that cannot be converted to a numpy array or a numpy array with more than one dimension will raise a corresponding error.

Raises
  • aspecd.dataset.AxisValuesTypeError – Raised of axis values are of wrong type

  • aspecd.dataset.AxisValuesDimensionError – Raised if axis values are of wrong dimension, i.e. not a vector

property equidistant

Return whether the axes values are equidistant.

True if the axis values are equidistant, False otherwise. None in case of no axis values.

The property is set automatically if axis values are set and therefore read-only.

While simple plotting of data values against non-uniform axes with non-equidistant values is usually straightforward, many processing steps rely on equidistant axis values in their simplest possible implementation.

from_dict(dict_=None)

Set properties from dictionary, e.g., from serialised dataset.

Only parameters in the dictionary that are valid properties of the class are set accordingly.

Parameters

dict (dict) – Dictionary containing properties to set