You're reading an old version of this documentation. For up-to-date information, please have a look at v0.9.
aspecd.dataset module¶
Datasets: units containing data and metadata.
The dataset is one key concept of the ASpecD framework, consisting of the data as well as the corresponding metadata. Storing metadata in a structured way is a prerequisite for a semantic understanding within the routines. Furthermore, a history of every processing, analysis and annotation step is recorded as well, aiming at a maximum of reproducibility. This is part of how the ASpecD framework tries to support good scientific practice.
Therefore, each processing and analysis step of data should always be performed using the respective methods of a dataset, at least as long as it can be performed on a single dataset.
Types of datasets¶
Generally, there are two types of datasets: Those containing experimental data and those containing calculated data. Therefore, two corresponding subclasses exist, and packages building upon the ASpecD framework should inherit from either of them:
Calculated datasets can either be the result of actual simulations or dummy datasets used for testing purposes.
Classes used by the dataset¶
Additional classes used within the dataset that are normally not necessary to implement directly on your own in packages building upon the ASpecD framework, are:
Unit containing both, numeric data and corresponding axes.
The data class ensures consistency in terms of dimensions between numerical data and axes.
Axis for data in a dataset.
An axis contains always both, numerical values as well as the metadata necessary to create axis labels and to make sense of the numerical information.
Additional data from devices recorded parallel to the actual data.
The dataset concept (see
aspecd.dataset.Dataset
) rests on the assumption that there is one particular set of data that can be regarded as the actual or primary data of the dataset. However, in many cases, parallel to these actual data, other data are recorded as well, be it readouts from monitors or alike.
aspecd.dataset.DatasetReference
Reference to a dataset.
Often, one dataset needs to reference other datasets. A typical example would be a simulation stored in a dataset of class
aspecd.dataset.CalculatedDataset
that needs to reference the corresponding experimental data, stored in a dataset of classaspecd.dataset.ExperimentalDataset
. Vice versa, the experimental dataset might want to store a reference to one (or more) simulations.
Device data¶
The dataset concept (see aspecd.dataset.Dataset
) rests on the
assumption that there is one particular set of data that can be
regarded as the actual or primary data of the dataset. However,
in many cases, parallel to these actual data, other data are recorded
as well, be it readouts from monitors or alike.
Usually, these additional data will share one axis with the primary data of the dataset. However, this is not necessarily the case. Furthermore, one dataset may contain an arbitrary number of additional device data entries.
Technically speaking, aspecd.dataset.DeviceData
are a special
or extended type of aspecd.dataset.Data
, i.e. a unit
containing both numerical data and corresponding axes. However,
this class extends that with metadata specific for the device the
additional data have been recorded with. Why storing metadata here and
not in the aspecd.dataset.Dataset.metadata
property? The
latter is more concerned with an overall description of the
experimental setup in sufficient detail, while the metadata contained
in this class are more device-specific. Potential contents of the
metadata here are internal device IDs, addresses for communication,
and alike. Eventually, the metadata contained herein are those that
can be relevant mainly for debugging purposes or sanity checks of
experiments.
Example
A real example for additional data recorded in spectroscopy comes from time-resolved EPR (tr-EPR) spectroscopy: Here, you usually record 2D data as function of magnetic field and time, i.e. a full time profile per magnetic field point. As this is a non-standard method, often setups are controlled by lab-written software and allow for monitoring parameters not usually recorded with commercial setups. In this particular case, this can be the time stamp and microwave frequency for each individual recorded time trace, and the Python trEPR package not only handles tr-EPR data, but is capable of dealing with both additional types of data for analysis purposes.
Module documentation¶
- class aspecd.dataset.Dataset¶
Bases:
aspecd.utils.ToDictMixin
Base class for all kinds of datasets.
The dataset is one of the core elements of the ASpecD framework, basically containing both, (numeric) data and corresponding metadata, aka information available about the data.
Generally, there are two types of datasets: Those containing experimental data and those containing calculated data. Therefore, two corresponding subclasses exist, and packages building upon the ASpecD framework should inherit from either of them:
The public attributes of a dataset can be converted to a dict via
aspecd.utils.ToDictMixin.to_dict()
.- label¶
Short description of the dataset
Can be set by the user, defaults to the value set as
aspecd.dataset.id
by the importer.- Type
- data¶
numeric data and axes
- Type
- device_data¶
Additional data from devices recorded parallel to the actual data.
For details and a bit of background, see
aspecd.dataset.DeviceData
.In a real dataset with actual device data, the key typically identifies the device (type) and the value is of type
aspecd.dataset.DeviceData
.Note
To further process and analyse these device data, the most general way is to extract them as individual dataset each and perform all further tasks on it, respectively. See
aspecd.analysis.DeviceDataExtraction
for details.New in version 0.9.
- Type
- metadata¶
hierarchical key-value store of metadata
- history¶
processing steps performed on the numeric data
For a full list of tasks performed on a dataset in chronological order see the
aspecd.dataset.Dataset.tasks
attribute.- Type
- analyses¶
analysis steps performed on the dataset
For a full list of tasks performed on a dataset in chronological order see the
aspecd.dataset.Dataset.tasks
attribute.- Type
- annotations¶
annotations of the dataset
For a full list of tasks performed on a dataset in chronological order see the
aspecd.dataset.Dataset.tasks
attribute.- Type
- representations¶
representations of the dataset, e.g., plots
For a full list of tasks performed on a dataset in chronological order see the
aspecd.dataset.Dataset.tasks
attribute.- Type
- references¶
references to other datasets
Each reference is an object of type
aspecd.dataset.DatasetReference
.- Type
- tasks¶
tasks performed on the dataset in chronological order
Each entry in the list is a dict containing information about the type of task (i.e., processing, analysis, annotation, representation) and a reference to the object containing more information about the respective task.
Tasks come in quite handy in cases where the exact chronological order of steps performed on a dataset are of relevance, regardless of their particular type, e.g., in context of reports.
- Type
- Raises
aspecd.exceptions.UndoWithEmptyHistoryError – Raised when trying to undo with empty history
aspecd.exceptions.UndoAtBeginningOfHistoryError – Raised when trying to undo with history pointer at zero
aspecd.exceptions.UndoStepUndoableError – Raised when trying to undo an undoable step of history
aspecd.exceptions.RedoAlreadyAtLatestChangeError – Raised when trying to redo with empty history
aspecd.exceptions.ProcessingWithLeadingHistoryError – Raised when trying to process with leading history
- property package_name¶
Return package name.
The name of the package the dataset is implemented in is a crucial detail for writing the history. The value is set automatically and is read-only.
- process(processing_step=None)¶
Apply processing step to dataset.
Every processing step is an object of type
aspecd.processing.SingleProcessingStep
and is passed as argument toprocess()
.Calling this function ensures that the history record is added to the dataset as well as a few basic checks are performed such as for leading history, meaning that the
_history_pointer
is not set to the current tip of the history of the dataset. In this case, an error is raised.Note
If processing_step is undoable, all previous plots stored in the list of representations will be removed, as these plots cannot be reproduced due to a change in
_origdata
.- Parameters
processing_step (
aspecd.processing.SingleProcessingStep
) – processing step to apply to the dataset- Returns
processing_step – processing step applied to the dataset
- Return type
- Raises
aspecd.exceptions.ProcessingWithLeadingHistoryError – Raised when trying to process with leading history
- undo()¶
Revert last processing step.
Actually, the history pointer is decremented and starting from the
_origdata
, all processing steps are reapplied to the data up to this point in history.- Raises
aspecd.exceptions.UndoWithEmptyHistoryError – Raised when trying to undo with empty history
aspecd.exceptions.UndoAtBeginningOfHistoryError – Raised when trying to undo with history pointer at zero
aspecd.exceptions.UndoStepUndoableError – Raised when trying to undo an undoable step of history
- redo()¶
Reapply previously undone processing step.
- Raises
aspecd.exceptions.RedoAlreadyAtLatestChangeError – Raised when trying to redo with empty history
- append_history_record(history_record)¶
Append history record to dataset history.
This method should never be called manually, but only from within classes of the ASpecD framework, at least as long as you are not interested in Orwellian History.
- Parameters
history_record (
aspecd.history.HistoryRecord
) – History record (of a processing step) to be appended.
Changed in version 0.2: Converted into a public method, due to needs of
aspecd.processing.MultiProcessingStep
- strip_history()¶
Remove leading history, if any.
If a dataset has a leading history, i.e., its history pointer does not point to the last entry of the history, and you want to perform a processing step on this very dataset, you need first to strip its history, as otherwise, a
ProcessingWithLeadingHistoryError
will be raised.
- analyse(analysis_step=None)¶
Apply analysis to dataset.
Every analysis step is an object of type
aspecd.analysis.SingleAnalysisStep
and is passed as an argument toanalyse()
.The information necessary to reproduce an analysis is stored in the
analyses
attribute as object of classaspecd.dataset.AnalysisHistoryRecord
. This record contains as well a (deep) copy of the complete history of the dataset stored inhistory
.- Parameters
analysis_step (
aspecd.analysis.SingleAnalysisStep
) – analysis step to apply to the dataset- Returns
analysis_step – analysis step applied to the dataset
- Return type
- analyze(analysis_step=None)¶
Apply analysis to dataset.
Same method as
analyse()
, but for those preferring AE over BE.
- delete_analysis(index=None)¶
Remove analysis step record from dataset.
- Parameters
index (int) – Number of analysis in analyses to delete
- annotate(annotation_=None)¶
Add annotation to dataset.
- Parameters
annotation (
aspecd.annotation.DatasetAnnotation
) – annotation to add to the dataset
- delete_annotation(index=None)¶
Remove annotation record from dataset.
- Parameters
index (int) – Number of analysis in analyses to delete
- plot(plotter=None)¶
Perform plot with data of current dataset.
Every plotter is an object of type
aspecd.plotting.Plotter
and is passed as an argument toplot()
.The information necessary to reproduce a plot is stored in the
representations
attribute as object of classaspecd.dataset.PlotHistoryRecord
. This record contains as well a (deep) copy of the complete history of the dataset stored inhistory
. Besides being a necessary prerequisite to reproduce a plot, this allows to automatically recreate plots requiring different incompatible preprocessing steps in arbitrary order.- Parameters
plotter (
aspecd.plotting.Plotter
) – plot to perform with data of current dataset- Returns
plotter – plot performed on the current dataset
- Return type
- Raises
aspecd.exceptions.MissingPlotterError – Raised when trying to plot without plotter
- tabulate(table=None)¶
Create table from data of current dataset.
Every table is an object of type
aspecd.table.Table
and is passed as an argument totabulate()
.The information necessary to reproduce a table is stored in the
representations
attribute as object of classaspecd.dataset.TableHistoryRecord
.- Parameters
table (
aspecd.table.Table
) – table created from the data of the current dataset- Returns
table – table created from the data of the current dataset
- Return type
aspecd.table.Table
- Raises
TypeError – Raised when trying to tabulate without table
- delete_representation(index=None)¶
Remove representation record from dataset.
- Parameters
index (int) – Number of analysis in analyses to delete
- load(filename=None)¶
Load dataset object from persistence layer.
The dataset will be loaded from a file conforming to the ASpecD dataset format (adf). For details, see the
aspecd.io.AdfExporter
class.
- save(filename=None)¶
Save dataset to persistence layer.
The dataset will be saved in ASpecD dataset format (adf). For details, see the
aspecd.io.AdfExporter
class.
- import_from(importer=None)¶
Import data and metadata contained in importer object.
This requires initialising an
aspecd.io.Importer
object first that is provided as an argument for this method.Note
The same operation can be performed by calling the
import_into()
method of anaspecd.io.Importer
object taking anaspecd.dataset.Dataset
object as argument.However, as usually one wants to continue working with a dataset, first creating an instance of a dataset and a respective importer and then calling
import_from()
of the dataset is the preferred way.- Parameters
importer (
aspecd.io.DatasetImporter
) – Importer containing data and metadata read from some source
- export_to(exporter=None)¶
Export data and metadata.
This requires initialising an
aspecd.io.DatasetImporter
object first that is provided as an argument for this method.Note
The same operation can be performed by calling the
export_from()
method of anaspecd.io.Exporter
object taking anaspecd.dataset.Dataset
object as argument.However, as usually the dataset is already at hand, first creating an instance of a respective exporter and then calling
export_to()
of the dataset is the preferred way.- Parameters
exporter (
aspecd.io.DatasetExporter
) – Exporter writing data and metadata to specific output format
- add_reference(dataset=None)¶
Add a reference to another dataset to the list of references.
A reference is always an object of type
aspecd.dataset.DatasetReference
that will be automatically created from the dataset provided.- Parameters
dataset (
aspecd.dataset.Dataset
) – dataset a reference for should be added to the list of references- Raises
aspecd.exceptions.MissingDatasetError – Raised if no dataset was provided
- remove_reference(dataset_id=None)¶
Remove a reference to another dataset from the list of references.
A reference is always an object of type
aspecd.dataset.DatasetReference
that was automatically created from the respective dataset when adding the reference.- Parameters
dataset_id (
string
) – ID of the dataset the reference should be removed for- Raises
aspecd.exceptions.MissingDatasetError – Raised if no dataset ID was provided
- from_dict(dict_=None)¶
Set properties from dictionary.
Only parameters in the dictionary that are valid properties of the class are set accordingly.
Note
In conjunction with the
aspecd.dataset.to_dict()
method, this method allows to serialise and deserialise dataset objects, i.e. all kinds of storage to the persistence layer.- Parameters
dict (
dict
) – Dictionary containing properties to set
- to_dict(remove_empty=False)¶
Create dictionary containing public attributes of an object.
- Parameters
remove_empty (
bool
) –Whether to remove keys with empty values
Default: False
- Returns
public_attributes – Ordered dictionary containing the public attributes of the object
The order of attribute definition is preserved
- Return type
Changed in version 0.6: New parameter remove_empty
Changed in version 0.9: Settings for properties to exclude and include are not traversed
- class aspecd.dataset.ExperimentalDataset¶
Bases:
aspecd.dataset.Dataset
Base class for experimental datasets.
The dataset is one of the core elements of the ASpecD framework, basically containing both, (numeric) data and corresponding metadata, aka information available about the data.
The public attributes of a dataset can be converted to a dict via
aspecd.utils.ToDictMixin.to_dict()
.- metadata¶
hierarchical key-value store of metadata
- add_reference(dataset=None)¶
Add a reference to another dataset to the list of references.
A reference is always an object of type
aspecd.dataset.DatasetReference
that will be automatically created from the dataset provided.- Parameters
dataset (
aspecd.dataset.Dataset
) – dataset a reference for should be added to the list of references- Raises
aspecd.exceptions.MissingDatasetError – Raised if no dataset was provided
- analyse(analysis_step=None)¶
Apply analysis to dataset.
Every analysis step is an object of type
aspecd.analysis.SingleAnalysisStep
and is passed as an argument toanalyse()
.The information necessary to reproduce an analysis is stored in the
analyses
attribute as object of classaspecd.dataset.AnalysisHistoryRecord
. This record contains as well a (deep) copy of the complete history of the dataset stored inhistory
.- Parameters
analysis_step (
aspecd.analysis.SingleAnalysisStep
) – analysis step to apply to the dataset- Returns
analysis_step – analysis step applied to the dataset
- Return type
- analyze(analysis_step=None)¶
Apply analysis to dataset.
Same method as
analyse()
, but for those preferring AE over BE.
- annotate(annotation_=None)¶
Add annotation to dataset.
- Parameters
annotation (
aspecd.annotation.DatasetAnnotation
) – annotation to add to the dataset
- append_history_record(history_record)¶
Append history record to dataset history.
This method should never be called manually, but only from within classes of the ASpecD framework, at least as long as you are not interested in Orwellian History.
- Parameters
history_record (
aspecd.history.HistoryRecord
) – History record (of a processing step) to be appended.
Changed in version 0.2: Converted into a public method, due to needs of
aspecd.processing.MultiProcessingStep
- delete_analysis(index=None)¶
Remove analysis step record from dataset.
- Parameters
index (int) – Number of analysis in analyses to delete
- delete_annotation(index=None)¶
Remove annotation record from dataset.
- Parameters
index (int) – Number of analysis in analyses to delete
- delete_representation(index=None)¶
Remove representation record from dataset.
- Parameters
index (int) – Number of analysis in analyses to delete
- export_to(exporter=None)¶
Export data and metadata.
This requires initialising an
aspecd.io.DatasetImporter
object first that is provided as an argument for this method.Note
The same operation can be performed by calling the
export_from()
method of anaspecd.io.Exporter
object taking anaspecd.dataset.Dataset
object as argument.However, as usually the dataset is already at hand, first creating an instance of a respective exporter and then calling
export_to()
of the dataset is the preferred way.- Parameters
exporter (
aspecd.io.DatasetExporter
) – Exporter writing data and metadata to specific output format
- from_dict(dict_=None)¶
Set properties from dictionary.
Only parameters in the dictionary that are valid properties of the class are set accordingly.
Note
In conjunction with the
aspecd.dataset.to_dict()
method, this method allows to serialise and deserialise dataset objects, i.e. all kinds of storage to the persistence layer.- Parameters
dict (
dict
) – Dictionary containing properties to set
- import_from(importer=None)¶
Import data and metadata contained in importer object.
This requires initialising an
aspecd.io.Importer
object first that is provided as an argument for this method.Note
The same operation can be performed by calling the
import_into()
method of anaspecd.io.Importer
object taking anaspecd.dataset.Dataset
object as argument.However, as usually one wants to continue working with a dataset, first creating an instance of a dataset and a respective importer and then calling
import_from()
of the dataset is the preferred way.- Parameters
importer (
aspecd.io.DatasetImporter
) – Importer containing data and metadata read from some source
- load(filename=None)¶
Load dataset object from persistence layer.
The dataset will be loaded from a file conforming to the ASpecD dataset format (adf). For details, see the
aspecd.io.AdfExporter
class.
- property package_name¶
Return package name.
The name of the package the dataset is implemented in is a crucial detail for writing the history. The value is set automatically and is read-only.
- plot(plotter=None)¶
Perform plot with data of current dataset.
Every plotter is an object of type
aspecd.plotting.Plotter
and is passed as an argument toplot()
.The information necessary to reproduce a plot is stored in the
representations
attribute as object of classaspecd.dataset.PlotHistoryRecord
. This record contains as well a (deep) copy of the complete history of the dataset stored inhistory
. Besides being a necessary prerequisite to reproduce a plot, this allows to automatically recreate plots requiring different incompatible preprocessing steps in arbitrary order.- Parameters
plotter (
aspecd.plotting.Plotter
) – plot to perform with data of current dataset- Returns
plotter – plot performed on the current dataset
- Return type
- Raises
aspecd.exceptions.MissingPlotterError – Raised when trying to plot without plotter
- process(processing_step=None)¶
Apply processing step to dataset.
Every processing step is an object of type
aspecd.processing.SingleProcessingStep
and is passed as argument toprocess()
.Calling this function ensures that the history record is added to the dataset as well as a few basic checks are performed such as for leading history, meaning that the
_history_pointer
is not set to the current tip of the history of the dataset. In this case, an error is raised.Note
If processing_step is undoable, all previous plots stored in the list of representations will be removed, as these plots cannot be reproduced due to a change in
_origdata
.- Parameters
processing_step (
aspecd.processing.SingleProcessingStep
) – processing step to apply to the dataset- Returns
processing_step – processing step applied to the dataset
- Return type
- Raises
aspecd.exceptions.ProcessingWithLeadingHistoryError – Raised when trying to process with leading history
- redo()¶
Reapply previously undone processing step.
- Raises
aspecd.exceptions.RedoAlreadyAtLatestChangeError – Raised when trying to redo with empty history
- remove_reference(dataset_id=None)¶
Remove a reference to another dataset from the list of references.
A reference is always an object of type
aspecd.dataset.DatasetReference
that was automatically created from the respective dataset when adding the reference.- Parameters
dataset_id (
string
) – ID of the dataset the reference should be removed for- Raises
aspecd.exceptions.MissingDatasetError – Raised if no dataset ID was provided
- save(filename=None)¶
Save dataset to persistence layer.
The dataset will be saved in ASpecD dataset format (adf). For details, see the
aspecd.io.AdfExporter
class.
- strip_history()¶
Remove leading history, if any.
If a dataset has a leading history, i.e., its history pointer does not point to the last entry of the history, and you want to perform a processing step on this very dataset, you need first to strip its history, as otherwise, a
ProcessingWithLeadingHistoryError
will be raised.
- tabulate(table=None)¶
Create table from data of current dataset.
Every table is an object of type
aspecd.table.Table
and is passed as an argument totabulate()
.The information necessary to reproduce a table is stored in the
representations
attribute as object of classaspecd.dataset.TableHistoryRecord
.- Parameters
table (
aspecd.table.Table
) – table created from the data of the current dataset- Returns
table – table created from the data of the current dataset
- Return type
aspecd.table.Table
- Raises
TypeError – Raised when trying to tabulate without table
- to_dict(remove_empty=False)¶
Create dictionary containing public attributes of an object.
- Parameters
remove_empty (
bool
) –Whether to remove keys with empty values
Default: False
- Returns
public_attributes – Ordered dictionary containing the public attributes of the object
The order of attribute definition is preserved
- Return type
Changed in version 0.6: New parameter remove_empty
Changed in version 0.9: Settings for properties to exclude and include are not traversed
- undo()¶
Revert last processing step.
Actually, the history pointer is decremented and starting from the
_origdata
, all processing steps are reapplied to the data up to this point in history.- Raises
aspecd.exceptions.UndoWithEmptyHistoryError – Raised when trying to undo with empty history
aspecd.exceptions.UndoAtBeginningOfHistoryError – Raised when trying to undo with history pointer at zero
aspecd.exceptions.UndoStepUndoableError – Raised when trying to undo an undoable step of history
- class aspecd.dataset.CalculatedDataset¶
Bases:
aspecd.dataset.Dataset
Base class for datasets containing calculated data.
The dataset is one of the core elements of the ASpecD framework, basically containing both, (numeric) data and corresponding metadata, aka information available about the data.
The public attributes of a dataset can be converted to a dict via
aspecd.utils.ToDictMixin.to_dict()
.- metadata¶
hierarchical key-value store of metadata
- add_reference(dataset=None)¶
Add a reference to another dataset to the list of references.
A reference is always an object of type
aspecd.dataset.DatasetReference
that will be automatically created from the dataset provided.- Parameters
dataset (
aspecd.dataset.Dataset
) – dataset a reference for should be added to the list of references- Raises
aspecd.exceptions.MissingDatasetError – Raised if no dataset was provided
- analyse(analysis_step=None)¶
Apply analysis to dataset.
Every analysis step is an object of type
aspecd.analysis.SingleAnalysisStep
and is passed as an argument toanalyse()
.The information necessary to reproduce an analysis is stored in the
analyses
attribute as object of classaspecd.dataset.AnalysisHistoryRecord
. This record contains as well a (deep) copy of the complete history of the dataset stored inhistory
.- Parameters
analysis_step (
aspecd.analysis.SingleAnalysisStep
) – analysis step to apply to the dataset- Returns
analysis_step – analysis step applied to the dataset
- Return type
- analyze(analysis_step=None)¶
Apply analysis to dataset.
Same method as
analyse()
, but for those preferring AE over BE.
- annotate(annotation_=None)¶
Add annotation to dataset.
- Parameters
annotation (
aspecd.annotation.DatasetAnnotation
) – annotation to add to the dataset
- append_history_record(history_record)¶
Append history record to dataset history.
This method should never be called manually, but only from within classes of the ASpecD framework, at least as long as you are not interested in Orwellian History.
- Parameters
history_record (
aspecd.history.HistoryRecord
) – History record (of a processing step) to be appended.
Changed in version 0.2: Converted into a public method, due to needs of
aspecd.processing.MultiProcessingStep
- delete_analysis(index=None)¶
Remove analysis step record from dataset.
- Parameters
index (int) – Number of analysis in analyses to delete
- delete_annotation(index=None)¶
Remove annotation record from dataset.
- Parameters
index (int) – Number of analysis in analyses to delete
- delete_representation(index=None)¶
Remove representation record from dataset.
- Parameters
index (int) – Number of analysis in analyses to delete
- export_to(exporter=None)¶
Export data and metadata.
This requires initialising an
aspecd.io.DatasetImporter
object first that is provided as an argument for this method.Note
The same operation can be performed by calling the
export_from()
method of anaspecd.io.Exporter
object taking anaspecd.dataset.Dataset
object as argument.However, as usually the dataset is already at hand, first creating an instance of a respective exporter and then calling
export_to()
of the dataset is the preferred way.- Parameters
exporter (
aspecd.io.DatasetExporter
) – Exporter writing data and metadata to specific output format
- from_dict(dict_=None)¶
Set properties from dictionary.
Only parameters in the dictionary that are valid properties of the class are set accordingly.
Note
In conjunction with the
aspecd.dataset.to_dict()
method, this method allows to serialise and deserialise dataset objects, i.e. all kinds of storage to the persistence layer.- Parameters
dict (
dict
) – Dictionary containing properties to set
- import_from(importer=None)¶
Import data and metadata contained in importer object.
This requires initialising an
aspecd.io.Importer
object first that is provided as an argument for this method.Note
The same operation can be performed by calling the
import_into()
method of anaspecd.io.Importer
object taking anaspecd.dataset.Dataset
object as argument.However, as usually one wants to continue working with a dataset, first creating an instance of a dataset and a respective importer and then calling
import_from()
of the dataset is the preferred way.- Parameters
importer (
aspecd.io.DatasetImporter
) – Importer containing data and metadata read from some source
- load(filename=None)¶
Load dataset object from persistence layer.
The dataset will be loaded from a file conforming to the ASpecD dataset format (adf). For details, see the
aspecd.io.AdfExporter
class.
- property package_name¶
Return package name.
The name of the package the dataset is implemented in is a crucial detail for writing the history. The value is set automatically and is read-only.
- plot(plotter=None)¶
Perform plot with data of current dataset.
Every plotter is an object of type
aspecd.plotting.Plotter
and is passed as an argument toplot()
.The information necessary to reproduce a plot is stored in the
representations
attribute as object of classaspecd.dataset.PlotHistoryRecord
. This record contains as well a (deep) copy of the complete history of the dataset stored inhistory
. Besides being a necessary prerequisite to reproduce a plot, this allows to automatically recreate plots requiring different incompatible preprocessing steps in arbitrary order.- Parameters
plotter (
aspecd.plotting.Plotter
) – plot to perform with data of current dataset- Returns
plotter – plot performed on the current dataset
- Return type
- Raises
aspecd.exceptions.MissingPlotterError – Raised when trying to plot without plotter
- process(processing_step=None)¶
Apply processing step to dataset.
Every processing step is an object of type
aspecd.processing.SingleProcessingStep
and is passed as argument toprocess()
.Calling this function ensures that the history record is added to the dataset as well as a few basic checks are performed such as for leading history, meaning that the
_history_pointer
is not set to the current tip of the history of the dataset. In this case, an error is raised.Note
If processing_step is undoable, all previous plots stored in the list of representations will be removed, as these plots cannot be reproduced due to a change in
_origdata
.- Parameters
processing_step (
aspecd.processing.SingleProcessingStep
) – processing step to apply to the dataset- Returns
processing_step – processing step applied to the dataset
- Return type
- Raises
aspecd.exceptions.ProcessingWithLeadingHistoryError – Raised when trying to process with leading history
- redo()¶
Reapply previously undone processing step.
- Raises
aspecd.exceptions.RedoAlreadyAtLatestChangeError – Raised when trying to redo with empty history
- remove_reference(dataset_id=None)¶
Remove a reference to another dataset from the list of references.
A reference is always an object of type
aspecd.dataset.DatasetReference
that was automatically created from the respective dataset when adding the reference.- Parameters
dataset_id (
string
) – ID of the dataset the reference should be removed for- Raises
aspecd.exceptions.MissingDatasetError – Raised if no dataset ID was provided
- save(filename=None)¶
Save dataset to persistence layer.
The dataset will be saved in ASpecD dataset format (adf). For details, see the
aspecd.io.AdfExporter
class.
- strip_history()¶
Remove leading history, if any.
If a dataset has a leading history, i.e., its history pointer does not point to the last entry of the history, and you want to perform a processing step on this very dataset, you need first to strip its history, as otherwise, a
ProcessingWithLeadingHistoryError
will be raised.
- tabulate(table=None)¶
Create table from data of current dataset.
Every table is an object of type
aspecd.table.Table
and is passed as an argument totabulate()
.The information necessary to reproduce a table is stored in the
representations
attribute as object of classaspecd.dataset.TableHistoryRecord
.- Parameters
table (
aspecd.table.Table
) – table created from the data of the current dataset- Returns
table – table created from the data of the current dataset
- Return type
aspecd.table.Table
- Raises
TypeError – Raised when trying to tabulate without table
- to_dict(remove_empty=False)¶
Create dictionary containing public attributes of an object.
- Parameters
remove_empty (
bool
) –Whether to remove keys with empty values
Default: False
- Returns
public_attributes – Ordered dictionary containing the public attributes of the object
The order of attribute definition is preserved
- Return type
Changed in version 0.6: New parameter remove_empty
Changed in version 0.9: Settings for properties to exclude and include are not traversed
- undo()¶
Revert last processing step.
Actually, the history pointer is decremented and starting from the
_origdata
, all processing steps are reapplied to the data up to this point in history.- Raises
aspecd.exceptions.UndoWithEmptyHistoryError – Raised when trying to undo with empty history
aspecd.exceptions.UndoAtBeginningOfHistoryError – Raised when trying to undo with history pointer at zero
aspecd.exceptions.UndoStepUndoableError – Raised when trying to undo an undoable step of history
- class aspecd.dataset.DatasetReference¶
Bases:
aspecd.utils.ToDictMixin
Reference to a given dataset.
Often, one dataset needs to reference other datasets. A typical example would be a simulation stored in a dataset of class
aspecd.dataset.CalculatedDataset
that needs to reference the corresponding experimental data, stored in a dataset of classaspecd.dataset.ExperimentalDataset
. Vice versa, the experimental dataset might want to store a reference to one (or more) simulations.As the dataset ID is not sufficient, both, the ID as well as the history of the dataset at the time the reference has been created gets stored in the reference and restored upon creating a (new) dataset. Hence, at least the data of the dataset returned should be identical to the data of the original dataset the reference has been created for.
- type¶
type of dataset
Will be inferred directly from dataset when creating a reference from a given dataset and is used to return a dataset of same type.
- Type
- Raises
aspecd.exceptions.MissingDatasetError – Raised if no dataset was provided when calling
from_dataset()
- from_dataset(dataset=None)¶
Create dataset reference from dataset.
- Parameters
dataset (
aspecd.dataset.Dataset
) – Dataset the reference should be created for- Raises
aspecd.exceptions.MissingDatasetError – Raised if no dataset was provided
- to_dataset()¶
Create (new) dataset from reference
The history stored will be applied to the newly created dataset, hence the dataset should be in the same state with respect to processing steps as the original dataset was upon creating the reference.
- Returns
dataset – Dataset with identical data to the one the reference has been created for
- Return type
- from_dict(dict_=None)¶
Set properties from dictionary.
Only parameters in the dictionary that are valid properties of the class are set accordingly.
- Parameters
dict (
dict
) – Dictionary containing properties to set
- to_dict(remove_empty=False)¶
Create dictionary containing public attributes of an object.
- Parameters
remove_empty (
bool
) –Whether to remove keys with empty values
Default: False
- Returns
public_attributes – Ordered dictionary containing the public attributes of the object
The order of attribute definition is preserved
- Return type
Changed in version 0.6: New parameter remove_empty
Changed in version 0.9: Settings for properties to exclude and include are not traversed
- class aspecd.dataset.DatasetFactory¶
Bases:
object
Factory for creating dataset objects based on the source provided.
Particularly in case of recipe-driven data analysis (c.f.
tasks
), there is a need to automatically retrieve datasets using nothing more than a source string that can be, e.g., a path or LOI.Packages derived from ASpecD should implement a
DatasetFactory
inheriting fromaspecd.dataset.DatasetFactory
and overriding the protected method_create_dataset()
. The only task of this protected method is to provide the correct dataset object, in most cases an instance of a class inheriting fromaspecd.dataset.ExperimentalDataset
.- importer_factory¶
ImporterFactory instance used for importing datasets
- Raises
aspecd.exceptions.MissingSourceError – Raised if no source is provided
aspecd.exceptions.MissingImporterFactoryError – Raised if no ImporterFactory is available
- get_dataset(source='', importer='', parameters=None)¶
Return dataset object for dataset specified by its source.
The import of data into the dataset is handled using an instance of
aspecd.io.DatasetImporterFactory
.The actual code for deciding which type of dataset to return in what case should be implemented in the non-public method
_create_dataset()
in any package based on the ASpecD framework.- Parameters
source (
str
) –string describing the source of the dataset
May be a filename or path, a URL/URI, a LOI, or similar
importer (
str
) –Name of the importer to use for importing the dataset
Default: ‘’
New in version 0.2.
parameters (
dict
) –Additional parameters for controlling the import
Default: None
New in version 0.2.
- Returns
dataset – Dataset object of appropriate class
- Return type
- Raises
aspecd.exceptions.MissingSourceError – Raised if no source is provided
aspecd.exceptions.MissingImporterFactoryError – Raised if no ImporterFactory is available
- class aspecd.dataset.Data(data=array([], dtype=float64), axes=None, calculated=False)¶
Bases:
aspecd.utils.ToDictMixin
Unit containing both, numeric data and corresponding axes.
The data class ensures consistency in terms of dimensions between numerical data and axes.
- Parameters
data (numpy.array) – Numerical data
axes (
list
) –List of objects of type
aspecd.dataset.Axis
The number of axes needs to be consistent with the dimensions of data.
Axes will be set automatically when setting data. Hence, the easiest is to first set data and only afterwards set axis values.
calculated (
bool
) – Indicator for the origin of the numerical data (calculation or experiment).
- calculated¶
Indicate whether numeric data are calculated rather than experimentally recorded
- Type
- Raises
aspecd.exceptions.AxesCountError – Raised if number of axes is inconsistent with data dimensions
aspecd.exceptions.AxesValuesInconsistentWithDataError – Raised if axes values are inconsistent with data
- property data¶
Get or set (numeric) data.
Note
If you set data that have different dimensions to the data previously stored in the dataset, the axes values will be set to an array with indices corresponding to the size of the respective data dimension. You will most probably assign proper axis values afterwards. On the other hand, all other information stored in the axis object will be retained, namely quantity, unit, and label.
- property axes¶
Get or set axes.
If you set axes, they will be checked for consistency with the data. Therefore, first set the data and only afterwards the axes, with values corresponding to the dimensions of the data.
- Raises
aspecd.exceptions.AxesCountError – Raised if number of axes is inconsistent with data dimensions
aspecd.exceptions.AxesValuesInconsistentWithDataError – Raised if axes values are inconsistent with data dimensions
- from_dict(dict_=None)¶
Set properties from dictionary, e.g., from serialised dataset.
Only parameters in the dictionary that are valid properties of the class are set accordingly.
The list of axes is handled appropriately.
- Parameters
dict (
dict
) – Dictionary containing properties to set
- to_dict(remove_empty=False)¶
Create dictionary containing public attributes of an object.
- Parameters
remove_empty (
bool
) –Whether to remove keys with empty values
Default: False
- Returns
public_attributes – Ordered dictionary containing the public attributes of the object
The order of attribute definition is preserved
- Return type
Changed in version 0.6: New parameter remove_empty
Changed in version 0.9: Settings for properties to exclude and include are not traversed
- class aspecd.dataset.Axis¶
Bases:
aspecd.utils.ToDictMixin
Axis for data in a dataset.
An axis contains always both, numerical values as well as the metadata necessary to create axis labels and to make sense of the numerical information.
- quantity¶
quantity of the numerical data, usually used as first part of an automatically generated axis label
- Type
string
- unit¶
unit of the numerical data, usually used as second part of an automatically generated axis label
- Type
string
- symbol¶
symbol for the quantity of the numerical data, usually used as first part of an automatically generated axis label
- Type
string
- label¶
manual label for the axis, particularly useful in cases where no quantity and unit are provided or should be overwritten.
- Type
string
Note
There are three alternative ways of writing axis labels, one with using the quantity name and the unit, one with using the quantity symbol and the unit, and one using both, quantity name and symbol, usually separated by comma. Quantity and unit shall always be separated by a slash. Which way you prefer is a matter of personal taste and given context.
- Raises
ValueError – Raised when trying to set axis values to another type than numpy array
IndexError – Raised when trying to set axis values to an array with more than one dimension. Raised if index does not have the same length as values.
- property values¶
Get or set the numerical axis values.
Values require to be a one-dimensional numpy array. Trying to set values to either a different type that cannot be converted to a numpy array or a numpy array with more than one dimension will raise a corresponding error.
- Raises
ValueError – Raised if axis values are of wrong type
IndexError – Raised if axis values are of wrong dimension, i.e. not a vector
- property index¶
Get or set the index corresponding to the axis values.
The index is a list of data labels for each element in the axis values, similar to the index in a
pandas.Series
. However, in contrast to pandas, usually the index is a list of empty strings.The main reason for introducing the index is to allow for tabular representation of (calculated) datasets, e.g. as a result of an analysis, either including multiple values or spanning multiple datasets.
- Raises
IndexError – Raised if index does not have the same length as values
New in version 0.5.
- property equidistant¶
Return whether the axes values are equidistant.
True if the axis values are equidistant, False otherwise. None in case of no axis values.
The property is set automatically if axis values are set and therefore read-only.
While simple plotting of data values against non-uniform axes with non-equidistant values is usually straightforward, many processing steps rely on equidistant axis values in their simplest possible implementation.
- from_dict(dict_=None)¶
Set properties from dictionary, e.g., from serialised dataset.
Only parameters in the dictionary that are valid properties of the class are set accordingly.
- Parameters
dict (
dict
) – Dictionary containing properties to set
- to_dict(remove_empty=False)¶
Create dictionary containing public attributes of an object.
- Parameters
remove_empty (
bool
) –Whether to remove keys with empty values
Default: False
- Returns
public_attributes – Ordered dictionary containing the public attributes of the object
The order of attribute definition is preserved
- Return type
Changed in version 0.6: New parameter remove_empty
Changed in version 0.9: Settings for properties to exclude and include are not traversed
- class aspecd.dataset.DeviceData¶
Bases:
aspecd.dataset.Data
Additional data from devices recorded parallel to the actual data.
The dataset concept (see
aspecd.dataset.Dataset
) rests on the assumption that there is one particular set of data that can be regarded as the actual or primary data of the dataset. However, in many cases, parallel to these actual data, other data are recorded as well, be it readouts from monitors or alike.Usually, these additional data will share one axis with the primary data of the dataset. However, this is not necessarily the case. Furthermore, one dataset may contain an arbitrary number of additional device data entries.
Technically speaking,
aspecd.dataset.DeviceData
are a special or extended type ofaspecd.dataset.Data
, i.e. a unit containing both numerical data and corresponding axes. However, this class extends that with metadata specific for the device the additional data have been recorded with. Why storing metadata here and not in theaspecd.dataset.Dataset.metadata
property? The latter is more concerned with an overall description of the experimental setup in sufficient detail, while the metadata contained in this class are more device-specific. Potential contents of the metadata here are internal device IDs, addresses for communication, and alike. Eventually, the metadata contained herein are those that can be relevant mainly for debugging purposes or sanity checks of experiments.Example
A real example for additional data recorded in spectroscopy comes from time-resolved EPR (tr-EPR) spectroscopy: Here, you usually record 2D data as function of magnetic field and time, i.e. a full time profile per magnetic field point. As this is a non-standard method, often setups are controlled by lab-written software and allow for monitoring parameters not usually recorded with commercial setups. In this particular case, this can be the time stamp and microwave frequency for each individual recorded time trace, and the Python trEPR package not only handles tr-EPR data, but is capable of dealing with both additional types of data for analysis purposes.
- metadata¶
Metadata of the device used to record the additional data
Note
For actual devices you will want to create dedicated classes inheriting from
aspecd.metadata.Device
and extending the available attributes.
- calculated¶
Indicator for the origin of the numerical data (calculation or experiment).
Default: False
- Type
New in version 0.9.
- property axes¶
Get or set axes.
If you set axes, they will be checked for consistency with the data. Therefore, first set the data and only afterwards the axes, with values corresponding to the dimensions of the data.
- Raises
aspecd.exceptions.AxesCountError – Raised if number of axes is inconsistent with data dimensions
aspecd.exceptions.AxesValuesInconsistentWithDataError – Raised if axes values are inconsistent with data dimensions
- property data¶
Get or set (numeric) data.
Note
If you set data that have different dimensions to the data previously stored in the dataset, the axes values will be set to an array with indices corresponding to the size of the respective data dimension. You will most probably assign proper axis values afterwards. On the other hand, all other information stored in the axis object will be retained, namely quantity, unit, and label.
- from_dict(dict_=None)¶
Set properties from dictionary, e.g., from serialised dataset.
Only parameters in the dictionary that are valid properties of the class are set accordingly.
The list of axes is handled appropriately.
- Parameters
dict (
dict
) – Dictionary containing properties to set
- to_dict(remove_empty=False)¶
Create dictionary containing public attributes of an object.
- Parameters
remove_empty (
bool
) –Whether to remove keys with empty values
Default: False
- Returns
public_attributes – Ordered dictionary containing the public attributes of the object
The order of attribute definition is preserved
- Return type
Changed in version 0.6: New parameter remove_empty
Changed in version 0.9: Settings for properties to exclude and include are not traversed