You're reading the documentation for a development version. For the latest released version, please have a look at v0.11.
aspecd.metadata module
Metadata: Information on numeric data stored in a structured way.
Metadata are one key concept of the ASpecD framework, and they come in different flavours. Perhaps the easiest to grasp is metadata that accompany measurements—and are often stored separately from the data in metadata files. Other types of metadata are those of processing steps or representations. This module is only concerned with the first kind of metadata, information accompanying data and usually recorded at the same time as the numeric data.
Each aspecd.dataset.Dataset
contains a
aspecd.dataset.Dataset.metadata
attribute that is in turn an object
of class aspecd.metadata.DatasetMetadata
. This latter object is
composed of different metadata objects each inheriting from
aspecd.metadata.Metadata
. Upon import of a dataset, the importer
class needs to make sure that as many metadata as possible are read and
imported to the dataset as well.
Generally speaking, metadata can be thought of as key–value stores that might be hierarchically structured and thus cascaded. Nevertheless, classes have some advantages over using simple dictionaries, as there are certain operations that are common to some or all types of metadata.
Metadata classes
The classes implemented in this module can be grouped into general metadata classes, concrete metadata classes, and metadata classes for datasets. Each will be described shortly below.
General metadata classes
The most basic class is aspecd.metadata.PhysicalQuantity
storing
all relevant information about a physical quantity in an easily accessible
way, eventually allowing to test for commensurable quantities and
converting between units.
Next is aspecd.metadata.Metadata
as a generic class for all
metadata containers. All other classes storing metadata, particularly those
storing metadata accompanying measurements and therefore ending up in the
metadata of a aspecd.dataset.Dataset
, should inherit from this class.
Concrete metadata classes
Currently, three classes for actual metadata of experimental datasets and
one class for calculated datasets are contained in the ASpecD framework,
namely aspecd.metadata.Measurement
for storing
general information about a given measurement,
aspecd.metadata.Sample
for all information regarding the sample
investigated, and aspecd.metadata.TemperatureControl
for
information about the temperature control (including whether temperature has
been actively controlled at all during the measurement) for experimental
datasets and aspecd.metadata.Calculation
for storing details about
the calculation underlying the (numeric) data for calculated datasets.
Metadata classes for datasets
The attribute metadata in the aspecd.dataset.ExperimentalDataset
is of type aspecd.metadata.ExperimentalDatasetMetadata
and
contains the three metadata classes for experimental datasets named above.
Derived packages should extend this class accordingly.
Similarly, the attribute metadata in the
aspecd.dataset.CalculatedDataset
is of type
aspecd.metadata.CalculatedDatasetMetadata
and contains the respective
metadata class named above. Derived packages should extend this class
accordingly wherever necessary.
Converting metadata from and to dictionaries
All classes inheriting from aspecd.metadata.Metadata
provide a
method from_dict()
allowing to set the attributes of the objects. This
allows for easy use with metadata read from a file into a dict.
Similiarly, all classes inheriting from aspecd.metadata.Metadata
as
well as aspecd.metadata.PhysicalQuantity
provide a method
to_dict()
that returns a dictionary of all public attributes of the
respective object. This allows to write metadata to a file.
Mapping metadata read from external sources
Generally, the representation and structure of metadata within the dataset
of the ASpecD framework and each application derived from it is separate
from the way the very same metadata are organised in files written mostly
during data acquisition. To map the structure obtained by reading a
metadata file to the internal representation within the dataset, as given
by the aspecd.metadata.ExperimentalDatasetMetadata
class,
there exists a generic mapper class aspecd.metadata.MetadataMapper
.
This way, you can separate the representations of metadata and support
mapping for different versions of metadata files.
Note
As mappings can become quite complicated and specifying lists of
mappings for the aspecd.metadata.MetadataMapper
by hand can
become quite tedious, you can specify metadata mapping recipes in YAML
files in a rather simple syntax. See the documentation of the
aspecd.metadata.MetadataMapper
class and its
aspecd.metadata.MetadataMapper.create_mappings()
method for details.
This method and the underlying ideas are heavily based on concepts and code developed by J. Popp for use within the trEPR Python package.
Metadata in packages based on the ASpecD framework
The dataset as unit of (numerical) data and metadata is a key concept of the ASpecD framework and a necessary prerequisite for a semantic understanding within the routines. Every measurement (or calculation) produces (raw) data that are useless without additional information, such as experimental parameters. This additional information is termed “metadata” within the ASpecD framework.
Additionally to combining numerical data and metadata, a dataset provides a common structure, unifying the different file formats used as source for both, data and metadata. Hence, the actual data format does not matter, greatly facilitating dealing with data from different sources (and even different kinds of data).
Therefore, if you develop a new package based on the ASpecD framework, one of the first and most important steps is to create a (hierarchical) metadata structure for your datasets. This requires a thorough understanding of the spectroscopic method you develop the package for and most probably several years of practical experience in the lab. Good sources of inspiration are the vendor file formats usually storing instrument parameters and alike in some form. If you are lucky, you can actually access this information. If not, you may need to store these metadata in an additional external file that gets written manually during data recording.
Some basic metadata that are rarely contained within vendor file formats,
as they concern the actual sample measured, as well as metadata for
calculated datasets can be found in the
aspecd.metadata.ExperimentalDatasetMetadata
and
aspecd.metadata.CalculatedDatasetMetadata
.
Module documentation
- class aspecd.metadata.PhysicalQuantity(string=None, value=None, unit=None)
Bases:
ToDictMixin
Class for storing all relevant informations about a physical quantity.
A physical quantity, Q, consists always of a value, {Q} and a corresponding unit, [Q], hence:
Q = {Q} [Q] .
See, e.g., the “IUPAC Green Book” for further details.
To get a string representation of a physical quantity, i.e., its value and unit separated by a single space, simply use
str()
.To set value (and unit) from a string, either use
from_string()
or supply the string as argument while instantiating the object. Make sure the value part of the string to be convertible to float. Otherwise, aValueError
will be raised.- Parameters:
string (
str
) –String containing value and unit, separated by whitespace
Will be used to set value and unit correspondingly.
If no second element separated by whitespace is present, only
value
will be set.value (float) – Numerical value
unit (
str
) –String containing the unit of the corresponding value.
SI units are preferred, and their abbreviations should be used.
- Raises:
ValueError – Raised if value is not a float
- property value
Get or set the value of a physical quantity.
A value is always a float.
- Raises:
ValueError – Raised if value is not a (scalar) float
- commensurable(physical_quantity=None)
Check whether two physical quantities are commensurable.
There are two criteria for physical quantities to be commensurable. Either they have the same unit, or they have the same dimension. In the latter case, a unit conversion is generally possible.
- Parameters:
physical_quantity (
aspecd.metadata.PhysicalQuantity
) – physical quantity to test commensurability with- Returns:
commensurable – True if both physical quantities have the same unit or dimension, False otherwise
- Return type:
- from_string(string)
Set value and unit from string.
- from_dict(dict_=None)
Set properties from dictionary.
Only parameters in the dictionary that are valid properties of the class are set accordingly.
- Parameters:
dict (
dict
) – Dictionary containing properties to set
- to_dict(remove_empty=False)
Create dictionary containing public attributes of an object.
- Parameters:
remove_empty (
bool
) –Whether to remove keys with empty values
Default: False
- Returns:
public_attributes – Ordered dictionary containing the public attributes of the object
The order of attribute definition is preserved
- Return type:
Changed in version 0.6: New parameter remove_empty
Changed in version 0.9: Settings for properties to exclude and include are not traversed
Changed in version 0.9.1: Dictionaries get copied before traversing, as otherwise, the special variables
__dict__
and__0dict__
are modified, what may result in strange behaviour.Changed in version 0.9.2: Dictionaries do not get copied by default, but there is a private method that can be overridden in derived classes to copy the dictionary.
- class aspecd.metadata.Metadata(dict_=None)
Bases:
ToDictMixin
General metadata class.
Metadata can be set from dict upon initialisation.
Metadata can be converted to dict via
aspecd.utils.ToDictMixin.to_dict()
.- from_dict(dict_=None)
Set properties from dictionary, e.g., from metadata.
Only parameters in the dictionary that are valid properties of the class are set accordingly.
Keys in the dictionary are converted to lower case and spaces converted to underscores to fit the naming scheme of attributes.
If a property is of class
aspecd.metadata.PhysicalQuantity
, it is set accordingly.- Parameters:
dict (
dict
) – Dictionary containing properties to set
- to_dict(remove_empty=False)
Create dictionary containing public attributes of an object.
- Parameters:
remove_empty (
bool
) –Whether to remove keys with empty values
Default: False
- Returns:
public_attributes – Ordered dictionary containing the public attributes of the object
The order of attribute definition is preserved
- Return type:
Changed in version 0.6: New parameter remove_empty
Changed in version 0.9: Settings for properties to exclude and include are not traversed
Changed in version 0.9.1: Dictionaries get copied before traversing, as otherwise, the special variables
__dict__
and__0dict__
are modified, what may result in strange behaviour.Changed in version 0.9.2: Dictionaries do not get copied by default, but there is a private method that can be overridden in derived classes to copy the dictionary.
- class aspecd.metadata.TemperatureControl(dict_=None, temperature='')
Bases:
Metadata
Temperature control is very often found in spectroscopy.
This class provides general means of storing relevant parameters for temperature control.
- temperature
value and unit of the temperature set
- property controlled
Has temperature been actively controlled during measurement?
Read-only property automatically set when setting a temperature value.
- from_dict(dict_=None)
Set properties from dictionary, e.g., from metadata.
Only parameters in the dictionary that are valid properties of the class are set accordingly.
If “controlled” is set to False in the dictionary, the temperature value and unit will be cleared.
The value of the temperature key needs to be a string.
- Parameters:
dict (
dict
) – Dictionary with keys corresponding to properties of the class.
- to_dict(remove_empty=False)
Create dictionary containing public attributes of an object.
- Parameters:
remove_empty (
bool
) –Whether to remove keys with empty values
Default: False
- Returns:
public_attributes – Ordered dictionary containing the public attributes of the object
The order of attribute definition is preserved
- Return type:
Changed in version 0.6: New parameter remove_empty
Changed in version 0.9: Settings for properties to exclude and include are not traversed
Changed in version 0.9.1: Dictionaries get copied before traversing, as otherwise, the special variables
__dict__
and__0dict__
are modified, what may result in strange behaviour.Changed in version 0.9.2: Dictionaries do not get copied by default, but there is a private method that can be overridden in derived classes to copy the dictionary.
- class aspecd.metadata.Measurement(dict_=None)
Bases:
Metadata
General information available for each type of measurement.
- start
Date and time of start of measurement
- Type:
datetime
- end
Date and time of end of measurement
- Type:
datetime
- operator
Name of the operator performing the measurement Beware of the implications for privacy protection
- Type:
- Parameters:
dict (
dict
) – Dictionary containing fields corresponding to attributes of the class
- from_dict(dict_=None)
Set properties from dictionary, e.g., from metadata.
Only parameters in the dictionary that are valid properties of the class are set accordingly.
For the “start” and “end” items, there are two different conventions available how the dictionary can be structured. Either those fields are dictionaries themselves, with fields “date” and “time” accordingly, such as:
{"start": {"date": "yyyy-mm-dd", "time": "HH:MM:SS"}, "end": {"date": "yyyy-mm-dd", "time": "HH:MM:SS"}}
Alternatively, those fields can be strings containing a representation of both, date and time:
{"start": "yyyy-mm-dd HH:MM:SS", "end": "yyyy-mm-dd HH:MM:SS"}
Use whichever is more appropriate for you.
- Parameters:
dict (
dict
) – Dictionary with keys corresponding to properties of the class.
- to_dict(remove_empty=False)
Create dictionary containing public attributes of an object.
- Parameters:
remove_empty (
bool
) –Whether to remove keys with empty values
Default: False
- Returns:
public_attributes – Ordered dictionary containing the public attributes of the object
The order of attribute definition is preserved
- Return type:
Changed in version 0.6: New parameter remove_empty
Changed in version 0.9: Settings for properties to exclude and include are not traversed
Changed in version 0.9.1: Dictionaries get copied before traversing, as otherwise, the special variables
__dict__
and__0dict__
are modified, what may result in strange behaviour.Changed in version 0.9.2: Dictionaries do not get copied by default, but there is a private method that can be overridden in derived classes to copy the dictionary.
- class aspecd.metadata.Sample(dict_=None)
Bases:
Metadata
Information on the sample measured.
- Parameters:
dict (
dict
) – Dictionary containing fields corresponding to attributes of the class
- from_dict(dict_=None)
Set properties from dictionary, e.g., from metadata.
Only parameters in the dictionary that are valid properties of the class are set accordingly.
Keys in the dictionary are converted to lower case and spaces converted to underscores to fit the naming scheme of attributes.
If a property is of class
aspecd.metadata.PhysicalQuantity
, it is set accordingly.- Parameters:
dict (
dict
) – Dictionary containing properties to set
- to_dict(remove_empty=False)
Create dictionary containing public attributes of an object.
- Parameters:
remove_empty (
bool
) –Whether to remove keys with empty values
Default: False
- Returns:
public_attributes – Ordered dictionary containing the public attributes of the object
The order of attribute definition is preserved
- Return type:
Changed in version 0.6: New parameter remove_empty
Changed in version 0.9: Settings for properties to exclude and include are not traversed
Changed in version 0.9.1: Dictionaries get copied before traversing, as otherwise, the special variables
__dict__
and__0dict__
are modified, what may result in strange behaviour.Changed in version 0.9.2: Dictionaries do not get copied by default, but there is a private method that can be overridden in derived classes to copy the dictionary.
- class aspecd.metadata.Calculation(dict_=None)
Bases:
Metadata
Information on the calculation.
- Parameters:
dict (
dict
) – Dictionary containing fields corresponding to attributes of the class
- from_dict(dict_=None)
Set properties from dictionary, e.g., from metadata.
Only parameters in the dictionary that are valid properties of the class are set accordingly.
Keys in the dictionary are converted to lower case and spaces converted to underscores to fit the naming scheme of attributes.
If a property is of class
aspecd.metadata.PhysicalQuantity
, it is set accordingly.- Parameters:
dict (
dict
) – Dictionary containing properties to set
- to_dict(remove_empty=False)
Create dictionary containing public attributes of an object.
- Parameters:
remove_empty (
bool
) –Whether to remove keys with empty values
Default: False
- Returns:
public_attributes – Ordered dictionary containing the public attributes of the object
The order of attribute definition is preserved
- Return type:
Changed in version 0.6: New parameter remove_empty
Changed in version 0.9: Settings for properties to exclude and include are not traversed
Changed in version 0.9.1: Dictionaries get copied before traversing, as otherwise, the special variables
__dict__
and__0dict__
are modified, what may result in strange behaviour.Changed in version 0.9.2: Dictionaries do not get copied by default, but there is a private method that can be overridden in derived classes to copy the dictionary.
- class aspecd.metadata.Device(dict_=None)
Bases:
Metadata
Information on the device contributing device data.
The dataset concept (see
aspecd.dataset.Dataset
) rests on the assumption that there is one particular set of data that can be regarded as the actual or primary data of the dataset. However, in many cases, parallel to these actual data, other data are recorded as well, be it readouts from monitors or alike.This class contains the metadata of the corresponding devices whose data are of type
aspecd.dataset.DeviceData
. That class contains an attributeaspecd.dataset.DeviceData.metadata
.Note
You will usually need to implement derived classes for concrete devices, as this class only contains a minimum set of attributes.
- Parameters:
dict (
dict
) – Dictionary containing fields corresponding to attributes of the class
New in version 0.9.
- from_dict(dict_=None)
Set properties from dictionary, e.g., from metadata.
Only parameters in the dictionary that are valid properties of the class are set accordingly.
Keys in the dictionary are converted to lower case and spaces converted to underscores to fit the naming scheme of attributes.
If a property is of class
aspecd.metadata.PhysicalQuantity
, it is set accordingly.- Parameters:
dict (
dict
) – Dictionary containing properties to set
- to_dict(remove_empty=False)
Create dictionary containing public attributes of an object.
- Parameters:
remove_empty (
bool
) –Whether to remove keys with empty values
Default: False
- Returns:
public_attributes – Ordered dictionary containing the public attributes of the object
The order of attribute definition is preserved
- Return type:
Changed in version 0.6: New parameter remove_empty
Changed in version 0.9: Settings for properties to exclude and include are not traversed
Changed in version 0.9.1: Dictionaries get copied before traversing, as otherwise, the special variables
__dict__
and__0dict__
are modified, what may result in strange behaviour.Changed in version 0.9.2: Dictionaries do not get copied by default, but there is a private method that can be overridden in derived classes to copy the dictionary.
- class aspecd.metadata.DatasetMetadata
Bases:
ToDictMixin
Metadata for dataset.
This class contains the minimal set of metadata for a dataset.
Metadata of actual datasets should extend this class by adding properties that are themselves classes inheriting from
aspecd.metadata.Metadata
.Metadata can be converted to dict via
aspecd.utils.ToDictMixin.to_dict()
, e.g., for generating reports using templates and template engines.- from_dict(dict_=None)
Set properties from dictionary, e.g., from metadata.
Only parameters in the dictionary that are valid properties of the class are set accordingly.
Keys in the dictionary are converted to lower case and spaces converted to underscores to fit the naming scheme of attributes.
- Parameters:
dict (
dict
) –Dictionary with metadata.
Each key of this dictionary corresponds to a class attribute and is in itself a dictionary with the correct set of attributes for the particular class.
- to_dict(remove_empty=False)
Create dictionary containing public attributes of an object.
- Parameters:
remove_empty (
bool
) –Whether to remove keys with empty values
Default: False
- Returns:
public_attributes – Ordered dictionary containing the public attributes of the object
The order of attribute definition is preserved
- Return type:
Changed in version 0.6: New parameter remove_empty
Changed in version 0.9: Settings for properties to exclude and include are not traversed
Changed in version 0.9.1: Dictionaries get copied before traversing, as otherwise, the special variables
__dict__
and__0dict__
are modified, what may result in strange behaviour.Changed in version 0.9.2: Dictionaries do not get copied by default, but there is a private method that can be overridden in derived classes to copy the dictionary.
- class aspecd.metadata.ExperimentalDatasetMetadata
Bases:
DatasetMetadata
Metadata for an experimental dataset.
This class contains the minimal set of metadata for an experimental dataset, i.e.,
aspecd.dataset.ExperimentalDataset
.Metadata of actual datasets should extend this class by adding properties that are themselves classes inheriting from
aspecd.metadata.Metadata
.Metadata can be converted to dict via
aspecd.utils.ToDictMixin.to_dict()
, e.g., for generating reports using templates and template engines.- measurement
Metadata of measurement
- sample
Metadata of sample
- Type:
- temperature_control
Metadata of temperature control
- from_dict(dict_=None)
Set properties from dictionary, e.g., from metadata.
Only parameters in the dictionary that are valid properties of the class are set accordingly.
Keys in the dictionary are converted to lower case and spaces converted to underscores to fit the naming scheme of attributes.
- Parameters:
dict (
dict
) –Dictionary with metadata.
Each key of this dictionary corresponds to a class attribute and is in itself a dictionary with the correct set of attributes for the particular class.
- to_dict(remove_empty=False)
Create dictionary containing public attributes of an object.
- Parameters:
remove_empty (
bool
) –Whether to remove keys with empty values
Default: False
- Returns:
public_attributes – Ordered dictionary containing the public attributes of the object
The order of attribute definition is preserved
- Return type:
Changed in version 0.6: New parameter remove_empty
Changed in version 0.9: Settings for properties to exclude and include are not traversed
Changed in version 0.9.1: Dictionaries get copied before traversing, as otherwise, the special variables
__dict__
and__0dict__
are modified, what may result in strange behaviour.Changed in version 0.9.2: Dictionaries do not get copied by default, but there is a private method that can be overridden in derived classes to copy the dictionary.
- class aspecd.metadata.CalculatedDatasetMetadata
Bases:
DatasetMetadata
Metadata for a dataset with calculated data.
This class contains the minimal set of metadata for a dataset consisting of calculated data, i.e.,
aspecd.dataset.CalculatedDataset
.Metadata of actual datasets should extend this class by adding properties that are themselves classes inheriting from
aspecd.metadata.Metadata
.Metadata can be converted to dict via
aspecd.utils.ToDictMixin.to_dict()
, e.g., for generating reports using templates and template engines.- calculation
Metadata of calculation underlying the numeric data
- from_dict(dict_=None)
Set properties from dictionary, e.g., from metadata.
Only parameters in the dictionary that are valid properties of the class are set accordingly.
Keys in the dictionary are converted to lower case and spaces converted to underscores to fit the naming scheme of attributes.
- Parameters:
dict (
dict
) –Dictionary with metadata.
Each key of this dictionary corresponds to a class attribute and is in itself a dictionary with the correct set of attributes for the particular class.
- to_dict(remove_empty=False)
Create dictionary containing public attributes of an object.
- Parameters:
remove_empty (
bool
) –Whether to remove keys with empty values
Default: False
- Returns:
public_attributes – Ordered dictionary containing the public attributes of the object
The order of attribute definition is preserved
- Return type:
Changed in version 0.6: New parameter remove_empty
Changed in version 0.9: Settings for properties to exclude and include are not traversed
Changed in version 0.9.1: Dictionaries get copied before traversing, as otherwise, the special variables
__dict__
and__0dict__
are modified, what may result in strange behaviour.Changed in version 0.9.2: Dictionaries do not get copied by default, but there is a private method that can be overridden in derived classes to copy the dictionary.
- class aspecd.metadata.MetadataMapper
Bases:
object
Mapper for metadata.
Allows to convert a dictionary containing metadata read, e.g., from a metadata file to a dictionary that corresponds to the internal structure of the metadata in a dataset stored in
aspecd.metadata.ExperimentalDatasetMetadata
.If all you need is to convert the dictionary keys to proper variable names conforming to the naming scheme proposed by PEP 8, you may simply use the method
keys_to_variable_names()
.Tasks that can be currently performed to map a dictionary to the internal structure of the metadata representation in a dataset contain renaming of keys via
rename_key()
and combining items viacombine_items()
as well as copying keys viacopy_key()
and moving items viamove_item()
.Rather than performing the mappings by hand, calling these methods repeatedly, you may use a mapping table contained in the
mappings
attribute. If you pre-define such mapping tables, you can easily apply different mappings depending on the version of your original metadata structure. Once you assigned the appropriate mapping table to themappings
attribute, simply callmap()
. If everything turns out well, this should map your metadata contained inmetadata
according to the mapping table contained inmappings()
. Finally, you may want to assing this converted data structure to your dataset’s metadata attribute, usingExperimentalDatasetMetadata.from_dict()
.As it is often tedious to manually create the entries of the mapping table residing in
mappings
, you can use mapping recipes stored in YAML files together with thecreate_mappings()
method. For details of the structure of the mapping recipe YAML files, see the documentation of thecreate_mappings()
method. Note that you need to specify the filename for the mapping recipe used inrecipe_filename
as well as the version of the metadata file format inversion
to get this to work. Thecreate_mappings()
method and the underlying ideas are heavily based on concepts and code developed by J. Popp for use within the trEPR Python package.Note
The mapping recipes should be stored within the package, and as accessing files from within packages should not be done using regular fiile-system paths, but rather the respective functionality of the pkgutil package used. Therefore, internally, the
aspecd.utils.get_package_data()
function is used. In case of using theMetadataMapper
class from a package derived from ASpecD, prefix the name of the recipe file with the package, followed by the ‘@’ character. As an example, if you would want to use the recipe ‘mappings.yaml’ from within the package ‘trepr’, you would need to specifytrepr@mappings.yaml
as :attr`recipe_filename`.- mappings
Tasks to perform to map dictionary
Each task is a list containing three entries:
an optional key of a “sub-dictionary” to operate on
the action to carry out
a list containing the necessary parameters to carry out the action
For examples, see the documentation of the
map()
method.- Type:
- version
Version of the metadata to map
Particularly important when you use
create_mappings()
to create the mappings from mapping recipes stored in a YAML file.- Type:
- recipe_filename
Name of the YAML file containing the mapping recipes
Needs to be specified when you use
create_mappings()
to create the mappings from mapping recipes stored in a YAML file.- Type:
Examples
To actually use the mapper, you will usually create a file (in YAML format) containing the mappings. For details how this file may look like, see the
create_mappings()
method. Suppose you have saved your mappings to the filemappings.yaml
, with different mappings for the different versions of your formats. In this case, using the mapper may look similar to the following:mapper = aspecd.metadata.MetadataMapper() mapper.version = version_string mapper.metadata = dict_to_be_mapped mapper.recipe_filename = 'mappings.yaml' mapper.map() modified_dict = mapper.metadata
As you can see, the
modified_dict
contains the dictionary where the mappings frommappings.yaml
have been applied to.Changed in version 0.6: Recipe files now retrieved from package data now via
aspecd.utils.get_package_data()
- rename_key(old_key='', new_key='')
Rename key in dictionary.
Note that this method does not preserve the order of keys in an ordered dictionary.
- combine_items(old_keys=None, new_key='', pattern='')
Combine two items in a dictionary.
- keys_to_variable_names()
Convert keys in
metadata
to proper variable names.Variable names in Python should be all lower case, with words joined by underscores.
Due to recursively traversing through the
metadata
dictionary, conversion is performed for (near) arbitrary depth.
- copy_key(old_key='', new_key='')
Copy key in dictionary to new key.
This method is particularly useful in cases where keys need to be combined using
combine_keys()
, but where one of the keys should be combined several times with another key.
- move_item(key='', source_dict_name='', target_dict_name='', create_target_dict=False)
Move item (i.e., key-value pair) between dictionaries.
Note
If the target dictionary does not exist, usually the method will not create it and raise an appropriate exception. However, if explicitly told to create the target dictionary, it will do so. This is to prevent accidental typos from messing up with the dictionary and resulting in hard to track bugs.
- map()
Map according to mappings in
mappings
.Each mapping is defined as a list containing optionally a key for a sub-dictionary as first element, the method to be performed as second element, and the parameters for this method as third element.
An example for a mapping may look like this:
mapping = [['', 'rename_key', ['old', 'new']]]
This would rename the key
old
inmetadata
tonew
.To do the same for a key in a “sub-dictionary”, you may provide a mapping similar to the following:
mapping = [['test', 'rename_key', ['old', 'new']]]
This would rename the key
old
in the dictionarytest
inmetadata
tonew
. The same pattern optionally specifying a dictionary to operate on can be applied to all the other mappings detailed below.Similarly, you can join two items to a new item. In this case, a mapping may look like this:
mapping = [['', 'combine_items', [['key1', 'key2'], 'new']]]
This would join the values corresponding to the two keys
key1
andkey2
and assign them to the new keynew
. If you would like to join the values with a particular string, this can be done as well:mapping = [['', 'combine_items', [['key1', 'key2'], 'new', ' ']]]
Here, the two values will be joined using a space.
Sometimes you want to combine keys, but need one of the two keys several times. Hence, you would like to first copy this key to another one. This can be done in the following way:
mapping = [['', 'copy_key', ['old', 'new']]]
And finally, there are cases where you want to move an item from one dictionary to another. This can be done using the following mapping:
mapping = [['', 'move_item', ['key', 'source', 'target']]]
Here, “source” and “target” are the names of the respective dictionaries the item should be moved between. If the target dictionary does not exist, by default, the method will raise an exception. If, however, you decide to exactly know what you do, you can pass an additional parameter explicitly telling the method to create the target dictionary:
mapping = [['', 'move_item', ['key', 'source', 'target', True]]]
In this particular case, however, you are solely responsible for any typos when specifying the name of the target dictionary, as this will most probably mess up your dictionaries and result in hard to track bugs.
- create_mappings()
Create mappings from mapping recipe stored in YAML file.
Mapping recipes are stored in an external file (currently a YAML file whose filename is stored in
recipe_filename
) in their own format described hereafter. From this file, the recipes are read and converted into mappings in themappings
attribute.Based on the version number of the format the metadata from an external source are stored in, the correct recipe is selected.
Following is an example of a YAML file containing recipes. Each map can contain several types of mappings and the latter can contain several entries:
--- format: type: metadata mapper version: '0.1' map 1: metadata file versions: - 0.1.6 - 0.1.5 combine items: - old keys: ['Date start', 'Time start'] new key: start pattern: ' ' in dict: GENERAL rename key: - old key: GENERAL new key: measurement in dict: map 2: metadata file versions: - 0.1.4 copy key: - old key: Date new key: Date end in dict: GENERAL move item: - key: model source dict: measurement target dict: spectrometer - key: Runs source dict: measurement target dict: experiment create target: True
Unknown mappings are silently ignored. The difference between the two entries in
move item
is that in the latter case, the target dictionary will be created. Be careful with this option, as typos introduced in your mapping recipe will lead to hard-to-debug behaviour of your application. Seemove_item()
for details.Important
If you have version numbers with only one dot, you need to explicitly mark this as string in YAML, as otherwise, it will automatically be converted into a float and hence your version lookup will fail.
Generally, the YAML file should be pretty self-explanatory. For details of the different mappings, see the documentation of the respective methods of the class, namely
combine_items()
,rename_key()
,copy_key()
, andmove_item()
.Important
The sequence of operations can sometimes be crucial. They are called as follows: “copy key” -> “combine items” -> “rename key” -> “remove items”
Note that you can name the mappings called here
map 1
andmap 2
as you like. Use descriptive names wherever possible.A hint on the filenames for metadata recipe YAML files: Use descriptive names containing the format of the metadata files. For info files, something like
infofile_metadata_mappings.yaml
may be reasonable.This method and the underlying ideas are heavily based on concepts and code developed by J. Popp for use within the trEPR Python package.