You're reading an old version of this documentation. For up-to-date information, please have a look at v0.12.

aspecd.utils module

General purpose functions and classes used in other modules.

To avoid circular dependencies, this module does not depend on any other modules of the ASpecD package, but it can be imported into every other module.

aspecd.utils.full_class_name(object_)

Return full class name of an object including packages and modules.

Parameters:: object (object) – object the class name should be inferred for
Returns:: class_name – string with full class name of object
Return type:: str

aspecd.utils.object_from_class_name(full_class_name_string)

Create object from full class name.

To obtain the full class name of an object, you might want to use the function full_class_name()

Parameters:: full_class_name_string (str) – string with full class name of an object that shall be instantiated
Returns:: object_ – object instantiated from the class given in full_class_name_string
Return type:: object

class aspecd.utils.ToDictMixin

Bases: object

Mixin class for returning all public attributes as dict.

Sometimes there is the need to either exclude public attributes (in case of infinite loops created by trying to apply to_dict in this case) or to add (public) attributes, particularly those used by getters and setters that are otherwise not included.

To do so, there are two non_public attributes of this class each class inheriting from it will be able to set as well:

_exclude_from_to_dict
_include_in_to_dict

The names should be rather telling. For details, see below.

__odict__

Dictionary of attributes preserving the order of their definition

Type:: collections.OrderedDict

_exclude_from_to_dict

Names of (public) attributes to exclude from dictionary

Usually, the reason to exclude public attributes from being added to the dictionary is to avoid infinite loops, as sometimes an object may contain a reference to another object that in turn references back.

Type:: list

_include_in_to_dict

Names of (public) attributes to include into dictionary

Usual reasons for actively including (public) attributes into the dictionary are those attributes accessed by getters and setters and hence not automatically included in the list otherwise.

Type:: list

to_dict(remove_empty=False)

Create dictionary containing public attributes of an object.

Parameters:

remove_empty (bool) –

Whether to remove keys with empty values

Default: False

Returns:

public_attributes – Ordered dictionary containing the public attributes of the object

The order of attribute definition is preserved

Return type:

collections.OrderedDict

Changed in version 0.6: New parameter remove_empty

Changed in version 0.9: Settings for properties to exclude and include are not traversed

Changed in version 0.9.1: Dictionaries get copied before traversing, as otherwise, the special variables __dict__ and __0dict__ are modified, what may result in strange behaviour.

Changed in version 0.9.2: Dictionaries do not get copied by default, but there is a private method that can be overridden in derived classes to copy the dictionary.

aspecd.utils.get_aspecd_version()

Get version of ASpecD package.

The function directly reads the contents of the file VERSION in the root directory of the package installation, not relying on introspection.

Returns:: version – Version number as string
Return type:: str

aspecd.utils.package_version(name='')

Get version of arbitrary package.

The function relies on introspection using pkg_resources.

Parameters:: name (str) – Name of package the version should be obtained for
Returns:: version – Version number as string
Return type:: str

aspecd.utils.package_name(obj=None)

Get name of package an object resides in.

Parameters:

obj (object) –

Object the package it resides in should be returned.

If no object is given, the name of the package this function is defined in (“aspecd”) will be returned.

Returns:

package_name – Name of the package

Return type:

str

aspecd.utils.config_dir()

Get config directory for per-user configurations.

Configuration on a per-user level should be stored within a directory (only) readable by the currently logged-in user.

Returns:: config_dir – Path to config directory, usually in the user’s directory
Return type:: str

class aspecd.utils.Yaml

Bases: object

Handle reading from and writing to YAML files.

YAML file contents are read into an ordered dict, making use of the oyaml package. This preserves the order of the entries of any dict.

Note

The PyYAML package cannot handle NumPy arrays by default. Hence, there are two methods in this class for serialising and deserialising NumPy arrays:

aspecd.utils.Yaml.serialise_numpy_arrays() and
aspecd.utils.Yaml.deserialise_numpy_arrays().

Have a look at their documentation for details of the implementation and how to use them. In particular, larger NumPy arrays are saved to files using the NumPy binary format.

dict

Contents read from/written to a YAML file

Type:: collections.OrderedDict

numpy_array_size_threshold

Maximum size of NumPy array that is converted into a list using aspecd.utils.Yaml.serialise_numpy_arrays().

Larger NumPy arrays are saved to a file in NumPy format using numpy.save().

Default: 100 – i.e., arrays with >100 elements are stored in files.

Type:: int

numpy_array_to_list

Whether to convert (small) NumPy arrays to a list or to a dict

Default: False

Type:: class:bool

binary_files

List of names of the binary files containing large arrays

Type:: list

binary_directory

Directory the binary files should be stored in

Default: ‘’

Type:: str

loader

Type of loader used for loading the YAML file

The type of loader used is crucial for the safety of your application. See the documentation of the PyYAML package for details.

Default: yaml.SafeLoader

Type:: yaml.loader

dumper

Type of dumper used for loading the YAML file

The type of dumper used is should be compatible to the type of loader.

Default: yaml.SafeDumper

Type:: yaml.Dumper

Raises:: aspecd.utils.MissingFilenameError – Raised if no filename is given to read from/write to

Changed in version 0.4: Added numpy_array_to_list

Changed in version 0.5: Added dumper and set dumper to SafeDumper

read_from(filename='')

Read from YAML file.

Parameters:: filename (str) – Name of the YAML file to read from.
Raises:: aspecd.processing.MissingFilenameError – Raised if no filename is given to read from.

write_to(filename='')

Write to YAML file.

Parameters:: filename (str) – Name of the YAML file to write to.
Raises:: aspecd.processing.MissingFilenameError – Raised if no filename is given to write to.

read_stream(stream=None)

Read from stream.

Parameters:: stream (bytes) – binary stream to read from

write_stream()

Write to from stream.

Returns:: stream – string representation of YAML file
Return type:: str

serialise_numpy_arrays()

Serialise numpy arrays in a simple form, using a dict.

Background: The PyYAML package cannot easily handle NumPy arrays out of the box. However, datasets and alike should sometimes be serialised in form of YAML files. Hence the need for a simple method of serialising NumPy arrays. Furthermore, larger NumPy arrays should not be serialised in text form in YAML directly, but rather be stored in binary form, probably in a separate file.

The reason for saving (larger) NumPy arrays in binary rather than text form is twofold: The size of a binary file is much smaller, and the original precision is retained. Those interested in the representation of floating-point values in computers should consult:

David Goldberg. What every computer scientist should know about floating-point arithmetic. ACM Comput. Surv. 23 (1991):5–48. DOI:10.1145/103162.103163

As binary format, the NumPy format (see numpy.lib.format) gets used.

As filename, the SHA256 hash of the array will be used. Thus, the names are unique with respect to the content. In the event of having several identical arrays within one dict that gets serialised, this should not be a problem, as the hashes should be reasonably unique. I.e., identical files should have identical content. Thus, having several identical arrays will lead to less files written, eventually saving space and overall file size.

serialize_numpy_arrays()

Serialise numpy arrays for our AE speaking friends.

See aspecd.utils.Yaml.serialise_numpy_arrays() for details.

deserialise_numpy_arrays()

Deserialise specially crafted dicts into NumPy arrays.

For reasons given in the documentation of aspecd.utils.Yaml.serialise_numpy_arrays(), NumPy arrays are handled separately and converted into dicts with some special fields. This method deserialises them back into NumPy arrays, as they were originally.

deserialize_numpy_arrays()

Deserialise numpy arrays for our AE speaking friends.

See aspecd.utils.Yaml.deserialise_numpy_arrays() for details.

aspecd.utils.replace_value_in_dict(replacement=None, target=None)

Replace value for given key in a dictionary, traversing recursively.

The key in replacement needs to correspond to the value in target_dict that should be replaced. Keys in the dict replacement that have no corresponding value in target_dict are silently ignored.

Parameters:

replacement (dict) – dict containing key corresponding to the value in target_dict that should be replaced by the associated value
target (dict) – dict containing the key whose value should be replaced by the value of the key in replacement named identical to the value

Returns:

target – dict containing the key whose value has been replaced by the value of the corresponding key in replacement

Return type:

dict

aspecd.utils.copy_values_between_dicts(source=None, target=None)

Copy values between two dicts in case of identical keys.

Each value in source is copied to target in case of matching keys in a recursive manner. Non-matching keys in source are silently ignored.

Both, source and target are parallel recursively traversed in case of identical overall structure.

Parameters:

source (dict) – Dictionary the values should be copied from in case of matching keys
target (dict) – Dictionary the values should be copied to in case of matching keys

Returns:

target – Dictionary the values have been copied to in case of matching keys

Return type:

dict

aspecd.utils.copy_keys_between_dicts(source=None, target=None)

Copy keys between two dicts.

Each key in source is copied to target in a recursive manner. If the key in source is a dict and exists in target, the two dicts will be joined, not loosing keys in target.

If, however, the key in source is not a dict, but the corresponding key in target is a dict, the corresponding value in target will be replaced with that from source.

Parameters:

source (dict) – Dictionary the keys should be copied from
target (dict) – Dictionary the keys should be copied to

Returns:

target – Dictionary the keys have been copied to

Return type:

dict

aspecd.utils.remove_empty_values_from_dict(dict_)

Remove empty values and their keys from dict recursively.

Parameters:: dict (dict) – Dictionary the keys with empty values should be removed from
Returns:: dict_ – Dictionary the keys with empty values have been removed from
Return type:: dict

New in version 0.2.1.

aspecd.utils.convert_keys_to_variable_names(dict_)

Change keys in dict to comply with PEP8 for variable names.

Keys are converted to all lower case and spaces replaced with underscores.

Parameters:: dict (dict) – Dictionary the keys should be renamed in
Returns:: new_dict – Dictionary the keys have been renamed in
Return type:: dict

New in version 0.2.1.

aspecd.utils.all_equal(list_=None)

Check whether all elements of a list are equal.

Parameters:: list (list) – List whose elements should be checked for being equal
Returns:: result
Return type:: bool

class aspecd.utils.Properties

Bases: ToDictMixin

General properties class allowing to set properties from dict.

Properties classes often need to set their properties from dicts, and additionally, they should be able to convert themselves into dicts for persistence.

from_dict(dict_=None)

Set attributes from dictionary.

Parameters:: dict (dict) – Dictionary containing information of a task.
Raises:: aspecd.plotting.MissingDictError – Raised if no dict is provided.

get_properties()

Return (public) properties, i.e. attributes that are not methods.

Returns:: properties – public properties
Return type:: list

to_dict(remove_empty=False)

Create dictionary containing public attributes of an object.

Parameters:

remove_empty (bool) –

Whether to remove keys with empty values

Default: False

Returns:

public_attributes – Ordered dictionary containing the public attributes of the object

The order of attribute definition is preserved

Return type:

collections.OrderedDict

Changed in version 0.6: New parameter remove_empty

Changed in version 0.9: Settings for properties to exclude and include are not traversed

Changed in version 0.9.1: Dictionaries get copied before traversing, as otherwise, the special variables __dict__ and __0dict__ are modified, what may result in strange behaviour.

Changed in version 0.9.2: Dictionaries do not get copied by default, but there is a private method that can be overridden in derived classes to copy the dictionary.

aspecd.utils.basename(filename)

Return basename of given filename.

Parameters:: filename (str) – Name of the file (may contain absolute path and extension) the basename should be returned for.
Returns:: basename – Basename corresponding to the filename provided as input.
Return type:: str

aspecd.utils.path(filename)

Return path of given filename, with trailing separator.

Parameters:: filename (str) – Name of the file (may contain absolute path and extension) the path should be returned for.
Returns:: path – path corresponding to the filename provided as input.
Return type:: str

aspecd.utils.not_zero(value)

Return a value that is not zero to prevent DivisionByZero errors.

Dividing by zero results in NaN values and often hinders evaluating mathematical models. A solution adopted from the lmfit Python package (https://doi.org/10.5281/zenodo.598352) returns a value equivalent to the resolution of a numpy float.

Note

If you use this function excessively within a module, mostly within rather complicated mathematical equations, it might be a good idea to import this function explicitly, to shorten the code, such as: from aspecd.utils import not_zero. As usual, readability is king.

Parameters:: value (float) – Value that can become (too close to) zero to trigger NaN values
Returns:: value – Value guaranteed not to be zero
Return type:: float

New in version 0.3.

aspecd.utils.isiterable(variable)

Check whether the given variable is iterable.

Lists, tuples, NumPy arrays, but strings as well are iterable. Integers, however, are not.

Parameters:: variable – variable to check for being iterable
Returns:: answer – Whether the given variable is iterable
Return type:: bool

aspecd.utils.get_package_data(name='', directory='')

Obtain contents from a non-code file (“package data”).

A note on obtaining data from the distributed package: Rather than manually playing around with paths relative to the package root directory, contents of non-code files need to be obtained in a way that works with different kinds of package installation.

Note that in Python, only files within the package, i.e. within the directory where all the modules are located, can be accessed, not files that reside on the root directory of the package.

Note

In case you would want to get package data from a package different than aspecd, you can prefix the name of the file to retrieve with the package name, using the ‘@’ as a separator.

Suppose you would want to retrieve the file __main__.py file from the “pip” package (why should you?):

get_package_data('pip@__main__.py', directory='')

This would return the contents of this file.

Parameters:

name (str) –
Name of the file whose contents should be accessed.

In case the file should be retrieved from a different package, the package name can be prefixed, using ‘@’ as a separator.
directory (str) –
Directory within the package where the files are located.

Default: ‘’

Returns:

contents – String containing the contents of the non-code file.

Return type:

str

New in version 0.5.

aspecd.utils.change_working_dir(path='')

Context manager for temporarily changing the working directory.

Sometimes it is necessary to temporarily change the working directory, but one would like to ensure that the directory is reverted even in case an exception is raised.

Due to its nature as a context manager, this function can be used with a with statement. See below for an example.

Parameters:: path (str) – Path the current working directory should be changed to.

Examples

To temporarily change the working directory:

with change_working_dir(os.path.join('some', 'path')):
    # Do something that may raise an exception

This can come in quite handy in case of tests.

New in version 0.6.

aspecd.utils.get_logger(name='')

Get logger object for a given module.

Logging in libraries is slightly different from standard logging with respect to the handler attached to the logger. The general advice from the Python Logging HOWTO is to explicitly add the logging.Nullhandler as a handler.

Additionally, if you want to add loggers for a library that inherits from/builds upon a framework, in this particular case a library/package built atop the ASpecD framework, you want to have loggers being children of the framework logger in order to have the framework catch the log messages your library/package creates.

Why does this matter, particularly for the ASpecD framework? If you want to have your log messages in a package based on the ASpecD framework appear when using recipe-driven data analysis, you need to have your package loggers to be in the hierarchy below the root logger of the ASpecD framework. For convenience and in order not to make any informed guesses on how the ASpecD framework root logger is named, simply use this function to create the loggers in the modules of your package.

Parameters:

name (str) –

Name of the module to get the logger for.

Usually, this will be set to __name__, as this returns the current module (including the package name and separated with a . if present).

Returns:

logger – Logger object for the module

The logger will have a logging.NullHandler handler attached, in line with the advice from the Python Logging HOWTO.

Return type:

logging.Logger

Examples

To add a logger to a module in your library/package based on the ASpecD framework, add something like this to the top of your module:

import aspecd.utils

logger = aspecd.utils.get_logger(__name__)

The important aspect here is to use __name__ as the name of the logger. The reason is simple: __name__ gets automatically expanded to the name of the current module, with the name of all parent modules/the package prefixed, using dot notation. The resulting logger will be situated in the hierarchy below the aspecd package logger. Suppose you have added the above lines to the module mypackage.processing in the processing module of your package mypackage. This will result in a logger aspecd.mypackage.processing.

New in version 0.9.