aspecd.utils module

General purpose functions and classes used in other modules.

To avoid circular dependencies, this module does not depend on any other modules of the ASpecD package, but it can be imported into every other module.

aspecd.utils.full_class_name(object_)

Return full class name of an object including packages and modules.

Parameters

object (object) – object the class name should be inferred for

Returns

class_name – string with full class name of object

Return type

str

aspecd.utils.object_from_class_name(full_class_name_string)

Create object from full class name.

To obtain the full class name of an object, you might want to use the function full_class_name()

Parameters

full_class_name_string (str) – string with full class name of an object that shall be instantiated

Returns

object_ – object instantiated from the class given in full_class_name_string

Return type

object

class aspecd.utils.ToDictMixin

Bases: object

Mixin class for returning all public attributes as dict.

Sometimes there is the need to either exclude public attributes (in case of infinite loops created by trying to apply to_dict in this case) or to add (public) attributes, particularly those used by getters and setters that are otherwise not included.

To do so, there are two non_public attributes of this class each class inheriting from it will be able to set as well:

The names should be rather telling. For details, see below.

__odict__

Dictionary of attributes preserving the order of their definition

Type

collections.OrderedDict

_exclude_from_to_dict

Names of (public) attributes to exclude from dictionary

Usually, the reason to exclude public attributes from being added to the dictionary is to avoid infinite loops, as sometimes an object may contain a reference to another object that in turn references back.

Type

list

_include_in_to_dict

Names of (public) attributes to include into dictionary

Usual reasons for actively including (public) attributes into the dictionary are those attributes accessed by getters and setters and hence not automatically included in the list otherwise.

Type

list

to_dict(remove_empty=False)

Create dictionary containing public attributes of an object.

Parameters

remove_empty (bool) –

Whether to remove keys with empty values

Default: False

Returns

public_attributes – Ordered dictionary containing the public attributes of the object

The order of attribute definition is preserved

Return type

collections.OrderedDict

Changed in version 0.6: New parameter remove_empty

Changed in version 0.9: Settings for properties to exclude and include are not traversed

Changed in version 0.9.1: Dictionaries get copied before traversing, as otherwise, the special variables __dict__ and __0dict__ are modified, what may result in strange behaviour.

Changed in version 0.9.2: Dictionaries do not get copied by default, but there is a private method that can be overridden in derived classes to copy the dictionary.

aspecd.utils.get_aspecd_version()

Get version of ASpecD package.

The function directly reads the contents of the file VERSION in the root directory of the package installation, not relying on introspection.

Returns

version – Version number as string

Return type

str

aspecd.utils.package_version(name='')

Get version of arbitrary package.

The function relies on introspection using pkg_resources.

Parameters

name (str) – Name of package the version should be obtained for

Returns

version – Version number as string

Return type

str

aspecd.utils.package_name(obj=None)

Get name of package an object resides in.

Parameters

obj (object) –

Object the package it resides in should be returned.

If no object is given, the name of the package this function is defined in (“aspecd”) will be returned.

Returns

package_name – Name of the package

Return type

str

aspecd.utils.config_dir()

Get config directory for per-user configurations.

Configuration on a per-user level should be stored within a directory (only) readable by the currently logged-in user.

Returns

config_dir – Path to config directory, usually in the user’s directory

Return type

str

class aspecd.utils.Yaml

Bases: object

Handle reading from and writing to YAML files.

YAML file contents are read into an ordered dict, making use of the oyaml package. This preserves the order of the entries of any dict.

Note

The PyYAML package cannot handle NumPy arrays by default. Hence, there are two methods in this class for serialising and deserialising NumPy arrays:

Have a look at their documentation for details of the implementation and how to use them. In particular, larger NumPy arrays are saved to files using the NumPy binary format.

dict

Contents read from/written to a YAML file

Type

collections.OrderedDict

numpy_array_size_threshold

Maximum size of NumPy array that is converted into a list using aspecd.utils.Yaml.serialise_numpy_arrays().

Larger NumPy arrays are saved to a file in NumPy format using numpy.save().

Default: 100 – i.e., arrays with >100 elements are stored in files.

Type

int

numpy_array_to_list

Whether to convert (small) NumPy arrays to a list or to a dict

Default: False

Type

class:bool

binary_files

List of names of the binary files containing large arrays

Type

list

binary_directory

Directory the binary files should be stored in

Default: ‘’

Type

str

loader

Type of loader used for loading the YAML file

The type of loader used is crucial for the safety of your application. See the documentation of the PyYAML package for details.

Default: yaml.SafeLoader

Type

yaml.loader

dumper

Type of dumper used for loading the YAML file

The type of dumper used is should be compatible to the type of loader.

Default: yaml.SafeDumper

Type

yaml.Dumper

Raises

aspecd.utils.MissingFilenameError – Raised if no filename is given to read from/write to

Changed in version 0.4: Added numpy_array_to_list

Changed in version 0.5: Added dumper and set dumper to SafeDumper

read_from(filename='')

Read from YAML file.

Parameters

filename (str) – Name of the YAML file to read from.

Raises

aspecd.processing.MissingFilenameError – Raised if no filename is given to read from.

write_to(filename='')

Write to YAML file.

Parameters

filename (str) – Name of the YAML file to write to.

Raises

aspecd.processing.MissingFilenameError – Raised if no filename is given to write to.

read_stream(stream=None)

Read from stream.

Parameters

stream (bytes) – binary stream to read from

write_stream()

Write to from stream.

Returns

stream – string representation of YAML file

Return type

str

serialise_numpy_arrays()

Serialise numpy arrays in a simple form, using a dict.

Background: The PyYAML package cannot easily handle NumPy arrays out of the box. However, datasets and alike should sometimes be serialised in form of YAML files. Hence the need for a simple method of serialising NumPy arrays. Furthermore, larger NumPy arrays should not be serialised in text form in YAML directly, but rather be stored in binary form, probably in a separate file.

The reason for saving (larger) NumPy arrays in binary rather than text form is twofold: The size of a binary file is much smaller, and the original precision is retained. Those interested in the representation of floating-point values in computers should consult:

  • David Goldberg. What every computer scientist should know about floating-point arithmetic. ACM Comput. Surv. 23 (1991):5–48. DOI:10.1145/103162.103163

As binary format, the NumPy format (see numpy.lib.format) gets used.

As filename, the SHA256 hash of the array will be used. Thus, the names are unique with respect to the content. In the event of having several identical arrays within one dict that gets serialised, this should not be a problem, as the hashes should be reasonably unique. I.e., identical files should have identical content. Thus, having several identical arrays will lead to less files written, eventually saving space and overall file size.

serialize_numpy_arrays()

Serialise numpy arrays for our AE speaking friends.

See aspecd.utils.Yaml.serialise_numpy_arrays() for details.

deserialise_numpy_arrays()

Deserialise specially crafted dicts into NumPy arrays.

For reasons given in the documentation of aspecd.utils.Yaml.serialise_numpy_arrays(), NumPy arrays are handled separately and converted into dicts with some special fields. This method deserialises them back into NumPy arrays, as they were originally.

deserialize_numpy_arrays()

Deserialise numpy arrays for our AE speaking friends.

See aspecd.utils.Yaml.deserialise_numpy_arrays() for details.

aspecd.utils.replace_value_in_dict(replacement=None, target=None)

Replace value for given key in a dictionary, traversing recursively.

The key in replacement needs to correspond to the value in target_dict that should be replaced. Keys in the dict replacement that have no corresponding value in target_dict are silently ignored.

Parameters
  • replacement (dict) – dict containing key corresponding to the value in target_dict that should be replaced by the associated value

  • target (dict) – dict containing the key whose value should be replaced by the value of the key in replacement named identical to the value

Returns

target – dict containing the key whose value has been replaced by the value of the corresponding key in replacement

Return type

dict

aspecd.utils.copy_values_between_dicts(source=None, target=None)

Copy values between two dicts in case of identical keys.

Each value in source is copied to target in case of matching keys in a recursive manner. Non-matching keys in source are silently ignored.

Both, source and target are parallel recursively traversed in case of identical overall structure.

Parameters
  • source (dict) – Dictionary the values should be copied from in case of matching keys

  • target (dict) – Dictionary the values should be copied to in case of matching keys

Returns

target – Dictionary the values have been copied to in case of matching keys

Return type

dict

aspecd.utils.copy_keys_between_dicts(source=None, target=None)

Copy keys between two dicts.

Each key in source is copied to target in a recursive manner. If the key in source is a dict and exists in target, the two dicts will be joined, not loosing keys in target.

If, however, the key in source is not a dict, but the corresponding key in target is a dict, the corresponding value in target will be replaced with that from source.

Parameters
  • source (dict) – Dictionary the keys should be copied from

  • target (dict) – Dictionary the keys should be copied to

Returns

target – Dictionary the keys have been copied to

Return type

dict

aspecd.utils.remove_empty_values_from_dict(dict_)

Remove empty values and their keys from dict recursively.

Parameters

dict (dict) – Dictionary the keys with empty values should be removed from

Returns

dict_ – Dictionary the keys with empty values have been removed from

Return type

dict

New in version 0.2.1.

aspecd.utils.convert_keys_to_variable_names(dict_)

Change keys in dict to comply with PEP8 for variable names.

Keys are converted to all lower case and spaces replaced with underscores.

Parameters

dict (dict) – Dictionary the keys should be renamed in

Returns

new_dict – Dictionary the keys have been renamed in

Return type

dict

New in version 0.2.1.

aspecd.utils.all_equal(list_=None)

Check whether all elements of a list are equal.

Parameters

list (list) – List whose elements should be checked for being equal

Returns

result

Return type

bool

class aspecd.utils.Properties

Bases: aspecd.utils.ToDictMixin

General properties class allowing to set properties from dict.

Properties classes often need to set their properties from dicts, and additionally, they should be able to convert themselves into dicts for persistence.

from_dict(dict_=None)

Set attributes from dictionary.

Parameters

dict (dict) – Dictionary containing information of a task.

Raises

aspecd.plotting.MissingDictError – Raised if no dict is provided.

get_properties()

Return (public) properties, i.e. attributes that are not methods.

Returns

properties – public properties

Return type

list

to_dict(remove_empty=False)

Create dictionary containing public attributes of an object.

Parameters

remove_empty (bool) –

Whether to remove keys with empty values

Default: False

Returns

public_attributes – Ordered dictionary containing the public attributes of the object

The order of attribute definition is preserved

Return type

collections.OrderedDict

Changed in version 0.6: New parameter remove_empty

Changed in version 0.9: Settings for properties to exclude and include are not traversed

Changed in version 0.9.1: Dictionaries get copied before traversing, as otherwise, the special variables __dict__ and __0dict__ are modified, what may result in strange behaviour.

Changed in version 0.9.2: Dictionaries do not get copied by default, but there is a private method that can be overridden in derived classes to copy the dictionary.

aspecd.utils.basename(filename)

Return basename of given filename.

Parameters

filename (str) – Name of the file (may contain absolute path and extension) the basename should be returned for.

Returns

basename – Basename corresponding to the filename provided as input.

Return type

str

aspecd.utils.path(filename)

Return path of given filename, with trailing separator.

Parameters

filename (str) – Name of the file (may contain absolute path and extension) the path should be returned for.

Returns

path – path corresponding to the filename provided as input.

Return type

str

aspecd.utils.not_zero(value)

Return a value that is not zero to prevent DivisionByZero errors.

Dividing by zero results in NaN values and often hinders evaluating mathematical models. A solution adopted from the lmfit Python package (https://doi.org/10.5281/zenodo.598352) returns a value equivalent to the resolution of a numpy float.

Note

If you use this function excessively within a module, mostly within rather complicated mathematical equations, it might be a good idea to import this function explicitly, to shorten the code, such as: from aspecd.utils import not_zero. As usual, readability is king.

Parameters

value (float) – Value that can become (too close to) zero to trigger NaN values

Returns

value – Value guaranteed not to be zero

Return type

float

New in version 0.3.

aspecd.utils.isiterable(variable)

Check whether the given variable is iterable.

Lists, tuples, NumPy arrays, but strings as well are iterable. Integers, however, are not.

Parameters

variable – variable to check for being iterable

Returns

answer – Whether the given variable is iterable

Return type

bool

aspecd.utils.get_package_data(name='', directory='')

Obtain contents from a non-code file (“package data”).

A note on obtaining data from the distributed package: Rather than manually playing around with paths relative to the package root directory, contents of non-code files need to be obtained in a way that works with different kinds of package installation.

Note that in Python, only files within the package, i.e. within the directory where all the modules are located, can be accessed, not files that reside on the root directory of the package.

Note

In case you would want to get package data from a package different than aspecd, you can prefix the name of the file to retrieve with the package name, using the ‘@’ as a separator.

Suppose you would want to retrieve the file __main__.py file from the “pip” package (why should you?):

get_package_data('pip@__main__.py', directory='')

This would return the contents of this file.

Parameters
  • name (str) –

    Name of the file whose contents should be accessed.

    In case the file should be retrieved from a different package, the package name can be prefixed, using ‘@’ as a separator.

  • directory (str) –

    Directory within the package where the files are located.

    Default: ‘’

Returns

contents – String containing the contents of the non-code file.

Return type

str

New in version 0.5.

aspecd.utils.change_working_dir(path='')

Context manager for temporarily changing the working directory.

Sometimes it is necessary to temporarily change the working directory, but one would like to ensure that the directory is reverted even in case an exception is raised.

Due to its nature as a context manager, this function can be used with a with statement. See below for an example.

Parameters

path (str) – Path the current working directory should be changed to.

Examples

To temporarily change the working directory:

with change_working_dir(os.path.join('some', 'path')):
    # Do something that may raise an exception

This can come in quite handy in case of tests.

New in version 0.6.

aspecd.utils.get_logger(name='')

Get logger object for a given module.

Logging in libraries is slightly different from standard logging with respect to the handler attached to the logger. The general advice from the Python Logging HOWTO is to explicitly add the logging.Nullhandler as a handler.

Additionally, if you want to add loggers for a library that inherits from/builds upon a framework, in this particular case a library/package built atop the ASpecD framework, you want to have loggers being children of the framework logger in order to have the framework catch the log messages your library/package creates.

Why does this matter, particularly for the ASpecD framework? If you want to have your log messages in a package based on the ASpecD framework appear when using recipe-driven data analysis, you need to have your package loggers to be in the hierarchy below the root logger of the ASpecD framework. For convenience and in order not to make any informed guesses on how the ASpecD framework root logger is named, simply use this function to create the loggers in the modules of your package.

Parameters

name (str) –

Name of the module to get the logger for.

Usually, this will be set to __name__, as this returns the current module (including the package name and separated with a . if present).

Returns

logger – Logger object for the module

The logger will have a logging.NullHandler handler attached, in line with the advice from the Python Logging HOWTO.

Return type

logging.Logger

Examples

To add a logger to a module in your library/package based on the ASpecD framework, add something like this to the top of your module:

import aspecd.utils

logger = aspecd.utils.get_logger(__name__)

The important aspect here is to use __name__ as the name of the logger. The reason is simple: __name__ gets automatically expanded to the name of the current module, with the name of all parent modules/the package prefixed, using dot notation. The resulting logger will be situated in the hierarchy below the aspecd package logger. Suppose you have added the above lines to the module mypackage.processing in the processing module of your package mypackage. This will result in a logger aspecd.mypackage.processing.

New in version 0.9.