You're reading an old version of this documentation. For up-to-date information, please have a look at v0.9.
aspecd.utils module¶
General purpose functions and classes used in other modules.
To avoid circular dependencies, this module does not depend on any other modules of the ASpecD package, but it can be imported into every other module.
- aspecd.utils.full_class_name(object_)¶
Return full class name of an object including packages and modules.
- Parameters
object (object) – object the class name should be inferred for
- Returns
class_name – string with full class name of object
- Return type
- aspecd.utils.object_from_class_name(full_class_name_string)¶
Create object from full class name.
To obtain the full class name of an object, you might want to use the function
full_class_name()
- Parameters
full_class_name_string (
str
) – string with full class name of an object that shall be instantiated- Returns
object_ – object instantiated from the class given in full_class_name_string
- Return type
object
- class aspecd.utils.ToDictMixin¶
Bases:
object
Mixin class for returning all public attributes as dict.
Sometimes there is the need to either exclude public attributes (in case of infinite loops created by trying to apply
to_dict
in this case) or to add (public) attributes, particularly those used by getters and setters that are otherwise not included.To do so, there are two non_public attributes of this class each class inheriting from it will be able to set as well:
The names should be rather telling. For details, see below.
- __odict__¶
Dictionary of attributes preserving the order of their definition
- _exclude_from_to_dict¶
Names of (public) attributes to exclude from dictionary
Usually, the reason to exclude public attributes from being added to the dictionary is to avoid infinite loops, as sometimes an object may contain a reference to another object that in turn references back.
- Type
- _include_in_to_dict¶
Names of (public) attributes to include into dictionary
Usual reasons for actively including (public) attributes into the dictionary are those attributes accessed by getters and setters and hence not automatically included in the list otherwise.
- Type
- to_dict(remove_empty=False)¶
Create dictionary containing public attributes of an object.
- Parameters
remove_empty (
bool
) –Whether to remove keys with empty values
Default: False
- Returns
public_attributes – Ordered dictionary containing the public attributes of the object
The order of attribute definition is preserved
- Return type
Changed in version 0.6: New parameter remove_empty
Changed in version 0.9: Settings for properties to exclude and include are not traversed
- aspecd.utils.get_aspecd_version()¶
Get version of ASpecD package.
The function directly reads the contents of the file VERSION in the root directory of the package installation, not relying on introspection.
- Returns
version – Version number as string
- Return type
- aspecd.utils.package_version(name='')¶
Get version of arbitrary package.
The function relies on introspection using
pkg_resources
.
- aspecd.utils.package_name(obj=None)¶
Get name of package an object resides in.
- aspecd.utils.config_dir()¶
Get config directory for per-user configurations.
Configuration on a per-user level should be stored within a directory (only) readable by the currently logged-in user.
- Returns
config_dir – Path to config directory, usually in the user’s directory
- Return type
- class aspecd.utils.Yaml¶
Bases:
object
Handle reading from and writing to YAML files.
YAML file contents are read into an ordered dict, making use of the oyaml package. This preserves the order of the entries of any dict.
Note
The PyYAML package cannot handle NumPy arrays by default. Hence, there are two methods in this class for serialising and deserialising NumPy arrays:
Have a look at their documentation for details of the implementation and how to use them. In particular, larger NumPy arrays are saved to files using the NumPy binary format.
- dict¶
Contents read from/written to a YAML file
- numpy_array_size_threshold¶
Maximum size of NumPy array that is converted into a list using
aspecd.utils.Yaml.serialise_numpy_arrays()
.Larger NumPy arrays are saved to a file in NumPy format using
numpy.save()
.Default: 100 – i.e., arrays with >100 elements are stored in files.
- Type
- numpy_array_to_list¶
Whether to convert (small) NumPy arrays to a list or to a dict
Default: False
- Type
class:bool
- loader¶
Type of loader used for loading the YAML file
The type of loader used is crucial for the safety of your application. See the documentation of the PyYAML package for details.
Default:
yaml.SafeLoader
- Type
yaml.loader
- dumper¶
Type of dumper used for loading the YAML file
The type of dumper used is should be compatible to the type of loader.
Default:
yaml.SafeDumper
- Type
yaml.Dumper
- Raises
aspecd.utils.MissingFilenameError – Raised if no filename is given to read from/write to
Changed in version 0.4: Added
numpy_array_to_list
Changed in version 0.5: Added
dumper
and set dumper to SafeDumper- read_from(filename='')¶
Read from YAML file.
- Parameters
filename (
str
) – Name of the YAML file to read from.- Raises
aspecd.processing.MissingFilenameError – Raised if no filename is given to read from.
- write_to(filename='')¶
Write to YAML file.
- Parameters
filename (
str
) – Name of the YAML file to write to.- Raises
aspecd.processing.MissingFilenameError – Raised if no filename is given to write to.
- write_stream()¶
Write to from stream.
- Returns
stream – string representation of YAML file
- Return type
- serialise_numpy_arrays()¶
Serialise numpy arrays in a simple form, using a dict.
Background: The PyYAML package cannot easily handle NumPy arrays out of the box. However, datasets and alike should sometimes be serialised in form of YAML files. Hence the need for a simple method of serialising NumPy arrays. Furthermore, larger NumPy arrays should not be serialised in text form in YAML directly, but rather be stored in binary form, probably in a separate file.
The reason for saving (larger) NumPy arrays in binary rather than text form is twofold: The size of a binary file is much smaller, and the original precision is retained. Those interested in the representation of floating-point values in computers should consult:
David Goldberg. What every computer scientist should know about floating-point arithmetic. ACM Comput. Surv. 23 (1991):5–48. DOI:10.1145/103162.103163
As binary format, the NumPy format (see
numpy.lib.format
) gets used.As filename, the SHA256 hash of the array will be used. Thus, the names are unique with respect to the content. In the event of having several identical arrays within one dict that gets serialised, this should not be a problem, as the hashes should be reasonably unique. I.e., identical files should have identical content. Thus, having several identical arrays will lead to less files written, eventually saving space and overall file size.
- serialize_numpy_arrays()¶
Serialise numpy arrays for our AE speaking friends.
See
aspecd.utils.Yaml.serialise_numpy_arrays()
for details.
- deserialise_numpy_arrays()¶
Deserialise specially crafted dicts into NumPy arrays.
For reasons given in the documentation of
aspecd.utils.Yaml.serialise_numpy_arrays()
, NumPy arrays are handled separately and converted into dicts with some special fields. This method deserialises them back into NumPy arrays, as they were originally.
- deserialize_numpy_arrays()¶
Deserialise numpy arrays for our AE speaking friends.
See
aspecd.utils.Yaml.deserialise_numpy_arrays()
for details.
- aspecd.utils.replace_value_in_dict(replacement=None, target=None)¶
Replace value for given key in a dictionary, traversing recursively.
The key in
replacement
needs to correspond to the value intarget_dict
that should be replaced. Keys in the dictreplacement
that have no corresponding value intarget_dict
are silently ignored.- Parameters
- Returns
target – dict containing the key whose value has been replaced by the value of the corresponding key in
replacement
- Return type
- aspecd.utils.copy_values_between_dicts(source=None, target=None)¶
Copy values between two dicts in case of identical keys.
Each value in
source
is copied totarget
in case of matching keys in a recursive manner. Non-matching keys insource
are silently ignored.Both, source and target are parallel recursively traversed in case of identical overall structure.
- aspecd.utils.copy_keys_between_dicts(source=None, target=None)¶
Copy keys between two dicts.
Each key in
source
is copied totarget
in a recursive manner. If the key insource
is a dict and exists intarget
, the two dicts will be joined, not loosing keys intarget
.If, however, the key in
source
is not a dict, but the corresponding key intarget
is a dict, the corresponding value intarget
will be replaced with that fromsource
.
- aspecd.utils.remove_empty_values_from_dict(dict_)¶
Remove empty values and their keys from dict recursively.
- Parameters
dict (
dict
) – Dictionary the keys with empty values should be removed from- Returns
dict_ – Dictionary the keys with empty values have been removed from
- Return type
New in version 0.2.1.
- aspecd.utils.convert_keys_to_variable_names(dict_)¶
Change keys in dict to comply with PEP8 for variable names.
Keys are converted to all lower case and spaces replaced with underscores.
- Parameters
dict (
dict
) – Dictionary the keys should be renamed in- Returns
new_dict – Dictionary the keys have been renamed in
- Return type
New in version 0.2.1.
- aspecd.utils.all_equal(list_=None)¶
Check whether all elements of a list are equal.
- class aspecd.utils.Properties¶
Bases:
aspecd.utils.ToDictMixin
General properties class allowing to set properties from dict.
Properties classes often need to set their properties from dicts, and additionally, they should be able to convert themselves into dicts for persistence.
- from_dict(dict_=None)¶
Set attributes from dictionary.
- Parameters
dict (
dict
) – Dictionary containing information of a task.- Raises
aspecd.plotting.MissingDictError – Raised if no dict is provided.
- get_properties()¶
Return (public) properties, i.e. attributes that are not methods.
- Returns
properties – public properties
- Return type
- to_dict(remove_empty=False)¶
Create dictionary containing public attributes of an object.
- Parameters
remove_empty (
bool
) –Whether to remove keys with empty values
Default: False
- Returns
public_attributes – Ordered dictionary containing the public attributes of the object
The order of attribute definition is preserved
- Return type
Changed in version 0.6: New parameter remove_empty
Changed in version 0.9: Settings for properties to exclude and include are not traversed
- aspecd.utils.basename(filename)¶
Return basename of given filename.
- aspecd.utils.path(filename)¶
Return path of given filename, with trailing separator.
- aspecd.utils.not_zero(value)¶
Return a value that is not zero to prevent DivisionByZero errors.
Dividing by zero results in NaN values and often hinders evaluating mathematical models. A solution adopted from the lmfit Python package (https://doi.org/10.5281/zenodo.598352) returns a value equivalent to the resolution of a numpy float.
Note
If you use this function excessively within a module, mostly within rather complicated mathematical equations, it might be a good idea to import this function explicitly, to shorten the code, such as:
from aspecd.utils import not_zero
. As usual, readability is king.- Parameters
value (
float
) – Value that can become (too close to) zero to trigger NaN values- Returns
value – Value guaranteed not to be zero
- Return type
New in version 0.3.
- aspecd.utils.isiterable(variable)¶
Check whether the given variable is iterable.
Lists, tuples, NumPy arrays, but strings as well are iterable. Integers, however, are not.
- Parameters
variable – variable to check for being iterable
- Returns
answer – Whether the given variable is iterable
- Return type
- aspecd.utils.get_package_data(name='', directory='')¶
Obtain contents from a non-code file (“package data”).
A note on obtaining data from the distributed package: Rather than manually playing around with paths relative to the package root directory, contents of non-code files need to be obtained in a way that works with different kinds of package installation.
Note that in Python, only files within the package, i.e. within the directory where all the modules are located, can be accessed, not files that reside on the root directory of the package.
Note
In case you would want to get package data from a package different than aspecd, you can prefix the name of the file to retrieve with the package name, using the ‘@’ as a separator.
Suppose you would want to retrieve the file
__main__.py
file from the “pip” package (why should you?):get_package_data('pip@__main__.py', directory='')
This would return the contents of this file.
- Parameters
- Returns
contents – String containing the contents of the non-code file.
- Return type
New in version 0.5.
- aspecd.utils.change_working_dir(path='')¶
Context manager for temporarily changing the working directory.
Sometimes it is necessary to temporarily change the working directory, but one would like to ensure that the directory is reverted even in case an exception is raised.
Due to its nature as a context manager, this function can be used with a
with
statement. See below for an example.- Parameters
path (
str
) – Path the current working directory should be changed to.
Examples
To temporarily change the working directory:
with change_working_dir(os.path.join('some', 'path')): # Do something that may raise an exception
This can come in quite handy in case of tests.
New in version 0.6.
- aspecd.utils.get_logger(name='')¶
Get logger object for a given module.
Logging in libraries is slightly different from standard logging with respect to the handler attached to the logger. The general advice from the Python Logging HOWTO is to explicitly add the
logging.Nullhandler
as a handler.Additionally, if you want to add loggers for a library that inherits from/builds upon a framework, in this particular case a library/package built atop the ASpecD framework, you want to have loggers being children of the framework logger in order to have the framework catch the log messages your library/package creates.
Why does this matter, particularly for the ASpecD framework? If you want to have your log messages in a package based on the ASpecD framework appear when using recipe-driven data analysis, you need to have your package loggers to be in the hierarchy below the root logger of the ASpecD framework. For convenience and in order not to make any informed guesses on how the ASpecD framework root logger is named, simply use this function to create the loggers in the modules of your package.
- Parameters
name (
str
) –Name of the module to get the logger for.
Usually, this will be set to
__name__
, as this returns the current module (including the package name and separated with a.
if present).- Returns
logger – Logger object for the module
The logger will have a
logging.NullHandler
handler attached, in line with the advice from the Python Logging HOWTO.- Return type
Examples
To add a logger to a module in your library/package based on the ASpecD framework, add something like this to the top of your module:
import aspecd.utils logger = aspecd.utils.get_logger(__name__)
The important aspect here is to use
__name__
as the name of the logger. The reason is simple:__name__
gets automatically expanded to the name of the current module, with the name of all parent modules/the package prefixed, using dot notation. The resulting logger will be situated in the hierarchy below theaspecd
package logger. Suppose you have added the above lines to the modulemypackage.processing
in theprocessing
module of your packagemypackage
. This will result in a loggeraspecd.mypackage.processing
.New in version 0.9.