aspecd.table module

Tabular representation of datasets.

While spectroscopic data are usually presented graphically (see the aspecd.plotting module for details), there are cases where a tabular representation is useful or even necessary.

One prime example of a situation where you would want to have a tabular representation of a (calculated) dataset is the result of an aspecd.analysis.AggregatedAnalysisStep. Here, you perform a aspecd.analysis.SingleAnalysisStep on a series of datasets and collect the results in a aspecd.dataset.CalculatedDataset. Of course, there are situations where you can simply plot this dataset, but while graphical representations are often helpful for obtaining trends, if the exact numbers are relevant, a table is often much more useful.

New in version 0.5.

Why this module?

While there are several Python packages available capable of formatting tables (PrettyTable, Tablib, pandas, to name but a few), all these do much more than only formatting tables, but are designed to work with tables as well, i.e. modifying and filtering the table contents. This is, however, not needed in the given context, hence the attempt to create a rather lightweight implementation.

The implementation focuses on the following aspects:

  • Tabulation of 1D and 2D datasets

  • Primarily textual output

  • Control over formatting (necessary for different output formats)

  • Control over formatting of numbers

  • Automatic column headers and row indices if present in the dataset axes.

Currently, the module consists of two types of classes:

  • Table

    The actual class for tabulating data of a dataset

  • Format

    A general class controlling the format of the tables created by Table

There is a list of formatters for different purposes:

For simple output, you can use the basic formatter, Format, as well. As this is the default in the Table class, nothing needs to be done in this case.

Basic usage

To give you an idea how working with this module may look like, have a look at the following examples:

import numpy as np
from aspecd import dataset, table

ds = dataset.Dataset()
ds.data.data = np.random.random([5,3])

tab = table.Table()
tab = ds.tabulate(tab)

print(tab.table)

The last line will produce an output similar to the following – of course the numbers will be different in your case, as we are using random numbers:

0.6457921026722823  0.5634217835847304 0.16339715303360636
0.1946206354990324  0.7901047968358327 0.16098166185006968
0.9898725675813765  0.8892801098024301 0.9657653854952412
0.38858973357936466 0.5818405189808569 0.03264142581790075
0.9391951330574149  0.5412481787012977 0.9171357572017617

Note that even though the number of digits of the individual cells are not always identical, the columns are nicely aligned.

If we would want to reduce the number of digits shown, we could use the Table.column_format attribute, like so:

tab.column_format = ['8.6f']
tab = ds.tabulate(tab)

print(tab.table)

Note

Two things are relevant here: Table.column_format is a list, and you can provide fewer format strings than columns in your table. In this case, the last format will be used for all remaining columns.

The last line will in this case produce an output similar to the following – again with different numbers in your case:

0.645792 0.563422 0.163397
0.194621 0.790105 0.160982
0.989873 0.889280 0.965765
0.388590 0.581841 0.032641
0.939195 0.541248 0.917136

So far, you could get pretty much the same using an ASCII exporter for your dataset. So what is special with Table? A few things: You have much more control on the output, and you can have column headers and row indices included automatically if these are present in your dataset.

Let’s look at a dataset with information on the different columns set in the second axis. A full example could look like this:

ds = dataset.Dataset()
ds.data.data = np.random.random([5,3])
ds.data.axes[1].index = ['foo', 'bar', 'baz']

tab = table.Table()
tab.column_format = ['8.6f']
tab = ds.tabulate(tab)

print(tab.table)

And the result of the print statement would show you the column headers added:

foo      bar      baz
0.645792 0.563422 0.163397
0.194621 0.790105 0.160982
0.989873 0.889280 0.965765
0.388590 0.581841 0.032641
0.939195 0.541248 0.917136

Of course, the same would work if you would have row indices provided, and it even works if for both axes, indices are provided. To demonstrate the latter (admittedly again in an artificial example):

ds = dataset.Dataset()
ds.data.data = np.random.random([5,3])
ds.data.axes[0].index = ['a', 'b', 'c', 'd', 'e']
ds.data.axes[1].index = ['foo', 'bar', 'baz']

tab = table.Table()
tab.column_format = ['8.6f']
tab = ds.tabulate(tab)

print(tab.table)

And the result of the print statement would show you both, the column headers and the row indices added:

  foo      bar      baz
a 0.645792 0.563422 0.163397
b 0.194621 0.790105 0.160982
c 0.989873 0.889280 0.965765
d 0.388590 0.581841 0.032641
e 0.939195 0.541248 0.917136

Output formats

Tables can be output using different formats, and if you need a special format, you can of course implement one on your own, by subclassing Format. However, out of the box there are already a number of formats, from plain (default, shown above) to text to reStructuredText (rst), DokuWiki, and LaTeX. To give you a quick overview, we will create a dataset with both, row indices and column headers, and show the different formats.

ds = dataset.Dataset()
ds.data.data = np.random.random([5,3])
ds.data.axes[0].index = ['a', 'b', 'c', 'd', 'e']
ds.data.axes[1].index = ['foo', 'bar', 'baz']

tab = table.Table()
tab.column_format = ['8.6f']
tab = ds.tabulate(tab)

print(tab.table)

The result is the same as already shown above, just a plain table, though already quite useful:

  foo      bar      baz
a 0.689140 0.775321 0.657159
b 0.315142 0.412736 0.580745
c 0.116352 0.807541 0.410055
d 0.226994 0.715985 0.967606
e 0.532774 0.620670 0.745630

Now, let’s see how the text format looks like:

# Same as above
tab.format = 'text'
tab = ds.tabulate(tab)

print(tab.table)

And here you go:

+---+----------+----------+----------+
|   | foo      | bar      | baz      |
+---+----------+----------+----------+
| a | 0.689140 | 0.775321 | 0.657159 |
| b | 0.315142 | 0.412736 | 0.580745 |
| c | 0.116352 | 0.807541 | 0.410055 |
| d | 0.226994 | 0.715985 | 0.967606 |
| e | 0.532774 | 0.620670 | 0.745630 |
+---+----------+----------+----------+

Next is reStructuredText:

# Same as above
tab.format = 'rst'
tab = ds.tabulate(tab)

print(tab.table)

As you can see, this format outputs the “simple” rst style, that can be used as well for an easy-to-read text-only output:

= ======== ======== ========
  foo      bar      baz
= ======== ======== ========
a 0.689140 0.775321 0.657159
b 0.315142 0.412736 0.580745
c 0.116352 0.807541 0.410055
d 0.226994 0.715985 0.967606
e 0.532774 0.620670 0.745630
= ======== ======== ========

Another format that may be useful is DokuWiki, as this kind of lightweight wiki can be used as an electronic lab notebook (ELN):

# Same as above
tab.format = 'dokuwiki'
tab = ds.tabulate(tab)

print(tab.table)

This will even correctly highlight the column headers and row indices as “headers”:

|   ^ foo      ^ bar      ^ baz      ^
^ a | 0.689140 | 0.775321 | 0.657159 |
^ b | 0.315142 | 0.412736 | 0.580745 |
^ c | 0.116352 | 0.807541 | 0.410055 |
^ d | 0.226994 | 0.715985 | 0.967606 |
^ e | 0.532774 | 0.620670 | 0.745630 |

And finally, LaTeX, as this is of great use in the scientific world, and honestly, manually formatting LaTeX tables can be quite tedious.

# Same as above
tab.format = 'latex'
tab = ds.tabulate(tab)

print(tab.table)

As you can see, the details of the formatting are left to you, but at least, you get valid LaTeX code and a table layout according to typesetting standards, i.e. only horizontal lines. Note that the horizontal lines ( “rules”) are typeset using the booktabs package that should always be used:

\begin{tabular}{llll}
\toprule
  & foo      & bar      & baz      \\
\midrule
a & 0.689140 & 0.775321 & 0.657159 \\
b & 0.315142 & 0.412736 & 0.580745 \\
c & 0.116352 & 0.807541 & 0.410055 \\
d & 0.226994 & 0.715985 & 0.967606 \\
e & 0.532774 & 0.620670 & 0.745630 \\
\bottomrule
\end{tabular}

Captions

Tables can and should have captions that describe the content, as rarely the numbers (and row indices and column headers) stand on their own. Hence, you can add a table caption to a table. As writing a caption is necessarily a manual task, it would only be fair if the table output would include this caption. For formats such as DokuWiki and LaTeX, it is fairly obvious how to add the table caption, and for the other formats, the caption is added as plain text on top of the actual table, wrapped to not have endless lines.

ds = dataset.Dataset()
ds.data.data = np.random.random([5,5])
ds.data.axes[0].index = ['a', 'b', 'c', 'd', 'e']
ds.data.axes[1].index = ['foo', 'bar', 'baz', 'foobar', 'frob']

caption = table.Caption()
caption.title = 'Lorem ipsum dolor sit amet, consectetur adipiscing elit.'
caption.text = 'Quisque varius tortor ac faucibus posuere. In hac ' \
               'habitasse platea dictumst. Morbi rutrum felis vitae '\
               'tristique accumsan. Sed est nisl, auctor a metus a, ' \
               'elementum cursus velit. Proin et rutrum erat. ' \
               'Praesent id urna odio. Duis quis augue ac nunc commodo' \
               ' euismod quis id orci.'

tab = table.Table()
tab.caption = caption
tab.column_format = ['8.6f']
tab = ds.tabulate(tab)

print(tab.table)

The result of the print statement above would output something like this:

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Quisque
varius tortor ac faucibus posuere. In hac habitasse platea dictumst.
Morbi rutrum felis vitae tristique accumsan. Sed est nisl, auctor a
metus a, elementum cursus velit. Proin et rutrum erat. Praesent id
urna odio. Duis quis augue ac nunc commodo euismod quis id orci.

  foo      bar      baz      foobar   frob
a 0.162747 0.620320 0.677983 0.719360 0.426734
b 0.342259 0.907828 0.252471 0.987115 0.563511
c 0.253853 0.752020 0.277696 0.479128 0.929410
d 0.768840 0.220356 0.247271 0.379556 0.231765
e 0.113655 0.725631 0.098438 0.753049 0.363572

Note that the caption has been wrapped for better readability, and that an empty line is inserted between caption and table. Of course, you can combine the caption with the other textual formats (“text”, “rst”) as well, and it will be output in the same way. The formats “dokuwiki” and “latex” are special, see the respective format class definitions ( DokuwikiFormat, LatexFormat) for details.

Module documentation

class aspecd.table.Table

Bases: ToDictMixin

Tabular representation of datasets.

Formatting of a table can be controlled by the formatter class defined by format. See the documentation of the Format class and its subclasses for details. Furthermore, the individual columns containing numerical data can be formatted as well, specifying formats for the individual columns in column_format.

In case the axes of a dataset contain values in their aspecd.dataset.Axis.index attribute, these values will be used as first column and column headers, respectively.

In case of indices present in either the first or second axis or both, they will be used as row indices and column headers, respectively. One particular use case is in combination with the results of an aspecd.analysis.AggregatedAnalysisStep operating on a series of datasets and combining the result in a aspecd.dataset.CalculatedDataset with the row indices being the labels of the corresponding datasets.

Note

For obvious reasons, only 1D and 2D datasets can be tabulated. Therefore, if you try to tabulate a ND dataset with N>2, this will raise an exception.

dataset

Dataset containing numerical data to tabulate

Type:

aspecd.dataset.Dataset

table

Final table ready to be output

Type:

str

format

Identifier for output format.

Valid identifiers are either the empty string or any first part of a subclass of Format, i.e. the part before “Format”.

Examples for currently valid identifiers: text, rst, dokuwiki, latex

See Format and the respective subclasses for details on the formats available and what kind of output they create.

Type:

str

column_format

(Optional) formats for the data

The format strings are used by str.format(), see there for details.

If the list is shorter than the number of columns, the last element will be used for the remaining columns.

Type:

list

filename

Name of the file to save the table to.

If calling save(), the table contained in table will be saved to this file

Type:

str

New in version 0.5.

tabulate(dataset=None, from_dataset=False)

Create tabular representation of the numerical data of a dataset.

The result is stored in table.

In case of an empty dataset, a warning is logged and no further action taken.

Parameters:
  • dataset (class:aspecd.dataset.Dataset) – Dataset to create the tabular representation for

  • from_dataset (boolean) –

    whether we are called from within a dataset

    Defaults to “False” and shall never be set manually.

save()

Save table to file.

The filename is set in filename.

If no table exists, i.e. tabulate() has not yet been called, the method will silently return.

create_history_record()

Create history record to be added to the dataset.

Usually, this method gets called from within the aspecd.dataset.Dataset.tabulate() method of the aspecd.dataset.Dataset class and ensures the history of each tabulating step to get written properly.

Returns:

history_record – history record for tabulating step

Return type:

aspecd.history.TableHistoryRecord

class aspecd.table.Format

Bases: object

Base class for settings for formatting tables.

The formatter is used by Table to control the output.

Different formats can be implemented by subclassing this class. Currently, the following subclasses are available:

For simple output, you can use the basic formatter, Format, as well. As this is the default in the Table class, nothing needs to be done in this case.

padding

Number of spaces left and right of a field

Type:

int

column_separator

String used to separate columns in a row

Type:

str

column_prefix

String used to prefix the first column in a row

Type:

str

column_postfix

String used to postfix the last column in a row

Type:

str

header_separator

String used to separate columns in the header (if present)

Type:

str

header_prefix

String used to prefix the first column in the header (if present)

Type:

str

header_postfix

String used to postfix the last column in the header (if present)

Type:

str

New in version 0.5.

top_rule(column_widths=None)

Create top rule for table.

Tables usually have three types of rules: top rule, middle rule, and bottom rule. The middle rule gets used to separate column headers from the actual tabular data.

If your format in a class inheriting from Format does not need this rule, don’t override this method, as it will by default return the empty string, and hence no rule gets added to the table.

Parameters:

column_widths (list) – (optional) list of column widths

Returns:

rule – Actual rule that gets added to the table output

Default: ‘’

Return type:

class:str

middle_rule(column_widths=None)

Create middle rule for table.

Tables usually have three types of rules: top rule, middle rule, and bottom rule. The middle rule gets used to separate column headers from the actual tabular data.

If your format in a class inheriting from Format does not need this rule, don’t override this method, as it will by default return the empty string, and hence no rule gets added to the table.

Parameters:

column_widths (list) – (optional) list of column widths

Returns:

rule – Actual rule that gets added to the table output

Default: ‘’

Return type:

class:str

bottom_rule(column_widths=None)

Create bottom rule for table.

Tables usually have three types of rules: top rule, middle rule, and bottom rule. The middle rule gets used to separate column headers from the actual tabular data.

If your format in a class inheriting from Format does not need this rule, don’t override this method, as it will by default return the empty string, and hence no rule gets added to the table.

Parameters:

column_widths (list) – (optional) list of column widths

Returns:

rule – Actual rule that gets added to the table output

Default: ‘’

Return type:

class:str

opening(columns=None, caption=None)

Create opening code.

Some formats have opening (and closing, see closing()) parts, e.g. opening and closing tags in XML and related languages, but in LaTeX as well.

Furthermore, table captions are usually set above the table, and if your table has a caption with content, this caption will be output as well. In its simplest form, as implemented here, caption title and caption text will be concatenated and wrapped using textwrap.wrap(), and an empty line added after the caption to separate it from the actual table. Thus, your table captions are output together with your table in simple text format.

Override this method according to your needs for your particular format.

Parameters:
  • columns (int) – (optional) number of columns of the table

  • caption (Caption) –

    (optional) table caption

    For details, see the Caption class documentation.

    Only if one of the properties of Caption contains content, the caption will be considered.

Returns:

opening – Code for opening the environment

Default: ‘’

Return type:

str

closing(caption=None)

Create closing code.

Some formats have opening (see opening()) and closing parts, e.g. opening and closing tags in XML and related languages, but in LaTeX as well.

If your format in a class inheriting from Format does not need this code, don’t override this method, as it will by default return the empty string, and hence no code gets added to the table.

Parameters:

caption (Caption) –

(optional) table caption

For details, see the Caption class documentation.

Only if one of the properties of Caption contains content, the caption will be considered.

Having a caption requires some formats to create an additional container surrounding the actual table.

Returns:

closing – Code for closing the environment

Default: ‘’

Return type:

str

class aspecd.table.TextFormat

Bases: Format

Table formatter for textual output.

With its default settings, the table would be surrounded by a grid, such as:

+-----+-----+-----+
| foo | bar | baz |
+-----+-----+-----+
| 1.0 | 1.1 | 1.2 |
| 2.0 | 2.1 | 2.2 |
+-----+-----+-----+
rule_character

Character used for drawing horizontal lines (rules)

Type:

str

rule_edge_character

Character used for the edges of horizontal lines (rules)

Type:

str

rule_separator_character

Character used for the column separators of horizontal lines (rules)

Type:

str

New in version 0.5.

top_rule(column_widths=None)

Create top rule for table.

Tables usually have three types of rules: top rule, middle rule, and bottom rule. The middle rule gets used to separate column headers from the actual tabular data.

The rule gets constructed according to this overall scheme:

Parameters:

column_widths (list) – List of column widths

Returns:

rule – Actual rule that gets added to the table output

Return type:

class:str

middle_rule(column_widths=None)

Create middle rule for table.

Here, the middle rule is identical to the top_rule(). See there for details how the rule is constructed.

Parameters:

column_widths (list) – List of column widths

Returns:

rule – Actual rule that gets added to the table output

Return type:

class:str

bottom_rule(column_widths=None)

Create bottom rule for table.

Here, the middle rule is identical to the top_rule(). See there for details how the rule is constructed.

Parameters:

column_widths (list) – List of column widths

Returns:

rule – Actual rule that gets added to the table output

Return type:

class:str

class aspecd.table.RstFormat

Bases: TextFormat

Table formatter for reStructuredText (rst) output.

This formatter actually uses the simple format for rst tables, such as:

=== === ===
foo bar baz
=== === ===
1.0 1.1 1.2
2.0 2.1 2.2
=== === ===

The above code would result in:

foo

bar

baz

1.0

1.1

1.2

2.0

2.1

2.2

New in version 0.5.

class aspecd.table.DokuwikiFormat

Bases: Format

Table formatter for DokuWiki output.

For details about the syntax, see the DokuWiki syntax documentation.

An example of a table in DokuWiki syntax could look like this:

^ foo ^ bar ^ baz ^
| 1.0 | 1.1 | 1.2 |
| 2.0 | 2.1 | 2.2 |

And in case of both, column headers and row indices, this would even convert to:

|     ^ foo ^ bar ^ baz ^
^ foo | 1.0 | 1.0 | 1.0 |
^ bar | 1.0 | 1.0 | 1.0 |
^ baz | 1.0 | 1.0 | 1.0 |

New in version 0.5.

opening(columns=None, caption=None)

Create opening code.

In case of DokuWiki, this is usually empty, except in cases where you have added a caption. In the latter case, code consistent with the DokuWiki caption plugin will be output, like so:

<table>
<caption>*Caption title* Caption text</caption>

To make this work in your DokuWiki, make sure to have the caption plugin installed.

Parameters:
  • columns (int) – Number of columns of the table

  • caption (Caption) –

    (optional) table caption

    For details, see the Caption class documentation.

    Only if one of the properties of Caption contains content, the caption will be considered.

    Having a caption requires DokuWiki to create an additional table environment surrounding the actual table. As this needs to be closed, the closing needs to have the information regarding the caption.

Returns:

opening – Code for opening the environment

Return type:

str

closing(caption=None)

Create closing code.

In case of DokuWiki, this is usually empty, except in cases where you have added a caption. In the latter case, code consistent with the DokuWiki caption plugin will be output, like so:

</table>

To make this work in your DokuWiki, make sure to have the caption plugin installed.

Parameters:

caption (Caption) –

(optional) table caption

For details, see the Caption class documentation.

Only if one of the properties of Caption contains content, the caption will be considered.

Having a caption requires DokuWiki to create an additional table environment surrounding the actual table. As this needs to be closed, the closing needs to have the information regarding the caption.

Returns:

closing – Code for closing the environment

Return type:

str

class aspecd.table.LatexFormat

Bases: Format

Table formatter for LaTeX output.

Results in a rather generic LaTeX table, and the goal of this formatter is to provide valid LaTeX code without trying to go into too many details of all the possibilities of LaTeX table formatting.

Note

The format requires the package “booktabs” to be loaded, as the horizontal rules defined by this package are automatically added to the LaTeX output.

An example of the LaTeX code of a table may look as follows:

\begin{tabular}{lll}
\toprule
foo & bar & baz \\
\midrule
1.0 & 1.1 & 1.2 \\
2.0 & 2.1 & 2.2 \\
\bottomrule
\end{tabular}

New in version 0.5.

top_rule(column_widths=None)

Create top rule for table.

Tables usually have three types of rules: top rule, middle rule, and bottom rule. The middle rule gets used to separate column headers from the actual tabular data.

Parameters:

column_widths (list) – Ignored in this particular case

Returns:

rule – Actual rule that gets added to the table output

Return type:

class:str

middle_rule(column_widths=None)

Create middle rule for table.

Tables usually have three types of rules: top rule, middle rule, and bottom rule. The middle rule gets used to separate column headers from the actual tabular data.

Parameters:

column_widths (list) – Ignored in this particular case

Returns:

rule – Actual rule that gets added to the table output

Return type:

class:str

bottom_rule(column_widths=None)

Create bottom rule for table.

Tables usually have three types of rules: top rule, middle rule, and bottom rule. The middle rule gets used to separate column headers from the actual tabular data.

Parameters:

column_widths (list) – Ignored in this particular case

Returns:

rule – Actual rule that gets added to the table output

Return type:

class:str

opening(columns=None, caption=None)

Create opening code.

In case of LaTeX, this is usually:

\begin{tabular}{<column-specification>}

As this class strives for a rather generic, though valid LaTeX code, the column specification is simply ‘l’ times the number of columns (for exclusively left-aligned columns).

Parameters:
  • columns (int) – Number of columns of the table

  • caption (Caption) –

    (optional) table caption

    For details, see the Caption class documentation.

    Only if one of the properties of Caption contains content, the caption will be considered.

    Having a caption requires LaTeX to create an additional table environment surrounding the actual table. As this needs to be closed, the closing needs to have the information regarding the caption.

Returns:

opening – Code for opening the environment

Return type:

str

closing(caption=None)

Create closing code.

In case of LaTeX, this is usually:

\end{tabular}
Parameters:

caption (Caption) –

(optional) table caption

For details, see the Caption class documentation.

Only if one of the properties of Caption contains content, the caption will be considered.

Having a caption requires LaTeX to create an additional table environment surrounding the actual table. As this needs to be closed, the closing needs to have the information regarding the caption.

Returns:

closing – Code for closing the environment

Return type:

str

class aspecd.table.Caption

Bases: Properties

Caption for tables.

title

usually one sentence describing the intent of the table

Often plotted bold-face in a table caption.

Type:

str

text

additional text directly following the title

Contains more information about the table. Ideally, a table caption is self-contained such that it explains the table sufficiently to understand its intent and content without needing to read all the surrounding text.

Type:

str

New in version 0.5.