aspecd.table module
Tabular representation of datasets.
While spectroscopic data are usually presented graphically (see
the aspecd.plotting
module for details), there are cases where a
tabular representation is useful or even necessary.
One prime example of a situation where you would want to have a tabular
representation of a (calculated) dataset is the result of an
aspecd.analysis.AggregatedAnalysisStep
. Here, you perform a
aspecd.analysis.SingleAnalysisStep
on a series of datasets and
collect the results in a aspecd.dataset.CalculatedDataset
. Of
course, there are situations where you can simply plot this dataset,
but while graphical representations are often helpful for obtaining trends,
if the exact numbers are relevant, a table is often much more useful.
New in version 0.5.
Why this module?
While there are several Python packages available capable of formatting tables (PrettyTable, Tablib, pandas, to name but a few), all these do much more than only formatting tables, but are designed to work with tables as well, i.e. modifying and filtering the table contents. This is, however, not needed in the given context, hence the attempt to create a rather lightweight implementation.
The implementation focuses on the following aspects:
Tabulation of 1D and 2D datasets
Primarily textual output
Control over formatting (necessary for different output formats)
Control over formatting of numbers
Automatic column headers and row indices if present in the dataset axes.
Currently, the module consists of two types of classes:
-
The actual class for tabulating data of a dataset
-
A general class controlling the format of the tables created by
Table
There is a list of formatters for different purposes:
-
Grid layout for text output
-
Simple layout for reStructuredText (rst)
-
DokuWiki table syntax
-
LaTeX table syntax
For simple output, you can use the basic formatter, Format
,
as well. As this is the default in the Table
class, nothing needs
to be done in this case.
Basic usage
To give you an idea how working with this module may look like, have a look at the following examples:
import numpy as np
from aspecd import dataset, table
ds = dataset.Dataset()
ds.data.data = np.random.random([5,3])
tab = table.Table()
tab = ds.tabulate(tab)
print(tab.table)
The last line will produce an output similar to the following – of course the numbers will be different in your case, as we are using random numbers:
0.6457921026722823 0.5634217835847304 0.16339715303360636
0.1946206354990324 0.7901047968358327 0.16098166185006968
0.9898725675813765 0.8892801098024301 0.9657653854952412
0.38858973357936466 0.5818405189808569 0.03264142581790075
0.9391951330574149 0.5412481787012977 0.9171357572017617
Note that even though the number of digits of the individual cells are not always identical, the columns are nicely aligned.
If we would want to reduce the number of digits shown, we could use the
Table.column_format
attribute, like so:
tab.column_format = ['8.6f']
tab = ds.tabulate(tab)
print(tab.table)
Note
Two things are relevant here: Table.column_format
is a list,
and you can provide fewer format strings than columns in your table. In
this case, the last format will be used for all remaining columns.
The last line will in this case produce an output similar to the following – again with different numbers in your case:
0.645792 0.563422 0.163397
0.194621 0.790105 0.160982
0.989873 0.889280 0.965765
0.388590 0.581841 0.032641
0.939195 0.541248 0.917136
So far, you could get pretty much the same using an ASCII exporter for your
dataset. So what is special with Table
? A few things: You have much
more control on the output, and you can have column headers and row indices
included automatically if these are present in your dataset.
Let’s look at a dataset with information on the different columns set in the second axis. A full example could look like this:
ds = dataset.Dataset()
ds.data.data = np.random.random([5,3])
ds.data.axes[1].index = ['foo', 'bar', 'baz']
tab = table.Table()
tab.column_format = ['8.6f']
tab = ds.tabulate(tab)
print(tab.table)
And the result of the print statement would show you the column headers added:
foo bar baz
0.645792 0.563422 0.163397
0.194621 0.790105 0.160982
0.989873 0.889280 0.965765
0.388590 0.581841 0.032641
0.939195 0.541248 0.917136
Of course, the same would work if you would have row indices provided, and it even works if for both axes, indices are provided. To demonstrate the latter (admittedly again in an artificial example):
ds = dataset.Dataset()
ds.data.data = np.random.random([5,3])
ds.data.axes[0].index = ['a', 'b', 'c', 'd', 'e']
ds.data.axes[1].index = ['foo', 'bar', 'baz']
tab = table.Table()
tab.column_format = ['8.6f']
tab = ds.tabulate(tab)
print(tab.table)
And the result of the print statement would show you both, the column headers and the row indices added:
foo bar baz
a 0.645792 0.563422 0.163397
b 0.194621 0.790105 0.160982
c 0.989873 0.889280 0.965765
d 0.388590 0.581841 0.032641
e 0.939195 0.541248 0.917136
Output formats
Tables can be output using different formats, and if you need a special
format, you can of course implement one on your own, by subclassing
Format
. However, out of the box there are already a number of
formats, from plain (default, shown above) to text to reStructuredText (rst),
DokuWiki, and LaTeX. To give you a quick overview, we will create a dataset
with both, row indices and column headers, and show the different formats.
ds = dataset.Dataset()
ds.data.data = np.random.random([5,3])
ds.data.axes[0].index = ['a', 'b', 'c', 'd', 'e']
ds.data.axes[1].index = ['foo', 'bar', 'baz']
tab = table.Table()
tab.column_format = ['8.6f']
tab = ds.tabulate(tab)
print(tab.table)
The result is the same as already shown above, just a plain table, though already quite useful:
foo bar baz
a 0.689140 0.775321 0.657159
b 0.315142 0.412736 0.580745
c 0.116352 0.807541 0.410055
d 0.226994 0.715985 0.967606
e 0.532774 0.620670 0.745630
Now, let’s see how the text format looks like:
# Same as above
tab.format = 'text'
tab = ds.tabulate(tab)
print(tab.table)
And here you go:
+---+----------+----------+----------+
| | foo | bar | baz |
+---+----------+----------+----------+
| a | 0.689140 | 0.775321 | 0.657159 |
| b | 0.315142 | 0.412736 | 0.580745 |
| c | 0.116352 | 0.807541 | 0.410055 |
| d | 0.226994 | 0.715985 | 0.967606 |
| e | 0.532774 | 0.620670 | 0.745630 |
+---+----------+----------+----------+
Next is reStructuredText:
# Same as above
tab.format = 'rst'
tab = ds.tabulate(tab)
print(tab.table)
As you can see, this format outputs the “simple” rst style, that can be used as well for an easy-to-read text-only output:
= ======== ======== ========
foo bar baz
= ======== ======== ========
a 0.689140 0.775321 0.657159
b 0.315142 0.412736 0.580745
c 0.116352 0.807541 0.410055
d 0.226994 0.715985 0.967606
e 0.532774 0.620670 0.745630
= ======== ======== ========
Another format that may be useful is DokuWiki, as this kind of lightweight wiki can be used as an electronic lab notebook (ELN):
# Same as above
tab.format = 'dokuwiki'
tab = ds.tabulate(tab)
print(tab.table)
This will even correctly highlight the column headers and row indices as “headers”:
| ^ foo ^ bar ^ baz ^
^ a | 0.689140 | 0.775321 | 0.657159 |
^ b | 0.315142 | 0.412736 | 0.580745 |
^ c | 0.116352 | 0.807541 | 0.410055 |
^ d | 0.226994 | 0.715985 | 0.967606 |
^ e | 0.532774 | 0.620670 | 0.745630 |
And finally, LaTeX, as this is of great use in the scientific world, and honestly, manually formatting LaTeX tables can be quite tedious.
# Same as above
tab.format = 'latex'
tab = ds.tabulate(tab)
print(tab.table)
As you can see, the details of the formatting are left to you, but at least, you get valid LaTeX code and a table layout according to typesetting standards, i.e. only horizontal lines. Note that the horizontal lines ( “rules”) are typeset using the booktabs package that should always be used:
\begin{tabular}{llll}
\toprule
& foo & bar & baz \\
\midrule
a & 0.689140 & 0.775321 & 0.657159 \\
b & 0.315142 & 0.412736 & 0.580745 \\
c & 0.116352 & 0.807541 & 0.410055 \\
d & 0.226994 & 0.715985 & 0.967606 \\
e & 0.532774 & 0.620670 & 0.745630 \\
\bottomrule
\end{tabular}
Module documentation
- class aspecd.table.Table
Bases:
ToDictMixin
Tabular representation of datasets.
Formatting of a table can be controlled by the formatter class defined by
format
. See the documentation of theFormat
class and its subclasses for details. Furthermore, the individual columns containing numerical data can be formatted as well, specifying formats for the individual columns incolumn_format
.In case the axes of a dataset contain values in their
aspecd.dataset.Axis.index
attribute, these values will be used as first column and column headers, respectively.In case of indices present in either the first or second axis or both, they will be used as row indices and column headers, respectively. One particular use case is in combination with the results of an
aspecd.analysis.AggregatedAnalysisStep
operating on a series of datasets and combining the result in aaspecd.dataset.CalculatedDataset
with the row indices being the labels of the corresponding datasets.Note
For obvious reasons, only 1D and 2D datasets can be tabulated. Therefore, if you try to tabulate a ND dataset with N>2, this will raise an exception.
- dataset
Dataset containing numerical data to tabulate
- Type:
- format
Identifier for output format.
Valid identifiers are either the empty string or any first part of a subclass of
Format
, i.e. the part before “Format”.Examples for currently valid identifiers:
text
,rst
,dokuwiki
,latex
See
Format
and the respective subclasses for details on the formats available and what kind of output they create.- Type:
- column_format
(Optional) formats for the data
The format strings are used by
str.format()
, see there for details.If the list is shorter than the number of columns, the last element will be used for the remaining columns.
- Type:
- filename
Name of the file to save the table to.
If calling
save()
, the table contained intable
will be saved to this file- Type:
New in version 0.5.
- tabulate(dataset=None, from_dataset=False)
Create tabular representation of the numerical data of a dataset.
The result is stored in
table
.In case of an empty dataset, a warning is logged and no further action taken.
- Parameters:
dataset (class:aspecd.dataset.Dataset) – Dataset to create the tabular representation for
from_dataset (boolean) –
whether we are called from within a dataset
Defaults to “False” and shall never be set manually.
- save()
Save table to file.
The filename is set in
filename
.If no table exists, i.e.
tabulate()
has not yet been called, the method will silently return.
- create_history_record()
Create history record to be added to the dataset.
Usually, this method gets called from within the
aspecd.dataset.Dataset.tabulate()
method of theaspecd.dataset.Dataset
class and ensures the history of each tabulating step to get written properly.- Returns:
history_record – history record for tabulating step
- Return type:
- class aspecd.table.Format
Bases:
object
Base class for settings for formatting tables.
The formatter is used by
Table
to control the output.Different formats can be implemented by subclassing this class. Currently, the following subclasses are available:
-
Grid layout for text output
-
Simple layout for reStructuredText (rst)
-
DokuWiki table syntax
-
LaTeX table syntax
For simple output, you can use the basic formatter,
Format
, as well. As this is the default in theTable
class, nothing needs to be done in this case.New in version 0.5.
- top_rule(column_widths=None)
Create top rule for table.
Tables usually have three types of rules: top rule, middle rule, and bottom rule. The middle rule gets used to separate column headers from the actual tabular data.
If your format in a class inheriting from
Format
does not need this rule, don’t override this method, as it will by default return the empty string, and hence no rule gets added to the table.- Parameters:
column_widths (
list
) – (optional) list of column widths- Returns:
rule – Actual rule that gets added to the table output
Default: ‘’
- Return type:
class:str
- middle_rule(column_widths=None)
Create middle rule for table.
Tables usually have three types of rules: top rule, middle rule, and bottom rule. The middle rule gets used to separate column headers from the actual tabular data.
If your format in a class inheriting from
Format
does not need this rule, don’t override this method, as it will by default return the empty string, and hence no rule gets added to the table.- Parameters:
column_widths (
list
) – (optional) list of column widths- Returns:
rule – Actual rule that gets added to the table output
Default: ‘’
- Return type:
class:str
- bottom_rule(column_widths=None)
Create bottom rule for table.
Tables usually have three types of rules: top rule, middle rule, and bottom rule. The middle rule gets used to separate column headers from the actual tabular data.
If your format in a class inheriting from
Format
does not need this rule, don’t override this method, as it will by default return the empty string, and hence no rule gets added to the table.- Parameters:
column_widths (
list
) – (optional) list of column widths- Returns:
rule – Actual rule that gets added to the table output
Default: ‘’
- Return type:
class:str
- opening(columns=None, caption=None)
Create opening code.
Some formats have opening (and closing, see
closing()
) parts, e.g. opening and closing tags in XML and related languages, but in LaTeX as well.Furthermore, table captions are usually set above the table, and if your table has a caption with content, this caption will be output as well. In its simplest form, as implemented here, caption title and caption text will be concatenated and wrapped using
textwrap.wrap()
, and an empty line added after the caption to separate it from the actual table. Thus, your table captions are output together with your table in simple text format.Override this method according to your needs for your particular format.
- Parameters:
- Returns:
opening – Code for opening the environment
Default: ‘’
- Return type:
- closing(caption=None)
Create closing code.
Some formats have opening (see
opening()
) and closing parts, e.g. opening and closing tags in XML and related languages, but in LaTeX as well.If your format in a class inheriting from
Format
does not need this code, don’t override this method, as it will by default return the empty string, and hence no code gets added to the table.- Parameters:
caption (
Caption
) –(optional) table caption
For details, see the
Caption
class documentation.Only if one of the properties of
Caption
contains content, the caption will be considered.Having a caption requires some formats to create an additional container surrounding the actual table.
- Returns:
closing – Code for closing the environment
Default: ‘’
- Return type:
-
- class aspecd.table.TextFormat
Bases:
Format
Table formatter for textual output.
With its default settings, the table would be surrounded by a grid, such as:
+-----+-----+-----+ | foo | bar | baz | +-----+-----+-----+ | 1.0 | 1.1 | 1.2 | | 2.0 | 2.1 | 2.2 | +-----+-----+-----+
- rule_separator_character
Character used for the column separators of horizontal lines (rules)
- Type:
New in version 0.5.
- top_rule(column_widths=None)
Create top rule for table.
Tables usually have three types of rules: top rule, middle rule, and bottom rule. The middle rule gets used to separate column headers from the actual tabular data.
The rule gets constructed according to this overall scheme:
Use the
rule_character
for the ruleUse the
rule_separator_character
for the gaps between columnsUse the
rule_edge_character
for beginning and end of the ruleUse the
padding
information to add horizontal space in a cell
- Parameters:
column_widths (
list
) – List of column widths- Returns:
rule – Actual rule that gets added to the table output
- Return type:
class:str
- middle_rule(column_widths=None)
Create middle rule for table.
Here, the middle rule is identical to the
top_rule()
. See there for details how the rule is constructed.- Parameters:
column_widths (
list
) – List of column widths- Returns:
rule – Actual rule that gets added to the table output
- Return type:
class:str
- bottom_rule(column_widths=None)
Create bottom rule for table.
Here, the middle rule is identical to the
top_rule()
. See there for details how the rule is constructed.- Parameters:
column_widths (
list
) – List of column widths- Returns:
rule – Actual rule that gets added to the table output
- Return type:
class:str
- class aspecd.table.RstFormat
Bases:
TextFormat
Table formatter for reStructuredText (rst) output.
This formatter actually uses the simple format for rst tables, such as:
=== === === foo bar baz === === === 1.0 1.1 1.2 2.0 2.1 2.2 === === ===
The above code would result in:
foo
bar
baz
1.0
1.1
1.2
2.0
2.1
2.2
New in version 0.5.
- class aspecd.table.DokuwikiFormat
Bases:
Format
Table formatter for DokuWiki output.
For details about the syntax, see the DokuWiki syntax documentation.
An example of a table in DokuWiki syntax could look like this:
^ foo ^ bar ^ baz ^ | 1.0 | 1.1 | 1.2 | | 2.0 | 2.1 | 2.2 |
And in case of both, column headers and row indices, this would even convert to:
| ^ foo ^ bar ^ baz ^ ^ foo | 1.0 | 1.0 | 1.0 | ^ bar | 1.0 | 1.0 | 1.0 | ^ baz | 1.0 | 1.0 | 1.0 |
New in version 0.5.
- opening(columns=None, caption=None)
Create opening code.
In case of DokuWiki, this is usually empty, except in cases where you have added a caption. In the latter case, code consistent with the DokuWiki caption plugin will be output, like so:
<table> <caption>*Caption title* Caption text</caption>
To make this work in your DokuWiki, make sure to have the caption plugin installed.
- Parameters:
columns (
int
) – Number of columns of the tablecaption (
Caption
) –(optional) table caption
For details, see the
Caption
class documentation.Only if one of the properties of
Caption
contains content, the caption will be considered.Having a caption requires DokuWiki to create an additional table environment surrounding the actual table. As this needs to be closed, the closing needs to have the information regarding the caption.
- Returns:
opening – Code for opening the environment
- Return type:
- closing(caption=None)
Create closing code.
In case of DokuWiki, this is usually empty, except in cases where you have added a caption. In the latter case, code consistent with the DokuWiki caption plugin will be output, like so:
</table>
To make this work in your DokuWiki, make sure to have the caption plugin installed.
- Parameters:
caption (
Caption
) –(optional) table caption
For details, see the
Caption
class documentation.Only if one of the properties of
Caption
contains content, the caption will be considered.Having a caption requires DokuWiki to create an additional table environment surrounding the actual table. As this needs to be closed, the closing needs to have the information regarding the caption.
- Returns:
closing – Code for closing the environment
- Return type:
- class aspecd.table.LatexFormat
Bases:
Format
Table formatter for LaTeX output.
Results in a rather generic LaTeX table, and the goal of this formatter is to provide valid LaTeX code without trying to go into too many details of all the possibilities of LaTeX table formatting.
Note
The format requires the package “booktabs” to be loaded, as the horizontal rules defined by this package are automatically added to the LaTeX output.
An example of the LaTeX code of a table may look as follows:
\begin{tabular}{lll} \toprule foo & bar & baz \\ \midrule 1.0 & 1.1 & 1.2 \\ 2.0 & 2.1 & 2.2 \\ \bottomrule \end{tabular}
New in version 0.5.
- top_rule(column_widths=None)
Create top rule for table.
Tables usually have three types of rules: top rule, middle rule, and bottom rule. The middle rule gets used to separate column headers from the actual tabular data.
- Parameters:
column_widths (
list
) – Ignored in this particular case- Returns:
rule – Actual rule that gets added to the table output
- Return type:
class:str
- middle_rule(column_widths=None)
Create middle rule for table.
Tables usually have three types of rules: top rule, middle rule, and bottom rule. The middle rule gets used to separate column headers from the actual tabular data.
- Parameters:
column_widths (
list
) – Ignored in this particular case- Returns:
rule – Actual rule that gets added to the table output
- Return type:
class:str
- bottom_rule(column_widths=None)
Create bottom rule for table.
Tables usually have three types of rules: top rule, middle rule, and bottom rule. The middle rule gets used to separate column headers from the actual tabular data.
- Parameters:
column_widths (
list
) – Ignored in this particular case- Returns:
rule – Actual rule that gets added to the table output
- Return type:
class:str
- opening(columns=None, caption=None)
Create opening code.
In case of LaTeX, this is usually:
\begin{tabular}{<column-specification>}
As this class strives for a rather generic, though valid LaTeX code, the column specification is simply ‘l’ times the number of columns (for exclusively left-aligned columns).
- Parameters:
columns (
int
) – Number of columns of the tablecaption (
Caption
) –(optional) table caption
For details, see the
Caption
class documentation.Only if one of the properties of
Caption
contains content, the caption will be considered.Having a caption requires LaTeX to create an additional table environment surrounding the actual table. As this needs to be closed, the closing needs to have the information regarding the caption.
- Returns:
opening – Code for opening the environment
- Return type:
- closing(caption=None)
Create closing code.
In case of LaTeX, this is usually:
\end{tabular}
- Parameters:
caption (
Caption
) –(optional) table caption
For details, see the
Caption
class documentation.Only if one of the properties of
Caption
contains content, the caption will be considered.Having a caption requires LaTeX to create an additional table environment surrounding the actual table. As this needs to be closed, the closing needs to have the information regarding the caption.
- Returns:
closing – Code for closing the environment
- Return type:
- class aspecd.table.Caption
Bases:
Properties
Caption for tables.
- title
usually one sentence describing the intent of the table
Often plotted bold-face in a table caption.
- Type:
- text
additional text directly following the title
Contains more information about the table. Ideally, a table caption is self-contained such that it explains the table sufficiently to understand its intent and content without needing to read all the surrounding text.
- Type:
New in version 0.5.