zensols.datdesc package¶
Submodules¶
zensols.datdesc.app module¶
Generate LaTeX tables in a .sty file from CSV files. The paths to the CSV files to create tables from and their metadata is given as a YAML configuration file. Paraemters are both files or both directories. When using directories, only files that match *-table.yml are considered.
- class zensols.datdesc.app.Application(config_factory, table_factory_name, figure_factory_name, data_file_regex=re.compile('^.+-table\\\\.yml$'), figure_file_regex=re.compile('^.+-figure\\\\.yml$'), hyperparam_file_regex=re.compile('^.+-hyperparam\\\\.yml$'), hyperparam_table_default=None)[source]¶
Bases:
object
Generate LaTeX tables files from CSV files and hyperparameter .sty files.
- __init__(config_factory, table_factory_name, figure_factory_name, data_file_regex=re.compile('^.+-table\\\\.yml$'), figure_file_regex=re.compile('^.+-figure\\\\.yml$'), hyperparam_file_regex=re.compile('^.+-hyperparam\\\\.yml$'), hyperparam_table_default=None)¶
-
config_factory:
ConfigFactory
¶ Creates table and figure factories.
-
data_file_regex:
Pattern
= re.compile('^.+-table\\.yml$')¶ Matches file names of table definitions process in the LaTeX output.
- property figure_factory: FigureFactory¶
Reads the figure definitions file and writes
eps
figures..
-
figure_factory_name:
str
¶ The section name of the figure factory (see
figure_factory
).
-
figure_file_regex:
Pattern
= re.compile('^.+-figure\\.yml$')¶ Matches file names of figure definitions process in the LaTeX output.
- generate_hyperparam(input_path, output_path, output_format=_OutputFormat.short)[source]¶
Write hyperparameter formatted data.
-
hyperparam_file_regex:
Pattern
= re.compile('^.+-hyperparam\\.yml$')¶ Matches file names of tables process in the LaTeX output.
- list_figures(input_path)[source]¶
Generate figures.
- Parameters:
input_path (
Path
) – definitions YAML path location or directoryoutput_path – output file or directory
output_image_format – the output format (defaults to
svg
)
- show_table(name=None)[source]¶
Print a list of example LaTeX tables.
- Parameters:
name (
str
) – the name of the example table or a listing of tables if omitted
- property table_factory: TableFactory¶
Reads the table definitions file and writes a Latex .sty file of the generated tables from the CSV data.
-
table_factory_name:
str
¶ The section name of the table factory (see
table_factory
).
zensols.datdesc.cli module¶
Command line entry point to the application.
- class zensols.datdesc.cli.ApplicationFactory(*args, **kwargs)[source]¶
Bases:
ApplicationFactory
zensols.datdesc.desc module¶
Metadata container classes.
- class zensols.datdesc.desc.DataDescriber(describers, name='default', output_dir=PosixPath('results'), csv_dir=PosixPath('csv'), yaml_dir=PosixPath('config'), mangle_file_names=False, mangle_sheet_name=False)[source]¶
Bases:
PersistableContainer
,Dictable
Container class for
DataFrameDescriber
instances. It also saves their instances as CSV data files and YAML configuration files.- __init__(describers, name='default', output_dir=PosixPath('results'), csv_dir=PosixPath('csv'), yaml_dir=PosixPath('config'), mangle_file_names=False, mangle_sheet_name=False)¶
- add_summary()[source]¶
Add a new metadata like
DataFrameDescriber
as a first entry indescribers
that describes what data this instance currently has.- Return type:
- Returns:
the added metadata
DataFrameDescriber
instance
-
describers:
Tuple
[DataFrameDescriber
,...
]¶ The contained dataframe and metadata.
- property describers_by_name: Dict[str, DataFrameDescriber]¶
Data frame describers keyed by the describer name.
- classmethod from_yaml_file(path)[source]¶
Create a data descriptor from a previously written YAML/CSV files using
save()
.- See:
- See:
- Return type:
-
mangle_sheet_name:
bool
= False¶ Whether to normalize the Excel sheet names when
xlsxwriter.exceptions.InvalidWorksheetName
is raised.
- save(output_dir=None, yaml_dir=None, include_excel=False)[source]¶
Save both the CSV and YAML configuration file.
- Parameters:
include_excel (
bool
) – whether to also write the Excel file to its default output file name- See:
- Return type:
:see
save_yaml()
- save_excel(output_file=None)[source]¶
Save all provided dataframe describers to an Excel file.
- Parameters:
output_file (
Path
) – the Excel file to write, which needs an.xlsx
extension; this defaults to a path created fromoutput_dir
andname
- Return type:
- save_yaml(output_dir=None, yaml_dir=None)[source]¶
Save all provided dataframe describers YAML files used by the
datdesc
command.
- class zensols.datdesc.desc.DataFrameDescriber(name, df, desc, head=None, meta_path=None, meta=None, table_kwargs=<factory>, index_meta=None, mangle_file_names=False)[source]¶
Bases:
PersistableContainer
,Dictable
A class that contains a Pandas dataframe, a description of the data, and descriptions of all the columns in that dataframe.
- property T: DataFrameDescriber¶
See
transpose()
.
- __init__(name, df, desc, head=None, meta_path=None, meta=None, table_kwargs=<factory>, index_meta=None, mangle_file_names=False)¶
- derive(*, name=None, df=None, desc=None, meta=None, index_meta=None)[source]¶
Create a new instance based on this instance and replace any non-
None
kwargs.If
meta
is provided, it is merged with the metadata of this instance. However, any metadata provided must match in both column names and descriptions.
- derive_with_index_meta(index_format=None)[source]¶
Like
derive()
, but the dataframe is generated withdf_with_index_meta()
usingindex_format
as a parameter.- Parameters:
index_format (
str
) – seedf_with_index_meta()
- Return type:
- df_with_index_meta(index_format=None)[source]¶
Create a dataframe with the first column containing index metadata. This uses
index_meta
to create the column values.- Parameters:
index_format (
str
) – the new index column format usingindex
andvalue
, which defaults to{index}
- Return type:
- Returns:
the dataframe with a new first column of the index metadata, or
df
ifindex_meta
isNone
- format_table()[source]¶
Replace (in place) dataframe
df
with the formatted table obtained withTable.formatted_dataframe
. TheTable
is created by withcreate_table()
.
- classmethod from_columns(source, name=None, desc=None)[source]¶
Create a new instance by transposing a column data into a new dataframe describer. If
source
is a dataframe, it that has the following columns:Otherwise, each element of the sequence is a row of column, meta descriptions, and data sequences.
-
head:
str
= None¶ A short summary of the table and used in
Table.head
.
-
index_meta:
Dict
[Any
,str
] = None¶ The index metadata, which maps index values to descriptions of the respective row.
- property meta: DataFrame¶
The column metadata for
dataframe
, which needs columnsname
anddescription
. If this is not provided, it is read from filemeta_path
. If this is set to a tuple of tuples, a dataframe is generated from the form:((<column name 1>, <column description 1>), (<column name 2>, <column description 2>) ...
If both this and
meta_path
are not provided, the following is used:(('description', 'Description'), ('value', 'Value')))
- save_excel(output_dir=PosixPath('.'))[source]¶
Save as an Excel file using
csv_path
. The same file naming semantics are used as withDataDescriber.save_excel()
.- See:
- Return type:
-
table_kwargs:
Dict
[str
,Any
]¶ Additional key word arguments given when creating a table in
create_table()
.
- transpose(row_names=((0, 'value', 'Value'),), name_column='name', name_description='Name', index_column='description')[source]¶
Transpose all data in this descriptor by transposing
df
and swappingmeta
withindex_meta
as a new instance.- Parameters:
row_names (
Tuple
[int
,str
,str
]) – a tuple of (row index indf
, the column in the newdf
and the metadata description of that column in the newdf
; the default takes only the first rowdescription_column – the column description this instance’s
df
index_column (
str
) – the name of the new index in the returned instance
- Return type:
- Returns:
a new derived instance of the transposed data
zensols.datdesc.dfstash module¶
A stash implementation that uses a Pandas dataframe and stored as a CSV file.
- class zensols.datdesc.dfstash.DataFrameStash(path, dataframe=None, key_column='key', columns=('value',), mkdirs=True, auto_commit=True, single_column_index=0)[source]¶
Bases:
CloseableStash
A backing stash that persists to a CSV file via a Pandas dataframe. All modification go through the
pandas.DataFrame
and then saved withcommit()
orclose()
.- __init__(path, dataframe=None, key_column='key', columns=('value',), mkdirs=True, auto_commit=True, single_column_index=0)¶
- clear()[source]¶
Delete all data from the from the stash.
Important: Exercise caution with this method, of course.
-
columns:
Tuple
[str
,...
] = ('value',)¶ The columns to create in the spreadsheet. These must be consistent when the data is restored.
- property dataframe: DataFrame¶
The dataframe to proxy in memory. This is settable on instantiation but read-only afterward. If this is not set an empty dataframe is created with the metadata in this class.
- delete(name=None)[source]¶
Delete the resource for data pointed to by
name
or the entire resource ifname
is not given.
- exists(name)[source]¶
Return
True
if data with keyname
exists.Implementation note: This
Stash.exists()
method is very inefficient and should be overriden.- Return type:
- get(name, default=None)[source]¶
Load an object or a default if key
name
doesn’t exist. Semantically, this method tries not to re-create the data if it already exists. This means that if a stash has built-in caching mechanisms, this method uses it.
- load(name)[source]¶
Load a data value from the pickled data with key
name
. Semantically, this method loads the using the stash’s implementation. For exampleDirectoryStash
loads the data from a file if it exists, but factory type stashes will always re-generate the data.
-
mkdirs:
bool
= True¶ Whether to recusively create the directory where
path
is stored if it does not already exist.
zensols.datdesc.figure module¶
A simple object oriented plotting API.
- class zensols.datdesc.figure.Figure(name='Untitled', config_factory=None, title_font_size=0, height=5, width=5, padding=5.0, metadata=<factory>, plots=(), image_dir=PosixPath('.'), image_format='svg', image_file_norm=True, seaborn=<factory>)[source]¶
Bases:
Deallocatable
,Dictable
An object oriented class to manage
matplit.figure.Figure
and subplots (matplit.pyplot.Axes
).- __init__(name='Untitled', config_factory=None, title_font_size=0, height=5, width=5, padding=5.0, metadata=<factory>, plots=(), image_dir=PosixPath('.'), image_format='svg', image_file_norm=True, seaborn=<factory>)¶
- add_plot(plot)[source]¶
Add to the collection of managed plots. This is needed for the plot to work if not created from this manager instance.
- Parameters:
plot (
Plot
) – the plot to be managed
-
config_factory:
ConfigFactory
= None¶ The configuration factory used to create plots.
- property path: Path¶
The path of the image figure to save. This is constructed from
image_dir
,name
and :obj`image_format`. Conversely, when set, it updates these fields.
-
plots:
Tuple
[Plot
,...
] = ()¶ The plots managed by this object instance. Use
add_plot()
to add new plots.
-
seaborn:
Dict
[str
,Any
]¶ Seaborn (
seaborn
) rendering configuration. It has the following optional keys:style
: parameters used with :function:`sns.set_style`context
: parameters used with :function:`sns.set_context`
- class zensols.datdesc.figure.FigureFactory(config_factory, plot_section_regex)[source]¶
Bases:
Dictable
Create instances of :.Figure using
create()
or from configuration files withfrom_file()
. See the `usage`_ documentation for information about the configuration files used byfrom_file()
.- __init__(config_factory, plot_section_regex)¶
-
config_factory:
ConfigFactory
¶ The configuration factory used to create
Table
instances.
- class zensols.datdesc.figure.Plot(title=None, row=0, column=0, post_hooks=<factory>)[source]¶
Bases:
Dictable
An abstract base class for plots. The subclass overrides
plot()
to generate the plot. Then the client can usesave()
orrender()
it. The plot is created as a subplot providing attributes for space to be taken in rows, columns, height and width.- __init__(title=None, row=0, column=0, post_hooks=<factory>)¶
zensols.datdesc.hyperparam module¶
Hyperparameter metadata: access and documentation. This package was designed for the following purposes:
Provide a basic scaffolding to update model hyperparameters such as
hyperopt
.Generate LaTeX tables of the hyperparamers and their descriptions for academic papers.
The object instance graph hierarchy is:
Access to the hyperparameters is done by calling the set or model levels
with a dotted path notation string. For example, svm.C
first navigates to
model svm
, then to the hyperparameter named C
.
- class zensols.datdesc.hyperparam.Hyperparam(name, type, doc, choices=None, value=None, interval=None)[source]¶
Bases:
Dictable
A hyperparameter’s metadata, documentation and value. The value is accessed (retrieval and setting) at runtime. Do not use this class explicitly. Instead use
HyperparamModel
.The index access only applies when
type
islist
ordict
. Otherwise, thevalue
member has the value of the hyperparameter.-
CLASS_MAP:
ClassVar
[Dict
[str
,Type
]] = {'bool': <class 'bool'>, 'choice': <class 'str'>, 'dict': <class 'dict'>, 'float': <class 'float'>, 'int': <class 'int'>, 'list': <class 'list'>, 'str': <class 'str'>}¶ A mapping for values set in
type
to their Python class equivalents.
-
VALID_TYPES:
ClassVar
[str
] = frozenset({'bool', 'choice', 'dict', 'float', 'int', 'list', 'str'})¶ Valid settings for
type
.
- __init__(name, type, doc, choices=None, value=None, interval=None)¶
-
doc:
str
¶ The human readable documentation for the hyperparameter. This is used in documentation generation tasks.
-
interval:
Union
[Tuple
[float
,float
],Tuple
[int
,int
]] = None¶ Valid intervals for
value
as an inclusive interval.
-
CLASS_MAP:
- class zensols.datdesc.hyperparam.HyperparamContainer[source]¶
Bases:
Dictable
A container class for
Hyperparam
instances.- __init__()¶
- abstract flatten(deep=False)[source]¶
Return a flattened directory with the dotted path notation (see module docs).
- exception zensols.datdesc.hyperparam.HyperparamError[source]¶
Bases:
DataDescriptionError
Raised for any error related hyperparameter access.
- __module__ = 'zensols.datdesc.hyperparam'¶
- class zensols.datdesc.hyperparam.HyperparamModel(name, doc, desc=None, params=<factory>, table=None)[source]¶
Bases:
HyperparamContainer
The model level class that contains the parameters. This class represents a machine learning model such as a SVM with hyperparameters such as
C
andmaximum iterations
.- __init__(name, doc, desc=None, params=<factory>, table=None)¶
- create_dataframe_describer()[source]¶
Return an object with metadata fully describing the hyperparameters of this model.
- Return type:
-
desc:
str
= None¶ name is not sufficient. Since
name
has naming constraints, this can be used as in place during documentation generation.- Type:
The description the model used in the documentation when obj
- flatten(deep=False)[source]¶
Return a flattened directory with the dotted path notation (see module docs).
- property metadata_dataframe: DataFrame¶
A dataframe describing the
values_dataframe
.
-
name:
str
¶ The name of the model (i.e.
svm
). This name can have only alpha-numeric and underscore charaters.
-
params:
Dict
[str
,Hyperparam
]¶ The hyperparameters keyed by their names.
-
table:
Optional
[Dict
[str
,Any
]] = None¶ Overriding data used when creating a
Table
fromDataFrameDescriber.create_table()
.
- property values_dataframe: DataFrame¶
A dataframe with parameter data. This includes the name, type, value and documentation.
- write(depth=0, writer=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>, include_doc=False)[source]¶
Write this instance as either a
Writable
or as aDictable
. If class attribute_DICTABLE_WRITABLE_DESCENDANTS
is set asTrue
, then use thewrite()
method on children instead of writing the generated dictionary. Otherwise, write this instance by first creating adict
recursively usingasdict()
, then formatting the output.If the attribute
_DICTABLE_WRITE_EXCLUDES
is set, those attributes are removed from what is written in thewrite()
method.Note that this attribute will need to be set in all descendants in the instance hierarchy since writing the object instance graph is done recursively.
- Parameters:
depth (
int
) – the starting indentation depthwriter (
TextIOBase
) – the writer to dump the content of this writable
- class zensols.datdesc.hyperparam.HyperparamSet(models=<factory>, name=None)[source]¶
Bases:
HyperparamContainer
The top level in the object graph hierarchy (see module docs). This contains a set of models and typically where calls by packages such as
hyperopt
are used to update the hyperparameters of the model(s).- __init__(models=<factory>, name=None)¶
- create_describer(meta_path=None)[source]¶
Return an object with metadata fully describing the hyperparameters of this model.
- Parameters:
meta_path (
Path
) – if provided, set the path on the returned instance- Return type:
- flatten(deep=False)[source]¶
Return a flattened directory with the dotted path notation (see module docs).
-
models:
Dict
[str
,HyperparamModel
]¶ The models containing hyperparameters for this set.
- write(depth=0, writer=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>, include_doc=False)[source]¶
Write this instance as either a
Writable
or as aDictable
. If class attribute_DICTABLE_WRITABLE_DESCENDANTS
is set asTrue
, then use thewrite()
method on children instead of writing the generated dictionary. Otherwise, write this instance by first creating adict
recursively usingasdict()
, then formatting the output.If the attribute
_DICTABLE_WRITE_EXCLUDES
is set, those attributes are removed from what is written in thewrite()
method.Note that this attribute will need to be set in all descendants in the instance hierarchy since writing the object instance graph is done recursively.
- Parameters:
depth (
int
) – the starting indentation depthwriter (
TextIOBase
) – the writer to dump the content of this writable
- class zensols.datdesc.hyperparam.HyperparamSetLoader(data, config=None, updates=())[source]¶
Bases:
object
Loads a set of hyperparameters from a YAML
pathlib.Path
,dict
or streamio.TextIOBase
.- __init__(data, config=None, updates=())¶
-
config:
Configurable
= None¶ The application configuration used to update the hyperparameters from other sections.
-
data:
Union
[Dict
[str
,Any
],Path
,TextIOBase
]¶ The source of data to load, which is a YAML
pathlib.Path
,dict
or streamio.TextIOBase
.- See:
- load(**kwargs) HyperparamSet ¶
Load and return the hyperparameter object graph from
data
.- Return type:
HyperparamSet
- exception zensols.datdesc.hyperparam.HyperparamValueError[source]¶
Bases:
HyperparamError
Raised for bad values set on a hyperparameter.
- __annotations__ = {}¶
- __module__ = 'zensols.datdesc.hyperparam'¶
zensols.datdesc.latex module¶
Contains the manager classes that invoke the tables to generate.
- class zensols.datdesc.latex.CsvToLatexTable(tables, package_name)[source]¶
Bases:
Writable
Generate a Latex table from a CSV file.
- __init__(tables, package_name)¶
- class zensols.datdesc.latex.LatexTable(path, name, template, caption='', head=None, type=None, template_params=<factory>, default_params=<factory>, params=<factory>, definition_file=None, uses=<factory>, hlines=<factory>, double_hlines=<factory>, column_keeps=None, column_removes=<factory>, column_renames=<factory>, column_value_replaces=<factory>, column_aligns=None, round_column_names=<factory>, percent_column_names=(), make_percent_column_names=<factory>, format_thousands_column_names=<factory>, format_scientific_column_names=<factory>, read_params=<factory>, tabulate_params=<factory>, replace_nan=None, blank_columns=<factory>, bold_cells=<factory>, bold_max_columns=<factory>, capitalize_columns=<factory>, index_col_name=None, variables=<factory>, writes=<factory>, code_pre=None, code_post=None, code_format=None, row_range=(1, -1), booktabs=False)[source]¶
Bases:
Table
This subclass generates LaTeX tables.
- __init__(path, name, template, caption='', head=None, type=None, template_params=<factory>, default_params=<factory>, params=<factory>, definition_file=None, uses=<factory>, hlines=<factory>, double_hlines=<factory>, column_keeps=None, column_removes=<factory>, column_renames=<factory>, column_value_replaces=<factory>, column_aligns=None, round_column_names=<factory>, percent_column_names=(), make_percent_column_names=<factory>, format_thousands_column_names=<factory>, format_scientific_column_names=<factory>, read_params=<factory>, tabulate_params=<factory>, replace_nan=None, blank_columns=<factory>, bold_cells=<factory>, bold_max_columns=<factory>, capitalize_columns=<factory>, index_col_name=None, variables=<factory>, writes=<factory>, code_pre=None, code_post=None, code_format=None, row_range=(1, -1), booktabs=False)¶
-
booktabs:
bool
= False¶ Whether or not to use the
booktabs
style table and to format using its style.
- class zensols.datdesc.latex.SlackTable(path, name, template, caption='', head=None, type=None, template_params=<factory>, default_params=<factory>, params=<factory>, definition_file=None, uses=<factory>, hlines=<factory>, double_hlines=<factory>, column_keeps=None, column_removes=<factory>, column_renames=<factory>, column_value_replaces=<factory>, column_aligns=None, round_column_names=<factory>, percent_column_names=(), make_percent_column_names=<factory>, format_thousands_column_names=<factory>, format_scientific_column_names=<factory>, read_params=<factory>, tabulate_params=<factory>, replace_nan=None, blank_columns=<factory>, bold_cells=<factory>, bold_max_columns=<factory>, capitalize_columns=<factory>, index_col_name=None, variables=<factory>, writes=<factory>, code_pre=None, code_post=None, code_format=None, row_range=(1, -1), booktabs=False, slack_column=0)[source]¶
Bases:
LatexTable
An instance of the table that fills up space based on the widest column.
- __init__(path, name, template, caption='', head=None, type=None, template_params=<factory>, default_params=<factory>, params=<factory>, definition_file=None, uses=<factory>, hlines=<factory>, double_hlines=<factory>, column_keeps=None, column_removes=<factory>, column_renames=<factory>, column_value_replaces=<factory>, column_aligns=None, round_column_names=<factory>, percent_column_names=(), make_percent_column_names=<factory>, format_thousands_column_names=<factory>, format_scientific_column_names=<factory>, read_params=<factory>, tabulate_params=<factory>, replace_nan=None, blank_columns=<factory>, bold_cells=<factory>, bold_max_columns=<factory>, capitalize_columns=<factory>, index_col_name=None, variables=<factory>, writes=<factory>, code_pre=None, code_post=None, code_format=None, row_range=(1, -1), booktabs=False, slack_column=0)¶
zensols.datdesc.opt module¶
zensols.datdesc.optscore module¶
zensols.datdesc.plots module¶
Common used plots for ML.
- class zensols.datdesc.plots.BarPlot(title=None, row=0, column=0, post_hooks=<factory>, data=None, palette=None, x_axis_label=None, y_axis_label=None, x_column_name=None, y_column_name=None, hue_column_name=None, x_label_rotation=0, key_title=None, log_scale=None, render_value_font_size=None, hue_palette=False, plot_params=<factory>)[source]¶
Bases:
PaletteContainerPlot
,DataFramePlot
Create a bar plot using
seaborn.barplot()
.- __init__(title=None, row=0, column=0, post_hooks=<factory>, data=None, palette=None, x_axis_label=None, y_axis_label=None, x_column_name=None, y_column_name=None, hue_column_name=None, x_label_rotation=0, key_title=None, log_scale=None, render_value_font_size=None, hue_palette=False, plot_params=<factory>)¶
- class zensols.datdesc.plots.DataFramePlot(title=None, row=0, column=0, post_hooks=<factory>, data=None)[source]¶
Bases:
Plot
- __init__(title=None, row=0, column=0, post_hooks=<factory>, data=None)¶
- class zensols.datdesc.plots.HeatMapPlot(title=None, row=0, column=0, post_hooks=<factory>, data=None, palette=None, format='.2f', x_label_rotation=0, params=<factory>)[source]¶
Bases:
PaletteContainerPlot
,DataFramePlot
Create heat map plot and optionally normalize. This uses
seaborn
’sheatmap
.- __init__(title=None, row=0, column=0, post_hooks=<factory>, data=None, palette=None, format='.2f', x_label_rotation=0, params=<factory>)¶
- class zensols.datdesc.plots.HistPlot(title=None, row=0, column=0, post_hooks=<factory>, palette=None, data=<factory>, x_axis_label=None, y_axis_label=None, key_title=None, log_scale=None, plot_params=<factory>)[source]¶
Bases:
PaletteContainerPlot
Create a histogram plot using
seaborn.histplot()
.- __init__(title=None, row=0, column=0, post_hooks=<factory>, palette=None, data=<factory>, x_axis_label=None, y_axis_label=None, key_title=None, log_scale=None, plot_params=<factory>)¶
-
data:
List
[Tuple
[str
,DataFrame
]]¶ The data to plot. Each element is tuple first components with the plot name.
- class zensols.datdesc.plots.PaletteContainerPlot(title=None, row=0, column=0, post_hooks=<factory>, palette=None)[source]¶
Bases:
Plot
A base class that supports creating a color palette for subclasses.
- __init__(title=None, row=0, column=0, post_hooks=<factory>, palette=None)¶
- class zensols.datdesc.plots.PointPlot(title=None, row=0, column=0, post_hooks=<factory>, palette=None, data=<factory>, x_axis_name=None, y_axis_name=None, x_column_name=None, y_column_name=None, key_title=None, sample_rate=0, plot_params=<factory>)[source]¶
Bases:
PaletteContainerPlot
An abstract base class that renders overlapping lines that uses a
seaborn
pointplot
.- __init__(title=None, row=0, column=0, post_hooks=<factory>, palette=None, data=<factory>, x_axis_name=None, y_axis_name=None, x_column_name=None, y_column_name=None, key_title=None, sample_rate=0, plot_params=<factory>)¶
- add(name, line)[source]¶
Add the losses of a dataset by adding X values as incrementing integers the size of
line
.
-
data:
List
[Tuple
[str
,DataFrame
]]¶ The data to plot. Each element is tuple first components with the plot name. The second component is a dataframe with columns:
x_column_name
: the X values of the graph, usually an incrementing numbery_column_name
: a list loss float values
Optionally use
add_line()
to populate this list.
zensols.datdesc.table module¶
This module contains classes that generate tables.
- class zensols.datdesc.table.Table(path, name, template, caption='', head=None, type=None, template_params=<factory>, default_params=<factory>, params=<factory>, definition_file=None, uses=<factory>, hlines=<factory>, double_hlines=<factory>, column_keeps=None, column_removes=<factory>, column_renames=<factory>, column_value_replaces=<factory>, column_aligns=None, round_column_names=<factory>, percent_column_names=(), make_percent_column_names=<factory>, format_thousands_column_names=<factory>, format_scientific_column_names=<factory>, read_params=<factory>, tabulate_params=<factory>, replace_nan=None, blank_columns=<factory>, bold_cells=<factory>, bold_max_columns=<factory>, capitalize_columns=<factory>, index_col_name=None, variables=<factory>, writes=<factory>, code_pre=None, code_post=None, code_format=None)[source]¶
Bases:
PersistableContainer
,Dictable
Generates a Zensols styled Latex table from a CSV file.
- __init__(path, name, template, caption='', head=None, type=None, template_params=<factory>, default_params=<factory>, params=<factory>, definition_file=None, uses=<factory>, hlines=<factory>, double_hlines=<factory>, column_keeps=None, column_removes=<factory>, column_renames=<factory>, column_value_replaces=<factory>, column_aligns=None, round_column_names=<factory>, percent_column_names=(), make_percent_column_names=<factory>, format_thousands_column_names=<factory>, format_scientific_column_names=<factory>, read_params=<factory>, tabulate_params=<factory>, replace_nan=None, blank_columns=<factory>, bold_cells=<factory>, bold_max_columns=<factory>, capitalize_columns=<factory>, index_col_name=None, variables=<factory>, writes=<factory>, code_pre=None, code_post=None, code_format=None)¶
- asflatdict(*args, **kwargs)[source]¶
Like
asdict()
but flatten in to a data structure suitable for writing to JSON or YAML.
-
blank_columns:
List
[int
]¶ A list of column indexes to set to the empty string (i.e. 0th to fixed the
Unnamed: 0
issues).
-
capitalize_columns:
Dict
[str
,bool
]¶ Capitalize either sentences (
False
values) or every word (True
values). The keys are column names.
-
code_format:
str
= None¶ Like
code_post
but modifies the table after this class’s all formatting of the table (including those applied by this class).
-
code_post:
str
= None¶ Like
code_pre
but modifies the table after this class’s modifications of the table.
-
code_pre:
str
= None¶ Python code executed that manipulates the table’s dataframe before modifications made by this class. The code has a local
df
variable and the returned value is used as the replacement. This is usually a one-liner used to subset the data etc. The code is evaluated witheval()
.
-
column_aligns:
str
= None¶ The alignment/justification (i.e.
|l|l|
for two columns). If not provided, they are automatically generated based on the columns of the table.
-
column_value_replaces:
Dict
[str
,Dict
[Any
,Any
]]¶ Data values to replace in the dataframe. It is keyed by the column name and values are the replacements. Each value is a
dict
with orignal value keys and the replacements as values.
-
default_params:
Sequence
[Sequence
[str
]]¶ Default parameters to be substituted in the template that are interpolated by the LaTeX numeric values such as #1, #2, etc. This is a sequence (list or tuple) of
(<name>, [<default>])
wherename
is substituted by name in the template anddefault
is the default if not given inparams
.
-
format_scientific_column_names:
Dict
[str
,Optional
[int
]]¶ Format a column using LaTeX formatted scientific notation using
format_scientific()
. Keys are column names and values is the mantissa length or 1 ifNone
.
- static format_thousand(x, apply_k=True, add_comma=True, round=None)[source]¶
Format a number as a string with comma separating thousands.
-
format_thousands_column_names:
Dict
[str
,Optional
[Dict
[str
,Any
]]]¶ Columns to format using thousands, and optionally round. The keys are the column names of the table and the values are either
None
or the keyword arguments toformat_thousand()
.
- property formatted_dataframe: DataFrame¶
The
dataframe
with the formatting applied to it used to create the Latex table. Modifications such as string replacements for adding percents is done.
-
head:
str
= None¶ The header to use for the table, which is used as the text in the list of tables and made bold in the table.
-
make_percent_column_names:
Dict
[str
,int
]¶ Each columnn in the map will get rounded to the value * 100 of the name. For example,
{'ann_per': 3}
will round columnann_per
to 3 decimal places.
-
params:
Dict
[str
,str
]¶ Parameters used in the template that override of the
default_params
.
-
read_params:
Dict
[str
,str
]¶ Keyword arguments used in the
read_csv()
call when reading the CSV file.
-
replace_nan:
str
= None¶ Replace NaN values with a the value of this field as
tabulate()
is not using the missing value due to some bug I assume.
-
round_column_names:
Dict
[str
,int
]¶ Each column in the map will get rounded to their respective values.
-
tabulate_params:
Dict
[str
,str
]¶ Keyword arguments used in the
tabulate()
call when writing the table. The default tellstabulate
to not parse/format numerical data.
-
variables:
Dict
[str
,Union
[Tuple
[int
,int
],str
]]¶ A mapping of variable names to a dataframe cell or Python code snipped that is evaluated with
exec()
. In LaTeX, this is done by setting anewcommand
(seeLatexTable
).If set to a tuple of
(<row>, <column>)
the value of the pre-formatted dataframe is used (seeunformatted
below).If a Python evalution string, the code values must set variables
v
to the variable value. A variablestages
is aDict
used to get one of the dataframes created at various stages of formatting the table with entries:nascent
: same asdataframe
unformatted
: after the pre-evaluation but before any formattingpostformat
: after number formatting and post evaluation, but before remaining column and cell modificationsformatted
: same asformatted_dataframe
For example, the following uses the value at row 2 and column 3 of the unformatted dataframe:
v = stages['unformatted'].iloc[2, 3]
- write(depth=0, writer=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>)[source]¶
Write this instance as either a
Writable
or as aDictable
. If class attribute_DICTABLE_WRITABLE_DESCENDANTS
is set asTrue
, then use thewrite()
method on children instead of writing the generated dictionary. Otherwise, write this instance by first creating adict
recursively usingasdict()
, then formatting the output.If the attribute
_DICTABLE_WRITE_EXCLUDES
is set, those attributes are removed from what is written in thewrite()
method.Note that this attribute will need to be set in all descendants in the instance hierarchy since writing the object instance graph is done recursively.
- Parameters:
depth (
int
) – the starting indentation depthwriter (
TextIOBase
) – the writer to dump the content of this writable
- class zensols.datdesc.table.TableFactory(config_factory, table_section_regex, default_table_type)[source]¶
Bases:
Dictable
Reads the table definitions file and writes a Latex
.sty
file of the generated tables from the CSV data. Tables are created with eitherusage()
orfrom_file()
. See the `usage`_ documentation for information about the configuration files used byfrom_file()
.- __init__(config_factory, table_section_regex, default_table_type)¶
-
config_factory:
ConfigFactory
¶ The configuration factory used to create
Table
instances.
-
default_table_type:
str
¶ The default name, which resolves to a section name, to use when creating anonymous tables.
Module contents¶
Generate Latex tables in a .sty file from CSV files. The paths to the CSV files to create tables from and their metadata is given as a YAML configuration file.
- Example::
- latextablenamehere:
type: slack slack_col: 0 path: ../config/table-name.csv caption: Some Caption placement: t! size: small single_column: true percent_column_names: [‘Proportion’]
- exception zensols.datdesc.DataDescriptionError[source]¶
Bases:
APIError
Thrown for any application level error.
- __annotations__ = {}¶
- __module__ = 'zensols.datdesc'¶
- exception zensols.datdesc.FigureError(reason, figure=None)[source]¶
Bases:
DataDescriptionError
Thrown for any application level error related to creating figures.
- __annotations__ = {}¶
- __module__ = 'zensols.datdesc'¶