zensols.deeplearn.result package

Submodules

zensols.deeplearn.result.compare module

Result diff utilities.

class zensols.deeplearn.result.compare.ModelResultComparer(rm, res_id_a, res_id_b)[source]

Bases: Writable

This class performs a diff on two classes and reports the differences.

__init__(rm, res_id_a, res_id_b)
res_id_a: str

The result ID of the first archived result set to diff.

res_id_b: str

The result ID of the second archived result set to diff.

rm: ModelResultManager

The manager used to retrieve the model results.

write(depth=0, writer=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>)[source]

Write the contents of this instance to writer using indention depth.

Parameters:
  • depth (int) – the starting indentation depth

  • writer (TextIOBase) – the writer to dump the content of this writable

zensols.deeplearn.result.domain module

Contains contain classes for results generated from training and testing a model.

class zensols.deeplearn.result.domain.ClassificationMetrics(labels, predictions, n_outcomes)[source]

Bases: Metrics

Real valued prediction results for ModelType.CLASSIFICATION result.

__init__(labels, predictions, n_outcomes)
property accuracy: float

Return the accuracy metric (num correct / total).

create_metrics(average)[source]

Create a score metrics with the given average.

Return type:

ScoreMetrics

property macro: ScoreMetrics

Compute macro F1, precision and recall.

property micro: ScoreMetrics

Compute micro F1, precision and recall.

property n_correct: int

The number or correct predictions for the classification.

n_outcomes: int

The number of outcomes given for this metrics set.

property weighted: ScoreMetrics

Compute weighted F1, precision and recall.

write(depth=0, writer=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>)[source]

Write this instance as either a Writable or as a Dictable. If class attribute _DICTABLE_WRITABLE_DESCENDANTS is set as True, then use the write() method on children instead of writing the generated dictionary. Otherwise, write this instance by first creating a dict recursively using asdict(), then formatting the output.

If the attribute _DICTABLE_WRITE_EXCLUDES is set, those attributes are removed from what is written in the write() method.

Note that this attribute will need to be set in all descendants in the instance hierarchy since writing the object instance graph is done recursively.

Parameters:
  • depth (int) – the starting indentation depth

  • writer (TextIOBase) – the writer to dump the content of this writable

class zensols.deeplearn.result.domain.DatasetResult(context)[source]

Bases: ResultsContainer

Contains results for a dataset, such as training, validating and test.

__init__(context)
append(epoch_result)[source]
clone()[source]

Return a clone of the current container. Sub containers (lists) are deep copied in sub classes, but everything is shallow copied.

This is needed to create a temporary container to persist whose end() gets called by the ModelExecutor.

Return type:

ResultsContainer

property contains_results

True if this container has results.

property converged_epoch: EpochResult

Return the last epoch that arrived at the lowest loss.

property convergence: int

Return the Nth epoch index this result set convergened. If used on a EpocResult it is the Nth iteration.

end()[source]

Record the time at which processing started for the metrics populated in this container.

See:

obj:is_ended

property losses: List[float]

Return the loss for each epoch of the run. If used on a EpocResult it is the Nth iteration.

property results: List[EpochResult]
property statistics: Dict[str, Any]

Return the statistics of the data set result.

Returns:

a dictionary with the following:

  • n_epochs: the number of epoch results

  • n_epoch_converged: the 0 based index for which epoch converged (lowest validation loss before it went back up)

  • n_batches: the number of batches on which were trained,

    tested or validated

  • ave_data_points: the average number of data pointes on

    which were trained, tested or validated per batch

  • n_total_data_points: the number of data pointes on which

    were trained, tested or validated

write(depth=0, writer=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>, include_details=False, converged_epoch=True, include_metrics=True, include_all_metrics=False)[source]

Write the results data.

Parameters:
  • depth (int) – the number of indentation levels

  • writer (TextIOBase) – the data sink

  • include_settings – whether or not to include model and network settings in the output

  • include_config – whether or not to include the configuration in the output

class zensols.deeplearn.result.domain.EpochResult(context, index, split_type, batch_losses=<factory>, batch_ids=<factory>, n_data_points=<factory>)[source]

Bases: ResultsContainer

Contains results recorded from an epoch of a neural network model. This is during a training/validation or test cycle.

Note that there is a terminology difference between what the model and the result set call outcomes. For the model, outcomes are the mapped/refined results, which are usually the argmax of the softmax of the logits. For results, these are the predictions of the given data to be compared against the gold labels.

__init__(context, index, split_type, batch_losses=<factory>, batch_ids=<factory>, n_data_points=<factory>)
batch_ids: List[int]

The ID of the batch from each iteration of the epoch.

property batch_labels: List[ndarray]

The batch labels given in the shape as output from the model.

batch_losses: List[float]

The losses generated from each iteration of the epoch.

property batch_outputs: List[ndarray]
property batch_predictions: List[ndarray]

The batch predictions given in the shape as output from the model.

clone()[source]

Return a clone of the current container. Sub containers (lists) are deep copied in sub classes, but everything is shallow copied.

This is needed to create a temporary container to persist whose end() gets called by the ModelExecutor.

Return type:

ResultsContainer

end()[source]

Record the time at which processing started for the metrics populated in this container.

See:

obj:is_ended

index: int

The Nth epoch of the run (across training, validation, test).

property losses: List[float]

Return the loss for each epoch of the run. If used on a EpocResult it is the Nth iteration.

n_data_points: List[int]

The number of data points for each batch for the epoch.

split_type: DatasetSplitType

The name of the split type (i.e. train vs test).

update(batch, loss, labels, preds, outputs)[source]

Add another set of statistics, predictions and gold labels to prediction_updates.

Parameters:
  • batch (Batch) – the batch on which the stats/data where trained, tested or validated; used to update the loss as a multiplier on its size

  • loss (Tensor) – the loss returned by the loss function

  • labels (Tensor) – the gold labels, or None if this is a prediction run

  • preds (Tensor) – the predictions, or None for scored models (see prediction_updates)

write(depth=0, writer=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>, include_metrics=False)[source]

Write this instance as either a Writable or as a Dictable. If class attribute _DICTABLE_WRITABLE_DESCENDANTS is set as True, then use the write() method on children instead of writing the generated dictionary. Otherwise, write this instance by first creating a dict recursively using asdict(), then formatting the output.

If the attribute _DICTABLE_WRITE_EXCLUDES is set, those attributes are removed from what is written in the write() method.

Note that this attribute will need to be set in all descendants in the instance hierarchy since writing the object instance graph is done recursively.

Parameters:
  • depth (int) – the starting indentation depth

  • writer (TextIOBase) – the writer to dump the content of this writable

class zensols.deeplearn.result.domain.Metrics(labels, predictions)[source]

Bases: Dictable

A container class that provides results for data stored in a ResultsContainer.

__init__(labels, predictions)
property contains_results: bool

Return True if this container has results.

labels: ndarray

The labels or None if none were provided (i.e. during test/evaluation).

predictions: ndarray

The predictions from the model. This also flattens the predictions in to a 1D array for the purpose of computing metrics.

class zensols.deeplearn.result.domain.ModelResult(config, name, model_settings, net_settings, decoded_attributes, context)[source]

Bases: Dictable

A container class used to capture the training, validation and test results. The data captured is used to report and plot curves.

RUNS: ClassVar[int] = 1
__init__(config, name, model_settings, net_settings, decoded_attributes, context)
clone()[source]
Return type:

ModelResult

config: Configurable

Useful for retrieving hyperparameter settings later after unpersisting from disk.

property contains_results: bool
context: ResultContext

The context of the results.

decoded_attributes: Set[str]

The attributes that were coded and used in this model.

get_intermediate()[source]
Return type:

ModelResult

classmethod get_num_runs()[source]
property last_test: DatasetResult

Return either the test or validation results depending on what is available.

property last_test_name: str

Return the anem of the dataset that exists in the container, and thus, the last to be populated. In order, this is test and then validation.

model_settings: InitVar

The setttings used to configure the model.

name: str

The name of this result set.

net_settings: InitVar

The network settings used by the model for this result set.

property non_empty_dataset_result: Dict[str, DatasetResult]
reset(name)[source]

Clear all results for data set name.

classmethod reset_runs()[source]

Reset the run counter.

property test: DatasetResult

Return the testing run results.

property train: DatasetResult

Return the training run results.

property validation: DatasetResult

Return the validation run results.

write(depth=0, writer=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>, include_settings=False, include_converged=False, include_config=False, include_all_metrics=False)[source]

Generate a human readable format of the results.

write_result_statistics(split_type, depth=0, writer=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>)[source]
exception zensols.deeplearn.result.domain.ModelResultError[source]

Bases: DeepLearnError

“Thrown when results can not be compiled or computed.

__annotations__ = {}
__module__ = 'zensols.deeplearn.result.domain'
class zensols.deeplearn.result.domain.ModelType(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)[source]

Bases: Enum

The type of model give by the type of its output.

CLASSIFICTION = 2
MULTI_LABEL_CLASSIFICATION = 3
PREDICTION = 1
RANKING = 4
class zensols.deeplearn.result.domain.MultiLabelClassificationMetrics(labels, predictions, n_outcomes, context)[source]

Bases: ClassificationMetrics

Metrics for multi-label classification.

__init__(labels, predictions, n_outcomes, context)
property accuracy: float

Return the accuracy metric (num correct / total).

context: ResultContext

The context of the results.

create_metrics(average)[source]

Create a score metrics with the given average.

Return type:

ScoreMetrics

property dataframes: Dict[str, DataFrame]

A multi-label classification report as a dataframe.

property multi_labels: Tuple[str]

The labels used in the multi-label classification.

property n_correct: int

The number or correct predictions for the classification.

write(depth=0, writer=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>, report=False)[source]

Write this instance as either a Writable or as a Dictable. If class attribute _DICTABLE_WRITABLE_DESCENDANTS is set as True, then use the write() method on children instead of writing the generated dictionary. Otherwise, write this instance by first creating a dict recursively using asdict(), then formatting the output.

If the attribute _DICTABLE_WRITE_EXCLUDES is set, those attributes are removed from what is written in the write() method.

Note that this attribute will need to be set in all descendants in the instance hierarchy since writing the object instance graph is done recursively.

Parameters:
  • depth (int) – the starting indentation depth

  • writer (TextIOBase) – the writer to dump the content of this writable

class zensols.deeplearn.result.domain.MultiLabelScoreMetrics(labels, predictions, average)[source]

Bases: ScoreMetrics

A container class that provides results for multi-label data.

__init__(labels, predictions, average)
exception zensols.deeplearn.result.domain.NoResultError(cls)[source]

Bases: ModelResultError

Convenience used for helping debug the network.

__annotations__ = {}
__init__(cls)[source]
__module__ = 'zensols.deeplearn.result.domain'
class zensols.deeplearn.result.domain.PredictionMetrics(labels, predictions)[source]

Bases: Metrics

Real valued prediction results for ModelType.PREDICTION result.

__init__(labels, predictions)
property correlation: float

Return the correlation metric.

property mean_absolute_error: float

Return the mean absolute error metric.

property r2_score: float

Return the R^2 score metric.

property root_mean_squared_error: float

Return the root mean squared error metric.

write(depth=0, writer=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>)[source]

Write this instance as either a Writable or as a Dictable. If class attribute _DICTABLE_WRITABLE_DESCENDANTS is set as True, then use the write() method on children instead of writing the generated dictionary. Otherwise, write this instance by first creating a dict recursively using asdict(), then formatting the output.

If the attribute _DICTABLE_WRITE_EXCLUDES is set, those attributes are removed from what is written in the write() method.

Note that this attribute will need to be set in all descendants in the instance hierarchy since writing the object instance graph is done recursively.

Parameters:
  • depth (int) – the starting indentation depth

  • writer (TextIOBase) – the writer to dump the content of this writable

class zensols.deeplearn.result.domain.ResultContext(multi_labels=())[source]

Bases: Dictable

Chain-of-reponsibility context for generated result objects.

__init__(multi_labels=())
multi_labels: Tuple[str, ...] = ()

The labels for multi-label classification, otherwise None.

class zensols.deeplearn.result.domain.ResultsContainer(context)[source]

Bases: Dictable

The base class for all metrics containers. It helps in calculating loss, finding labels, predictions and other utility helpers.

Every container has a start and stop time, which demarcates the duration the for which the populated metrics were being calculated.

FLOAT_TYPES = [<class 'numpy.float32'>, <class 'numpy.float64'>, <class 'float'>]

Used to determin the model_type.

__init__(context)
property ave_loss: float

The average loss of this result set.

property classification_metrics: ClassificationMetrics

Return classification based metrics.

clone()[source]

Return a clone of the current container. Sub containers (lists) are deep copied in sub classes, but everything is shallow copied.

This is needed to create a temporary container to persist whose end() gets called by the ModelExecutor.

Return type:

ResultsContainer

property contains_results

True if this container has results.

context: ResultContext

The context of the results.

end()[source]

Record the time at which processing started for the metrics populated in this container.

See:

obj:is_ended

Return type:

datetime

property is_ended: bool

The time at which processing ended for the metrics populated in this container.

See:

meth:end

property is_started: bool

The time at which processing started for the metrics populated in this container.

See:

meth:start

property labels: ndarray

The labels or None if none were provided (i.e. during test/evaluation).

property max_loss: float

The highest loss recorded in this container.

property metrics: Metrics

Return the metrics based on the model_type.

property min_loss: float

The lowest loss recorded in this container.

property model_type: ModelType

The type of the model based on what whether the outcome data is a float or integer.

property multi_label_classification_metrics: MultiLabelClassificationMetrics

Return classification based metrics.

property n_iterations: int

The number of iterations, which is different from the n_outcomes since a single (say training) iteration can produce multiple outcomes (for example sequence classification).

property n_outcomes: int

The number of outcomes.

property prediction_metrics: PredictionMetrics

Return prediction based metrics.

property predictions: ndarray

The predictions from the model. This also flattens the predictions in to a 1D array for the purpose of computing metrics.

Returns:

the flattened predictions

start()[source]

Record the time at which processing started for the metrics populated in this container.

See:

obj:is_started

Return type:

datetime

class zensols.deeplearn.result.domain.ScoreMetrics(labels, predictions, average)[source]

Bases: Metrics

Classification metrics having an f1, precision and recall for a configured weighted, micro or macro average.

__init__(labels, predictions, average)
average: str

The type of average to apply to metrics produced by this class, which is one of macro or micro.

property f1: float

Return the F1 metric as either the micro or macro based on the average attribute.

property long_f1_name: str
property precision: float

Return the precision metric as either the micro or macro based on the average attribute.

property recall: float

Return the recall metric as either the micro or macro based on the average attribute.

property short_f1_name: str
write(depth=0, writer=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>)[source]

Write this instance as either a Writable or as a Dictable. If class attribute _DICTABLE_WRITABLE_DESCENDANTS is set as True, then use the write() method on children instead of writing the generated dictionary. Otherwise, write this instance by first creating a dict recursively using asdict(), then formatting the output.

If the attribute _DICTABLE_WRITE_EXCLUDES is set, those attributes are removed from what is written in the write() method.

Note that this attribute will need to be set in all descendants in the instance hierarchy since writing the object instance graph is done recursively.

Parameters:
  • depth (int) – the starting indentation depth

  • writer (TextIOBase) – the writer to dump the content of this writable

zensols.deeplearn.result.hypsig module

Model hypothesis significance testing. This module has a small framework for the hypothesis testing the model results (typically the results from the test dataset). The outcome of disproving the null hypothesis (which is that two classifiers perform the same) means that a classifier has statistically significant better (or worse) performance compared to a second.

class zensols.deeplearn.result.hypsig.AnovaSignificanceTest(data)[source]

Bases: SignificanceTest

One-way ANOVA test.

class zensols.deeplearn.result.hypsig.ChiSquareEvaluation(pvalue, alpha, statistic=None, dof=None, expected=None, contingency_table=None)[source]

Bases: Evaluation

The statistics gathered from scipy.stats.chi2_contingency() and created in ChiSquareCalculator.

__init__(pvalue, alpha, statistic=None, dof=None, expected=None, contingency_table=None)
property adjusted_residuals: DataFrame

The adjusted residuals (see class docs).

property associated: bool

Whether or not the variables are assocated (rejection of the null hypotheis).

contingency_table: DataFrame = None

The contigency table used for the results.

property contribs: DataFrame

The contribution of each cell to the results of the chi-square computation.

dof: int = None

Degrees of freedom

expected: ndarray = None

The expected frequencies, based on the marginal sums of the table. It has the same shape as ChiSquareCalculator.observations.

property pearson_residuals: DataFrame

Pearson residuals, aka standardized residuals.

property raw_residuals: DataFrame

The raw residuals as computed as the difference between the observations and the expected cell values.

write(depth=0, writer=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>)[source]

Write this instance as either a Writable or as a Dictable. If class attribute _DICTABLE_WRITABLE_DESCENDANTS is set as True, then use the write() method on children instead of writing the generated dictionary. Otherwise, write this instance by first creating a dict recursively using asdict(), then formatting the output.

If the attribute _DICTABLE_WRITE_EXCLUDES is set, those attributes are removed from what is written in the write() method.

Note that this attribute will need to be set in all descendants in the instance hierarchy since writing the object instance graph is done recursively.

Parameters:
  • depth (int) – the starting indentation depth

  • writer (TextIOBase) – the writer to dump the content of this writable

write_associated(depth=0, writer=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>)[source]

Write how the variables relate as a result of the chi-square computation.

See:

write()

class zensols.deeplearn.result.hypsig.ChiSquareSignificanceTest(data)[source]

Bases: SignificanceTest

A ChiSquare test using the 2x2 contigency table as input.

class zensols.deeplearn.result.hypsig.Evaluation(pvalue, alpha, statistic=None)[source]

Bases: DataFrameDictable

An evaluation metric returned by an implementation of SignificanceTest.

__init__(pvalue, alpha, statistic=None)
alpha: float

Independency threshold for asserting the null hypothesis.

property disprove_null_hyp: bool

Whether the evaluation shows the test disproves the null hypothesis.

pvalue: float

The probabily value (p-value).

statistic: float = None

A method specific statistic.

class zensols.deeplearn.result.hypsig.McNemarSignificanceTest(data)[source]

Bases: SignificanceTest

McNemar’s test.

Citation:

Quinn McNemar (1947) Note on the sampling error of the difference between correlated proportions or percentages. Psychometrika, 12(2):153–157, June.

exception zensols.deeplearn.result.hypsig.SignificanceError[source]

Bases: APIError

Raised for inconsistent or bad data while testing significance.

__annotations__ = {}
__module__ = 'zensols.deeplearn.result.hypsig'
class zensols.deeplearn.result.hypsig.SignificanceTest(data)[source]

Bases: DataFrameDictable

A statistical significance hypothesis test for models using test set data results.

__init__(data)
data: SignificanceTestData

Contains the data to be used for the significance hypothesis testing.

property evaluation: Evaluation
property name: str

The name of the test.

write(depth=0, writer=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>, include_contingency=True, include_conclusion=True)[source]

Write this instance as either a Writable or as a Dictable. If class attribute _DICTABLE_WRITABLE_DESCENDANTS is set as True, then use the write() method on children instead of writing the generated dictionary. Otherwise, write this instance by first creating a dict recursively using asdict(), then formatting the output.

If the attribute _DICTABLE_WRITE_EXCLUDES is set, those attributes are removed from what is written in the write() method.

Note that this attribute will need to be set in all descendants in the instance hierarchy since writing the object instance graph is done recursively.

Parameters:
  • depth (int) – the starting indentation depth

  • writer (TextIOBase) – the writer to dump the content of this writable

write_conclusion(depth=0, writer=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>)[source]

Write an intuitive explanation of the results.

Parameters:
  • depth (int) – the starting indentation depth

  • writer (TextIOBase) – the writer to dump the content of this writable

class zensols.deeplearn.result.hypsig.SignificanceTestData(a, b, id_col='id', gold_col='label', pred_col='pred', alpha=0.05, null_hypothesis='classifiers have a similar proportion of errors on the test set')[source]

Bases: DataFrameDictable

Metadata needed to create significance tests.

See:

SignificanceTest.

__init__(a, b, id_col='id', gold_col='label', pred_col='pred', alpha=0.05, null_hypothesis='classifiers have a similar proportion of errors on the test set')
a: DataFrame

Test set results from the first model.

alpha: float = 0.05

Used to compare with the p-value to disprove the null hypothesis.

b: DataFrame

Test set results from the second model.

property contingency_table: DataFrame

Return the contingency table using correct columns from correct_table`.

property correct_table: DataFrame

Return a tuple of a dataframe of the correct values in columns a_correct and b_correct.

gold_col: str = 'label'

The column of the gold label/data.

id_col: str = 'id'

The dataset column that contains the unique identifier of the data point. If this is not None, an assertion on the id’s of a and b is performed.

null_hypothesis: str = 'classifiers have a similar proportion of errors on the test set'

A human readable string of the hypothesis.

pred_col: str = 'pred'

The column of the prediction.

class zensols.deeplearn.result.hypsig.SignificanceTestSuite(data, test_names=None)[source]

Bases: DataFrameDictable

A suite of significance tests that use one or more SignificanceTest.

__init__(data, test_names=None)
property available_test_names: Set[str]

All avilable names of tests (see test_names).

data: SignificanceTestData

Contains the data to be used for the significance hypothesis testing.

property describer: DataFrameDescriber

A dataframe describer of all significance evaluations.

test_names: Tuple[str, ...] = None

The test names (SignificanceTest.name) to be in this suite.

property tests: Tuple[SignificanceTest, ...]

The tests used in this suite

write(depth=0, writer=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>)[source]

Write this instance as either a Writable or as a Dictable. If class attribute _DICTABLE_WRITABLE_DESCENDANTS is set as True, then use the write() method on children instead of writing the generated dictionary. Otherwise, write this instance by first creating a dict recursively using asdict(), then formatting the output.

If the attribute _DICTABLE_WRITE_EXCLUDES is set, those attributes are removed from what is written in the write() method.

Note that this attribute will need to be set in all descendants in the instance hierarchy since writing the object instance graph is done recursively.

Parameters:
  • depth (int) – the starting indentation depth

  • writer (TextIOBase) – the writer to dump the content of this writable

class zensols.deeplearn.result.hypsig.StudentTTestSignificanceTest(data)[source]

Bases: SignificanceTest

Student’s T-Test, which measure the difference in the mean. This test violates the independence assumption, but it is included as it is still used in papers as a metric.

Citation:

Student (1908) The Probable Error of a Mean. Biometrika, 6(1):1–25.

class zensols.deeplearn.result.hypsig.WilcoxSignificanceTest(data)[source]

Bases: SignificanceTest

Wilcoxon signed-rank test, which is a non-parametric version of Student’s T-Test.

Citation:

Frank Wilcoxon (1945) Individual Comparisons by Ranking Methods. Biometrics Bulletin, 1(6):80–83.

zensols.deeplearn.result.manager module

A class that persists results in various formats.

class zensols.deeplearn.result.manager.ArchivedResult(id, name, txt_path, result_path, model_path, png_path, json_path)[source]

Bases: Dictable

An archived result that provides access to the outcomes the training, validation and optionally test phases of a model execution

See:

ModelResultManager

__init__(id, name, txt_path, result_path, model_path, png_path, json_path)
clear()[source]
Return type:

List[Path]

get_paths(excludes=frozenset({}))[source]

Get all paths in the result as an iterable.

Parameters:

excludes (Set[str]) – the extensions to exclude from the returned paths

Return type:

Iterable[Path]

id: int

The result incremented identitifer.

json_path: Path

The path to the results as a parsable JSON file.

model_path: Path

The path to the directory with the PyTorch model and state files.

property model_result: ModelResult

The results container of the run.

name: str

The result’s unique name, which includes id.

png_path: Path

The path to the training/validation loss results.

result_path: Path

The path to pickled results file.

txt_path: Path

The path results as a text file.

class zensols.deeplearn.result.manager.ModelResultManager(path, pattern='{name}.dat', name=None, model_path=True, save_text=True, save_plot=True, save_json=True, file_pattern='{prefix}-{key}.{ext}', file_regex=re.compile('^(.+)-(.+?)\\\\.([^.]+)$'))[source]

Bases: IncrementKeyDirectoryStash

Saves and loads results from runs (ModelResult) of the ModelExecutor. Keys incrementing integers, one for each save, which usually corresponds to the run of the model executor.

The stash’s path points to where results are persisted with all file format versions.

__init__(path, pattern='{name}.dat', name=None, model_path=True, save_text=True, save_plot=True, save_json=True, file_pattern='{prefix}-{key}.{ext}', file_regex=re.compile('^(.+)-(.+?)\\\\.([^.]+)$'))
create_results_stash(prefix=None)[source]

Return a stash that provides access to previous results (not just the last results). The stash iterates over the model results directory with ArchivedResult values.

Parameters:

prefix (str) – the prefix to use when creating the basename’s stem file portion; if None use a file name version of name

Return type:

Stash

dump(result)[source]

If only one argument is given, it is used as the data and the key name is derived from get_last_key.

file_pattern: str = '{prefix}-{key}.{ext}'

The pattern used to store the model and results files.

file_regex: Pattern = re.compile('^(.+)-(.+?)\\.([^.]+)$')

An regular expression analogue to file_pattern.

get_grapher(figsize=(15, 5), title=None)[source]

Return an instance of a model grapher. This class can plot results of res using matplotlib.

See:

ModelResultGrapher

Return type:

ModelResultGrapher

get_last_id()[source]

Get the last result ID.

Return type:

str

get_next_graph_path()[source]

Return a path to the available graph file to be written.

Return type:

Path

get_next_json_path()[source]

Return a path to the available JSON file to be written.

Return type:

Path

get_next_model_path()[source]

Return a path to the available model file to be written.

Return type:

Path

get_next_text_path()[source]

Return a path to the available text file to be written.

Return type:

Path

model_path: Path = True

The path to where the results are stored.

name: str = None

The name of the manager in the configuration.

parse_file_name(res_id, raise_ex=True)[source]
Return type:

Tuple[str, str, str]

property results_stash: Stash

The canonical results stash for the application configured prefix.

See:

create_results_stash()

save_json: bool = True

If True save the results as a JSON file.

save_json_result(result)[source]

Save the results of the model in JSON format.

save_plot: bool = True

If True save the plot to the file system.

save_plot_result(result)[source]

Plot and save results of the validation and training loss.

save_text: bool = True

If True save the results as a text file.

save_text_result(result)[source]

Save the text results of the model.

static to_file_name(name)[source]

Return a file name string from human readable name.

Return type:

str

zensols.deeplearn.result.plot module

Provides a class to graph the results.

class zensols.deeplearn.result.plot.ModelResultGrapher(name=None, figsize=(15, 5), split_types=None, title=None, save_path=None)[source]

Bases: object

Graphs the an instance of ModelResult. This creates subfigures, one for each of the results given as input to plot.

See:

plot

__init__(name=None, figsize=(15, 5), split_types=None, title=None, save_path=None)
figsize: Tuple[int, int] = (15, 5)

the size of the top level figure (not the panes)

name: str = None

The name that goes in the title of the graph.

plot(containers)[source]

Create a plot for results containers.

save()[source]

Save the plot to disk.

save_path: Path = None

Where the plot is saved.

show()[source]

Render and display the plot.

split_types: List[DatasetSplitType] = None

The splits to graph (list of size 2); defaults to [DatasetSplitType.train, DatasetSplitType.validation].

title: str = None

The title format used to create each sub pane graph.

zensols.deeplearn.result.pred module

This creates Pandas dataframes containing predictions.

class zensols.deeplearn.result.pred.MultiLabelPredictionsDataFrameFactory(source, result, stash, column_names=None, data_point_transform=None, batch_limit=9223372036854775807, epoch_result=None, label_vectorizer_name=None, metric_metadata=None)[source]

Bases: PredictionsDataFrameFactory

Like the super class but create predictions multilabel on sentences and documents.

__init__(source, result, stash, column_names=None, data_point_transform=None, batch_limit=9223372036854775807, epoch_result=None, label_vectorizer_name=None, metric_metadata=None)
class zensols.deeplearn.result.pred.PredictionsDataFrameFactory(source, result, stash, column_names=None, data_point_transform=None, batch_limit=9223372036854775807, epoch_result=None, label_vectorizer_name=None, metric_metadata=None)[source]

Bases: object

Create a Pandas pandas.DataFrame containing the labels and predictions from the model.ModelExecutor test data set output . The data frame contains the feature IDs, labels, predictions mapped back to their original value from the feature data item.

Currently only classification models are supported.

CORRECT_COL: ClassVar[str] = 'correct'

The correct/incorrect indication column in the generated dataframe in dataframe and metrics_dataframe.

ID_COL: ClassVar[str] = 'id'

The data point ID in the generated dataframe in dataframe and metrics_dataframe.

LABEL_COL: ClassVar[str] = 'label'

The gold label column in the generated dataframe in dataframe and metrics_dataframe.

METRICS_DF_COLUMNS: ClassVar[Tuple[str, ...]] = ('label', 'wF1', 'wP', 'wR', 'mF1', 'mP', 'mR', 'MF1', 'MP', 'MR', 'acc', 'correct', 'count')

metrics_dataframe

Type:

see

METRICS_DF_MACRO_COLUMNS: ClassVar[Tuple[str, ...]] = ('MF1', 'MP', 'MR')

Macro performance metrics columns.

METRICS_DF_MICRO_COLUMNS: ClassVar[Tuple[str, ...]] = ('mF1', 'mP', 'mR')

Micro performance metrics columns.

METRICS_DF_WEIGHTED_COLUMNS: ClassVar[Tuple[str, ...]] = ('wF1', 'wP', 'wR')

Weighed performance metrics columns.

METRIC_COLUMNS: ClassVar[Tuple[str, ...]] = ('wF1', 'wP', 'wR', 'mF1', 'mP', 'mR', 'MF1', 'MP', 'MR', 'acc')

Weighted, micro, macro and accuracy metrics columns.

METRIC_DESCRIPTIONS: ClassVar[Dict[str, str]] = {'MF1': 'macro F1', 'MF1t': 'macro F1 on the test set', 'MF1v': 'macro F1 on the validation set', 'MP': 'macro precision', 'MPt': 'macro precision on the test set', 'MPv': 'macro precision on the validation set', 'MR': 'macro recall', 'MRt': 'macro recall on the test set', 'MRv': 'macro recall on the validation set', 'acc': 'accuracy', 'acct': 'accuracy on the test set', 'accv': 'accuracy on the validation set', 'converged': 'last epoch with the lowest loss', 'correct': 'number of correct classifications', 'count': 'number of data points in the test set', 'features': 'features used in the model', 'label': 'model class', 'mF1': 'micro F1', 'mF1t': 'micro F1 on the test set', 'mF1v': 'micro F1 on the validation set', 'mP': 'micro precision', 'mPt': 'micro precision on the test set', 'mPv': 'micro precision on the validation set', 'mR': 'micro recall', 'mRt': 'micro recall on the test set', 'mRv': 'micro recall on the validation set', 'name': 'model or result set name', 'resid': 'result ID and file name prefix', 'start': 'when the test started', 'test_occurs': 'the number of data points used to test', 'train_duration': 'time it took to train the model in HH:MM:SS', 'train_occurs': 'the number of data points used to train', 'validation_occurs': 'the number of data points used to validate', 'wF1': 'weighted F1', 'wF1t': 'weighted F1 on the test set', 'wF1v': 'weighted F1 on the validation set', 'wP': 'weighted precision', 'wPt': 'weighted precision on the test set', 'wPv': 'weighted precision on the validation set', 'wR': 'weighted recall', 'wRt': 'weighted recall on the test set', 'wRv': 'weighted recall on the validation set'}

Dictionary of performance metrics column names to human readable descriptions.

PREDICTION_COL: ClassVar[str] = 'pred'

The prediction column in the generated dataframe in dataframe and metrics_dataframe.

TEST_METRIC_COLUMNS: ClassVar[Tuple[str, ...]] = ('wF1t', 'wPt', 'wRt', 'mF1t', 'mPt', 'mRt', 'MF1t', 'MPt', 'MRt', 'acct')

Test set performance metric columns.

VALIDATION_METRIC_COLUMNS: ClassVar[Tuple[str, ...]] = ('wF1v', 'wPv', 'wRv', 'mF1v', 'mPv', 'mRv', 'MF1v', 'MPv', 'MRv', 'accv')

Validation set performance metric columns.

__init__(source, result, stash, column_names=None, data_point_transform=None, batch_limit=9223372036854775807, epoch_result=None, label_vectorizer_name=None, metric_metadata=None)
batch_limit: int = 9223372036854775807

The max number of batches of results to output.

column_names: List[str] = None

The list of string column names for each data item the list returned from data_point_transform to be added to the results for each label/prediction.

data_point_transform: Callable[[DataPoint], tuple] = None

A function that returns a tuple, each with an element respective of column_names to be added to the results for each label/prediction; if None (the default), str used (see the Iris Jupyter Notebook example)

property dataframe: DataFrame

The predictions and labels as a dataframe. The first columns are generated from data_point_tranform, and the remaining columns are:

  • id: the ID of the feature (not batch) data item

  • label: the label given by the feature data item

  • pred: the prediction

  • correct: whether or not the prediction was correct

property dataframe_describer: DataFrameDescriber

Same as dataframe, but return the data with metadata.

epoch_result: EpochResult = None

The epoch containing the results. If none given, take it from the test results.

label_vectorizer_name: str = None

The name of the vectorizer that encodes the labels, which is used to reverse map from integers to their original string nominal values.

property majority_label_metrics_describer: DataFrameDescriber

Compute metrics of the majority label of the test dataset.

metric_metadata: Dict[str, str] = None

Additional metadata when creating instances of DataFrameDescriber in addition to METRIC_DESCRIPTIONS.

property metrics_dataframe_describer: DataFrameDescriber

Get a dataframe describer of metrics (see metrics_dataframe).

metrics_to_series(lab, mets)[source]

Create a single row dataframe from classification metrics.

Return type:

Series

property name: str

The name of the results taken from ModelResult.

result: ModelResult

The epoch containing the results.

source: Path

The source file from where the results were unpickled.

stash: BatchStash

The batch stash used to generate the results from the ModelExecutor. This is used to get the vectorizer to reverse map the labels.

class zensols.deeplearn.result.pred.SequencePredictionsDataFrameFactory(source, result, stash, column_names=None, data_point_transform=None, batch_limit=9223372036854775807, epoch_result=None, label_vectorizer_name=None, metric_metadata=None)[source]

Bases: PredictionsDataFrameFactory

Like the super class but create predictions for sequence based models.

See:

SequenceNetworkModule

__init__(source, result, stash, column_names=None, data_point_transform=None, batch_limit=9223372036854775807, epoch_result=None, label_vectorizer_name=None, metric_metadata=None)

zensols.deeplearn.result.report module

A utility class to summarize all results in a directory.

class zensols.deeplearn.result.report.ModelResultReporter(result_manager, include_validation=True)[source]

Bases: object

Summarize all results in a directory from the output of model execution from ModelExectuor.

The class iterates through the pickled binary output files from the run and summarizes in a Pandas dataframe, which is handy for reporting in papers.

__init__(result_manager, include_validation=True)
property cross_validate_describer: DataDescriber

Create a data describer with the results of a cross-validation.

property dataframe: DataFrame

Return the summarized results (see class docs).

Returns:

the Pandas dataframe of the results

property dataframe_describer: DataFrameDescriber

Get a dataframe describer of metrics (see dataframe).

dump(path)[source]

Create the summarized results and write them to the file system.

Return type:

DataFrame

include_validation: bool = True

Whether or not to include validation performance metrics.

result_manager: ModelResultManager

Contains the results to report on–and specifically the path to directory where the results were persisted.

Module contents

This package provides container classes used for the model execution results.

see:

zensols.deeplearn.result.ModelResult