Change Log

All notable changes to this project will be documented in this file.

The format is based on Keep a Changelog and this project adheres to Semantic Versioning.

Unreleased

1.12.0 - 2024-10-14

Changed

  • CLI formatting and file options for training, testing and reporting on models.

  • Layers expect a torch configuration to fix a model quantization issue.

  • Disable fp16 error when creating numpy sparse matrices.

1.11.2 - 2024-05-11

Added

  • A new feature to update previously trained model distribution’s configuration.

  • Moved the batch size resource library configuration to default.conf.

Changed

  • Moved batch size in resource library batch.conf to default.conf.

1.11.1 - 2024-03-14

Changed

  • Fix missing weighted classification metrics in JSON.

1.11.0 - 2024-03-07

Added

  • A model client (ModelUnpacker) that uses a bundled and downloaded model packaged with ModelPacker. It allows two separate code bases to train the model and inference with it.

  • A feature to turn off saving results (ModelResults) with model, which can potentially slow training and bloat models.

  • A feature to store the results as a string with the model when the ModelResults is not saved with it. These results are accessible in packaged models.

Changed

  • Move TclError to fix Python bailing if not compiled with Tcl/Tk (tkinter).

  • Fix accuracy “N out of total” accuracy reporting metrics.

1.10.0 - 2024-02-27

Added

  • Model signifiance testing module zensols.deeplearn.result.hypsig.

  • Data sanity checks on predictions when creating prediction dataframes.

Changed

  • Move CLI application imports to method level for faster importing.

  • Fold zensols.deeplearn.cli.app into zensols.deeplearn.cli. Since app was imported into the cli name space, this is backward compatible for apps that doesn’t fully qualify.

  • More flexible configuration options for ModelFacade prediction dataframe factory.

  • Fix CLI batch action ignoring the limit argument.

1.9.0 - 2023-12-05

Changed

  • Upgrade to PyTorch 2.1.1, numpy 1.25, scipy 1.9, scikit-learn 1.3, pandas 2.1, and zensols.util 1.14.

  • Unit and integration test fixes.

Added

  • Support for Python 3.11.

  • Multiple CUDA configuration support.

Removed

  • Support for Python 3.9.

1.8.1 - 2023-08-16

Added

  • A feature to inject application (or any) configuration into a previously packed or persisted model via ModelManager.

  • SubsetConfig: subsets the application configuration

  • A command-line option on the FacadeApplication’s info action that specifies the model path to allow a dump of the (packed) model’s persisted configuration.

1.8.0 - 2023-08-16

Downstream moderate risk update release.

Changed

  • Set model type in TorchConfig.to only if its comparable and different.

  • Upgrade to zensols.util v1.13.0.

1.7.0 - 2023-06-09

Changes

  • Upgraded scipy from 1.8.1 to 1.9.3.

1.6.1 - 2023-06-07

Added

Changed

  • Upgrade from torch 1.12 to torch 1.13.

  • TorchConfig allows specific torch device name initialization, which is useful for macOS mps devices.

1.6.0 - 2023-01-23

Added

  • SplitKeyContainer write method for concrete subclasses.

Changed

  • Updated zensols.util to 1.12.0.

  • Type hints and code clean up.

1.5.2 - 2022-11-06

Changed

  • Add batch size to output directory to allow changes without re-batching previous batch size configurations.

  • Bug fix to validation metrics reporting.

1.5.1 - 2022-10-01

Changed

  • Upgrade zensols.install.

1.5.0 - 2022-10-01

Removed

  • Remove mask creation in aggregate vectorizer. Use mask vectorizer instead.

Added

  • Added model packaging API.

  • Added Configurable that reads from a CSV file with Pandas for nominal model labels.

1.4.0 - 2022-08-06

Changed

  • Upgrade a significant portion of the dependent libraries.

1.3.0 - 2022-06-14

This is primarily a refactoring release to simplify the API and examples.

Added

  • Feature creation resource library.

  • Zensols installer and dependency for resource library adds.

Changed

  • Initialize PyTorch system from the zensols.deepnlp module.

  • Stash prime chain fixes.

  • Module logging uses calling logger when not provided.

  • Add stratified split app reporting from facade app when available.

  • Simplify facade application manager for Jupyter notebooks.

  • Memory leak fixes in CLI facade application.

  • Split off batch from info facade application.

  • More error information for missing dataset splits.

  • Fix batch feature mapping for prediction.

Removed

  • The zensols.deeplearn.plot module.

1.2.0 - 2022-05-15

Added

  • New dynamic batch mapping that no longer requires it to be hard coded in a Python source file.

Changed

  • Rename gradient norm to gradient scaling in ModelSettings.

  • Minor bug fixes to results API.

1.1.1 - 2022-05-04

Changed

  • Fix pinned requirements.

1.1.0 - 2022-05-04

Added

  • Add label prediction metric calculations.

  • Add split, label and key access as a Pandas dataframe.

  • Create support for sequence based predictions as a Pandas dataframe.

  • Add PyTorch CrossEntryLoss ignore index padding

Changed

  • Bug fixes and clean up warnings.

  • Recurrent CRF network uses new network settings factory method.

  • Remove HuggingFace schedule warnings.

  • Default OneHotEncodedFeatureDocumentVectorizer.optimize_bools = True is needed as the default has been removed.

  • Move model name configuration from ModelExecutor to ModelSettings.

  • Guard on missing vectorizers.

  • Move name from executor to model settings.

1.0.0 - 2022-02-12

Stable release

Added

  • Added observer pattern for model facade, execution and trainer.

Changed

  • Resource library clean up.

  • Torch GPU memory reporting bug fix.

  • Terminal progress bar fixes.

0.1.8 - 2022-01-25

Changed

  • PyTorch upgrade API bug fix.

0.1.7 - 2022-01-25

Changes

  • Upgrade to torch 1.9.

  • Remove dependency on pycuda.

  • Suppress numpy warnings from results reporting module, PyTorch start up and MNIST test cases.

  • Better CUDA memory debugging/profiling.

0.1.6 - 2021-10-22

Added

  • Much more documentation.

Changed

  • Upgrade to new zensols.util import semantics.

Removed

  • Learning rate parameter from resource library. It is better for the client to be forced to set it to avoid ambiguous spurious settings.

  • DeepZensols NLP classification configuration that doesn’t belong in this package.

0.1.5 - 2021-09-21

Changed

  • Rename DataframeStash to SplitKeyDataframeStash.

  • Split out FacadePredictApplication from the FacadeModelApplication.

Added

  • Class AutoSplitDataframeStash to automatically split (add a column) a Pandas dataframe in to training, validation and test datasets.

  • Configuration resource library.

  • Add a default distribution for StashSplitKeyContainer.

0.1.4 - 2021-09-07

Changed

  • Upgrade to zensols.util version 1.6.0.

  • Better model facade write behavior.

0.1.3 - 2021-08-07

Changed

  • Fix torch config GPU->CPU copy/dealloc; add outcomes/output in batch.

Added

  • CLI application to facade “glue” FacadeApplication.

  • Sequence classification: model base class and batch iterator.

  • Track direct output/outcomes/logits from model in results.

  • More GPU side memory de-allocation.

  • Support for float/quotient batch limits.

0.1.2 - 2021-04-29

Changed

  • Upgraded to torch 1.8 and sklearn 0.24.1.

  • Protect deallocation of non-copied (GPU to GPU) batches.

  • Warning clean up for numpy.

  • Better model naming and file/directory name output.

  • Clone tensor correctly in 1.7 per warning.

Added

  • Sparse support for size 3 tensors.

  • New zensols.util 1.5 CLI application interface to facade.

  • Summary spreadsheet reporting for results. This scans a directory for results and adds performance metrics in to CSV file for easy reporting across multiple models.

0.1.1 - 2020-12-29

Maintenance release.

Changed

  • Updated dependencies and tested across Python 3.7, 3.8, 3.9.

0.1.0 - 2020-12-10

First fully functional major feature release, which includes batch persistence/processing, layers, vectorizors, debugging and many other features/support. See documentation for more information.

0.0.6 - 2020-05-11

Changed

Added

  • Type conversion from torch to numpy and sparse types.

  • Implementation of sparse matrix encoding and decoding to compensate for PyTorch bug with hanging child processing in multiprocessing API.

  • Key split stash implementation for data set splits.

  • Vectorization framework

  • Model training and testing framework.

  • Notebook example usage.

  • Unit tests, for which, now give very good coverage.

Removed

  • Travis build: too time consuming to make CUDA, python and xenial etc. all play together nicely.

0.0.5 - 2020-05-04

Added

  • Create sparse tensors.

0.0.4 - 2020-04-27

Change

  • Travis tests.

  • Upgraded to Python 3.7.

0.0.3 - 2019-12-14

Data classes are now used so Python 3.7 is now a requirement.

Added

  • Arbitrarily deep linear layer using constant or decay parameter counts.

  • Conv2d interface to PyTorch.

Changed

  • Better handling of tensor to memory mapping and device handling in general.

Removed

0.0.2 - 2019-07-07

Added

  • Tests.

0.0.1 - 2019-07-07

Added

  • Initial version.