Zensols Utilities¶
Command line, configuration and persistence utilities generally used for any more than basic application. This general purpose library is small, has few dependencies, and helpful across many applications.
See the full documentation.
Paper on arXiv.
Some features include:
A Hydra or Java Spring like application level support for configuration than configparser.
Construct objects using configuration files (both INI and YAML).
Parse primitives, dictionaries, file system objects, instances of classes.
A command action library using an action mnemonic to invocation of a handler that is integrated with a the configuration API. This supports long and short GNU style options as provided by optparse.
Streamline in memory and on disk persistence.
Multi-processing work with a persistence layer.
A secondary goal of the API is to make prototyping Python code quick and easy using the REPL. Examples include reloading modules in the configuration factory.
Documentation¶
Configuration: powerful but simple configuration system much like Hydra or Java Spring
Command line: automagically creates a fully functional command with help from a Python dataclass
Persistence: cache intermediate data(structures) to the file system
Obtaining¶
The easiest way to install the command line program is via the pip
installer:
pip3 install zensols.util
Command Line Usage¶
This library contains a full persistence layer and other utilities. However, a quick and dirty example that uses the configuration and command line functionality is given below. See the other examples to learn how else to use it.
from dataclasses import dataclass
from enum import Enum, auto
import os
from io import StringIO
from zensols.cli import CliHarness
CONFIG = """
# configure the command line
[cli]
apps = list: app
# define the application, whose code is given below
[app]
class_name = fsinfo.Application
"""
class Format(Enum):
short = auto()
long = auto()
@dataclass
class Application(object):
"""Toy application example that provides file system information.
"""
def ls(self, format: Format = Format.short):
"""List the contents of the directory.
:param format: the output format
"""
cmd = ['ls']
if format == Format.long:
cmd.append('-l')
os.system(' '.join(cmd))
if (__name__ == '__main__'):
harnes = CliHarness(app_config_resource=StringIO(CONFIG))
harnes.run()
The framework automatically links each command line action mnemonic (i.e. ls
)
to the data class Application
method ls
and command line help. For
example:
$ python ./fsinfo.py -h
Usage: fsinfo.py [options]:
List the contents of the directory.
Options:
-h, --help show this help message and exit
--version show the program version and exit
-f, --format <long|short> short the output format
$ python ./fsinfo.py -f short
__pycache__ fsinfo.py
See the full example that demonstrates more complex command line handling, documentation and explanation.
Template¶
The easiest to get started is to template out this project is to create your
own boilerplate project with the mkproj
utility. This requires a Java
installation, and easy to create a Python boilerplate with the following
commands:
# clone the boilerplate repo
git clone https://github.com/plandes/template
# download the boilerplate tool
wget https://github.com/plandes/clj-mkproj/releases/download/v0.0.7/mkproj.jar
# create a python template and build it out
java -jar mkproj.jar config -s template/python
java -jar mkproj.jar
This creates a project customized with your organization’s name, author, and other details about the project. In addition, it also creates a sample configuration file and command line that is ready to be invoked by either a Python REPL or from the command line via GNU make.
If you don’t want to bother installing this program, the following sections have generated code as examples from which you can copy/paste.
Citation¶
If you use this project in your research please use the following BibTeX entry:
@inproceedings{landes-etal-2023-deepzensols,
title = "{D}eep{Z}ensols: A Deep Learning Natural Language Processing Framework for Experimentation and Reproducibility",
author = "Landes, Paul and
Di Eugenio, Barbara and
Caragea, Cornelia",
editor = "Tan, Liling and
Milajevs, Dmitrijs and
Chauhan, Geeticka and
Gwinnup, Jeremy and
Rippeth, Elijah",
booktitle = "Proceedings of the 3rd Workshop for Natural Language Processing Open Source Software (NLP-OSS 2023)",
month = dec,
year = "2023",
address = "Singapore, Singapore",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2023.nlposs-1.16",
pages = "141--146"
}
Changelog¶
An extensive changelog is available here.
License¶
Copyright (c) 2020 - 2023 Paul Landes