zensols.deepnlp package¶

Subpackages¶

Submodules¶

zensols.deepnlp.cli module¶

Facade application implementations for NLP use.

class zensols.deepnlp.cli.NLPClassifyFacadeModelApplication(config, facade_name='facade', model_path=None, config_factory_args=<factory>, config_overwrites=None, cache_global_facade=True, model_config_overwrites=None)[source]¶

Bases: NLPFacadeModelApplication

A facade application for predicting text (for example sentiment classification tasks).

predict_text(text, verbose=False)[source]¶

Classify text and output the results.

Parameters:

text (str) – the sentence to classify or standard in a dash (-)
verbose (bool) – if given, print the long format version of the document

class zensols.deepnlp.cli.NLPClassifyPackedModelApplication(unpacker)[source]¶

Bases: object

Classifies data used a packed model. The unpacker is used to install the model (if not already), then provide access to it. A ModelFacade is created from packaged model that is downloaded. The model then uses the facade’s zensols.deeplearn.model.facade.ModelFacade.predict() method to output the predictions.

CLI_META = {'mnemonic_excludes': {'predict'}, 'mnemonic_overrides': {'write_model_info': 'modelstat', 'write_predictions': 'predict'}, 'option_excludes': {'unpacker'}, 'option_overrides': {'text_or_file': {'long_name': 'input', 'metavar': '<TEXT|FILE>'}, 'verbose': {'short_name': None}}}¶

__init__(unpacker)¶

property facade: ModelFacade¶: The packaged model’s facade.

predict(sents)[source]¶

Predcit sentiment for each sentence in sents.

Return type:: Tuple[Any]

unpacker: ModelUnpacker¶: The model source.

write_model_info()[source]¶: Write the model information and metrics.

write_predictions(text_or_file, verbose=False)[source]¶

Predict sentement of sentence(s).

Parameters:

text_or_file (str) – newline delimited file of sentences or a sentence
verbose (bool) – write verbose prediction output

class zensols.deepnlp.cli.NLPFacadeBatchApplication(config, facade_name='facade', model_path=None, config_factory_args=<factory>, config_overwrites=None, cache_global_facade=True, model_config_overwrites=None)[source]¶

Bases: FacadeApplication

A facade application for creating mini-batches for training.

CLI_META = {'mnemonic_excludes': {'clear_cached_facade', 'create_facade', 'deallocate', 'get_cached_facade'}, 'mnemonic_overrides': {'dump_batches': 'dumpbatch'}, 'option_overrides': {'model_path': {'long_name': 'model', 'short_name': None}, 'out_format': {'long_name': 'format', 'short_name': 'f'}}}¶: Tell the command line app API to igonore subclass and client specific use case methods.

__init__(config, facade_name='facade', model_path=None, config_factory_args=<factory>, config_overwrites=None, cache_global_facade=True, model_config_overwrites=None)¶

dump_batches()[source]¶: Dump the batch dataset with IDs, splits, labels and text.

class zensols.deepnlp.cli.NLPFacadeModelApplication(config, facade_name='facade', model_path=None, config_factory_args=<factory>, config_overwrites=None, cache_global_facade=True, model_config_overwrites=None)[source]¶

Bases: FacadeApplication

A base class facade application for predicting tokens or text.

CLI_META = {'mnemonic_excludes': {'clear_cached_facade', 'create_facade', 'deallocate', 'get_cached_facade'}, 'mnemonic_overrides': {'predict_text': 'predict'}, 'option_overrides': {'model_path': {'long_name': 'model', 'short_name': None}, 'out_format': {'long_name': 'format', 'short_name': 'f'}, 'verbose': {'long_name': 'verbose', 'short_name': None}}}¶: Tell the command line app API to igonore subclass and client specific use case methods.

__init__(config, facade_name='facade', model_path=None, config_factory_args=<factory>, config_overwrites=None, cache_global_facade=True, model_config_overwrites=None)¶

class zensols.deepnlp.cli.NLPSequenceClassifyFacadeModelApplication(config, facade_name='facade', model_path=None, config_factory_args=<factory>, config_overwrites=None, cache_global_facade=True, model_config_overwrites=None)[source]¶

Bases: NLPFacadeModelApplication

A facade application for predicting tokens (for example NER tasks).

__init__(config, facade_name='facade', model_path=None, config_factory_args=<factory>, config_overwrites=None, cache_global_facade=True, model_config_overwrites=None)¶

model_path: Path = None¶: The path to the model or use the last trained model if not provided.

predict_text(text, verbose=False)[source]¶

Classify text and output the results.

Parameters:

text (str) – the sentence to classify or standard in a dash (-)
verbose (bool) – if given, print the long format version of the document

zensols.deepnlp.feature module¶

Stashes that parse feature documents.

class zensols.deepnlp.feature.DataframeDocumentFeatureStash(delegate, config, name, chunk_size, workers, factory, vec_manager, document_limit=9223372036854775807, text_column='text', additional_columns=None)[source]¶

Bases: DocumentFeatureStash

Creates FeatureDocument instances from pandas.Series rows from the pandas.DataFrame stash values.

__init__(delegate, config, name, chunk_size, workers, factory, vec_manager, document_limit=9223372036854775807, text_column='text', additional_columns=None)¶

additional_columns: Tuple[str] = None¶: A tuple of column names to add as position argument to the instance.

text_column: str = 'text'¶: The column name for the text to be parsed by the document parser.

class zensols.deepnlp.feature.DocumentFeatureStash(delegate, config, name, chunk_size, workers, factory, vec_manager, document_limit=9223372036854775807)[source]¶

Bases: MultiProcessStash

This class parses natural language text in to FeatureDocument instances in multiple sub processes.

abstract _parse_document(id, factory_data)[source]¶

Return type:: FeatureDocument

ATTR_EXP_META = ('document_limit',)¶

__init__(delegate, config, name, chunk_size, workers, factory, vec_manager, document_limit=9223372036854775807)¶

document_limit: int = 9223372036854775807¶: The maximum number of documents to process.

factory: Stash¶: The stash that creates the factory_data given to _parse_document().

prime()[source]¶: If the delegate stash data does not exist, use this implementation to generate the data and process in children processes.

vec_manager: FeatureDocumentVectorizerManager¶: Used to parse text in to FeatureDocument instances.

zensols.deepnlp.score module¶

Module contents¶

Deep learning for NLP applications.

zensols.deepnlp.init(*args, **kwargs)[source]¶: Initalize the deep NLP system and PyTorch. This calls the initialization of the PyTorch system by passing kwargs to init().