zensols.deepnlp package

Subpackages

Submodules

zensols.deepnlp.cli module

Facade application implementations for NLP use.

class zensols.deepnlp.cli.NLPClassifyFacadeModelApplication(config, facade_name='facade', model_path=None, config_factory_args=<factory>, config_overwrites=None, cache_global_facade=True, model_config_overwrites=None)[source]

Bases: NLPFacadeModelApplication

A facade application for predicting text (for example sentiment classification tasks).

predict_text(text, verbose=False)[source]

Classify text and output the results.

Parameters:
  • text (str) – the sentence to classify or standard in a dash (-)

  • verbose (bool) – if given, print the long format version of the document

class zensols.deepnlp.cli.NLPClassifyPackedModelApplication(unpacker)[source]

Bases: object

Classifies data used a packed model. The unpacker is used to install the model (if not already), then provide access to it. A ModelFacade is created from packaged model that is downloaded. The model then uses the facade’s zensols.deeplearn.model.facade.ModelFacade.predict() method to output the predictions.

CLI_META = {'mnemonic_excludes': {'predict'}, 'mnemonic_overrides': {'write_model_info': 'modelstat', 'write_predictions': 'predict'}, 'option_excludes': {'unpacker'}, 'option_overrides': {'text_or_file': {'long_name': 'input', 'metavar': '<TEXT|FILE>'}, 'verbose': {'short_name': None}}}
__init__(unpacker)
property facade: ModelFacade

The packaged model’s facade.

predict(sents)[source]

Predcit sentiment for each sentence in sents.

Return type:

Tuple[Any]

unpacker: ModelUnpacker

The model source.

write_model_info()[source]

Write the model information and metrics.

write_predictions(text_or_file, verbose=False)[source]

Predict sentement of sentence(s).

Parameters:
  • text_or_file (str) – newline delimited file of sentences or a sentence

  • verbose (bool) – write verbose prediction output

class zensols.deepnlp.cli.NLPFacadeBatchApplication(config, facade_name='facade', model_path=None, config_factory_args=<factory>, config_overwrites=None, cache_global_facade=True, model_config_overwrites=None)[source]

Bases: FacadeApplication

A facade application for creating mini-batches for training.

CLI_META = {'mnemonic_excludes': {'clear_cached_facade', 'create_facade', 'deallocate', 'get_cached_facade'}, 'mnemonic_overrides': {'dump_batches': 'dumpbatch'}, 'option_overrides': {'model_path': {'long_name': 'model', 'short_name': None}}}

Tell the command line app API to igonore subclass and client specific use case methods.

__init__(config, facade_name='facade', model_path=None, config_factory_args=<factory>, config_overwrites=None, cache_global_facade=True, model_config_overwrites=None)
dump_batches()[source]

Dump the batch dataset with IDs, splits, labels and text.

class zensols.deepnlp.cli.NLPFacadeModelApplication(config, facade_name='facade', model_path=None, config_factory_args=<factory>, config_overwrites=None, cache_global_facade=True, model_config_overwrites=None)[source]

Bases: FacadeApplication

A base class facade application for predicting tokens or text.

CLI_META = {'mnemonic_excludes': {'clear_cached_facade', 'create_facade', 'deallocate', 'get_cached_facade'}, 'mnemonic_overrides': {'predict_text': 'predict'}, 'option_overrides': {'model_path': {'long_name': 'model', 'short_name': None}, 'verbose': {'long_name': 'verbose', 'short_name': None}}}

Tell the command line app API to igonore subclass and client specific use case methods.

__init__(config, facade_name='facade', model_path=None, config_factory_args=<factory>, config_overwrites=None, cache_global_facade=True, model_config_overwrites=None)
class zensols.deepnlp.cli.NLPSequenceClassifyFacadeModelApplication(config, facade_name='facade', model_path=None, config_factory_args=<factory>, config_overwrites=None, cache_global_facade=True, model_config_overwrites=None)[source]

Bases: NLPFacadeModelApplication

A facade application for predicting tokens (for example NER tasks).

__init__(config, facade_name='facade', model_path=None, config_factory_args=<factory>, config_overwrites=None, cache_global_facade=True, model_config_overwrites=None)
model_path: Path = None

The path to the model or use the last trained model if not provided.

predict_text(text, verbose=False)[source]

Classify text and output the results.

Parameters:
  • text (str) – the sentence to classify or standard in a dash (-)

  • verbose (bool) – if given, print the long format version of the document

zensols.deepnlp.feature module

class zensols.deepnlp.feature.DataframeDocumentFeatureStash(delegate, config, name, chunk_size, workers, factory, vec_manager, document_limit=9223372036854775807, text_column='text', additional_columns=None)[source]

Bases: DocumentFeatureStash

Creates FeatureDocument instances from pandas.Series rows from the pandas.DataFrame stash values.

__init__(delegate, config, name, chunk_size, workers, factory, vec_manager, document_limit=9223372036854775807, text_column='text', additional_columns=None)
additional_columns: Tuple[str] = None

A tuple of column names to add as position argument to the instance.

text_column: str = 'text'

The column name for the text to be parsed by the document parser.

class zensols.deepnlp.feature.DocumentFeatureStash(delegate, config, name, chunk_size, workers, factory, vec_manager, document_limit=9223372036854775807)[source]

Bases: MultiProcessStash

This class parses natural language text in to FeatureDocument instances in multiple sub processes.

abstract _parse_document(id, factory_data)[source]
Return type:

FeatureDocument

ATTR_EXP_META = ('document_limit',)
__init__(delegate, config, name, chunk_size, workers, factory, vec_manager, document_limit=9223372036854775807)
document_limit: int = 9223372036854775807

The maximum number of documents to process.

factory: Stash

The stash that creates the factory_data given to _parse_document().

prime()[source]

If the delegate stash data does not exist, use this implementation to generate the data and process in children processes.

vec_manager: FeatureDocumentVectorizerManager

Used to parse text in to FeatureDocument instances.

zensols.deepnlp.score module

Additional deep learning based scoring methods.

This needs the BERTScore packge; install it with pip install bert-score.

class zensols.deepnlp.score.BERTScoreScoreMethod(reverse_sents=False, use_norm=True, bert_score_params=<factory>)[source]

Bases: ScoreMethod

A scoring method that uses BERTScore. Sentence pairs are ordered as (<references>, <candidates>).

Citation:

Tianyi Zhang, Varsha Kishore, Felix Wu, Kilian Q. Weinberger, and Yoav
Artzi. 2020. BERTScore: Evaluating Text Generation with BERT. In
Proceedings of the 8th International Conference on Learning
Representations, Addis Ababa, Ethopia, March.
__init__(reverse_sents=False, use_norm=True, bert_score_params=<factory>)
bert_score_params: Dict[str, Any]

The parameters given to bert_score.scorer.BERTScorer.

property bert_scorer: BERTScorer
use_norm: bool = True

Whether to compare with norm or text.

Module contents

Deep learning for NLP applications.

zensols.deepnlp.init(*args, **kwargs)[source]

Initalize the deep NLP system and PyTorch. This calls the initialization of the PyTorch system by passing kwargs to init().