zensols.deepnlp package¶
Subpackages¶
- zensols.deepnlp.classify package
- Submodules
- zensols.deepnlp.classify.domain module
LabeledBatch
LabeledBatch.COUNTS_ATTRIBUTE
LabeledBatch.DEPENDENCIES_ATTRIBUTE
LabeledBatch.DEPENDENCY_EXPANDER_ATTRIBTE
LabeledBatch.EMBEDDING_ATTRIBUTES
LabeledBatch.ENUMS_ATTRIBUTE
LabeledBatch.ENUM_EXPANDER_ATTRIBUTE
LabeledBatch.FASTTEXT_CRAWL_300_EMBEDDING
LabeledBatch.FASTTEXT_NEWS_300_EMBEDDING
LabeledBatch.GLOVE_300_EMBEDDING
LabeledBatch.GLOVE_50_EMBEDDING
LabeledBatch.LANGUAGE_ATTRIBUTES
LabeledBatch.LANGUAGE_FEATURE_MANAGER_NAME
LabeledBatch.MAPPINGS
LabeledBatch.STATS_ATTRIBUTE
LabeledBatch.TRANSFORMER_FIXED_EMBEDDING
LabeledBatch.TRANSFORMER_TRAINBLE_EMBEDDING
LabeledBatch.WORD2VEC_300_EMBEDDING
LabeledBatch.__init__()
LabeledFeatureDocument
LabeledFeatureDocumentDataPoint
TokenContainerDataPoint
- zensols.deepnlp.classify.facade module
- zensols.deepnlp.classify.model module
- zensols.deepnlp.classify.pred module
ClassificationPredictionMapper
ClassificationPredictionMapper.__init__()
ClassificationPredictionMapper.label_feature_id
ClassificationPredictionMapper.label_vectorizer
ClassificationPredictionMapper.map_results()
ClassificationPredictionMapper.pred_attribute
ClassificationPredictionMapper.softmax_logit_attribute
ClassificationPredictionMapper.vec_manager
SequencePredictionMapper
- Module contents
- zensols.deepnlp.embed package
- Submodules
- zensols.deepnlp.embed.doc module
- zensols.deepnlp.embed.domain module
NoOpWordEmbedModel
WordEmbedError
WordEmbedModel
WordEmbedModel.UNKNOWN
WordEmbedModel.ZERO
WordEmbedModel.__init__()
WordEmbedModel.cache
WordEmbedModel.clear_cache()
WordEmbedModel.deallocate()
WordEmbedModel.get()
WordEmbedModel.keyed_vectors
WordEmbedModel.keys()
WordEmbedModel.lowercase
WordEmbedModel.matrix
WordEmbedModel.model_id
WordEmbedModel.name
WordEmbedModel.prime()
WordEmbedModel.shape
WordEmbedModel.to_matrix()
WordEmbedModel.unk_idx
WordEmbedModel.vector_dimension
WordEmbedModel.vectors
WordEmbedModel.word2idx()
WordEmbedModel.word2idx_or_unk()
WordVectorModel
- zensols.deepnlp.embed.fasttext module
- zensols.deepnlp.embed.glove module
- zensols.deepnlp.embed.word2vec module
- zensols.deepnlp.embed.wordtext module
- Module contents
- zensols.deepnlp.index package
- Submodules
- zensols.deepnlp.index.domain module
- zensols.deepnlp.index.lda module
- zensols.deepnlp.index.lsi module
LatentSemanticDocumentIndexerVectorizer
LatentSemanticDocumentIndexerVectorizer.DESCRIPTION
LatentSemanticDocumentIndexerVectorizer.FEATURE_TYPE
LatentSemanticDocumentIndexerVectorizer.__init__()
LatentSemanticDocumentIndexerVectorizer.components
LatentSemanticDocumentIndexerVectorizer.iterations
LatentSemanticDocumentIndexerVectorizer.lsa
LatentSemanticDocumentIndexerVectorizer.similarity()
LatentSemanticDocumentIndexerVectorizer.vectorizer
LatentSemanticDocumentIndexerVectorizer.vectorizer_params
- Module contents
- zensols.deepnlp.layer package
- Submodules
- zensols.deepnlp.layer.conv module
DeepConvolution1d
DeepConvolution1dNetworkSettings
DeepConvolution1dNetworkSettings.__init__()
DeepConvolution1dNetworkSettings.batch_norm_d
DeepConvolution1dNetworkSettings.clone()
DeepConvolution1dNetworkSettings.embedding_dimension
DeepConvolution1dNetworkSettings.get_module_class_name()
DeepConvolution1dNetworkSettings.layer_factory
DeepConvolution1dNetworkSettings.n_filters
DeepConvolution1dNetworkSettings.padding
DeepConvolution1dNetworkSettings.pool_factory
DeepConvolution1dNetworkSettings.pool_padding
DeepConvolution1dNetworkSettings.pool_stride
DeepConvolution1dNetworkSettings.pool_token_kernel
DeepConvolution1dNetworkSettings.repeats
DeepConvolution1dNetworkSettings.stride
DeepConvolution1dNetworkSettings.token_kernel
DeepConvolution1dNetworkSettings.token_length
DeepConvolution1dNetworkSettings.write()
- zensols.deepnlp.layer.embed module
EmbeddingLayer
EmbeddingNetworkModule
EmbeddingNetworkModule.MODULE_NAME
EmbeddingNetworkModule.__init__()
EmbeddingNetworkModule.embedding_dimension
EmbeddingNetworkModule.forward_document_features()
EmbeddingNetworkModule.forward_embedding_features()
EmbeddingNetworkModule.forward_token_features()
EmbeddingNetworkModule.get_embedding_tensors()
EmbeddingNetworkModule.vectorizer_by_name()
EmbeddingNetworkSettings
TrainableEmbeddingLayer
- zensols.deepnlp.layer.embrecurcrf module
- zensols.deepnlp.layer.wordvec module
- Module contents
- zensols.deepnlp.model package
- Submodules
- zensols.deepnlp.model.facade module
LanguageModelFacade
LanguageModelFacade.__init__()
LanguageModelFacade.count_feature_ids
LanguageModelFacade.doc_parser
LanguageModelFacade.embedding
LanguageModelFacade.enum_feature_ids
LanguageModelFacade.get_max_word_piece_len()
LanguageModelFacade.get_transformer_vectorizer()
LanguageModelFacade.language_attributes
LanguageModelFacade.language_vectorizer_manager
LanguageModelFacade.suppress_transformer_warnings
LanguageModelFacadeConfig
- zensols.deepnlp.model.sequence module
- Module contents
- zensols.deepnlp.transformer package
- Submodules
- zensols.deepnlp.transformer.domain module
TokenizedDocument
TokenizedDocument.__init__()
TokenizedDocument.attention_mask
TokenizedDocument.boundary_tokens
TokenizedDocument.deallocate()
TokenizedDocument.detach()
TokenizedDocument.from_tensor()
TokenizedDocument.get_wordpiece_count()
TokenizedDocument.input_ids
TokenizedDocument.map_to_word_pieces()
TokenizedDocument.map_word_pieces()
TokenizedDocument.offsets
TokenizedDocument.params()
TokenizedDocument.shape
TokenizedDocument.tensor
TokenizedDocument.token_type_ids
TokenizedDocument.truncate()
TokenizedDocument.write()
TokenizedFeatureDocument
- zensols.deepnlp.transformer.embed module
TransformerEmbedding
TransformerEmbedding.ALL_OUTPUT
TransformerEmbedding.LAST_HIDDEN_STATE_OUTPUT
TransformerEmbedding.POOLER_OUTPUT
TransformerEmbedding.__init__()
TransformerEmbedding.cache
TransformerEmbedding.model
TransformerEmbedding.name
TransformerEmbedding.output
TransformerEmbedding.output_attentions
TransformerEmbedding.resource
TransformerEmbedding.tokenize()
TransformerEmbedding.tokenizer
TransformerEmbedding.trainable
TransformerEmbedding.transform()
TransformerEmbedding.vector_dimension
- zensols.deepnlp.transformer.layer module
- zensols.deepnlp.transformer.mask module
- zensols.deepnlp.transformer.optimizer module
- zensols.deepnlp.transformer.pred module
- zensols.deepnlp.transformer.resource module
TransformerError
TransformerResource
TransformerResource.__init__()
TransformerResource.args
TransformerResource.cache
TransformerResource.cache_dir
TransformerResource.cached
TransformerResource.cased
TransformerResource.clear()
TransformerResource.model
TransformerResource.model_args
TransformerResource.model_class
TransformerResource.model_id
TransformerResource.name
TransformerResource.tokenizer
TransformerResource.tokenizer_args
TransformerResource.tokenizer_class
TransformerResource.torch_config
TransformerResource.trainable
- zensols.deepnlp.transformer.tokenizer module
TransformerDocumentTokenizer
TransformerDocumentTokenizer.DEFAULT_PARAMS
TransformerDocumentTokenizer.__init__()
TransformerDocumentTokenizer.all_special_tokens
TransformerDocumentTokenizer.id2tok
TransformerDocumentTokenizer.params
TransformerDocumentTokenizer.pretrained_tokenizer
TransformerDocumentTokenizer.resource
TransformerDocumentTokenizer.token_max_length
TransformerDocumentTokenizer.tokenize()
TransformerDocumentTokenizer.word_piece_token_length
- zensols.deepnlp.transformer.vectorizers module
LabelTransformerFeatureVectorizer
TransformerEmbeddingFeatureVectorizer
TransformerExpanderFeatureContext
TransformerExpanderFeatureVectorizer
TransformerFeatureContext
TransformerFeatureVectorizer
TransformerMaskFeatureVectorizer
TransformerNominalFeatureVectorizer
TransformerNominalFeatureVectorizer.DESCRIPTION
TransformerNominalFeatureVectorizer.__init__()
TransformerNominalFeatureVectorizer.annotations_attribute
TransformerNominalFeatureVectorizer.delegate_feature_id
TransformerNominalFeatureVectorizer.label_all_tokens
TransformerNominalFeatureVectorizer.write()
- zensols.deepnlp.transformer.wordpiece module
CachingWordPieceFeatureDocumentFactory
WordPiece
WordPieceDocumentDecorator
WordPieceFeatureDocument
WordPieceFeatureDocumentFactory
WordPieceFeatureDocumentFactory.__init__()
WordPieceFeatureDocumentFactory.add_sent_embeddings()
WordPieceFeatureDocumentFactory.add_token_embeddings()
WordPieceFeatureDocumentFactory.create()
WordPieceFeatureDocumentFactory.embed_model
WordPieceFeatureDocumentFactory.populate()
WordPieceFeatureDocumentFactory.sent_embeddings
WordPieceFeatureDocumentFactory.token_embeddings
WordPieceFeatureDocumentFactory.tokenizer
WordPieceFeatureSentence
WordPieceFeatureSpan
WordPieceFeatureToken
WordPieceFeatureToken.__init__()
WordPieceFeatureToken.clone()
WordPieceFeatureToken.copy_embedding()
WordPieceFeatureToken.detach()
WordPieceFeatureToken.embedding
WordPieceFeatureToken.indexes
WordPieceFeatureToken.is_unknown
WordPieceFeatureToken.token_embedding
WordPieceFeatureToken.word_iter()
WordPieceFeatureToken.words
WordPieceFeatureToken.write()
WordPieceFeatureVectorizer
WordPieceFeatureVectorizer.DESCRIPTION
WordPieceFeatureVectorizer.FEATURE_TYPE
WordPieceFeatureVectorizer.__init__()
WordPieceFeatureVectorizer.access
WordPieceFeatureVectorizer.decode_embedding
WordPieceFeatureVectorizer.embed_model
WordPieceFeatureVectorizer.encode()
WordPieceFeatureVectorizer.encode_transformed
WordPieceFeatureVectorizer.fold_method
WordPieceFeatureVectorizer.word_piece_doc_factory
WordPieceTokenContainer
- Module contents
- zensols.deepnlp.vectorize package
- Submodules
- zensols.deepnlp.vectorize.embed module
- zensols.deepnlp.vectorize.manager module
FeatureDocumentVectorizer
FeatureDocumentVectorizerManager
FeatureDocumentVectorizerManager.__init__()
FeatureDocumentVectorizerManager.deallocate()
FeatureDocumentVectorizerManager.doc_parser
FeatureDocumentVectorizerManager.get_token_length()
FeatureDocumentVectorizerManager.is_batch_token_length
FeatureDocumentVectorizerManager.parse()
FeatureDocumentVectorizerManager.spacy_vectorizers
FeatureDocumentVectorizerManager.token_feature_ids
FeatureDocumentVectorizerManager.token_length
FoldingDocumentVectorizer
MultiDocumentVectorizer
TextFeatureType
- zensols.deepnlp.vectorize.spacy module
DependencyFeatureVectorizer
NamedEntityRecognitionFeatureVectorizer
PartOfSpeechFeatureVectorizer
SpacyFeatureVectorizer
SpacyFeatureVectorizer.VECTORIZERS
SpacyFeatureVectorizer.__init__()
SpacyFeatureVectorizer.dist()
SpacyFeatureVectorizer.from_spacy()
SpacyFeatureVectorizer.id_from_spacy()
SpacyFeatureVectorizer.id_from_spacy_symbol()
SpacyFeatureVectorizer.torch_config
SpacyFeatureVectorizer.transform()
SpacyFeatureVectorizer.vocab
SpacyFeatureVectorizer.write()
- zensols.deepnlp.vectorize.vectorizers module
CountEnumContainerFeatureVectorizer
CountEnumContainerFeatureVectorizer.ATTR_EXP_META
CountEnumContainerFeatureVectorizer.DESCRIPTION
CountEnumContainerFeatureVectorizer.FEATURE_TYPE
CountEnumContainerFeatureVectorizer.__init__()
CountEnumContainerFeatureVectorizer.decoded_feature_ids
CountEnumContainerFeatureVectorizer.get_feature_counts()
CountEnumContainerFeatureVectorizer.to_symbols()
DepthFeatureDocumentVectorizer
EnumContainerFeatureVectorizer
MutualFeaturesContainerFeatureVectorizer
OneHotEncodedFeatureDocumentVectorizer
OverlappingFeatureDocumentVectorizer
StatisticsFeatureDocumentVectorizer
TokenEmbeddingFeatureVectorizer
WordEmbeddingFeatureVectorizer
- Module contents
Submodules¶
zensols.deepnlp.cli module¶
Facade application implementations for NLP use.
- class zensols.deepnlp.cli.NLPClassifyFacadeModelApplication(config, facade_name='facade', model_path=None, config_factory_args=<factory>, config_overwrites=None, cache_global_facade=True, model_config_overwrites=None)[source]¶
Bases:
NLPFacadeModelApplication
A facade application for predicting text (for example sentiment classification tasks).
- class zensols.deepnlp.cli.NLPClassifyPackedModelApplication(unpacker)[source]¶
Bases:
object
Classifies data used a packed model. The
unpacker
is used to install the model (if not already), then provide access to it. AModelFacade
is created from packaged model that is downloaded. The model then uses the facade’szensols.deeplearn.model.facade.ModelFacade.predict()
method to output the predictions.- CLI_META = {'mnemonic_excludes': {'predict'}, 'mnemonic_overrides': {'write_model_info': 'modelstat', 'write_predictions': 'predict'}, 'option_excludes': {'unpacker'}, 'option_overrides': {'text_or_file': {'long_name': 'input', 'metavar': '<TEXT|FILE>'}, 'verbose': {'short_name': None}}}¶
- __init__(unpacker)¶
- property facade: ModelFacade¶
The packaged model’s facade.
-
unpacker:
ModelUnpacker
¶ The model source.
- class zensols.deepnlp.cli.NLPFacadeBatchApplication(config, facade_name='facade', model_path=None, config_factory_args=<factory>, config_overwrites=None, cache_global_facade=True, model_config_overwrites=None)[source]¶
Bases:
FacadeApplication
A facade application for creating mini-batches for training.
- CLI_META = {'mnemonic_excludes': {'clear_cached_facade', 'create_facade', 'deallocate', 'get_cached_facade'}, 'mnemonic_overrides': {'dump_batches': 'dumpbatch'}, 'option_overrides': {'model_path': {'long_name': 'model', 'short_name': None}}}¶
Tell the command line app API to igonore subclass and client specific use case methods.
- __init__(config, facade_name='facade', model_path=None, config_factory_args=<factory>, config_overwrites=None, cache_global_facade=True, model_config_overwrites=None)¶
- class zensols.deepnlp.cli.NLPFacadeModelApplication(config, facade_name='facade', model_path=None, config_factory_args=<factory>, config_overwrites=None, cache_global_facade=True, model_config_overwrites=None)[source]¶
Bases:
FacadeApplication
A base class facade application for predicting tokens or text.
- CLI_META = {'mnemonic_excludes': {'clear_cached_facade', 'create_facade', 'deallocate', 'get_cached_facade'}, 'mnemonic_overrides': {'predict_text': 'predict'}, 'option_overrides': {'model_path': {'long_name': 'model', 'short_name': None}, 'verbose': {'long_name': 'verbose', 'short_name': None}}}¶
Tell the command line app API to igonore subclass and client specific use case methods.
- __init__(config, facade_name='facade', model_path=None, config_factory_args=<factory>, config_overwrites=None, cache_global_facade=True, model_config_overwrites=None)¶
- class zensols.deepnlp.cli.NLPSequenceClassifyFacadeModelApplication(config, facade_name='facade', model_path=None, config_factory_args=<factory>, config_overwrites=None, cache_global_facade=True, model_config_overwrites=None)[source]¶
Bases:
NLPFacadeModelApplication
A facade application for predicting tokens (for example NER tasks).
- __init__(config, facade_name='facade', model_path=None, config_factory_args=<factory>, config_overwrites=None, cache_global_facade=True, model_config_overwrites=None)¶
zensols.deepnlp.feature module¶
- class zensols.deepnlp.feature.DataframeDocumentFeatureStash(delegate, config, name, chunk_size, workers, factory, vec_manager, document_limit=9223372036854775807, text_column='text', additional_columns=None)[source]¶
Bases:
DocumentFeatureStash
Creates
FeatureDocument
instances frompandas.Series
rows from thepandas.DataFrame
stash values.- __init__(delegate, config, name, chunk_size, workers, factory, vec_manager, document_limit=9223372036854775807, text_column='text', additional_columns=None)¶
- class zensols.deepnlp.feature.DocumentFeatureStash(delegate, config, name, chunk_size, workers, factory, vec_manager, document_limit=9223372036854775807)[source]¶
Bases:
MultiProcessStash
This class parses natural language text in to
FeatureDocument
instances in multiple sub processes.- ATTR_EXP_META = ('document_limit',)¶
- __init__(delegate, config, name, chunk_size, workers, factory, vec_manager, document_limit=9223372036854775807)¶
-
factory:
Stash
¶ The stash that creates the
factory_data
given to_parse_document()
.
- prime()[source]¶
If the delegate stash data does not exist, use this implementation to generate the data and process in children processes.
-
vec_manager:
FeatureDocumentVectorizerManager
¶ Used to parse text in to
FeatureDocument
instances.
zensols.deepnlp.score module¶
Additional deep learning based scoring methods.
This needs the BERTScore packge; install it with pip install bert-score
.
- class zensols.deepnlp.score.BERTScoreScoreMethod(reverse_sents=False, use_norm=True, bert_score_params=<factory>)[source]¶
Bases:
ScoreMethod
A scoring method that uses BERTScore. Sentence pairs are ordered as
(<references>, <candidates>)
.Citation:
Tianyi Zhang, Varsha Kishore, Felix Wu, Kilian Q. Weinberger, and Yoav Artzi. 2020. BERTScore: Evaluating Text Generation with BERT. In Proceedings of the 8th International Conference on Learning Representations, Addis Ababa, Ethopia, March.
- __init__(reverse_sents=False, use_norm=True, bert_score_params=<factory>)¶
- property bert_scorer: BERTScorer¶
Module contents¶
Deep learning for NLP applications.