zensols.deepnlp package¶
Subpackages¶
- zensols.deepnlp.classify package
- Submodules
- zensols.deepnlp.classify.domain module
- zensols.deepnlp.classify.facade module
ClassifyModelFacade
ClassifyModelFacade.COUNTS_ATTRIBUTE
ClassifyModelFacade.DEPENDENCIES_ATTRIBUTE
ClassifyModelFacade.DEPENDENCY_EXPANDER_ATTRIBTE
ClassifyModelFacade.EMBEDDING_ATTRIBUTES
ClassifyModelFacade.ENUMS_ATTRIBUTE
ClassifyModelFacade.ENUM_EXPANDER_ATTRIBUTE
ClassifyModelFacade.FASTTEXT_CRAWL_300_EMBEDDING
ClassifyModelFacade.FASTTEXT_NEWS_300_EMBEDDING
ClassifyModelFacade.GLOVE_300_EMBEDDING
ClassifyModelFacade.GLOVE_50_EMBEDDING
ClassifyModelFacade.LANGUAGE_ATTRIBUTES
ClassifyModelFacade.LANGUAGE_FEATURE_MANAGER_NAME
ClassifyModelFacade.LANGUAGE_MODEL_CONFIG
ClassifyModelFacade.STATS_ATTRIBUTE
ClassifyModelFacade.TRANSFORMER_FIXED_EMBEDDING
ClassifyModelFacade.TRANSFORMER_TRAINBLE_EMBEDDING
ClassifyModelFacade.WORD2VEC_300_EMBEDDING
ClassifyModelFacade.__init__()
ClassifyModelFacade.predict()
MultilabelClassifyModelFacade
TokenClassifyModelFacade
- zensols.deepnlp.classify.model module
- zensols.deepnlp.classify.multilabel module
- zensols.deepnlp.classify.pred module
ClassificationPredictionMapper
ClassificationPredictionMapper.__init__()
ClassificationPredictionMapper.label_feature_id
ClassificationPredictionMapper.label_vectorizer
ClassificationPredictionMapper.map_results()
ClassificationPredictionMapper.pred_attribute
ClassificationPredictionMapper.softmax_logit_attribute
ClassificationPredictionMapper.vec_manager
SequencePredictionMapper
- Module contents
- zensols.deepnlp.embed package
- Submodules
- zensols.deepnlp.embed.doc module
- zensols.deepnlp.embed.domain module
NoOpWordEmbedModel
WordEmbedError
WordEmbedModel
WordEmbedModel.UNKNOWN
WordEmbedModel.ZERO
WordEmbedModel.__init__()
WordEmbedModel.cache
WordEmbedModel.clear_cache()
WordEmbedModel.deallocate()
WordEmbedModel.get()
WordEmbedModel.keyed_vectors
WordEmbedModel.keys()
WordEmbedModel.lowercase
WordEmbedModel.matrix
WordEmbedModel.model_id
WordEmbedModel.name
WordEmbedModel.prime()
WordEmbedModel.shape
WordEmbedModel.to_matrix()
WordEmbedModel.unk_idx
WordEmbedModel.vector_dimension
WordEmbedModel.vectors
WordEmbedModel.word2idx()
WordEmbedModel.word2idx_or_unk()
WordVectorModel
- zensols.deepnlp.embed.fasttext module
- zensols.deepnlp.embed.glove module
- zensols.deepnlp.embed.word2vec module
- zensols.deepnlp.embed.wordtext module
- Module contents
- zensols.deepnlp.index package
- Submodules
- zensols.deepnlp.index.domain module
- zensols.deepnlp.index.lda module
- zensols.deepnlp.index.lsi module
LatentSemanticDocumentIndexerVectorizer
LatentSemanticDocumentIndexerVectorizer.DESCRIPTION
LatentSemanticDocumentIndexerVectorizer.FEATURE_TYPE
LatentSemanticDocumentIndexerVectorizer.__init__()
LatentSemanticDocumentIndexerVectorizer.components
LatentSemanticDocumentIndexerVectorizer.iterations
LatentSemanticDocumentIndexerVectorizer.lsa
LatentSemanticDocumentIndexerVectorizer.similarity()
LatentSemanticDocumentIndexerVectorizer.vectorizer
LatentSemanticDocumentIndexerVectorizer.vectorizer_params
- Module contents
- zensols.deepnlp.layer package
- Submodules
- zensols.deepnlp.layer.conv module
DeepConvolution1d
DeepConvolution1dNetworkSettings
DeepConvolution1dNetworkSettings.__init__()
DeepConvolution1dNetworkSettings.applies
DeepConvolution1dNetworkSettings.embedding_dimension
DeepConvolution1dNetworkSettings.get_module_class_name()
DeepConvolution1dNetworkSettings.layer_factories
DeepConvolution1dNetworkSettings.out_shape
DeepConvolution1dNetworkSettings.padding
DeepConvolution1dNetworkSettings.pool_padding
DeepConvolution1dNetworkSettings.pool_stride
DeepConvolution1dNetworkSettings.pool_token_kernel
DeepConvolution1dNetworkSettings.repeats
DeepConvolution1dNetworkSettings.stride
DeepConvolution1dNetworkSettings.token_kernel
DeepConvolution1dNetworkSettings.token_length
DeepConvolution1dNetworkSettings.validate()
DeepConvolution1dNetworkSettings.write()
- zensols.deepnlp.layer.embed module
EmbeddingLayer
EmbeddingNetworkModule
EmbeddingNetworkModule.MODULE_NAME
EmbeddingNetworkModule.__init__()
EmbeddingNetworkModule.embedding_dimension
EmbeddingNetworkModule.forward_document_features()
EmbeddingNetworkModule.forward_embedding_features()
EmbeddingNetworkModule.forward_token_features()
EmbeddingNetworkModule.get_embedding_tensors()
EmbeddingNetworkModule.vectorizer_by_name()
EmbeddingNetworkSettings
TrainableEmbeddingLayer
- zensols.deepnlp.layer.embrecurcrf module
- zensols.deepnlp.layer.wordvec module
- Module contents
- zensols.deepnlp.model package
- Submodules
- zensols.deepnlp.model.facade module
LanguageModelFacade
LanguageModelFacade.__init__()
LanguageModelFacade.count_feature_ids
LanguageModelFacade.doc_parser
LanguageModelFacade.embedding
LanguageModelFacade.enum_feature_ids
LanguageModelFacade.get_max_word_piece_len()
LanguageModelFacade.get_transformer_vectorizer()
LanguageModelFacade.language_attributes
LanguageModelFacade.language_vectorizer_manager
LanguageModelFacade.suppress_transformer_warnings
LanguageModelFacadeConfig
- zensols.deepnlp.model.sequence module
- Module contents
- zensols.deepnlp.transformer package
- Submodules
- zensols.deepnlp.transformer.domain module
TokenizedDocument
TokenizedDocument.__init__()
TokenizedDocument.attention_mask
TokenizedDocument.boundary_tokens
TokenizedDocument.deallocate()
TokenizedDocument.detach()
TokenizedDocument.from_tensor()
TokenizedDocument.get_wordpiece_count()
TokenizedDocument.input_ids
TokenizedDocument.is_empty
TokenizedDocument.map_to_word_pieces()
TokenizedDocument.map_word_pieces()
TokenizedDocument.offsets
TokenizedDocument.params()
TokenizedDocument.shape
TokenizedDocument.tensor
TokenizedDocument.token_type_ids
TokenizedDocument.truncate()
TokenizedDocument.write()
TokenizedFeatureDocument
- zensols.deepnlp.transformer.embed module
TransformerEmbedding
TransformerEmbedding.ALL_OUTPUT
TransformerEmbedding.LAST_HIDDEN_STATE_OUTPUT
TransformerEmbedding.POOLER_OUTPUT
TransformerEmbedding.__init__()
TransformerEmbedding.cache
TransformerEmbedding.model
TransformerEmbedding.name
TransformerEmbedding.output
TransformerEmbedding.output_attentions
TransformerEmbedding.resource
TransformerEmbedding.tokenize()
TransformerEmbedding.tokenizer
TransformerEmbedding.trainable
TransformerEmbedding.transform()
TransformerEmbedding.vector_dimension
- zensols.deepnlp.transformer.layer module
- zensols.deepnlp.transformer.mask module
- zensols.deepnlp.transformer.optimizer module
- zensols.deepnlp.transformer.pred module
- zensols.deepnlp.transformer.resource module
TransformerError
TransformerResource
TransformerResource.__init__()
TransformerResource.args
TransformerResource.cache
TransformerResource.cache_dir
TransformerResource.cached
TransformerResource.cased
TransformerResource.clear()
TransformerResource.model
TransformerResource.model_args
TransformerResource.model_class
TransformerResource.model_id
TransformerResource.name
TransformerResource.tokenizer
TransformerResource.tokenizer_args
TransformerResource.tokenizer_class
TransformerResource.torch_config
TransformerResource.trainable
- zensols.deepnlp.transformer.tokenizer module
TransformerDocumentTokenizer
TransformerDocumentTokenizer.DEFAULT_PARAMS
TransformerDocumentTokenizer.__init__()
TransformerDocumentTokenizer.all_special_tokens
TransformerDocumentTokenizer.feature_id
TransformerDocumentTokenizer.id2tok
TransformerDocumentTokenizer.params
TransformerDocumentTokenizer.pretrained_tokenizer
TransformerDocumentTokenizer.resource
TransformerDocumentTokenizer.token_max_length
TransformerDocumentTokenizer.tokenize()
TransformerDocumentTokenizer.word_piece_token_length
- zensols.deepnlp.transformer.vectorizers module
DocumentEmbeddingFeatureVectorizer
DocumentMappedTransformerFeatureContext
LabelTransformerFeatureVectorizer
TransformerEmbeddingFeatureVectorizer
TransformerExpanderFeatureContext
TransformerExpanderFeatureVectorizer
TransformerFeatureContext
TransformerFeatureVectorizer
TransformerMaskFeatureVectorizer
TransformerNominalFeatureVectorizer
TransformerNominalFeatureVectorizer.DESCRIPTION
TransformerNominalFeatureVectorizer.__init__()
TransformerNominalFeatureVectorizer.annotations_attribute
TransformerNominalFeatureVectorizer.delegate_feature_id
TransformerNominalFeatureVectorizer.label_all_tokens
TransformerNominalFeatureVectorizer.write()
- zensols.deepnlp.transformer.wordpiece module
CachingWordPieceFeatureDocumentFactory
WordPiece
WordPieceDocumentDecorator
WordPieceFeatureDocument
WordPieceFeatureDocumentFactory
WordPieceFeatureDocumentFactory.__init__()
WordPieceFeatureDocumentFactory.add_sent_embeddings()
WordPieceFeatureDocumentFactory.add_token_embeddings()
WordPieceFeatureDocumentFactory.create()
WordPieceFeatureDocumentFactory.embed_model
WordPieceFeatureDocumentFactory.populate()
WordPieceFeatureDocumentFactory.sent_embeddings
WordPieceFeatureDocumentFactory.token_embeddings
WordPieceFeatureDocumentFactory.tokenizer
WordPieceFeatureSentence
WordPieceFeatureSpan
WordPieceFeatureToken
WordPieceFeatureToken.__init__()
WordPieceFeatureToken.clone()
WordPieceFeatureToken.copy_embedding()
WordPieceFeatureToken.detach()
WordPieceFeatureToken.embedding
WordPieceFeatureToken.indexes
WordPieceFeatureToken.is_unknown
WordPieceFeatureToken.token_embedding
WordPieceFeatureToken.word_iter()
WordPieceFeatureToken.words
WordPieceFeatureToken.write()
WordPieceFeatureVectorizer
WordPieceFeatureVectorizer.DESCRIPTION
WordPieceFeatureVectorizer.FEATURE_TYPE
WordPieceFeatureVectorizer.__init__()
WordPieceFeatureVectorizer.access
WordPieceFeatureVectorizer.decode_embedding
WordPieceFeatureVectorizer.embed_model
WordPieceFeatureVectorizer.encode()
WordPieceFeatureVectorizer.encode_transformed
WordPieceFeatureVectorizer.fold_method
WordPieceFeatureVectorizer.word_piece_doc_factory
WordPieceTokenContainer
- Module contents
- zensols.deepnlp.vectorize package
- Submodules
- zensols.deepnlp.vectorize.embed module
- zensols.deepnlp.vectorize.manager module
FeatureDocumentVectorizer
FeatureDocumentVectorizerManager
FeatureDocumentVectorizerManager.__init__()
FeatureDocumentVectorizerManager.configured_spacy_vectorizers
FeatureDocumentVectorizerManager.deallocate()
FeatureDocumentVectorizerManager.doc_parser
FeatureDocumentVectorizerManager.get_token_length()
FeatureDocumentVectorizerManager.is_batch_token_length
FeatureDocumentVectorizerManager.ordered_spacy_vectorizers
FeatureDocumentVectorizerManager.parse()
FeatureDocumentVectorizerManager.spacy_vectorizers
FeatureDocumentVectorizerManager.token_feature_ids
FeatureDocumentVectorizerManager.token_length
FoldingDocumentVectorizer
MultiDocumentVectorizer
TextFeatureType
- zensols.deepnlp.vectorize.spacy module
DependencyFeatureVectorizer
NamedEntityRecognitionFeatureVectorizer
PartOfSpeechFeatureVectorizer
SpacyFeatureVectorizer
SpacyFeatureVectorizer.__init__()
SpacyFeatureVectorizer.description
SpacyFeatureVectorizer.dist()
SpacyFeatureVectorizer.from_spacy()
SpacyFeatureVectorizer.id_from_spacy()
SpacyFeatureVectorizer.id_from_spacy_symbol()
SpacyFeatureVectorizer.model
SpacyFeatureVectorizer.symbols
SpacyFeatureVectorizer.torch_config
SpacyFeatureVectorizer.transform()
SpacyFeatureVectorizer.write()
- zensols.deepnlp.vectorize.vectorizers module
CountEnumContainerFeatureVectorizer
CountEnumContainerFeatureVectorizer.ATTR_EXP_META
CountEnumContainerFeatureVectorizer.DESCRIPTION
CountEnumContainerFeatureVectorizer.FEATURE_TYPE
CountEnumContainerFeatureVectorizer.__init__()
CountEnumContainerFeatureVectorizer.get_feature_counts()
CountEnumContainerFeatureVectorizer.string_symbol_feature_ids
CountEnumContainerFeatureVectorizer.to_symbols()
DecodedContainerFeatureVectorizer
DepthFeatureDocumentVectorizer
EnumContainerFeatureVectorizer
MutualFeaturesContainerFeatureVectorizer
OneHotEncodedFeatureDocumentVectorizer
OverlappingFeatureDocumentVectorizer
StatisticsFeatureDocumentVectorizer
TokenEmbeddingFeatureVectorizer
WordEmbeddingFeatureVectorizer
- Module contents
Submodules¶
zensols.deepnlp.cli module¶
Facade application implementations for NLP use.
- class zensols.deepnlp.cli.NLPClassifyFacadeModelApplication(config, facade_name='facade', model_path=None, config_factory_args=<factory>, config_overwrites=None, cache_global_facade=True, model_config_overwrites=None)[source]¶
Bases:
NLPFacadeModelApplication
A facade application for predicting text (for example sentiment classification tasks).
- class zensols.deepnlp.cli.NLPClassifyPackedModelApplication(unpacker)[source]¶
Bases:
object
Classifies data used a packed model. The
unpacker
is used to install the model (if not already), then provide access to it. AModelFacade
is created from packaged model that is downloaded. The model then uses the facade’szensols.deeplearn.model.facade.ModelFacade.predict()
method to output the predictions.- CLI_META = {'mnemonic_excludes': {'predict'}, 'mnemonic_overrides': {'write_model_info': 'modelstat', 'write_predictions': 'predict'}, 'option_excludes': {'unpacker'}, 'option_overrides': {'text_or_file': {'long_name': 'input', 'metavar': '<TEXT|FILE>'}, 'verbose': {'short_name': None}}}¶
- __init__(unpacker)¶
- property facade: ModelFacade¶
The packaged model’s facade.
-
unpacker:
ModelUnpacker
¶ The model source.
- class zensols.deepnlp.cli.NLPFacadeBatchApplication(config, facade_name='facade', model_path=None, config_factory_args=<factory>, config_overwrites=None, cache_global_facade=True, model_config_overwrites=None)[source]¶
Bases:
FacadeApplication
A facade application for creating mini-batches for training.
- CLI_META = {'mnemonic_excludes': {'clear_cached_facade', 'create_facade', 'deallocate', 'get_cached_facade'}, 'mnemonic_overrides': {'dump_batches': 'dumpbatch'}, 'option_overrides': {'model_path': {'long_name': 'model', 'short_name': None}, 'out_format': {'long_name': 'format', 'short_name': 'f'}}}¶
Tell the command line app API to igonore subclass and client specific use case methods.
- __init__(config, facade_name='facade', model_path=None, config_factory_args=<factory>, config_overwrites=None, cache_global_facade=True, model_config_overwrites=None)¶
- class zensols.deepnlp.cli.NLPFacadeModelApplication(config, facade_name='facade', model_path=None, config_factory_args=<factory>, config_overwrites=None, cache_global_facade=True, model_config_overwrites=None)[source]¶
Bases:
FacadeApplication
A base class facade application for predicting tokens or text.
- CLI_META = {'mnemonic_excludes': {'clear_cached_facade', 'create_facade', 'deallocate', 'get_cached_facade'}, 'mnemonic_overrides': {'predict_text': 'predict'}, 'option_overrides': {'model_path': {'long_name': 'model', 'short_name': None}, 'out_format': {'long_name': 'format', 'short_name': 'f'}, 'verbose': {'long_name': 'verbose', 'short_name': None}}}¶
Tell the command line app API to igonore subclass and client specific use case methods.
- __init__(config, facade_name='facade', model_path=None, config_factory_args=<factory>, config_overwrites=None, cache_global_facade=True, model_config_overwrites=None)¶
- class zensols.deepnlp.cli.NLPSequenceClassifyFacadeModelApplication(config, facade_name='facade', model_path=None, config_factory_args=<factory>, config_overwrites=None, cache_global_facade=True, model_config_overwrites=None)[source]¶
Bases:
NLPFacadeModelApplication
A facade application for predicting tokens (for example NER tasks).
- __init__(config, facade_name='facade', model_path=None, config_factory_args=<factory>, config_overwrites=None, cache_global_facade=True, model_config_overwrites=None)¶
zensols.deepnlp.feature module¶
Stashes that parse feature documents.
- class zensols.deepnlp.feature.DataframeDocumentFeatureStash(delegate, config, name, chunk_size, workers, factory, vec_manager, document_limit=9223372036854775807, text_column='text', additional_columns=None)[source]¶
Bases:
DocumentFeatureStash
Creates
FeatureDocument
instances frompandas.Series
rows from thepandas.DataFrame
stash values.- __init__(delegate, config, name, chunk_size, workers, factory, vec_manager, document_limit=9223372036854775807, text_column='text', additional_columns=None)¶
- class zensols.deepnlp.feature.DocumentFeatureStash(delegate, config, name, chunk_size, workers, factory, vec_manager, document_limit=9223372036854775807)[source]¶
Bases:
MultiProcessStash
This class parses natural language text in to
FeatureDocument
instances in multiple sub processes.- ATTR_EXP_META = ('document_limit',)¶
- __init__(delegate, config, name, chunk_size, workers, factory, vec_manager, document_limit=9223372036854775807)¶
-
factory:
Stash
¶ The stash that creates the
factory_data
given to_parse_document()
.
- prime()[source]¶
If the delegate stash data does not exist, use this implementation to generate the data and process in children processes.
-
vec_manager:
FeatureDocumentVectorizerManager
¶ Used to parse text in to
FeatureDocument
instances.
zensols.deepnlp.score module¶
Module contents¶
Deep learning for NLP applications.