zensols.deepnlp package#
Subpackages#
- zensols.deepnlp.classify package
- Submodules
- zensols.deepnlp.classify.domain
LabeledBatch
LabeledBatch.COUNTS_ATTRIBUTE
LabeledBatch.DEPENDENCIES_ATTRIBUTE
LabeledBatch.DEPENDENCY_EXPANDER_ATTRIBTE
LabeledBatch.EMBEDDING_ATTRIBUTES
LabeledBatch.ENUMS_ATTRIBUTE
LabeledBatch.ENUM_EXPANDER_ATTRIBUTE
LabeledBatch.FASTTEXT_CRAWL_300_EMBEDDING
LabeledBatch.FASTTEXT_NEWS_300_EMBEDDING
LabeledBatch.GLOVE_300_EMBEDDING
LabeledBatch.GLOVE_50_EMBEDDING
LabeledBatch.LANGUAGE_ATTRIBUTES
LabeledBatch.LANGUAGE_FEATURE_MANAGER_NAME
LabeledBatch.MAPPINGS
LabeledBatch.STATS_ATTRIBUTE
LabeledBatch.TRANSFORMER_FIXED_EMBEDDING
LabeledBatch.TRANSFORMER_TRAINBLE_EMBEDDING
LabeledBatch.WORD2VEC_300_EMBEDDING
LabeledBatch.__init__()
LabeledFeatureDocument
LabeledFeatureDocumentDataPoint
TokenContainerDataPoint
- zensols.deepnlp.classify.facade
- zensols.deepnlp.classify.model
- zensols.deepnlp.classify.pred
ClassificationPredictionMapper
ClassificationPredictionMapper.__init__()
ClassificationPredictionMapper.label_feature_id
ClassificationPredictionMapper.label_vectorizer
ClassificationPredictionMapper.map_results()
ClassificationPredictionMapper.pred_attribute
ClassificationPredictionMapper.softmax_logit_attribute
ClassificationPredictionMapper.vec_manager
SequencePredictionMapper
- Module contents
- zensols.deepnlp.embed package
- Submodules
- zensols.deepnlp.embed.doc
- zensols.deepnlp.embed.domain
WordEmbedError
WordEmbedModel
WordEmbedModel.UNKNOWN
WordEmbedModel.ZERO
WordEmbedModel.__init__()
WordEmbedModel.cache
WordEmbedModel.clear_cache()
WordEmbedModel.deallocate()
WordEmbedModel.get()
WordEmbedModel.keyed_vectors
WordEmbedModel.keys()
WordEmbedModel.lowercase
WordEmbedModel.matrix
WordEmbedModel.model_id
WordEmbedModel.name
WordEmbedModel.prime()
WordEmbedModel.shape
WordEmbedModel.to_matrix()
WordEmbedModel.unk_idx
WordEmbedModel.vector_dimension
WordEmbedModel.vectors
WordEmbedModel.word2idx()
WordEmbedModel.word2idx_or_unk()
WordVectorModel
- zensols.deepnlp.embed.fasttext
- zensols.deepnlp.embed.glove
- zensols.deepnlp.embed.word2vec
- zensols.deepnlp.embed.wordtext
- Module contents
- zensols.deepnlp.index package
- Submodules
- zensols.deepnlp.index.domain
- zensols.deepnlp.index.lda
- zensols.deepnlp.index.lsi
LatentSemanticDocumentIndexerVectorizer
LatentSemanticDocumentIndexerVectorizer.DESCRIPTION
LatentSemanticDocumentIndexerVectorizer.FEATURE_TYPE
LatentSemanticDocumentIndexerVectorizer.__init__()
LatentSemanticDocumentIndexerVectorizer.components
LatentSemanticDocumentIndexerVectorizer.iterations
LatentSemanticDocumentIndexerVectorizer.lsa
LatentSemanticDocumentIndexerVectorizer.similarity()
LatentSemanticDocumentIndexerVectorizer.vectorizer
LatentSemanticDocumentIndexerVectorizer.vectorizer_params
- Module contents
- zensols.deepnlp.layer package
- Submodules
- zensols.deepnlp.layer.conv
DeepConvolution1d
DeepConvolution1dNetworkSettings
DeepConvolution1dNetworkSettings.__init__()
DeepConvolution1dNetworkSettings.batch_norm_d
DeepConvolution1dNetworkSettings.clone()
DeepConvolution1dNetworkSettings.embedding_dimension
DeepConvolution1dNetworkSettings.get_module_class_name()
DeepConvolution1dNetworkSettings.layer_factory
DeepConvolution1dNetworkSettings.n_filters
DeepConvolution1dNetworkSettings.padding
DeepConvolution1dNetworkSettings.pool_factory
DeepConvolution1dNetworkSettings.pool_padding
DeepConvolution1dNetworkSettings.pool_stride
DeepConvolution1dNetworkSettings.pool_token_kernel
DeepConvolution1dNetworkSettings.repeats
DeepConvolution1dNetworkSettings.stride
DeepConvolution1dNetworkSettings.token_kernel
DeepConvolution1dNetworkSettings.token_length
DeepConvolution1dNetworkSettings.write()
- zensols.deepnlp.layer.embed
EmbeddingLayer
EmbeddingNetworkModule
EmbeddingNetworkModule.MODULE_NAME
EmbeddingNetworkModule.__init__()
EmbeddingNetworkModule.embedding_dimension
EmbeddingNetworkModule.forward_document_features()
EmbeddingNetworkModule.forward_embedding_features()
EmbeddingNetworkModule.forward_token_features()
EmbeddingNetworkModule.get_embedding_tensors()
EmbeddingNetworkModule.vectorizer_by_name()
EmbeddingNetworkSettings
TrainableEmbeddingLayer
- zensols.deepnlp.layer.embrecurcrf
- zensols.deepnlp.layer.wordvec
- Module contents
- zensols.deepnlp.model package
- Submodules
- zensols.deepnlp.model.facade
LanguageModelFacade
LanguageModelFacade.__init__()
LanguageModelFacade.count_feature_ids
LanguageModelFacade.doc_parser
LanguageModelFacade.embedding
LanguageModelFacade.enum_feature_ids
LanguageModelFacade.get_max_word_piece_len()
LanguageModelFacade.get_transformer_vectorizer()
LanguageModelFacade.language_attributes
LanguageModelFacade.language_vectorizer_manager
LanguageModelFacade.suppress_transformer_warnings
LanguageModelFacadeConfig
- zensols.deepnlp.model.sequence
- Module contents
- zensols.deepnlp.transformer package
- Submodules
- zensols.deepnlp.transformer.domain
TokenizedDocument
TokenizedDocument.__init__()
TokenizedDocument.attention_mask
TokenizedDocument.boundary_tokens
TokenizedDocument.deallocate()
TokenizedDocument.detach()
TokenizedDocument.from_tensor()
TokenizedDocument.get_wordpiece_count()
TokenizedDocument.input_ids
TokenizedDocument.map_to_word_pieces()
TokenizedDocument.map_word_pieces()
TokenizedDocument.offsets
TokenizedDocument.params()
TokenizedDocument.shape
TokenizedDocument.tensor
TokenizedDocument.token_type_ids
TokenizedDocument.truncate()
TokenizedDocument.write()
TokenizedFeatureDocument
- zensols.deepnlp.transformer.embed
TransformerEmbedding
TransformerEmbedding.ALL_OUTPUT
TransformerEmbedding.LAST_HIDDEN_STATE_OUTPUT
TransformerEmbedding.POOLER_OUTPUT
TransformerEmbedding.__init__()
TransformerEmbedding.cache
TransformerEmbedding.model
TransformerEmbedding.name
TransformerEmbedding.output
TransformerEmbedding.output_attentions
TransformerEmbedding.resource
TransformerEmbedding.tokenize()
TransformerEmbedding.tokenizer
TransformerEmbedding.trainable
TransformerEmbedding.transform()
TransformerEmbedding.vector_dimension
- zensols.deepnlp.transformer.layer
- zensols.deepnlp.transformer.mask
- zensols.deepnlp.transformer.optimizer
- zensols.deepnlp.transformer.pred
- zensols.deepnlp.transformer.resource
TransformerError
TransformerResource
TransformerResource.__init__()
TransformerResource.args
TransformerResource.cache
TransformerResource.cache_dir
TransformerResource.cached
TransformerResource.cased
TransformerResource.clear()
TransformerResource.model
TransformerResource.model_args
TransformerResource.model_class
TransformerResource.model_id
TransformerResource.name
TransformerResource.tokenizer
TransformerResource.tokenizer_args
TransformerResource.tokenizer_class
TransformerResource.torch_config
TransformerResource.trainable
- zensols.deepnlp.transformer.tokenizer
TransformerDocumentTokenizer
TransformerDocumentTokenizer.DEFAULT_PARAMS
TransformerDocumentTokenizer.__init__()
TransformerDocumentTokenizer.all_special_tokens
TransformerDocumentTokenizer.id2tok
TransformerDocumentTokenizer.params
TransformerDocumentTokenizer.pretrained_tokenizer
TransformerDocumentTokenizer.resource
TransformerDocumentTokenizer.token_max_length
TransformerDocumentTokenizer.tokenize()
TransformerDocumentTokenizer.word_piece_token_length
- zensols.deepnlp.transformer.vectorizers
LabelTransformerFeatureVectorizer
TransformerEmbeddingFeatureVectorizer
TransformerExpanderFeatureContext
TransformerExpanderFeatureVectorizer
TransformerFeatureContext
TransformerFeatureVectorizer
TransformerMaskFeatureVectorizer
TransformerNominalFeatureVectorizer
TransformerNominalFeatureVectorizer.DESCRIPTION
TransformerNominalFeatureVectorizer.__init__()
TransformerNominalFeatureVectorizer.annotations_attribute
TransformerNominalFeatureVectorizer.delegate_feature_id
TransformerNominalFeatureVectorizer.label_all_tokens
TransformerNominalFeatureVectorizer.write()
- zensols.deepnlp.transformer.wordpiece
CachingWordPieceFeatureDocumentFactory
WordPiece
WordPieceDocumentDecorator
WordPieceFeatureDocument
WordPieceFeatureDocumentFactory
WordPieceFeatureDocumentFactory.__init__()
WordPieceFeatureDocumentFactory.add_sent_embeddings()
WordPieceFeatureDocumentFactory.add_token_embeddings()
WordPieceFeatureDocumentFactory.create()
WordPieceFeatureDocumentFactory.embed_model
WordPieceFeatureDocumentFactory.sent_embeddings
WordPieceFeatureDocumentFactory.token_embeddings
WordPieceFeatureDocumentFactory.tokenizer
WordPieceFeatureSentence
WordPieceFeatureSpan
WordPieceFeatureToken
WordPieceFeatureToken.__init__()
WordPieceFeatureToken.clone()
WordPieceFeatureToken.copy_embedding()
WordPieceFeatureToken.detach()
WordPieceFeatureToken.embedding
WordPieceFeatureToken.indexes
WordPieceFeatureToken.is_unknown
WordPieceFeatureToken.token_embedding
WordPieceFeatureToken.word_iter()
WordPieceFeatureToken.words
WordPieceFeatureToken.write()
WordPieceTokenContainer
- Module contents
- zensols.deepnlp.vectorize package
- Submodules
- zensols.deepnlp.vectorize.embed
- zensols.deepnlp.vectorize.manager
FeatureDocumentVectorizer
FeatureDocumentVectorizerManager
FeatureDocumentVectorizerManager.__init__()
FeatureDocumentVectorizerManager.deallocate()
FeatureDocumentVectorizerManager.doc_parser
FeatureDocumentVectorizerManager.get_token_length()
FeatureDocumentVectorizerManager.is_batch_token_length
FeatureDocumentVectorizerManager.parse()
FeatureDocumentVectorizerManager.spacy_vectorizers
FeatureDocumentVectorizerManager.token_feature_ids
FeatureDocumentVectorizerManager.token_length
FoldingDocumentVectorizer
MultiDocumentVectorizer
TextFeatureType
- zensols.deepnlp.vectorize.spacy
DependencyFeatureVectorizer
NamedEntityRecognitionFeatureVectorizer
PartOfSpeechFeatureVectorizer
SpacyFeatureVectorizer
SpacyFeatureVectorizer.VECTORIZERS
SpacyFeatureVectorizer.__init__()
SpacyFeatureVectorizer.dist()
SpacyFeatureVectorizer.from_spacy()
SpacyFeatureVectorizer.id_from_spacy()
SpacyFeatureVectorizer.id_from_spacy_symbol()
SpacyFeatureVectorizer.torch_config
SpacyFeatureVectorizer.transform()
SpacyFeatureVectorizer.vocab
SpacyFeatureVectorizer.write()
- zensols.deepnlp.vectorize.vectorizers
CountEnumContainerFeatureVectorizer
CountEnumContainerFeatureVectorizer.ATTR_EXP_META
CountEnumContainerFeatureVectorizer.DESCRIPTION
CountEnumContainerFeatureVectorizer.FEATURE_TYPE
CountEnumContainerFeatureVectorizer.__init__()
CountEnumContainerFeatureVectorizer.decoded_feature_ids
CountEnumContainerFeatureVectorizer.get_feature_counts()
CountEnumContainerFeatureVectorizer.to_symbols()
DepthFeatureDocumentVectorizer
EnumContainerFeatureVectorizer
MutualFeaturesContainerFeatureVectorizer
OneHotEncodedFeatureDocumentVectorizer
OverlappingFeatureDocumentVectorizer
StatisticsFeatureDocumentVectorizer
TokenEmbeddingFeatureVectorizer
WordEmbeddingFeatureVectorizer
- Module contents
Submodules#
zensols.deepnlp.cli#
Facade application implementations for NLP use.
- class zensols.deepnlp.cli.NLPClassifyFacadeModelApplication(config, facade_name='facade', model_path=None, config_factory_args=<factory>, config_overwrites=None, cache_global_facade=True, model_config_overwrites=None)[source]#
Bases:
NLPFacadeModelApplication
A facade application for predicting text (for example sentiment classification tasks).
- class zensols.deepnlp.cli.NLPClassifyPackedModelApplication(unpacker)[source]#
Bases:
object
Classifies data used a packed model. The
unpacker
is used to install the model (if not already), then provide access to it. AModelFacade
is created from packaged model that is downloaded. The model then uses the facade’szensols.deeplearn.model.facade.ModelFacade.predict()
method to output the predictions.- CLI_META = {'mnemonic_excludes': {'predict'}, 'mnemonic_overrides': {'write_model_info': 'modelstat', 'write_predictions': 'predict'}, 'option_excludes': {'unpacker'}, 'option_overrides': {'text_or_file': {'long_name': 'input', 'metavar': '<TEXT|FILE>'}, 'verbose': {'short_name': None}}}#
- __init__(unpacker)#
- property facade: ModelFacade#
The packaged model’s facade.
-
unpacker:
ModelUnpacker
# The model source.
- class zensols.deepnlp.cli.NLPFacadeBatchApplication(config, facade_name='facade', model_path=None, config_factory_args=<factory>, config_overwrites=None, cache_global_facade=True, model_config_overwrites=None)[source]#
Bases:
FacadeApplication
A facade application for creating mini-batches for training.
- CLI_META = {'mnemonic_excludes': {'clear_cached_facade', 'create_facade', 'deallocate', 'get_cached_facade'}, 'mnemonic_overrides': {'dump_batches': 'dumpbatch'}, 'option_overrides': {'model_path': {'long_name': 'model', 'short_name': None}}}#
Tell the command line app API to igonore subclass and client specific use case methods.
- __init__(config, facade_name='facade', model_path=None, config_factory_args=<factory>, config_overwrites=None, cache_global_facade=True, model_config_overwrites=None)#
- class zensols.deepnlp.cli.NLPFacadeModelApplication(config, facade_name='facade', model_path=None, config_factory_args=<factory>, config_overwrites=None, cache_global_facade=True, model_config_overwrites=None)[source]#
Bases:
FacadeApplication
A base class facade application for predicting tokens or text.
- CLI_META = {'mnemonic_excludes': {'clear_cached_facade', 'create_facade', 'deallocate', 'get_cached_facade'}, 'mnemonic_overrides': {'predict_text': 'predict'}, 'option_overrides': {'model_path': {'long_name': 'model', 'short_name': None}, 'verbose': {'long_name': 'verbose', 'short_name': None}}}#
Tell the command line app API to igonore subclass and client specific use case methods.
- __init__(config, facade_name='facade', model_path=None, config_factory_args=<factory>, config_overwrites=None, cache_global_facade=True, model_config_overwrites=None)#
- class zensols.deepnlp.cli.NLPSequenceClassifyFacadeModelApplication(config, facade_name='facade', model_path=None, config_factory_args=<factory>, config_overwrites=None, cache_global_facade=True, model_config_overwrites=None)[source]#
Bases:
NLPFacadeModelApplication
A facade application for predicting tokens (for example NER tasks).
- __init__(config, facade_name='facade', model_path=None, config_factory_args=<factory>, config_overwrites=None, cache_global_facade=True, model_config_overwrites=None)#
zensols.deepnlp.feature#
- class zensols.deepnlp.feature.DataframeDocumentFeatureStash(delegate, config, name, chunk_size, workers, factory, vec_manager, document_limit=9223372036854775807, text_column='text', additional_columns=None)[source]#
Bases:
DocumentFeatureStash
Creates
FeatureDocument
instances frompandas.Series
rows from thepandas.DataFrame
stash values.- __init__(delegate, config, name, chunk_size, workers, factory, vec_manager, document_limit=9223372036854775807, text_column='text', additional_columns=None)#
- class zensols.deepnlp.feature.DocumentFeatureStash(delegate, config, name, chunk_size, workers, factory, vec_manager, document_limit=9223372036854775807)[source]#
Bases:
MultiProcessStash
This class parses natural language text in to
FeatureDocument
instances in multiple sub processes.- ATTR_EXP_META = ('document_limit',)#
- __init__(delegate, config, name, chunk_size, workers, factory, vec_manager, document_limit=9223372036854775807)#
-
factory:
Stash
# The stash that creates the
factory_data
given to_parse_document()
.
- prime()[source]#
If the delegate stash data does not exist, use this implementation to generate the data and process in children processes.
-
vec_manager:
FeatureDocumentVectorizerManager
# Used to parse text in to
FeatureDocument
instances.
zensols.deepnlp.score#
Module contents#
Deep learning for NLP applications.