Token Normalizers and Mappers List¶

This package provides a simple, yet robust way to generate a string stream of tokens using a TokenNormalizer as mentioned in the parsing documentation (please read this first).

A full list of token normalizers mappers are listed below. Note that the API was written to easily extend to create your own using the configuration factory API.

TokenNormalizer: Base token extractor returns tuples of tokens and their normalized version.
TokenMapper: Abstract class used to transform token tuples generated from TokenNormalizer.normalize.
MapTokenNormalizer: A normalizer that applies a sequence of TokenMappers to transform the normalized token text.
SplitTokenMapper: Splits the normalized text on a per token basis with a regular expression.
LemmatizeTokenMapper: Lemmatize tokens and optional remove entity stop words.
FilterTokenMapper: Filter tokens based on token (Spacy) attributes.
SubstituteTokenMapper: Replace a string in normalized token text.
LambdaTokenMapper: Use a lambda expression to map a token tuple.