Token Normalizers and Mappers List#
This package provides a simple, yet robust way to generate a string stream of tokens using a TokenNormalizer as mentioned in the parsing documentation (please read this first).
A full list of token normalizers mappers are listed below. Note that the API was written to easily extend to create your own using the configuration factory API.
TokenNormalizer: Base token extractor returns tuples of tokens and their normalized version.
TokenMapper: Abstract class used to transform token tuples generated from
TokenNormalizer.normalize
.MapTokenNormalizer: A normalizer that applies a sequence of
TokenMapper
s to transform the normalized token text.SplitTokenMapper: Splits the normalized text on a per token basis with a regular expression.
LemmatizeTokenMapper: Lemmatize tokens and optional remove entity stop words.
FilterTokenMapper: Filter tokens based on token (Spacy) attributes.
SubstituteTokenMapper: Replace a string in normalized token text.
LambdaTokenMapper: Use a lambda expression to map a token tuple.