zensols.nlparse.feature.word
Feature utility functions for tokens and words.
dictionary-feature-metas
(dictionary-feature-metas)(dictionary-feature-metas lang-codes)See dictionary-features.
dictionary-features
(dictionary-features tokens)(dictionary-features tokens lang-codes)Dictionary features include in/out-of-vocabulary ratio. The lang-codes parameter is a hash set of two letter string language code (see zensols.nlparse.wordlist/in-word-list?) to look up, which defaults to en for English.
token-features
(token-features panon tokens)Return token features for panon for all tokens. The following features are given:
- :utterance-length The character length of the utterance.
- :mention-count Number of mentions in the utterance.
- :sent-count Number of sentences in the utterance.
- :token-count Total tokens across all sentences.
- :token-average-length Average character lenght of all tokens.
- :stopword-count Number of stop words in hte utterance.
- :is-question Whether or not the last token across all sentences is a question.
wordnet-feature-metas
(wordnet-feature-metas)wordnet-features
(wordnet-features word)(wordnet-features word pos-tag)Get features generated from WordNet from word.
- word the word to lookup
- pos-tag a wordnet pos tag (see zensols.nlparse.wordnet/pos-tags)