NLP Parsing and Feature Creation 0.1.6

Released under the Apache License version 2.0

A library for parsing natural language feature creation.

Installation

To install, add the following dependency to your project or build file:

[com.zensols.nlp/parse "0.1.6"]

Topics

Namespaces

zensols.nlparse.config

Configure the Stanford CoreNLP parser.

zensols.nlparse.config-parse

Parse a pipeline configruation. This namespace supports a simple DSL for parsing a pipeline configuration (see zensols.nlparse.config). The configuration string represents is a component separated by commas as a set of forms. For example the forms:

zensols.nlparse.config/tokenize("en"),zensols.nlparse.config/sentence,part-of-speech("english.tagger"),zensols.nlparse.config/morphology

creates a pipeline that tokenizes, adds POS and lemmas when called with parse. Note the double quotes in the tokenize and part-of-speech mnemonics. The parse function does this by calling in order:

Public variables and functions:

zensols.nlparse.feature.lang

Feature utility functions. In this library, all references to panon stand for parsed annotation, which is returned from zensols.nlparse.parse/parse.

zensols.nlparse.feature.word-count

Feature utility functions. See zensols.nlparse.feature.lang.

zensols.nlparse.parse

Parse an utterance using the Stanford CoreNLP and the ClearNLP SRL.

zensols.nlparse.resource

Configure environment for the NLP pipeline.

Public variables and functions:

zensols.nlparse.srl

Wrap ClearNLP SRL.

zensols.nlparse.stanford

Wraps the Stanford CoreNLP parser.

zensols.nlparse.stopword

This namesapce provides ways of filtering stop word tokens.

Public variables and functions:

zensols.nlparse.tok-re

This namespace extends the NER system to easily add any regular expression using the Stanford TokensRegex API.

Public variables and functions:

zensols.nlparse.util

Utility functions

Public variables and functions:

zensols.nlparse.version

Public variables and functions: