zensols.nlparse.config-parse
Parse a pipeline configruation. This namespace supports a simple DSL for parsing a pipeline configuration (see zensols.nlparse.config). The configuration string represents is a component separated by commas as a set of forms. For example the forms:
zensols.nlparse.config/tokenize("en"),zensols.nlparse.config/sentence,part-of-speech("english.tagger"),zensols.nlparse.config/morphology
creates a pipeline that tokenizes, adds POS and lemmas when called with parse. Note the double quotes in the tokenize
and part-of-speech
mnemonics. The parse function does this by calling in order:
- (zensols.nlparse.config/tokenize “en”)
- (zensols.nlparse.config/sentence)
- (zensols.nlparse.config/part-of-speech “english.tagger”)
- (zensols.nlparse.config/morphology)
Soem configuration functions are parameterized by positions or maps. Positional functions are shown in the above example and a map configuration follows:
parse-tree({:use-shift-reduce? true :maxtime 1000})
which creates a shift reduce parser that times out after a second (per sentence).
Note that arguments are option (the parenthetical portion of the form) and so is the namespace, which defaults to zensols.nlparse.config
. To use a separate namespace for custom plug and play To use a separate namespace for custom plug and play components (see zensols.nlparse.config/register-library) you can specify your own namespace with a /
, for example:
example.namespace/myfunc(arg1,arg2)
parse
(parse config-str)
(parse config-str namespaces)
Parse configuration string config-str into a pipeline configuration. See the namespace (zensols.nlparse.config-parse) documentation for more information.
to-metadata
(to-metadata config-str)
Create form metadata data structures from configuration string config-str.