zensols.annotate.gate
Wrapper for Gate annotation natural language processing utility. This is a small wrapper that makes the following easier:
- Annotating Documents
- Create Store Documents
- Creating Annotation Schemas
*corpus-name*
dynamic
The default corpus name when creating a Gate data store.
annotate-document
(annotate-document start end label doc)
(annotate-document start end label features doc)
Annotate a document with entity label type
from character position [**start** end) using additional entity metadata features in document doc.
annotation-schema
(annotation-schema label)
(annotation-schema label options)
Create an annotation schema (i.e. entity) label. If options is given provide additional feature schema metadata. See annotation-schema-from-resource.
annotation-schema-from-resource
(annotation-schema-from-resource resource)
Create a schema annotation from a schema the contents of resource. See Gate docs). See annotation-schema.
configure-plugins
(configure-plugins)
(configure-plugins plugins)
Configure Gate plugins. The no-arg default configures the Alignment
plugin.
create-document
(create-document text)
(create-document text name)
Create a document with raw text. You can annotate the returned document with annotate-document.
initialize
(initialize)
Initialize the Gate system. This is called when this namespace is loaded.
retrieve-documents
(retrieve-documents store-dir)
Retrieve Gate documents as maps that was stored by a human annotator or by store-documents. The data to be retrieved comes from the file system pointed by the directory store-dir.
This returns a lazy sequence of maps that have the following keys:
- :document The
gate.Document
instance (if you really need it) - :name The name of the document
- :content The text string content of the document.
- :annotation A map of annotation maps that have the following keys:
- :text: The text of the annotation
- :label The label of the annotation (*type* in Gate parlance)
- :annotations The character interval of the annotation text (start/end node in Gate parlance
store-documents
(store-documents store-dir documents & {:keys [resources]})
Create a Gate data store that can be opened by the Gate GUI. This creates a directory structure at store-dir and populates it documents that were create with create-document. The name of the corpus is taken from *corpus-name*.
Important: this first deletes the store-dir directory if it exists.
Keys
- resources: resources (i.e. entities created with annotation-schema)