uic.nlp.todo.corpus

This namespace parses the corpus from an Excel file.

annotated-file

(annotated-file)(annotated-file annotator)

Return the annoatated todo corpus spreadsheet file.

annotation-info

(annotation-info)

Return information from the todocorp.conf configuration file.

coder-agreement

(coder-agreement & {:keys [annotators limit], :or {limit Integer/MAX_VALUE}})

Create the output file used by R to create inercoder agreement (Cohen’s Kappa).

deserialize-annotation

(deserialize-annotation)

metrics

deprecated

(metrics)

Generate somewhat useful metrics (depreciated).

read-anons

(read-anons & {:keys [annotators limit], :or {limit Integer/MAX_VALUE}})

Read annotations from the Excel file.

read-for-annotator

(read-for-annotator & {:keys [limit annotator], :or {limit Integer/MAX_VALUE}})

Return a list of maps, each with a Todo list data point.

serialize-annotations

(serialize-annotations)

Write annotations in an intermedia binary serialization file. Note: this should not be confused with the JSON generation, which uic.nlp.todo.db/freeze-dataset.