Generate, split into folds or train/test and cache a dataset 0.0.12
Released under the MIT
Generate, split into folds or train/test and cache a dataset.
Installation
To install, add the following dependency to your project or build file:
[com.zensols.ml/dataset "0.0.12"]
Namespaces
zensols.dataset.db
Preemptively compute a dataset (i.e. features from natural language utterances) and store them in Elasticsearch. This is useful for use with training, testing, validating and development machine learning models.
Public variables and functions:
- class-label-key
- clear
- dataset-file
- default-connection-inst
- distribution
- divide-by-fold
- divide-by-preset
- divide-by-set
- elasticsearch-connection
- freeze-dataset
- freeze-dataset-to-writer
- freeze-file
- id-key
- ids
- instance-by-id
- instance-count
- instance-key
- instances
- instances-by-class-label
- instances-count
- instances-load
- set-default-connection
- set-default-set-type
- set-fold
- set-population-use
- stats
- with-connection
- write-dataset
zensols.dataset.elsearch
A client simple wrapper for an Elasticsearch wrapper. You probably want use the more client friendly zensols.dataset.db.
zensols.dataset.thaw
Exactly like zensols.dataset.db but use the file system.
Public variables and functions: