zensols.model.execute-classifier

A client entry point library to help with executing a trained classifier. The classifier is tested and trained in the zensols.model.eval-classifier namespace.

This library expects that you configure your model. To learn how to do that, see with-model-conf and see the repo docs.

classifier-file

(classifier-file model)

Return the default file used to create a model data file with write-classifier.

classify

(classify model & data)

Classify a single instance using a trained model.

confusion-matrix-file

(confusion-matrix-file model)

Return the default file used to create a confusion matrix spreadsheet file with write-confusion-model.

cross-fold-instances

(cross-fold-instances)

Called by eval-classifier to create the data set for cross validation. See create-instances.

display-predictions

(display-predictions predictions)

Display predictions given by predict.

dump-model-info

(dump-model-info model & opts)

Write all data from print-model-info to the file system.

See zensols.model.classifier/modeldir for where the model is read from and zensols.model.classifier/analysis-report-resource for information about to where the model information is written.

model-classifier-feature-types

(model-classifier-feature-types)(model-classifier-feature-types context)

Return the feature metadatas from the model config.

model-classifier-label

(model-classifier-label)

Return the class label metadata from the model config.

model-config

(model-config)

Return the currently bound model configuration.

model-exists?

(model-exists?)

Return whether a model file exists on the file system.

predict

(predict model & {:keys [set-type feature-sets], :or {set-type :test}})

Create predictions using the provided model.

Keys

  • :set-type the set to draw the test data, which defaults to :test

predictions-file

(predictions-file model)

Return the default file used to create a predictions spreadsheet file with write-predictions.

prime-model

(prime-model model)

Prime a trained or unpersisted (read-model) model for classification with classify.

print-model-info

(print-model-info model & {:keys [metrics? attributes? features? classifier? context? results?], :or {metrics? true, attributes? true, features? false, classifier? false, context? false, results? true}})

Print informtation from a (usually serialized) model. This data includes performance metrics, the classifier, features used to create the model and the context (see zensols.model.execute-classifier).

read-model

(read-model & {:keys [fail-if-not-exists? file], :or {fail-if-not-exists? true, file (:name (model-config))}})

Read/unpersist the model from the file system. If file is given, use that file instead of getting it from zensols.model.classifier/analysis-report-resource.

train-test-instances

(train-test-instances)

Called by eval-classifier to create the data set for cross validation. See create-instances.

with-model-conf

macro

(with-model-conf model-config & body)

Evaluates body with a model configuration.

An example of how this is used is in the example repo.

The model configuration is a map with the following keys:

  • :name human readable short name, which is used in file names and spreadsheet cells
  • :create-feature-sets-fn function creates a sequence of maps with each map having key/value pairs of the features of the model to be populated; it is passed with optional keys:
    • :set-type the data set type: :test or :train, which uses the data set library convention for easy integration
  • :create-features-fn just like create-feature-sets-fn but creates a single feature map used for test/execution after the classifier is built; it’s called with the arguments that classify is given to classify an instance along with the context generated at train time by :context-fn if it was provided (see below)–therefore you must provide a two argument function if a context is provided at train time
  • :feature-metas-fn a function that creates a map of key/value pairs describing the features where the values are string, boolean, numeric, or a sequence of strings representing possible enumeration values
  • :display-feature-metas-fn like :feature-metas-fn but used to display (i.e. while debugging)
  • :class-feature-meta-fn just like a feature-metas-fn but describes the class
  • :context-fn a function that creates a context (ie. stats on the entire training set) and passed to :create-features-fn
  • :set-context-fn (optional) a function that is called to set the context created with :context-fn and retrieved from the persisted model; this is useful when using/executing the model and the context is needed before :create-features-fn is called; if this function is provided it replaces the unpersisted context in case there is any thawing logic that might be needed for the model
  • :model-return-keys what the classifier will return (by default {:label :distributions})
  • :cross-fold-instances-inst at atom used to cache the weka.core.Instances generated from :create-feature-sets-fn; when this atom is derefed as nil :create-feature-sets-fn is called to create the feature maps
  • :feature-sets-set a map of key/value pairs where keys are names of feature sets and the values are lists of lists of features as symbols

write-classifier

(write-classifier model)(write-classifier model file)

Serialize (just) the classifier to the file system.

The model parameter is a model created from zensols.model.eval-classifier/train-model. If file is given, use that file instead of getting it from zensols.model.classifier/analysis-report-resource.

See classifier-file.

write-confusion-matrix

(write-confusion-matrix model)(write-confusion-matrix model output-file)

Write the confusion matrix in model.

write-predictions

(write-predictions prediction)(write-predictions predictions file)

Write predictions given by predict to the analysis directory. If file is given, use that file instead of getting it from zensols.model.classifier/analysis-report-resource.

See zensols.model.classifier/analysis-report-resource for information about to where the spreadsheet is written.

See predictions-file.