zensols.model.weka

Wraps the Weka Java API. This is probably the wrong library to use for most uses. Instead take a look at zensols.model.eval-classifier and zensols.model.execute-classifier.

classifiers

dynamic

An (incomplete) set of Weka classifiers keyed by their speed, type or singleton by name.

fast train quickly
slow train slowly
really-slow train very very slowly
lazy lazy category
meta meta classifiers (i.e. boosting)
tree tree based classifiers (typically train quickly)

The singleton classifiers is a list like the others but have only a single element of the class. They include: zeror, svm, j48, random-forest, naivebays, logit, logitboost, smo, kstar.

view source

cross-fold-info

dynamic

When two-pass cross fold validations are used this is bound to the following map during the validation (see clone-instances):

:train? true if creating folds during for the train phase, otherwise the test phase is used
:fold the number of the fold
:state state shared between training and testing (i.e. context)

view source

missing-values-ok

dynamic

Whether missing the classifier can handle missing values, otherwise an exception is thrown for missing values.

view source

append-instances

(append-instances src dst)

Merge two instances row wise by adding dst to src.

view source

attribute-by-name

(attribute-by-name instances name)

Return a weka.core.Attribute instance by name from a weka.core.Instances.

view source

attributes-for-instances

(attributes-for-instances insts & {:keys [sort?], :or {sort? true}})

Return a map with :name and :type for each attribute in an weka.core.Instances.

view source

clone-classifier

(clone-classifier classifier)

view source

clone-instances

(clone-instances inst & {:keys [train-fn test-fn randomize-fn], :as opts})

Return a deep clone of inst, optionally with a specific training and test set. See *cross-fold-info* to get information during the validation for debugging and analysis.

inst an (object) instance of weka.core.Instances (the whole dataset)

Keys

train-fn a function that takes the following arguments: an weka.core.Instances created for the training set, number of folds, the fold number and a java.util.Random to pass to the Weka layer to shuffle the dataset
test-fn just like train-fn but used to create the test data set and it doesn’t take the java.util.Random instance

view source

create-attrib

(create-attrib att-name type)

Create a Weka Attribute instance with att-name.

type is the type of attribute, which can be string, boolean, numeric, or a sequence of strings representing possible enumeration values (nominals in Weka speak).

view source

instances

(instances inst-name feature-sets feature-metas)(instances inst-name feature-sets feature-metas class-feature-meta & {:keys [clone?], :or {clone? true}})

Create a new weka.core.Instances instance.

inst-name used to identify the model data set
feature-sets a sequence of maps with each map having key/value pairs of the features of the model to be populated in the returned weka.core.Instances
feature-metas a map of key/value pairs describing the features (they become weka.core.Attributes) where the values are string, boolean, numeric, or a sequence of strings representing possible enumeration values (nominals in Weka speak)
class-feature-meta just like a (single) feature-metas but describes the class

view source

let-classifier

macro

(let-classifier fdef-expr & forms)

fnspec ==> (classifier-name [insts] exprs)

Define a classifier that uses Clojure code to evaluate insts instances and evaluate body exprs.

Example:

(let-classifier
  (langid-baseline [inst]
     (let [attrib (weka/attribute-by-name inst "langid-1-id")
           val (.stringValue inst attrib)
           rval (= "en" val)]
       (log/infof "langid: %s for: %s: res: %s" val inst rval)
       (if rval 1 0)))
(terse-results lang-baseline meta-set))

view source

make-classifiers

(make-classifiers)(make-classifiers set-name-or-instance)

Make classifiers from either a key in *classifiers* or an instance of weka.classifiers.Classifier (meaning an already constructed instance). All classifiers are returned for the 0-arg option.

view source

populate-instances

(populate-instances insts feature-metas feature-sets)

Populate a weka.core.Instances instance Clojure data structures.

inst a weka.core.Instances that will be populated
feature-metas a map of key/value pairs describing the features (they become weka.core.Attributes) where the values are described as types in create-attrib
feature-sets a sequence of maps with each map having key/value pairs of the features of the model to be populated in the returned weka.core.Instances

view source

remove-attributes

(remove-attributes inst attrib-names & {:keys [invert-selection?]})

Remove a set of attributes from inst (weka.core.Instances) by string (string) name.

view source

sparse-instances

(sparse-instances maps dim & {:keys [pattern class-attribute-name instance-name add-class? default-value], :or {pattern "f%d", class-attribute-name "class", instance-name "inst", add-class? true}})

Create a sparse core.weka.Instance using a sequence of maps (map). The keys of the maps are the class with the values maps each with the key as the index and the value the weight. The dim parameter is the dimension of each instance.

Keys

:pattern a format using one integer as the index (default: f%d)
:class-attribute the name of the output class (values are given from the keys of maps)
instance-name the name of the Instance created object and defaults to inst
:add-class? if true add the class that comes from the key in maps
default-value if a double replace missing values not in th emap with this value, otherwise missing values will be used

view source

value

(value insts n name)

Return the value for instance n in core.weka.Instance insts with attribute of name.

view source

value-for-instance

(value-for-instance val)(value-for-instance type val)

Return a Java variable that plays nicely with the Weka framework. If no type is given it tries to determine the type on its own.

val is a Java primitive (wrapper)
type if given, is the type of val (see create-attrib)

view source

Generated by Codox

Interface for machine learning modeling, testing and training 0.0.18

Project

Namespaces

Public Vars

zensols.model.weka

classifiers

dynamic

cross-fold-info

dynamic

missing-values-ok

dynamic

append-instances

attribute-by-name

attributes-for-instances

clone-classifier

clone-instances

Keys

create-attrib

instances

let-classifier

macro

make-classifiers

populate-instances

remove-attributes

sparse-instances

Keys

value

value-for-instance

Generated by Codox

Interface for machine learning modeling, testing and training 0.0.18

Project

Namespaces

Public Vars

zensols.model.weka

*classifiers*

dynamic

*cross-fold-info*

dynamic

*missing-values-ok*

dynamic

append-instances

attribute-by-name

attributes-for-instances

clone-classifier

clone-instances

Keys

create-attrib

instances

let-classifier

macro

make-classifiers

populate-instances

remove-attributes

sparse-instances

Keys

value

value-for-instance

classifiers

cross-fold-info

missing-values-ok