wink-naive-bayes-text-classifier - Wink JS

Methods

computeOdds

computeOdds(input) → {Array.<array>}

Computes the log base-2 of odds of every label for the input; and returns the array of [ label, odds ] in descending order of odds.

Example

myClassifier.computeOdds( 'I want to pay my car loan early' );
// -> [
        [ 'prepay', 6.169686751688911 ],
        [ 'autoloan', -6.169686751688911 ]
      ]

Parameters

Name	Type	Description
input	String Array.<String>	is either text or tokens determined by the choice of `preparatory tasks`.

Returns

Array of [ label, odds ] in descending order of odds.

Type: Array.<array>

consolidate

consolidate() → {boolean}

Consolidates the learning. It is a prerequisite for evaluate() and/or predict().

Example

myClassifier.consolidate();
// -> true

Throws

Error if training data belongs to only a single class label or the training data is too small for learning.

Returns

Always true.

Type: boolean

defineConfig

defineConfig(cfg, considerOnlyPresenceopt, smoothingFactoropt) → {boolean}

Defines the configuration for naive bayes text classifier. This must be called before attempting to learn; in other words it can not be set once learning has started.

Example

myClassifier.defineConfig( { considerOnlyPresence: true, smoothingFactor: 0.5 } );
// -> true

Parameters

Name	Type	Attributes	Default	Description
cfg	object			defines the configuration in terms of the following parameters:
considerOnlyPresence	boolean	<optional>	false	true indicates a binarized model.
smoothingFactor	number	<optional>	1	defines the value for additive smoothing. It can have any value between 0 and 1.

Throws

Error if cfg is not a valid Javascript object, or smoothingFactor is invalid, or an attempt to define configuration is made after learning starts.

Returns

Always true.

Type: boolean

definePrepTasks

definePrepTasks(tasks) → {number}

Defines the text preparation tasks to transform raw incoming text into tokens required during learn(), evaluate() and predict() operations. The tasks should be an array of functions; using these function a simple pipeline is built to serially transform the input to the output. A single helper function for preparing text is available that (a) tokenizes, (b) removes punctuations, symbols, numerals, URLs, stop words and (c) stems.

Example

// Load wink NLP utilities
var prepText = require( 'wink-naive-bayes-text-classifier/src/prep-text.js' );
// Define the text preparation tasks.
myClassifier.definePrepTasks( [ prepText ] );
// -> 1

Parameters

Name	Type	Description
tasks	Array.<function()>	the first function in this array must accept a string as input and the last function must return tokens i.e. array of strings. Please refer to example.

Throws

Error if tasks is not an array of functions.

Returns

The number of functions in task array.

Type: number

evaluate

evaluate(input, label) → {boolean}

Evaluates the learning against a test data set. The input is used to predict the class label, which is compared with the actual class label to populate confusion matrix incrementally.

Example

myClassifier.evaluate( 'can i close my loan', 'prepay' );
// -> true

Parameters

Name	Type	Description
input	String Array.<String>	is either text or tokens determined by the choice of `preparatory tasks`.
label	string	of class to which `input` belongs.

Returns

Always true.

Type: boolean

exportJSON

exportJSON() → {string}

Exports the learning as a JSON, which may be saved as a text file for later use via importJSON().

Example

myClassifier.exportJSON();
// returns JSON.

Returns

Learning in JSON format.

Type: string

importJSON

importJSON(json) → {boolean}

Imports an existing JSON learning for prediction. It is essential to definePrepTasks()#definepreptasks and consolidate() before attempting to predict.

Parameters

Name	Type	Description
json	JSON	containing learnings in as exported by `exportJSON`.

Throws

Error if json is invalid.

Returns

Always true.

Type: boolean

learn

learn(input, label) → {boolean}

Learns from the example pair of input and its label.

Example

myClassifier.learn( 'I need loan for a new vehicle', 'autoloan' );
// -> true

Parameters

Name	Type	Description
input	string Array.<string>	if it is a string, then `definePrepTasks()` must be called before learning so that `input` string is transformed into tokens on the fly.
label	string	of class to which `input` belongs.

Throws

Error if learnings have been already consolidated.

Returns

Always true.

Type: boolean

metrics

metrics() → {object}

Computes a detailed metrics consisting of macro-averaged precision, recall and f-measure along with their label-wise values and the confusion matrix.

Example

// Assuming that evaluation has been already carried out
JSON.stringify( myClassifier.metrics(), null, 2 );
// -> {
//      "avgPrecision": 0.75,
//      "avgRecall": 0.75,
//      "avgFMeasure": 0.6667,
//      "details": {
//        "confusionMatrix": {
//          "prepay": {
//            "prepay": 1,
//            "autoloan": 1
//          },
//          "autoloan": {
//            "prepay": 0,
//            "autoloan": 1
//          }
//        },
//        "precision": {
//          "prepay": 0.5,
//          "autoloan": 1
//        },
//        "recall": {
//          "prepay": 1,
//          "autoloan": 0.5
//        },
//        "fmeasure": {
//          "prepay": 0.6667,
//          "autoloan": 0.6667
//        }
//      }
//    }

Throws

Error if attempt to generate metrics is made prior to proper evaluation.

Returns

Detailed metrics.

Type: object

predict

predict(input) → {String}

Predicts the class label for the input. If it is unable to predict then it returns a value unknown.

Example

myClassifier.predict( 'I want to pay my car loan early' );
// -> prepay

Parameters

Name	Type	Description
input	String Array.<String>	is either text or tokens determined by the choice of `preparatory tasks`.

Returns

The predicted class label for the input.

Type: String

reset

reset() → {boolean}

It completely resets the classifier by re-initializing all the learning related variables, except the preparatory tasks. It is useful during cross fold-validation.

Example

myClassifier.reset();
// -> true

Returns

Always true.

Type: boolean

stats

stats() → {object}

Returns basic stats of learning in terms of count of samples under each label, total words, and the size of vocabulary.

Example

myClassifier.stats();
// -> {
//      labelWiseSamples: {
//        autoloan: 5,
//        prepay: 4
//      },
//      labelWiseWords: {
//        autoloan: 36,
//        prepay: 26
//      },
//      vocabulary: 24
//    };

Returns

An object containing count of samples under each label, total words, and the size of vocabulary.

Type: object

NaiveBayesTextClassifier

Methods

computeOdds

Example

Parameters

Returns

consolidate

Example

Throws

Returns

defineConfig

Example

Parameters

Throws

Returns

definePrepTasks

Example

Parameters

Throws

Returns

evaluate

Example

Parameters

Returns

exportJSON

Example

Returns

importJSON

Parameters

Throws

Returns

learn

Example

Parameters

Throws

Returns

metrics

Example

Throws

Returns

predict

Example

Parameters

Returns

reset

Example

Returns

stats

Example

Returns