Methods
computeOdds
Computes the log base-2 of odds of every label for the input; and returns
the array of [ label, odds ]
in descending order of odds.
Example
myClassifier.computeOdds( 'I want to pay my car loan early' );
// -> [
[ 'prepay', 6.169686751688911 ],
[ 'autoloan', -6.169686751688911 ]
]
Parameters
Name | Type | Description |
---|---|---|
input | String Array.<String> | is either text or tokens determined by the
choice of |
Returns
Array of [ label, odds ]
in descending order of odds.
- Type
- Array.<array>
consolidate
Consolidates the learning. It is a prerequisite for evaluate()
and/or predict()
.
Example
myClassifier.consolidate();
// -> true
Throws
Error if training data belongs to only a single class label or the training data is too small for learning.
Returns
Always true.
- Type
- boolean
defineConfig
Defines the configuration for naive bayes text classifier. This must be called before attempting to learn; in other words it can not be set once learning has started.
Example
myClassifier.defineConfig( { considerOnlyPresence: true, smoothingFactor: 0.5 } );
// -> true
Parameters
Name | Type | Attributes | Default | Description |
---|---|---|---|---|
cfg | object | defines the configuration in terms of the following parameters: |
||
considerOnlyPresence | boolean |
<optional> |
false | true indicates a binarized model. |
smoothingFactor | number |
<optional> |
1 | defines the value for additive smoothing. It can have any value between 0 and 1. |
Throws
Error if cfg
is not a valid Javascript object, or smoothingFactor
is invalid,
or an attempt to define configuration is made after learning starts.
Returns
Always true.
- Type
- boolean
definePrepTasks
Defines the text preparation tasks
to transform raw incoming
text into tokens required during
learn()
, evaluate()
and predict()
operations.
The tasks
should be an array of functions;
using these function a simple pipeline is built to serially transform the
input to the output. A single helper function for preparing text is available that (a) tokenizes,
(b) removes punctuations, symbols, numerals, URLs, stop words and (c) stems.
Example
// Load wink NLP utilities
var prepText = require( 'wink-naive-bayes-text-classifier/src/prep-text.js' );
// Define the text preparation tasks.
myClassifier.definePrepTasks( [ prepText ] );
// -> 1
Parameters
Name | Type | Description |
---|---|---|
tasks | Array.<function()> | the first function in this array must accept a string as input and the last function must return tokens i.e. array of strings. Please refer to example. |
Throws
Error if tasks
is not an array of functions.
Returns
The number of functions in task
array.
- Type
- number
evaluate
Evaluates the learning against a test data set.
The input
is used to predict the class label, which is compared with the
actual class label
to populate confusion matrix incrementally.
Example
myClassifier.evaluate( 'can i close my loan', 'prepay' );
// -> true
Parameters
Name | Type | Description |
---|---|---|
input | String Array.<String> | is either text or tokens determined by the
choice of |
label | string | of class to which |
Returns
Always true.
- Type
- boolean
exportJSON
Exports the learning as a JSON, which may be saved as a text file for
later use via importJSON()
.
Example
myClassifier.exportJSON();
// returns JSON.
Returns
Learning in JSON format.
- Type
- string
importJSON
Imports an existing JSON learning for prediction.
It is essential to definePrepTasks()
#definepreptasks and
consolidate()
before attempting to predict.
Parameters
Name | Type | Description |
---|---|---|
json | JSON | containing learnings in as exported by |
Throws
Error if json
is invalid.
Returns
Always true.
- Type
- boolean
learn
Learns from the example pair of input
and its label
.
Example
myClassifier.learn( 'I need loan for a new vehicle', 'autoloan' );
// -> true
Parameters
Name | Type | Description |
---|---|---|
input | string Array.<string> | if it is a string, then |
label | string | of class to which |
Throws
Error if learnings have been already consolidated.
Returns
Always true.
- Type
- boolean
metrics
Computes a detailed metrics consisting of macro-averaged precision, recall and f-measure along with their label-wise values and the confusion matrix.
Example
// Assuming that evaluation has been already carried out
JSON.stringify( myClassifier.metrics(), null, 2 );
// -> {
// "avgPrecision": 0.75,
// "avgRecall": 0.75,
// "avgFMeasure": 0.6667,
// "details": {
// "confusionMatrix": {
// "prepay": {
// "prepay": 1,
// "autoloan": 1
// },
// "autoloan": {
// "prepay": 0,
// "autoloan": 1
// }
// },
// "precision": {
// "prepay": 0.5,
// "autoloan": 1
// },
// "recall": {
// "prepay": 1,
// "autoloan": 0.5
// },
// "fmeasure": {
// "prepay": 0.6667,
// "autoloan": 0.6667
// }
// }
// }
Throws
Error if attempt to generate metrics is made prior to proper evaluation.
Returns
Detailed metrics.
- Type
- object
predict
Predicts the class label for the input
. If it is unable to predict then it
returns a value unknown
.
Example
myClassifier.predict( 'I want to pay my car loan early' );
// -> prepay
Parameters
Name | Type | Description |
---|---|---|
input | String Array.<String> | is either text or tokens determined by the
choice of |
Returns
The predicted class label for the input
.
- Type
- String
reset
It completely resets the classifier by re-initializing all the learning related variables, except the preparatory tasks. It is useful during cross fold-validation.
Example
myClassifier.reset();
// -> true
Returns
Always true.
- Type
- boolean
stats
Returns basic stats of learning in terms of count of samples under each label, total words, and the size of vocabulary.
Example
myClassifier.stats();
// -> {
// labelWiseSamples: {
// autoloan: 5,
// prepay: 4
// },
// labelWiseWords: {
// autoloan: 36,
// prepay: 26
// },
// vocabulary: 24
// };
Returns
An object containing count of samples under each label, total words, and the size of vocabulary.
- Type
- object