Methods
defineConfig
Defines the configuration required to read the input data and to generates the regression tree.
Example
// Define each column.
var columns = [
{ name: 'model', categorical: true, exclude: true },
{ name: 'mpg', categorical: false, target: true },
{ name: 'cylinders', categorical: true },
{ name: 'displacement', categorical: true, exclude: false },
{ name: 'horsepower', categorical: true, exclude: false },
{ name: 'weight', categorical: true, exclude: false },
{ name: 'acceleration', categorical: true, exclude: false },
{ name: 'year', categorical: true, exclude: true },
{ name: 'origin', categorical: true, exclude: false }
];
// Define parameters to grow the tree.
var treeParams = {
minPercentVarianceReduction: 2.5,
minLeafNodeItems: 10,
minSplitCandidateItems: 30,
minAvgChildrenItems: 3
};
// Define the configuration using above 2 variables.
myRT.defineConfig( columns, treeParams );
// -> 8
Parameters
Name | Type | Description | ||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
inputDataCols | Array.<object> | each object in this array defines a column of input
data in the same sequence in which data will be supplied to Properties
|
||||||||||||||||||||||||||||||
tree | object | contains key value pairs of the following regression tree's parameters: Properties
|
Returns
number of columns defined.
- Type
- number
evaluate
Incrementally evalutes variance reduction for one data row at a time.
Example
myRT.evaluate( input );
Parameters
Name | Type | Description |
---|---|---|
rowObject | object | contains column name/value pairs including the target column name/value pair as well, which is used in evaluating the variance reduction. |
Returns
always true
.
- Type
- boolean
exportJSON
Exports the JSON of the rule tree generated by learn()
, which may be
saved in a file for later predictions.
Example
var rules = myRT.exportJSON();
Returns
of the rule tree.
- Type
- json
importJSON
Imports the rule tree from the input rulesTree
for subsequent use by predict()
.
Note after a successful import, this can be used ONLY for prediction purpose
and not for further ingestion and/or learning.
Example
var anRT = regressionTree();
// Assuming that json has a valid rule tree.
anRT.importJSON( rules );
Parameters
Name | Type | Description |
---|---|---|
rulesTree | json | containg an earlier exported rule tree in JSON format. |
Throws
-
-
if
rulesTree
isnull
. - Type
- error
-
-
-
if
rulesTree
can not be parsed as a valid JSON. - Type
- error
-
-
-
if
rulesTree
is of incorrect version or incorrect format. - Type
- error
-
Returns
always true
.
- Type
- boolean
ingest
Ingests one row of the data at a time. It is specially useful for reading data in an asynchronus manner, where this may be used as a call back function on every row read event.
Example
// Load cars training data set.
var cars = require( 'wink-regression-tree/sample-data/cars.json' );
// Ingest the data.
cars.forEach( function ( row ) {
myRT.ingest( row );
} );
Parameters
Name | Type | Description |
---|---|---|
row | array | one row of the data to be ingested; column values
should be in the same sequence in which they are defined in data configuration
via |
Throws
-
if number of elements in
row
don't match with the number of columns defined. - Type
- error
Returns
always true
.
- Type
- boolean
learn
Learns from the ingested data and generates the rule tree that is used to
predict()
the value of target variable from the input. It requires at least
60 data rows to initiate meaningful learning.
Example
myRT.learn();
// -> Number of rules learned
Throws
-
if number of rows in the ingested data are <60.
- Type
- error
Returns
number of rules learned from the input data.
- Type
- number
metrics
Computes the variance reduction observed in the validation data passed to
evaluate()
.
Example
myRT.metrics();
// -> object containing varianceReduction and data size.
Returns
containing the varianceReduction
in percentage and data size
.
- Type
- object
predict
Predicts the value of target variable from the input
using the rules tree generated by
learn()
. If the value of a columm in the input data, required for
the prediction is missing, by defualt it throws an error. If the function
fn
is defined then no error is thrown, instead the name of missing column is passed
to this function; and the function is expected to handle the same.
Example
// Populate sample input
var input = {
model: 'Ford Gran Torino',
weight: 'very high weight',
displacement: 'very large displacement',
horsepower: 'extremely high power',
origin: 'US',
acceleration: 'slow'
};
// Attempt prediction.
myRT.predict( input );
// -> 14.3
Parameters
Name | Type | Attributes | Description |
---|---|---|---|
input | object | data containing column name/value pairs; the column
names must the same as defined via |
|
modifier | function |
<optional> |
is called once
a leaf node is reached during prediction with the following 5 parameters: size,
mean and stdev values at the node; an array of column names
navigated to reach the leaf and column name for which value is missing
in the input ( |
Throws
-
-
if the
input
is not a javascript object. - Type
- error
-
-
-
if a value of a column required for prediction is missing in
input
, providedmodifier
has not been defined. - Type
- error
-
Returns
mean
value or whatever is returned by the modifier
function, if defined.
- Type
- number
reset
It completely resets the tree by re-initializing all the learning related variables, except it's configuration. It is useful during cross fold-validation.
Example
var myRT.reset();
Returns
nothing!
- Type
- undefined
summary
Generates summary of the learnings in terms of the following:
- Relative importance of columns along with the corresponding min/max variance reductions (VR).
- The min/max mean values along with the corresponding standard deviations (SD).
- The minumum standard deviation (SD) discovered during the learning.
Example
myRT.summary();
// -> returns the summary object.
Returns
containing the following:
table
— array of objects, where each object defineslevel
,columnHierarchy
,nodesSplit
,minVR
andmaxVR
. A lower value oflevel
indicates higher importance; similarly more nodes at a level split on a columnHierarchy is an indication of importance. Therefore, it is sorted in ascending order oflevel
followed by in descending order ofnodesSplit
.stats
— object containingmin.mean
,min.itsSD
,max.mean
,max.itsSD
, andminSD
.
- Type
- object