Methods
defineConfig
This API has no effect. It has been maintained for compatibility purpose.
The wink-tokenizer
will now always add lemma and normal forms.
Note, lemmas are added only for nouns (excluding proper noun), verbs and
adjectives.
Example
// There will not be any effect:
var myTagger.defineConfig( { lemma: false } );
// -> { lemma: true, normal: true }
Returns
always as { lemma: true, normal: true }
.
- Type
- object
tag
Tags the input tokens
with their pos. It has another alias – tagTokens()
.
In order to pos tag a sentence directly, use
tagSentence
API instead.
Example
// Get `tokenizer` method from the instance of `wink-tokenizer`.
var tokenize = require( 'wink-tokenizer' )().tokenize;
// Tag the tokenized sentence.
myTagger.tag( tokenize( 'I ate the entire pizza as I was feeling hungry.' ) );
// -> [ { value: 'I', tag: 'word', normal: 'i', pos: 'PRP' },
// { value: 'ate', tag: 'word', normal: 'ate', pos: 'VBD', lemma: 'eat' },
// { value: 'the', tag: 'word', normal: 'the', pos: 'DT' },
// { value: 'entire', tag: 'word', normal: 'entire', pos: 'JJ', lemma: 'entire' },
// { value: 'pizza', tag: 'word', normal: 'pizza', pos: 'NN', lemma: 'pizza' },
// { value: 'as', tag: 'word', normal: 'as', pos: 'IN' },
// { value: 'I', tag: 'word', normal: 'i', pos: 'PRP' },
// { value: 'was', tag: 'word', normal: 'was', pos: 'VBD', lemma: 'be' },
// { value: 'feeling', tag: 'word', normal: 'feeling', pos: 'VBG', lemma: 'feel' },
// { value: 'hungry', tag: 'word', normal: 'hungry', pos: 'JJ', lemma: 'hungry' },
// { value: '.', tag: 'punctuation', normal: '.', pos: '.' } ]
Parameters
Name | Type | Description |
---|---|---|
tokens | Array.<object> | to be pos tagged. They are array of objects and
must follow the |
Returns
pos tagged tokens
.
- Type
- Array.<object>
tagRawTokens
Tags the raw tokens
with their pos. Note, it only categorizes each
token in to one of the following 3-categories (a) word, or (b) punctuation,
or (c) number.
In order to pos tag a sentence directly, use
tagSentence
API instead.
Example
var rawTokens = [ 'I', 'ate', 'the', 'entire', 'pizza', 'as', 'I', 'was', 'feeling', 'hungry', '.' ];
// Tag the raw tokens.
myTagger.tagRawTokens( rawTokens );
// -> [ { value: 'I', tag: 'word', normal: 'i', pos: 'PRP' },
// { value: 'ate', tag: 'word', normal: 'ate', pos: 'VBD', lemma: 'eat' },
// { value: 'the', tag: 'word', normal: 'the', pos: 'DT' },
// { value: 'entire', tag: 'word', normal: 'entire', pos: 'JJ', lemma: 'entire' },
// { value: 'pizza', tag: 'word', normal: 'pizza', pos: 'NN', lemma: 'pizza' },
// { value: 'as', tag: 'word', normal: 'as', pos: 'IN' },
// { value: 'I', tag: 'word', normal: 'i', pos: 'PRP' },
// { value: 'was', tag: 'word', normal: 'was', pos: 'VBD', lemma: 'be' },
// { value: 'feeling', tag: 'word', normal: 'feeling', pos: 'VBG', lemma: 'feel' },
// { value: 'hungry', tag: 'word', normal: 'hungry', pos: 'JJ', lemma: 'hungry' },
// { value: '.', tag: 'punctuation', normal: '.', pos: '.' } ]
Parameters
Name | Type | Description |
---|---|---|
rawTokens | Array.<string> | to be pos tagged. They are simple array of string. |
Returns
pos tagged tokens
.
- Type
- Array.<object>
tagSentence
Tags the input sentence
with their pos.
Example
myTagger.tagSentence( 'A bear just crossed the road.' );
// -> [ { value: 'A', tag: 'word', normal: 'a', pos: 'DT' },
// { value: 'bear', tag: 'word', normal: 'bear', pos: 'NN', lemma: 'bear' },
// { value: 'just', tag: 'word', normal: 'just', pos: 'RB' },
// { value: 'crossed', tag: 'word', normal: 'crossed', pos: 'VBD', lemma: 'cross' },
// { value: 'the', tag: 'word', normal: 'the', pos: 'DT' },
// { value: 'road', tag: 'word', normal: 'road', pos: 'NN', lemma: 'road' },
// { value: '.', tag: 'punctuation', normal: '.', pos: '.' } ]
//
//
myTagger.tagSentence( 'I will bear all the expenses.' );
// -> [ { value: 'I', tag: 'word', normal: 'i', pos: 'PRP' },
// { value: 'will', tag: 'word', normal: 'will', pos: 'MD', lemma: 'will' },
// { value: 'bear', tag: 'word', normal: 'bear', pos: 'VB', lemma: 'bear' },
// { value: 'all', tag: 'word', normal: 'all', pos: 'PDT' },
// { value: 'the', tag: 'word', normal: 'the', pos: 'DT' },
// { value: 'expenses', tag: 'word', normal: 'expenses', pos: 'NNS', lemma: 'expense' },
// { value: '.', tag: 'punctuation', normal: '.', pos: '.' } ]
Parameters
Name | Type | Description |
---|---|---|
sentence | string | to be pos tagged. |
Throws
-
if
sentence
is not a valid string. - Type
- Error
Returns
pos tagged tokens.
- Type
- Array.<object>
updateLexicon
Updates the internal lexicon using the input lexicon
. If a word/pos pair
is found in the internal lexicon then it's value is updated with the new pos;
otherwise it added.
Example
myTagger.updateLexicon( { Obama: [ 'NNP' ] } );
Parameters
Name | Type | Description |
---|---|---|
lexicon | object | containing |
Throws
-
if
lexicon
is not a valid JS object. - Type
- Error
Returns
Nothing!
- Type
- undefined