tokens()

tokens() → { collection of tokens }

This API returns a collection of tokens in the winkNLP document, sentence item or entity item.

If the input text is a single word, then the collection of tokens will be returned with a single member.

You can chain it further with other winkNLP APIs like each(), length(), filter(), etc. However, to get the output in a JavaScript data type, you need to follow this API with out().

Example:

const text = '#Breaking: Can’t get over this #Oscars selfie from @TheEllenShow🤩! Go check it out:)https://pic.twitter.com/C9U5NOtGap #Share your best selfie@r2d2@gmail.com by 11:59 pm!';
const patterns = [
  { name: 'wordEmoji', patterns: [ '[|NOUN|PROPN|ADP] [EMOJI|EMOTICON]' ] },
  { name: 'mentionEmoji', patterns: [ '[MENTION] [EMOJI|EMOTICON]' ] }
];
nlp.learnCustomEntities( patterns );
const doc = nlp.readDoc( text );

// Tokens in winkNLP doc
const tokens = doc.tokens();
console.log( tokens.out() ); // default
// ->  [ '#Breaking', ':', 'Ca', 'n’t',...'by', '11:59', 'pm', '!' ]

console.log( tokens.out( its.type ) ); // get the token types
// -> [ 'hashtag', 'punctuation', 'word',...'word', 'punctuation' ]

// Tokens in a sentence item
const sentence = doc.sentences().itemAt( 0 ); // -> #Breaking: Can’t get..@TheEllenShow🤩!
console.log( 'Tokens in Sentence No.: 1' );
console.log( sentence.tokens().out() );
// -> Tokens in Sentence No.: 1
//     [ '#Breaking', ':', 'Ca', 'n’t',...'@TheEllenShow', '🤩', '!' ]

// Tokens in an entity item
const entity = doc.entities().itemAt( 8 ); // -> 11:59 pm
console.log( 'Tokens in Entity No.: 9' );
console.log( entity.tokens().out() );
// -> Tokens in Entity No.: 9
//    [ '11:59', 'pm' ]

// Tokens in a custom entity
console.log(doc.customEntities().out(its.detail));
const customEntity = doc.customEntities().itemAt( 0 ); // -> @TheEllenShow🤩
console.log( 'Tokens in Custom Entity No.: 1' );
console.log( customEntity.tokens().out() );
// -> Tokens in Custom Entity No.: 1
//    [ '@TheEllenShow', '🤩' ]

Leave feedback