Item and its properties

There are three item types — token, entity, and sentence. Items are always accessed via the collection that contains them. Like collections, item also exposes a set of API methods including the out() method. Every type of item has its own set of properties. You can access the required property by providing its.propertyName parameter to the item.out() method. For example, doc.tokens().itemAt(1).out( its.type ) returns the type of first token in the document.

Note all properties are read-only.

API Methods

Let’s explore the API methods with the same example text:

var text = `The Godfather premiered on March 15, 1972. It was released on March 24, 1972. It is the first installment in The Godfather trilogy. The story of the movie spans from 1945 to 1955. About 90 percent of the film was shot in New York City. The movie was made on a budget of $7.2 million. And it has a running time of 177 minutes.`;

const doc = nlp.readDoc( text );

An item exposes API methods to:

  • access its parent document, sentence and named entity (if any),
  • find its index,
  • output its value or available properties, and
  • markup the item for text visualization.
Method Purpose
index() Find the index of an item in the collection; for example entity2.index() will return numeric value — 2.
markup() Easily create beautiful visualizations using information extracted from the document. More about it in the section on “Using markup”.
out() Obtain the value or properties of the item. By default, it returns the value property of an item; for example token1.out() will return string value — 'Godfather'.
parentDocument() Access the parent document of an item.
parentEntity() Access the parent entity (if any) of a token item; as only a token may have a parent entity.
If there is no parent entity, it returns undefined.
parentSentence() Access the parent sentence of a token item or entity item.

Item’s Properties

Properties are specific to the item type except in the case of value and normal, which are uniformly available across each item type. Here is a glimpse of a few properties under each item type:

  • Token — type, stopWordFlag and normal.
  • Entity — type and span.
  • Sentence — span and sentiment.
The document, which is not an item exactly, also has properties like sentiment and markedUpText.

A variety of NLP tasks can be performed by using these API methods and item properties. For example, we can easily extract all the sentences that have reference to an event i.e. it contains a date:

doc.entities()
        .each((e) => {
          // Extract type of entity using .out() with “its.value”
          // as input parameter.
          if (e.out(its.type) === '#DATE')
            console.log(e.parentSentence().out());
        } );
// -> 'The Godfather premiered on March 15, 1972.'
// -> 'It was released on March 24, 1972.' 
Some of the properties are language dependent such as it.stem. For details refer to language models

Leave feedback