NLP in Agriculture

Nov 18, 2018 by Sanjaya

Farmer query data

Several years ago, even before we had the idea of making Wink, our nascent NLP code was put to the test — we got a chance to analyze the queries of Indian farmers. These queries were in Hinglish (a mix of Hindi and English) and were in subjects ranging from sowing, harvesting, to plant protection. Our goal was to find distribution of various crops, discover trends and pattern of diseases in various crops across geographies with time.

In June this year we got a chance to relive those memorable moments when we spotted a large corpus of farmer queries on Open Government Data Platform. It was a gold mine for us. We decided to analyze about 3 million of them using our current tools. It was real fun! Here is a summary of the methodology:

NLP process in agriculture

We were able to extract hyper-local disease and virus trends from this data and even create day-by-day timelines. Using it we created many invaluable insights. Below is a heatmap visualization of blight occurrences in Maharashtra:

Blight in Maharashtra (2016)

Another unique insight was of how different diseases affected onions in Ahmadnagar throughout the year:

Onion Ahmednagar (2016)

These outcomes became the primary component of our keynote address at the IBM Agri Tech Challenge 2018 conference in mid June this year. With our current wink NLP packages we could put all of this together rapidly. It took less than a minute on a Macbook Pro to clean, classify, extract crops, and tag diseases on the entire data set.

We will be showcasing the detailed recipe of this project on Github soon. Keep an eye out while we release the next generation tools, and be ready for more fun. 😀

For speaking engagements or any other enquiries feel free to get in touch at


nlp utils
NLP Functions for amplifying negations, managing elisions, creating ngrams, stems, phonetic codes to tokens and more.
Github · Documentation

Multi-class averaged perceptron.
Github · Documentation

Recent posts

NLP in Agriculture

A more permissive license