English Dictionary

This project involves parsing the Wiktionary pages for English into the most comprehensive digital dictionary ever created. The first two stages of this project have been done via grants to RV College of Engineering from LexisNexis Risk’s HPCC Systems group.

In Progress

This project is still in progress.

Project Description

The objective of this project is to use NLP++ to parse the Wiktionary pages for English in order to create a comprehensive digital dictionary that can be used in NLP and NLU.


I’m happy to share our paper titled “Scalable Analysis of English Dictionary Files on HPCC Systems Big Data Platform,” authored by Adarsh U, David De Hilster, Hugo Watanuki, Shobha G, Jyoti Shetty and myself, was accepted for oral presentation at the esteemed International Conference on Big Data Analytics (ICBDA) hosted at Waseda University in Tokyo. Participating in this conference and presenting our findings in person was an enriching experience.