Part of the Natural Language Understanding Global Initiative is the Global Dictionary Initiative.
The idea is to product NLP++ dictionary files for all the major languages of the world. VisualText and NLP++ are being used to parse Wiktionary pages as well as other digital resources in order to create NLP++ “dict” files that can be used in VisualText when developing human digital readers.
Here is a repository of NLP++ analyzers that have created numerous dictionaries by David de Hilster: https://github.com/VisualText/dehilster-analyzers/tree/main/Dictionaries
There are various dictionary efforts that are underway or have been worked on:
- English Dictionary from Wiktionary
- Portuguese Dictionary from Wikitionary
- Tamil Dictionary
- Nepali Dictionary
Some other dictionaries that have been constructed are human name dictionaries, and various other specific dictionaries for English.
- First and surname dictionaries for English, Japanese, Portuguese, Chinese, Spanish, etc.
- English dictionaries for number words, date words (months and days of the week), state names, and US Postal street names.