NLP++ in One Page

Putting data into tables using columns with values, and creating relations created SQL. Tokenizing text into trees, matching patterns, and building knowledge, created NLP++. SQL evolved from the necessity to write programs that manipulate data in database tables. NLP++ evolved from the necessity to write programs that break down text and find meaning.

Both languages were logical progressions of the data structures they were required to manipulate. Where SQL is ubiquitous and runs almost 90% of all databases in the world, NLP++  is hardly known. This will change when two things happen:

  1. When the world becomes aware that NLP has a universal programming language as databases have SQL.
  2. When rule-based NLP is recognized as the only way to write trustworthy and human-level NLP programs.

There are two problems facing NLP++:

  1. Human language is exponentially more complex than data in database tables and there is no standard for constructing NLP++ analyzers.
  2. Data, rules, knowledge and algorithms must be created manually by humans.

Building rule-based NLP systems takes an immense amount lot of time, we don’t know what the answer is, and there is no incentive to build these systems (e.g. Wikipedia).

Unlike SQL which was invented with a well defined data space, NLP++ was designed without having a well defined data space.

The solution to these problems is the NLP Blockchain.

The NLP Blockchain will organize, incentivize, and decentralize NLP sparking the “great digital migration” where people around the world will build trustworthy NLP for all human languages. The progression from dictionaries, to simple phrases, to entity extraction, to story understanding will happen with tens of thousands of people building towards better and better rule-based NLP.

The incentive will be NLP coins that will be mined during the “great digital migration” with dictionaries, knowledge bases, simpler patterns being supervised while analyzer code being allowed to happen organically. Certain techniques in parsing will bubble to the surface, eventually becoming “standard practice”.

This is a decades-long project where NLP++ code fully captures linguistic and world knowledge and creates the algorithms necessary for distributed, maintainable, controlled, trustworthy, and powerful NLP.

The “great digital migration” requires programmers to shift from thinking like computers as in traditional programming, to thinking like humans.

That is why NLP++ was created: to allow for the encoding of how humans read and understand text.

NLP++ is the SQL for text.

Loading