NLP Blockchain Harnesses Humans, Not Text

Northeastern University is launching a groundbreaking project that rethinks natural language processing by harnessing human ingenuity and people power instead of huge server farms to produce the trustworthy NLP that industry seeks. This new direction in “AI” is made possible by merging blockchain technology with a unique, rule-based programming language called NLP++. Led by Professor David de Hilster, the initiative—set to begin in fall 2025 at the Miami Campus—aims to shift NLP development away from resource-intensive, centralized large language models toward a decentralized, transparent ecosystem powered by human-crafted dictionaries, algorithms, and incentive-driven contributions through its native cryptocurrency, NLPX.

As de Hilster points out, “for decades we have been fooling ourselves that statistical systems are artificial ‘intelligence’ when they clearly are not.” This innovative project, part of the university’s Experiential Network, brings together students, remote and in-person, as well as industry experts to set new standards for linguistic and computational trustworthiness, ultimately laying the foundation for what will evolve into a human-centric revolution in AI.

Fall 2025 Project

Northeastern University and their students are joining experts in industry to form a project team to help design what Professor David de Hilster calls the “NLP Blockchain”. This is the first of what de Hilster hopes to be a series of projects. The first project elaborates a blueprint for the NLP blockchain, which has unique characteristics.

This “Extracurricular Project” is part of the Experiential Network (XN) at Northeastern University, that helps students gain experience in their respective fields by completing hands-on projects in selected courses. It starts fall semester 2025 based at the Miami Campus of Northeastern University and includes in-person and remote students as well as industry leaders. The hope is for this project to eventually become a startup tech company.

The purpose of the NLP Blockchain is to facilitate the “great digital migration”, where human coders pass linguistic and world knowledge, along with algorithms, to computers, so that they can understand natural language as we humans do. NLP++ makes this possible.

Antithetical to LLMs

The NLP Blockchain is antithetical to Large Language Models in the following ways:

Issue NLP Blockchain Large Language Models
CreationProgrammers write codeVast computer farms train on large data sets
Development SetSmall, using human ability to generalizeHuge datasets, scraped from the internet, with copyright and undisclosed data issues
TrustworthinessLogically based and trustworthyProbability-based and untrustworthy
ResourcesHuman programmersHuge data farms that consume huge water and electricity resources
DistributionDistributedCentralized
OwnershipLarge number of individualsFew private hands
CostTens of millionsHundreds of billions
Timeframe5 – 10 yearsweeks to months
PromiseTrustworthy intelligent NLP for critical tasksSuggestive help for non-critical tasks

What is the NLP Blockchain?

At the heart of the project lies a powerful idea: to decentralize AI and NLP development by combining the strengths of blockchain and the rule-based NLP++ programming language. The goal is to create an open, decentralized ecosystem for developing, maintaining, and using linguistic resources—such as dictionaries, knowledge bases, and algorithms—for every human language and topic domain.

The NLP Blockchain—powered by the NLPX (NLP Coin) cryptocurrency—offers a radical alternative to today’s centralized, resource-heavy AI systems. It aims to put control back in the hands of the people through collaborative, transparent, and incentivized development.

NLP++ is the only programming language explicitly created for text and NLP, enabling the NLP Blockchain to exist at all. It is the universal programming language for creating the NLP blockchain data.

Without the blockchain, NLP++ is a standalone programming language. With the blockchain, NLP++ becomes replete with dictionaries, knowledge, and algorithms that perform at human level and in a trustworthy manner.

Goals

Because the NLP Blockchain is a departure from other blockchain projects, the goals for the project will inevitably change as the project goes forward. For now, the goals of the project are:

  • Come up with a set of standards for the linguistic and world knowledge, much like the Unicode character system
  • Decide whether to use an existing blockchain or create a new one from scratch
  • Create an incentive structure for knowledge and code creators
  • Define ledgers for knowledge and code storage
  • Define ledgers for keeping track of knowledge and code use
  • Keep track of assets used in by the NLP Engine
  • Design an NLP token or coin – NLPX
  • Develop apps and websites for investors and creators

Expert Team

The team consists of the co-creators of NLP++, Northeastern Adjunct Professor David de Hilster and Amnon Meyers; Zac Cohen, blockchain expert and entrepreneur; and Matthew Stroul, expert in business tech.

David and Amnon created their “dream programming language” for NLP in a startup company some 25 years ago, and with interest in trustworthy NLP on the rise, interest in NLP++ has been on the rise as well, with the pair preparing a textbook for teaching NLP++. The textbook will be used in the first university course, to be taught in India, with hope of teaching the class at Northeastern University sometime in the near future.

In 2024, David met Zac Cohen, an entrepreneur and expert in the area of blockchain, who asked the fateful question: “can NLP++ can be combined with blockchain technology?” After a month or so, both de Hilster and Cohen hashed out their preliminary understanding of blockchain and NLP++, and the idea of the NLP Blockchain was born.

During the same period, de Hilster enlisted long-time friend Matthew Stroul to co-host a revival of David’s podcast “Dissident Science“. When they discussed the idea of an NLP Blockchain, Matthew was eager to join the team. Matthew has spent the last 5 years helping startups business tech with their technical workflows.

Acceptance by the Crypto and Blockchain Community

The crypto and blockchain movement has its roots in a broader struggle for decentralized power and economic transparency—an ethos that found a significant stage during the Occupy Wall Street movement. Occupy Wall Street was not just a protest against corporate excess and economic inequality; it was also a rallying cry for reclaiming power from centralized institutions. Many involved in the early crypto scene shared these ideals, seeking to build systems that operated without the control of banks or governments. Bitcoin’s emergence following the 2008 financial crisis resonated with those who saw traditional financial structures as broken, and the subsequent decentralized technologies grew in response to these worries. In this way, blockchain became a tool for enabling trusted networks and peer-to-peer transactions, embodying the grassroots aspiration for a fairer, more transparent system.

Traditionally, AI and NLP efforts have leaned heavily on training with vast swaths of unstructured text scraped from numerous sources most often without permission from their human authors. That approach leaves you with models that—while useful— are untrustworthy, opaque and prone to perpetuating biases, simply because they’re based on patterns derived from “all the text in the world.” In contrast, the NLP Blockchain takes an entirely different approach. By incentivizing human programmers to create dictionaries, build explicit knowledge bases, and architect algorithms using the dedicated programming language NLP++, the system puts human insight at the forefront of NLP development. This hands-on, rule-based paradigm ensures that language processing is rooted in codified human understanding, rather than in the unpredictable outputs of massive data training.

NLP++ is designed to be, in many ways, the SQL for text. Just as SQL brought precision and structure to working with data in databases, NLP++ strives to bring the same clarity and human-level understanding to language processing. Instead of relying on statistical guesswork, programmers use NLP++ to explicitly define how natural language should be tokenized, parsed, and semantically interpreted. This allows for the creation of structured dictionaries, grammars, and algorithms that are not only transparent and verifiable, but also evolve as contributors refine and upgrade their rules and analyzer code. In doing so, the approach moves toward a standard for constructing NLP analyzers that are inherently trustworthy.

The real power of the NLP Blockchain lies in its use of blockchain technology to decentralize and incentivize the NLP development process. Every contribution—whether it’s a new dictionary entry, a refined parsing algorithm, or an innovative knowledge representation—is immutably recorded on the blockchain. Contributors earn NLP coins as a reward for their work, turning the traditionally hidden labor of creating NLP tools into a verifiable, economically incentivized, and collaborative effort. This mechanism ensures that AI development isn’t monopolized by a few large entities training on uncontrolled and copyrighted data, but is instead a community-driven, democratic process. It’s a fresh take on putting AI back into the hands of the people, aligning technology development with transparency and human-centered values.

Timeframe

Whereas today’s “AI” and “tech” world moves at light speed, the NLP Blockchain moves at a human pace, but multiplied by an army of human coders.

Over the next decade, growing numbers of programmers, scientists, and laymen from industry and academia will expand on the foundations for trustworthy NLP.

And from those developers will emerge dictionaries, knowledge, and algorithms that will become the standard in logic-based and rule-based NLP of the future.

“I am happy to be the turtle. I see statistical ‘AI’ rabbits running around without a notion of where they are going, hampered by overpromise and the severe theoretical and ethical limitations of their technology. Linguistically, we know where we need to go. The fun part is getting there and using our human ingenuity to do so. Just like humans, computers must be given language. And that is what NLP++ can do. We just need a blockchain to organize and incentivize.”

-David de Hilster

NLP++ is the tortoise in the NLP technology race being the only entry that is 100% rule-based.

Northeastern Students

If you are a Northeastern University student and want to read an in-depth syllabus for this project, please contact Professor David de Hilster and he will give you the password to this page: Calling Northeastern Students for NLP Blockchain Project – Natural Language Understanding Global Initiative

Loading