8 Weeks with Claude and NLP++ – Natural Language Understanding Global Initiative

For eight weeks I worked alongside Claude to build version 3 of the NLP Engine and its VS Code extension. Together we landed hundreds of commits across more than a dozen repositories: new functions in the NLP++ language, cross-platform fixes, one-click Cloudflare/GitHub compilation, old RoboHelp documentation converted to Markdown, coordinated updates across eight repositories, a new Help tab, and scaffolding for more than 40 NLP++ information extractors.

Claude was a force multiplier through all of it — call it five to ten times my normal output. It was great at working with our existing C++ NLP Engine and our existing TypeScript NLP++ language extension.

But could it write NLP++? This is my journey to the answer.

My Road to LLM Force Multiplying

Let’s be honest: NLP++ is still a fairly rare language. There is very little NLP++ code on GitHub compared with the massive amount of code in C++, TypeScript, or Python. So the question became: can Claude write NLP++ effectively? Can it help? How can it help? And can it be a force multiplier for those who don’t know NLP++ but who want to build deterministic NLP analyzers and gain all the advantages that come with them?

For years, I was skeptical. My first attempts at getting LLMs to write NLP++ code came a couple of years back, when I was using Copilot inside VS Code at my job. Everyone in our programming group was given Copilot, and at that time it could help out with small pieces of code. When Copilot tried writing NLP++, it was generally a disaster — the code was some strange mix of C++, JavaScript, Python, and a peculiar “invented” NLP programming language. NLP++ is unique in the realm of computer programming languages, and with so little NLP++ code on GitHub, this was to be expected.

Fast forward to 2025, when I started using Claude, which was provided to students and faculty at Northeastern University. I started playing around with Claude to fix some problems on a few websites, and it fixed several I simply hadn’t had time to. I am very fluent in PHP and WordPress from building many websites, and I had left some of those problems hanging for years, so I started seeing if Claude could fix them. And to my surprise, it could.

It helped me “unhack” some of the WordPress websites that had been infected, and I saw that its agentic workflow could resolve many issues I never had time for. It truly was a force multiplier. As I got Claude to do more and more tasks that I myself could never get to, I started letting it tackle more and more coding for me.

Once I saw its ability to work with existing code, I moved to the NLP Engine and VisualText to see if it could work through a backlog of issues.

Worked Great for C++ and TypeScript

As it turned out, when it came to taking the NLP Engine and VisualText to version 3, Claude was a true force multiplier. But it worked for me for two very specific reasons.

First, I am 100% familiar with the code. I know the C++ engine and the TypeScript extension inside and out. I wrote 100% of the TypeScript code myself and had worked with Amnon’s C++ code for the NLP Engine for years, taking it over myself a couple of years ago when Amnon retired. That means I can specify exactly what I want Claude to code for both the NLP Engine and VisualText, and review it just as quickly. Claude wrote code fast, and I could verify it just as fast.

Second, the code foundation is solid. The underlying codebase was written by senior software engineers — myself and Amnon. Amnon is the architect of the NLP++ language interpreter, which is a massively impressive piece of C++ coding. It’s clean and well-structured, which means Claude has clear patterns to follow. The same goes for the TypeScript code I wrote 100% from scratch for the NLP++ language extension for VS Code. Point Claude at good code and it usually produces more good code — plain, deterministic code you can read, diff, version, and re-run. Point it at a mess and it will faithfully extend the mess.

The First Time Claude Wrote NLP++

After the success with version 3 of the NLP Engine and VisualText, I went for broke: I asked Claude if it could write NLP++ code. I had all my repositories in one place on my laptop and pointed Claude to them to investigate. After about 10 minutes, it came back and said “yes.” So I decided to get Claude to write NLP++.

I decided to see if it could improve our package analyzers that came with the NLPPlus Python and NPM packages. So I invited Claude into my NLP++ programming session to see what it could do. The first thing I did was ask it to do some simple things, like creating more text on which to “harden” the analyzer. NLP++ does not train on data; the texts are used for humans to run the analyzer and find problems with the deterministic code. I knew this was busywork it could excel at, and it did so easily.

It then came back and asked: “Do you want to run the analyzer on the new texts I created and find any problems?” Of course I said yes. At first, I did not expect it to know how to run the analyzer at all. The analyzer is usually run through the NLP++ language extension in VS Code, which calls the nlp.exe executable. I watched it look for nlp.exe, and when it couldn’t find it, I pointed out where in the VS Code extension it was located. I then watched it struggle at first with the switches and variables, but eventually it was running the NLP++ analyzer via the command line.

It ran the analyzer, found which texts didn’t produce outputs, and then started to create new rules and new analyzer passes in the NLP++ pipeline without ever being told how. It was doing everything on the command line, and given that NLP++ was designed to do everything in simple text files like the analyzer pipeline, Claude’s probability patterns generated mostly good NLP++ code. Once in a while I would see it get stuck; I would explain what was wrong, and it went on.

It found gaps in the rules, dictionary, knowledge bases, and functions and corrected them — all in NLP++. It went out to the internet and created a folder of new texts to run the analyzer on, to find where it might break. And then it proceeded to fix them. Very much unexpected.

OK, let’s push this one step further.

Can Claude Write an Entire NLP++ Analyzer?

My first attempts at getting Claude to write NLP++ code from scratch came when I asked it to scaffold a company that would use NLP++ extractors to do simple tasks like extracting telephone numbers, emails, and addresses. I got this idea from my sessions with Claude hardening the analyzer package for the NLPPlus Python and NPM packages. I asked Claude to pick out 30–40 simple extraction tasks that LLMs are currently used for, and to create a website of deterministic extractors that would use NLP++ to replace LLMs, which are inherently unreliable at extraction.

LLMs are expensive, trained on anything, and will try to do anything you prompt them for. They are not intelligent, but they are a useful autocomplete that excels at code, given that computer languages are so regular compared to natural language. Using these huge behemoths of technology to do simple extraction is doubly wrong. First, LLMs are not good at extraction: they use probability, and the output is never the same. This is fatal for extraction. NLP++, on the other hand, is one input, one output, with no probability used in generation. So using NLP++ to replace LLMs at simple extraction made sense — an obvious candidate for tackling a specific pain point.

In short order, Claude came up with a website and a list of possible extractors currently handled by LLMs. I then asked Claude to use the NLPPlus package to create a demo webpage that would use NLP++ analyzers to do live extraction on text that visitors could cut and paste in. It did just that for the extractors that already existed.

It then came back with a question: “Do you want me to follow the format of the existing NLP++ analyzers and scaffold the remaining analyzers?” Why not?

I said yes, and it proceeded to create NLP++ extractors from scratch, one after another. It did them one at a time, and I just let it go on its own with almost no guidance. Soon there was an entire webpage with all the extractors written in NLP++. I went back and hardened many of the analyzers by pulling more realistic text from the internet. After hardening all of the analyzers it produced, I went in to see what it had written.

The Good and the Bad

The good was that it did fairly well on the simpler analyzers, the kind often written in Python or regex — telephone numbers, emails, and the like. The bad was that for more complicated text, ranging from simpler bank transactions all the way to entity extraction, the NLP++ analyzers were obviously quite unsatisfactory.

I could also go back to certain extractors, like the resume extractor, and guide it at a higher level — suggesting it create “zones” first and then extract information from those zones. I went back to several of the analyzers and experimented with vibe-coding them into a better state. I wanted to see where Claude could actually help with writing NLP++ analyzers.

But the majority of the almost 40 extractors it built on its own were not good. They would need heavy coaching, or a rewrite from scratch. Building deterministic analyzers with NLP++ is different from writing in any other programming language, so even though Claude could harden existing NLP++ analyzers written by humans, it was not so good at creating them on its own with no guidance.

In the end, what did these eight weeks of using Claude to code the NLP Engine, VisualText, and finally NLP++ analyzers reveal?

Can Claude Write NLP++?

The answer is yes, it can — but with an important caveat.

First, Claude can help with existing NLP++ analyzers. This is good, because new users can take an existing NLP++ analyzer and modify it with Claude’s help — having it create test files to harden the analyzer, and letting it run the analyzer on new text and change or add code to deal with text it has not seen. It is good at that.

Second, Claude can write NLP++ and scaffold an analyzer from scratch, but only if you point it to all the NLP++ resources it needs AND you give it a specific way to process the text. You need to tell it what to look for and where, be specific about what the output should be, and give guidance about what to tackle first. Claude can only help if you have an idea of how the particular NLP task you are trying to implement can be tackled, and if you carefully guide it.

Lowering the Barrier for Deterministic Parsing

Another thing these eight weeks made me realize is that Claude can be used to lower the barrier to becoming fluent in NLP++. You can now describe your corpus and what you want extracted, point Claude to the resources it needs to mimic NLP++ code, and let Claude scaffold a starting point. This is something I didn’t expect when I started this journey. The reason Claude can do this is that it is very good at computer programming language patterns. Unlike natural language, computer programming languages like NLP++ are perfectly regular. If you get one semicolon wrong, it most likely will not compile.

NLP++ is no different, and even though Claude didn’t see a lot of NLP++ during its training, it has the ability to mimic patterns in NLP++ code — because NLP++ is just another logical programming language, and Claude can do that without ever having trained on it.

Although I won’t cover it here, a companion how-to, Using Claude with NLP++, walks through the setup step by step and shows how you can use Claude to your advantage to get on your way to creating your first deterministic NLP programs. Version 3 of VisualText even includes prompts you can use with Claude to make it more capable of helping coders build deterministic NLP systems. Everything is explained in Using Claude with NLP++.

The Irony Is Not Lost

Using a probabilistic LLM to help build deterministic, rule-based, symbolic NLP over those first eight weeks has been very ironic. Getting an untrustworthy autocomplete system to produce a trustworthy deterministic system is bizarre, to say the least. Oftentimes it hit me, while working with LLMs to produce NLP++, that LLMs can help create their own replacements. That is ironic indeed.

A tool that produces an almost infinite number of outputs for the same input, creating NLP systems that produce one output for one input, is not only ironic but very, very satisfying!

What’s Coming

Digital NLP Governing Body, nlp.foundation — the governing body overseeing the great digital migration of linguistic and world knowledge, along with the linguistic algorithms that will be processed and used by computers.

Curated extractors at nlpfix.ai — ready-made, reliable extraction for people who want the result without writing NLP++ themselves. This is a work in progress, and we hope to launch in the near future.

Certification through screamingkoala.com — for people who want genuine fluency in the NLP++ language. This is a project for the future. In the meantime, we have our NLP++ textbook.