Eight Weeks with Claude and NLP++ – Natural Language Understanding Global Initiative

For the last eight weeks I’ve paired with Claude to build version 3 of the NLP Engine and its VS Code extension. Not a side experiment — the real thing, shipping code, hundreds of commits across more than a dozen repositories. Here’s some of what we did together:

Added new functions and capabilities to the NLP++ language itself
Fixed cross-platform problems in the engine
Built one-click compiling using Cloudflare and GitHub
Converted the entire old RoboHelp HTML help set into Markdown
Percolated coordinated updates across eight-plus repositories
Shipped a new Help tab in the extension

Across all of it, Claude was a force multiplier — call it five to ten times my normal output. And every bit of it worked for one reason: I understand every line of code, the code we wrote and the code it wrote.

Why it works for me specifically

Claude works for me because of two things, and I don’t think either is optional.

First: I am 100% familiar with the code. I know the C++ engine and the TypeScript extension inside and out. That is not a footnote; it is the whole reason this partnership works. Claude is a fast, tireless writer of plausible code, and plausible is not the same as correct. Because I can read what it produces and know instantly whether it’s right, it becomes an accelerator instead of a liability. The classic failure mode of AI-assisted development — shipping code you don’t actually understand — never happens here, because I understand all of it. That’s the bar: use Claude where you can check it.

Second: the foundation is solid. The engine and extension were written by senior software engineers, and the code is clean and well-structured. Claude does its best work on top of good code, because good code gives it clear patterns to mimic and consistent conventions to follow. Hand it a disciplined codebase and it extends it faithfully; hand it a mess and it makes a bigger one.

And whatever it writes stays glass-box: plain, deterministic code you can read, diff, version, and re-run to the exact same result. Claude is a development partner, not the runtime.

The big question: can Claude write deterministic analyzers?

Building the engine and tooling is one thing. The question I really wanted answered was about NLP++ itself: could Claude build analyzers from scratch, and could it help harden analyzers when they failed to extract information as new text arose?

The answer, to both, was yes.

Claude can write NLP++ code. It can even prototype a new analyzer from scratch. But there’s a real condition attached: you have to point it to the right places, and it has to understand how analyzers actually work in NLP++. Writing NLP++ is very different from other programming languages — it’s a rule-based system of passes, rules, a parse tree, wildcards, and knowledge-base operations, not the imperative code most models have seen a million times. Left to guess, Claude flounders. Given the engine paths, the example analyzers, dictionaries, knowledge bases to study, and the conventions to follow, it does real work. That’s exactly why version 3 ships a small library of ready-to-paste prompts — they hand Claude the right paths and the right guardrails so a new user doesn’t have to know them by heart.

Prototyping your first analyzer

For years the hardest part of NLP++ was the blank editor. Programmers, computational linguistics, and coders didn’t know where to start and how to scaffold the specific analyzer they needed for their specific task. The “build an analyzer” prompt closes that gap: it gives Claude the engine, example, and template paths, tells it to study the examples first and build on the Knowledge Base template — accumulating results into a knowledge base and emitting JSON with the library’s SaveKB and JsonKB functions instead of hand-rolling strings — and leaves you two blanks to describe your corpus and your extraction task. There’s also a complete worked example, a from-scratch analyzer called ChemFormulas that gathers a Wikipedia chemistry corpus, cleans it (H₂O becomes H2O), seeds it with near-miss distractors, and finds chemical formulas in prose, breaking each into its element symbols and atom counts. It’s a great way to watch an analyzer come together — and a solid model for your own first build.

Hardening the analyzers you already have

This is the sweet spot. Claude is at its best not on the blank page but on an analyzer that already works — the one that’s right a majority of the time and then chokes on new real-world input you never anticipated. A date format you didn’t handle. A label with an extra word between it and its value. A rule that fires one token too greedily. The “harden” prompt automates the loop I’d otherwise do by hand: get more real text, run it through the engine, see what breaks, tighten the rule, re-run. It generates varied test inputs with the edge cases you name, runs the analyzer over them, and reports back where the extraction looks wrong. Two more prompts round out the workflow — one builds NLP++ dictionaries and knowledge bases in the exact format the engine expects, and another extends the full English dictionary from the missing-words list an analyzer collects, adding properly featured entries and quarantining the noise for your review.

The honest limit

So Claude can scaffold a new analyzer and harden an existing one, and for anyone who wants to write deterministic NLP, that is a genuine step forward. But let me be just as clear about what it can’t do: Claude is not capable of writing industry-ready analyzers on its own. That still has to be done by humans. It builds the scaffolding and grinds the edge cases; the engineering judgment — the architecture, the precision, the decisions that make an analyzer trustworthy in production — stays with a person who knows what they’re doing.

The irony isn’t lost on me

There’s a paradox at the center of all this, and I want to name it plainly: I’m using a large language model to help build the very thing that could, one day, replace it. NLP++ is deterministic, transparent, trustworthy symbolic NLP — every decision is a rule you can read, every result one you can reproduce. LLMs are the opposite. They are astonishing, and they are probabilistic black boxes. For anything that demands an auditable, defensible answer — in law, medicine, finance, anywhere the stakes are real — “usually right” isn’t good enough, and “I can’t show you why” is disqualifying. That is precisely the ground symbolic NLP is built to stand on.

So there’s something fitting about a brilliant improviser helping you write down the score. Claude is accelerating the construction of the kind of NLP that doesn’t need Claude at runtime. It will take decades to build symbolic NLP out to the breadth today’s models cover, and I have no illusions about the size of that job. But it can happen, and I’d argue it must. The world is going to run more and more on automated language decisions, and those decisions should be ones we can open up and examine, not take on faith.

What’s coming

Not everyone wants to write analyzers, and not everyone wants to stay a beginner — so we’re building for both. Curated extractors are on the way for people who just want reliable, ready-made extraction without writing a line of NLP++. And for those who want to go the other direction and become genuinely fluent in this new programming language, we’re rolling out certification. Whether you want the results handed to you or you want to master the craft, there will be a path.

The bottom line

After eight weeks and hundreds of commits, my take is straightforward. Claude is a superb development partner for NLP++ when two things are true: you know your code well enough to check its work, and the code it’s building on is solid. Under those conditions it’s a five-to-ten-times force multiplier — fastest at exactly the work I value most, hardening real analyzers against real text — while never taking the wheel. It can scaffold and it can harden; it cannot replace the human who ships the finished, trustworthy analyzer. Let Claude gather the texts, mimic the patterns, and grind the edge cases, and keep the engineering judgment where it belongs. It’s helping build something it can’t be: NLP we can fully trust. That’s the most worthwhile work I could ask of it.