Claude Can Write NLP++ — And That Changes Who Gets to Use It

Claude can write NLP++ — and that opens the door to everyone who once found it too daunting to try. If Claude stays available and its top models stay within reach, that door is open to almost anyone. We built NLP++ Version 3 and scaffolded 40 extractors using Claude Max 20x, but even a Claude Pro subscription puts these models in nearly anyone’s hands today. And what’s on the other side of the door is the point: not an opaque model you take on faith, but real NLP++ you can read, audit, and trust — code that stands on its own, whatever happens to the tool that helped you write it.

Intro

I’ve spent years as something of a warrior on this. I’ve argued, loudly and repeatedly, that large language models are glorified auto-completes — that they aren’t intelligent in any meaningful sense, that they won’t replace anyone’s job, and that the hype around them is mostly hype. I still believe that. Nothing in the eight weeks I’m about to describe has changed my mind on the fundamentals.

But being right about what a tool is not the same as saying the tool is useless. An LLM is a pattern-matcher over text and computer code, unlike human language, is extraordinarily regular. Human language is riddled with ambiguity, context, and meaning that no amount of surface pattern-detection can fully reach, which is exactly why NLP++ exists. But computer code has strict syntax, predictable structure, and a narrow space of valid forms. Detecting patterns in that kind of text is a natural fit and is arguably what these models are best at. So it shouldn’t be surprising that an LLM turns out to be genuinely useful for helping write code. And given NLP++ is a true computer programming language with strict syntax, it too can be mimicked by these pattern machines.

That’s the spirit of what follows — a report on putting a tool to work where it actually fits. Because in the end, that’s all it is: a tool. And like every tool ever made, it’s only as good as the person wielding it, and only as smart as the hands that know when and how to use it..

My Journey

For the last eight weeks I’ve paired with Claude to build version 3 of the NLP Engine and its VS Code extension. Not a side experiment — the real thing, shipping code, hundreds of commits across more than a dozen repositories. Here’s some of what we did together:

  • Added new functions and capabilities to the NLP++ language itself
  • Fixed cross-platform problems in the engine
  • Built one-click compiling using Cloudflare and GitHub
  • Converted the entire old RoboHelp HTML help set into Markdown
  • Percolated coordinated updates across eight-plus repositories
  • Shipped a new Help tab in the extension
  • Helping scaffold and harden over 40 information extractors

Across all of it, Claude was a force multiplier — call it five to ten times my normal output. And every bit of it worked for one reason: I understand every line of code, the code we wrote and the code it wrote.

Why it works for me specifically

Claude works for me because of two things, and I don’t think either is optional.

First: I am 100% familiar with the code. I know the C++ engine and the TypeScript extension inside and out. That is not a footnote; it is the whole reason this partnership works. Claude is a fast, tireless writer of plausible code, and plausible is not the same as correct. Because I can read what it produces and know instantly whether it’s right, it becomes an accelerator instead of a liability. The classic failure mode of AI-assisted development — shipping code you don’t actually understand — never happens here, because I understand all of it. That’s the bar: use Claude where you can check it.

Second: the foundation is solid. The engine and extension were written by senior software engineers, and the code is clean and well-structured. Claude does its best work on top of good code, because good code gives it clear patterns to mimic and consistent conventions to follow. Hand it a disciplined codebase and it extends it faithfully; hand it a mess and it makes a bigger one.

And whatever it writes stays glass-box: plain, deterministic code you can read, diff, version, and re-run to the exact same result. Claude is a development partner, not the runtime.

The big question: can Claude write deterministic analyzers?

Building the engine and tooling is one thing. The question I really wanted answered was about NLP++ itself: could Claude build analyzers from scratch, and could it help harden analyzers when they failed to extract information as new text arose?

The answer, to both, was yes.

Claude can write NLP++ code. It can even prototype a new analyzer from scratch. But there’s a real condition attached: you have to point it to the right places, and it has to understand how analyzers actually work in NLP++. Writing NLP++ is very different from other programming languages — it’s a rule-based system of passes, rules, a parse tree, wildcards, and knowledge-base operations, not the imperative code most models have seen a million times. Left to guess, Claude flounders. Given the engine paths, the example analyzers, dictionaries, knowledge bases to study, and the conventions to follow, it does real work. That’s exactly why version 3 ships a small library of ready-to-paste prompts — they hand Claude the right paths and the right guardrails so a new user doesn’t have to know them by heart.

Prototyping your first analyzer

For years the hardest part of NLP++ was the blank editor. Programmers, computational linguistics, and coders didn’t know where to start and how to scaffold the specific analyzer they needed for their specific task. The “build an analyzer” prompt closes that gap: it gives Claude the engine, example, and template paths, tells it to study the examples first and build on the Knowledge Base template — accumulating results into a knowledge base and emitting JSON with the library’s SaveKB and JsonKB functions instead of hand-rolling strings — and leaves you two blanks to describe your corpus and your extraction task. There’s also a complete worked example, a from-scratch analyzer called ChemFormulas that gathers a Wikipedia chemistry corpus, cleans it (H₂O becomes H2O), seeds it with near-miss distractors, and finds chemical formulas in prose, breaking each into its element symbols and atom counts. It’s a great way to watch an analyzer come together — and a solid model for your own first build.

Hardening the analyzers you already have

This is the sweet spot. Claude is at its best not on the blank page but on an analyzer that already works — the one that’s right a majority of the time and then chokes on new real-world input you never anticipated. A date format you didn’t handle. A label with an extra word between it and its value. A rule that fires one token too greedily. The “harden” prompt automates the loop I’d otherwise do by hand: get more real text, run it through the engine, see what breaks, tighten the rule, re-run. It generates varied test inputs with the edge cases you name, runs the analyzer over them, and reports back where the extraction looks wrong. Two more prompts round out the workflow — one builds NLP++ dictionaries and knowledge bases in the exact format the engine expects, and another extends the full English dictionary from the missing-words list an analyzer collects, adding properly featured entries and quarantining the noise for your review.

The honest limit

So Claude can scaffold a new analyzer and harden an existing one, and for anyone who wants to write deterministic NLP, that is a genuine step forward. But let me be just as clear about what it can’t do: Claude is not capable of writing industry-ready analyzers on its own. That still has to be done by humans. It builds the scaffolding and grinds the edge cases; the engineering judgment — the architecture, the precision, the decisions that make an analyzer trustworthy in production — stays with a person who knows what they’re doing.

Lowering the barrier to entry

There’s a group I haven’t mentioned yet, and they may matter most: the people who’ve looked at NLP++ and quietly decided it isn’t for them. Not because they doubt the technology, but because a rule-based system of passes, parse trees, and knowledge-base operations sits well outside their comfort zone. It looks like a language you’d have to learn from the ground up before you could do anything useful — and for a lot of capable people, that blank editor was a wall, not a door.

Claude changes that math. You no longer have to be fluent before you start. You can describe the corpus you have and the information you want pulled out of it, hand Claude the right prompt, and watch a working analyzer take shape — one you can then read, run, and tweak. The knowledge is still being built into real, deterministic NLP++ code; you’re just no longer required to summon it all from memory on day one. Claude becomes the on-ramp: it scaffolds, explains, and shows you the patterns, so learning NLP++ happens by reading and adjusting real analyzers instead of staring at an empty file.

That’s the part I find genuinely exciting. For years the price of admission to NLP++ was steep familiarity with an unfamiliar paradigm. With Claude alongside you, curiosity is enough to get started. You still grow into the craft — the engineering judgment is still yours to develop — but the fear of it being “out of my wheelhouse” stops being a reason to never try.

One honest caveat: this on-ramp holds only as long as Claude Pro or better stays within reach. The scaffolding that makes the paradigm approachable depends on access to a capable model — so if you’ve been on the fence, this is an argument for starting now, while the door is open, and for treating what you learn as yours to keep. The NLP++ code you build is deterministic and runs on its own; the analyzer you walk away with doesn’t need Claude to keep working, even if your on-ramp someday does.

The irony isn’t lost on me

There’s a paradox at the center of all this, and I want to name it plainly: I’m using a large language model to help build the very thing that could, one day, replace it. NLP++ is deterministic, transparent, trustworthy symbolic NLP — every decision is a rule you can read, every result one you can reproduce. LLMs are the opposite. They are astonishing, and they are probabilistic black boxes. For anything that demands an auditable, defensible answer — in law, medicine, finance, anywhere the stakes are real — “usually right” isn’t good enough, and “I can’t show you why” is disqualifying. That is precisely the ground symbolic NLP is built to stand on.

So there’s something fitting about a brilliant improviser helping you write down the score. Claude is accelerating the construction of the kind of NLP that doesn’t need Claude at runtime. It will take decades to build symbolic NLP out to the breadth today’s models cover, and I have no illusions about the size of that job. But it can happen (see the NLP Foundation website), and I’d argue it must. The world is going to run more and more on automated language decisions, and those decisions should be ones we can open up and examine, not take on faith.

What’s coming

Not everyone wants to write analyzers, and not everyone wants to stay a beginner — so we’re building for both. Curated extractors are on the way for people who just want reliable, ready-made extraction without writing a line of NLP++. And for those who want to go the other direction and become genuinely fluent in this new programming language, we’re rolling out certification. Whether you want the results handed to you or you want to master the craft, there will be a path.

The bottom line

After eight weeks and hundreds of commits, my take is straightforward. Claude is a superb development partner for NLP++ when two things are true: you know your code well enough to check its work, and the code it’s building on is solid. Under those conditions it’s a five-to-ten-times force multiplier — fastest at exactly the work I value most, hardening real analyzers against real text — while never taking the wheel. It can scaffold and it can harden; it cannot replace the human who ships the finished, trustworthy analyzer. Let Claude gather the texts, mimic the patterns, and grind the edge cases, and keep the engineering judgment where it belongs. It’s helping build something it can’t be: NLP we can fully trust. That’s the most worthwhile work I could ask of it.