I have been in computational linguistics for more than 40 years, and this is the first time I have been to the most important conference in our field: the annual Association of Computational Linguistics (ACL) Conference. As part of the registration process, I became a member for the first time – a somewhat embarrassing fact, given that 25 years ago Amnon Meyers and I came up with what we called our “dream computer programming language” for computational linguistics: NLP++. And now with everyone realizing (something we in computational linguistics knew for a while) that large language models as well as other statistical methods have severe limitations, NLP++ is attracting more and more interest.
Attending ACL 2024 made me realize this: why didn’t we come to the community with his programming language earlier?
This year, I was invited by Clemson to accompany master’s degree recipient Ashton Williamson, who presented a poster session in BioNLP titled “Low-resource ICD Coding of Hospital Discharge Summaries“. I mentored Ashton during the last year and a half of his master’s thesis, which used NLP++ as a core technology. I found the conference fruitful and eye opening. It was not only amazing to be in Thailand, but amazing to see the current state of NLP and the “lack of” and “appetite for” rule-based and knowledge-based systems.
I spent a lot of my time during the last few decades in the NLP industry using our then “proprietary” technology. But recently, I have become more involved with universities and students around the world, given that in 2018, Amnon and I made our dream NLP computer language and framework open source and available to the general public. Going open-source led me to mentor interns during my last years at LexisNexis, which spawned various projects involving NLP++, including the first master’s thesis using that technology.
Attendance
Most of the 3800 attendees came from Thailand and its surroundings, including students and professors from China, Japan, and India. Many also joined from Europe and the United States.
All the big tech companies were present, including Apple, Google, Facebook, and Amazon — all recruiting due to the LLM frenzy. The ACL took full advantage of the extra attention by both sponsors and students. But the attention is both good and bad — an opinion I share with others in the computational linguistics world. More on that later.
Posters – LLMs Everywhere
I was shocked to see that 95% of the poster sessions were about LLMs. There was little in the way of computational linguistics – a point emphasized in the president’s closing keynote. Given the inherent problems with statistical systems, NLP++ is presented with a great opportunity, enabling linguists and industry to incorporate rule-based and knowledge-based methods.
Ashton’s Poster Presentation
On the last day of the workshops, Ashton presented his poster. He spent the time talking with conference goers who were intrigued by using a rule-based system instead of statistical systems. Some people had no clue as to what a rule-based system was and some even laughed at the idea. I also was there to interact with conference attendees and handed out information about downloading and using NLP++ right away with VSCode. Computational linguists in particular were intrigued by the rule-based systems approach.
Teach NLP Workshop
I attended the full-day Teach NLP workshops session, held every few years and organized by Laura Biester and Margot Mieskes. Clearly a good place to promote a grass roots interest and use of NLP++ is to teach a course in hands-on rule-based and knowledge-based NLP.
The opening presentation comprised a round-table discussion of teaching NLP, with participants David Adelani from McGill, Graham Neubig from CMU, Lori Levin from CMU, and Aiala Rosá from Universidad de la República Uruguay. My interest was piqued by Dr. Lori Levin and her project “International Linguistics Olympiad“, whose goal is to interest high school students in linguistics. I have contacted her about using NLP++ in teaching NLP, and I look forward to speaking with her and others from the group about that possibility.
Having talked with numerous people from the group, I embellished our NLP Course Project page with more detailed information. I look forward to helping universities around the world reintroduce computational linguistic programming into their curriculum, following the statistical takeover of the NLP curriculum during the past decades.
Keynotes
Although I unfortunately missed the first days of the conference, I did attend the other keynote talks in the main room. All were enlightening, but I most enjoyed the president’s talk. I especially enjoyed seeing a fellow colleague receive the lifetime achievement award.
Blast from the Past: Ralph Grishman
It was a pleasant surprise when the lifetime achievement award went to a person who Amnon and I have known for decades: Ralph Grishman. He received his award for work in information extraction, something near and dear to Amnon and me. Even though we were often at odds with Ralph over the years, it was sad to see that he had Parkinson’s Disease, and I was glad to see someone from our era get this distinguished award. He appeared live for a few minutes, but speaking was laborious. His presentation was done via a recording of his talk being read by a colleague.
During his talk, the MUC (Message Understanding Conferences) sponsored by DARPA were featured, something that Amnon and I participated in with precursory work to NLP++. Amnon was instrumental in working with Beth Sundheim in organizing the first MUC conferences in the early 1990s.
I spoke briefly with ACL president Dr. Emily Bender after her “controversial” talk “ACL is not an AI Conference” (something I agree with whole-heartedly), mentioning to her how I was shocked to see almost no computational linguistic posters or papers. She seemed to agree, and her talk made it clear that the ACL needs to get back to its roots. This is something I think NLP++ could help with immensely.
I also enjoyed her setting straight one of the students, arguing (correctly) that language models don’t reason and that their unreliable output can cause harm.
NLP++ And Computational Linguitics
After the conference, I realized that computational linguistics could take advantage of the NLP++ framework in various ways. NLP++ provides an open-source, uniform computer language and knowledge representation that is human readable and usable world-wide. And it works for all languages and Unicode. It even works with emojis. Here are just some some of the ways computational linguistics can exploit NLP++ (some items are already in progress):
- Design and refine rule-based extraction systems for specific tasks
- Build knowledge bases and dictionaries for areas such as law, medicine, and business
- Build dictionaries and grammars for lesser known languages
- Teach hands-on rule-based and knowledge-based NLP
- Use VisualText as a corpus study tool
- Test computational and linguistic theories and concepts on real world text
As per the thesis of our NLU Global initiative, NLP++ and the world-wide computational linguistic community can come together to build dictionaries, knowledge bases, analyzers, and algorithms to do trustworthy NLP.
NLP++ Needs to be Active in ACL Events
So what were my main takeaways from my first ACL conference, as a computational linguist for four decades? For me, they are: NLP++ needs to have a greater presence at the conference and universities, and the ACL community needs and can greatly benefit from such a system.
NLP++ needs to be an active component of the ACL conferences now and in the future. The ACL provides an excellent venue for spreading the word about NLP++ and the NLU Global Initiative. Computational Linguists I talked to showed keen interest in knowing more about the NLP++ rule-based and knowledge-based framework.
We need to publish papers, deliver keynote presentations, and help universities use NLP++ in their course work as well as research. This has already started to happen, for example with Ashton obtaining a Master’s degree from Clemson University using the technology. Other examples include more than a dozen interns from universities around the world, who have used NLP++ in formal projects and have published papers. In my opinion, we have just begun.
I can see a day in the future where ACL conferences are replete with projects taking advantage of NLP++, the only computer programming language dedicated to computational linguistics, and even integrating with statistical processing to help create more trustworthy systems.
Overall, I’m extremely happy to have participated in my first ACL conference, and it certainly will not be my last. I look forward to many more interactions with this amazing community, and I’m hoping to contribute to the wonderful area of study we call computation linguistics. After all, understanding how people understand text reveals a lot about how we act and think about the world around us.
Thank Yous
I want to thank LexisNexis Risk and HPCC Systems for their support of Ashton’s Project that used the NLP++ technology. I also want to thank the co-authors, Amnon Meyers, Hugo Wantanuki, Dr. Amy Apon and Dr. Nina Hubig.
I also want to thank Dr. Apon and Clemson University for sponsoring me to accompany Ashton on this amazing trip. I have never been to Thailand, and it was a most memorable trip.
I’d also like to thank the ACL for continuing to move computational linguists forward, and I hope to become a more integral part of this amazing community.
And lastly, to my amazing Thai sister Nantarika Chansue, who introduced me to Thailand and showed me Bangkok and its amazing culture, customs and food!