Regex for NLP

Regex is ubiquitous in the programming world because of its usefulness as a rule-based text parsing language. Programmers find comfort in the idea of writing explicit, modifiable rules in order to parse text. This is in contrast with black-box statistical models, which cannot be modified when things go wrong – […]

Loading

Read more

NLP++ and LLM

Trustworthy NLP systems must be rule and knowledge based given all statistical systems like large language models, machine learning, and neural networks are not. With the advent of large language models that can be queried about common knowledge, it is natural to use them to generate linguistic and world knowledge […]

Loading

Read more

Guilherme Santos da Silva

Guilherme Santos da Silva has a degree in Computer Engineering from the Federal Technological University of Paraná, Brazil and is currently an employee of LexisNexis Risk Brazil. He discovered HPCC Systems in 2021 when he joined LexisNexis as an intern and participated in the 2021 HPCC Systems Poster Contest with […]

Loading

Read more

Python Package for NLP++

The first version of our NLPPlus python package is ready to use. We are still waiting on approval of the package on the python package website, but it is available as a download from our GitHub. https://github.com/VisualText/py-package-nlpengine The NLPPlus python package for NLP++ allows Python programmers to call NLP++ analyzers […]

Loading

Read more

Portuguese Dictionary

The first steps in creating a portuguese dictionary has been started and can be found in the GitHub repository: http://github.com/VisualText/dict-pt-br. This was started by NLP++ co-author David de Hilster given he is fluent in Portuguese and that no digital dictionary for portuguese is available. Video Sessions This is a video […]

Loading

Read more

English Dictionary

This project involves parsing the Wiktionary pages for English into the most comprehensive digital dictionary ever created. The first two stages of this project have been done via grants to RV College of Engineering from LexisNexis Risk’s HPCC Systems group. In Progress This project is still in progress. Project Description […]

Loading

Read more

Dr. Jyoti Shetty

Dr. Jyoti Shetty is an Assistant Professor in the Computer Science and Engineering Department at the RV College of Engineering. In collaboration with students, she has executed several projects on HPCC Systems, including implementing a distributed DBSCAN, providing evaluation metrics for a clustering algorithm, and IoT plugin for HPCC Systems, an OpenCV […]

Loading

Read more

NLP Course Using NLP++

One of the more important projects we are currently working on is the creation of high school and college-level courses on NLP using NLP++. NLP courses at universities almost exclusively concentrate on statistical methods like Machine Learning, Neural Networks, and Large Language Models. NLP courses that do not use statistical […]

Loading

Read more