This was an internship project by Dheemonth Kolali in the summer of 2023 sponsored by HPCC Systems of LexisNexis Risk. The purpose of this NLP++ analyzer was to analyze tweets from cricket games to find out the sentiment out a cricket player or team.
Cricket is a popular sport that is followed by millions of people around the world. As a result, there is a lot of emotion that is expressed on Twitter about cricket. The expression of human emotions on cricket tweets is a reflection of the passion that people have for the game. It also shows how cricket can bring people together and create shared experiences. One of the most well-liked methods for analyzing emotions on Twitter is sentiment analysis. Traditional Natural Language Processing (NLP) in English is prone to various failures by generating different sentiment rather than what they intend to. The human way of constructing phrases and sentiments in tweets are different from what modern day ML algorithms think of. Users may tweet saying, “He is a great player” and followed by a laughing emoji which is actually sarcasm and not positive sentiment that is generally provided by the ML algorithms. But what if those sentiments are very much similar to human emotions, NLP++ which provides that “plus-plus” feature to apply human based sentiments to the tweets. For the initial stage, we created a number of parsers and an analyzer using NLP++(Visual Text). To do this, we defined the different rules that map to a very generic manner of supplying the sentiments rather than having for specific ones. NLP++ assisted in constructing the parsers for assigning different sentiments depending on user, cricket terms, player and team interests and team supports. The second phase centered on the sentiments that were given to emojis. Emojis in the dictionary, a capability offered by NLP++, were used to assign sentiments to the cricket tweets. We started by making an emoji dictionary with attributes like positive, negative, and neutral. Based on the rules defined in the parsers, these attribute values are used to generate the sentiments. The add-on capability of generating a knowledge base in NLP++ for storing the results at each stage of assigning sentiments using HPCC Systems to generate the detailed analysis is what makes it more addictive, efficient and interesting to keep us using it all the time. In summary, we developed and initiated a path for future in providing a natural human based approach rather than traditional approach of applying sentiments through the NLP++ and with the help of the community.
Here is a video presentation by Dheemonth of his work:
Because each sentiment analyzer is different, the attributes for emojis for this particular analyzer was constructed by Dheemonth.
The repository for this sentiment analyzer can be found here: