How to train the data set for sentiment analysis

Robots read emotions in text: a report on a new sentiment analysis model from Hello Ebbot

Recently, there has been an increasing need for an automated sentiment analysis model among businesses, especially in e-commerce. There are currently many pre-trained models available that can be quickly applied, but they are often not entirely free and do not yet support the Swedish language - like Google's Natural Language API. Mood for Swedish language module, but doesn't work well and fails the tests by our critical "judges" here at Hello Ebbot.

At Hello Ebbot, we want a sentiment analysis functionality for our chatbots. So we decided to train a model ourselves, it will certainly take more time to collect data, train it and then bring the model into production - but that's not a problem because we don't want our products only offer conversational virtual assistants, but also those who can think a little. If you are curious how we can make a model with an average accuracy of 75% and spot sarcastic comments, read on to know our secrets.

Sentiment Analysis and the Problems it Solves:

There is plenty of research and articles out there explaining sentiment analysis using professional business or machine learning terms, so if you are looking for a complete guide to sentiment analysis we recommend checking out this blog from MonkeyLearn. But here's a cute and easy round-up for you:Sentiment analysis is a technique of processing textual data to classify the various emotions (positive, neutral, or negative) that the text reflects..

With this automated model, we want Ebbot to respond with tonality based on customers' emotions. For example, if customers are angry, Ebbot would sense the negativity in the message and provide a reassuring response. In addition, feedback on customer service or product reviews from Ebbot can be collected and analyzed for business-to-consumer (B2C) companies. The solutions we have in mind are not currently available, but we hope to get them up and running as soon as possible.

The development process:

In this section we show the technical background of the model. There are a few machine learning terms pending, so if you just want to see how the model behaves, you can jump straight to this section instead.

We started by collecting data from Trustpilot using BeautifulSoup's web scraping library. Our dataset contained reviews for several famous companies in Sweden, such as Telia and Qliro. Since the mood classifier is basically just a supervised learning task (Datacamp 2020), we marked the ratings based on the number of stars each rating received. Specifically, five stars were classified as "positive", three as "neutral" and one star as "negative".

The next step was to decide which classifiers we would use for this task. First we implemented the NLTK classifier Multinomial Naive Bayes and achieved amazing results for positive / negative analyzes. Our model achieved 92% accuracy and was able to spot some false positives.

As it turned out, the Naive Bayes (NB) model didn't work well when neutral vibes came into play. In fact, DatumBox did thorough research on some classifiers and concluded that NB gave high accuracy for binary prediction (positive and negative) but much lower for 3 class output (positive, negative and neutral).

That was when sklearn Support Vector Machine Classifier (SVC) came to save the day! Inspired by a project by another machine learning enthusiast with IMDB Review and LinearSVC pipeline, we also used K-Fold Cross-Validator to split data into pull / test sets. We divided our data set into 10 consecutive folds instead of the standard five. Among the ten times, the highest accuracy was 95% and, on average, our model gave correct prediction on test sets 75% of the time.

Here are a few ambiguous and / or sarcastic comments that our model was surprisingly able to analyze:

"Thank you for being the worst in the world, make it easy to switch to a competitor"Negatives

(Thank you for being the worst in the world make it easy to switch to a competitor)

"Thanks for nothing" - negative

(Thanks for nothing)

"I like your product, but you are fantastic at customer service"Negatives

(I like your product, but you are fantastically bad at customer service)

"My social life is now ruined because the computer I got from you was beyond expectations!"  • positive

(My social life is ruined because the computer I got was better than I expected!)

This is what the "judges" from Ebbot say:

After a month and a half, the model was finally ready for testing! We tried the model popped up in our heads with different types of comments and here is some feedback from Team Ebbot:

"I found myself smart and challenged the algorithm with very tricky and ambiguous sentences - but after a few tries I was amazed at how difficult it was to fool him." -Oliver

"I've tried a lot of mood models, but that's the first thing that made me think, 'Wow, this really works!'. It's just super cool and super useful!" -Different

"The accuracy has been around 75% so far. Impressive stuff!""Gustav

Even though the results were better than our expectations, we would like to point out some of the limitations of the model. First of all, the average accuracy is only 75% so some of the comments send mixed signals or are too sophisticated it will confuse the model. It is also worth mentioning that the model only works in Swedish, therefore only applies to chatbots that "speak" English.

All in all, this is an exciting step in the right direction. The implications of what this can do for customer service are very promising. And we look forward to developing practical applications for the model.