Today, sentiment analysis is a topic of great interest and development since it has many practical applications.
Considering publicly and privately available information over the Internet is constantly growing, a large number of texts expressing opinions are available in forums, blogs, review sites and social media channels.
The increasing need of extracting subjective information from written texts leads brands to analyse sentiment and viewpoints, just to get customer information related to feelings, attitudes, emotions and opinions of existing and potential buyers towards their products or services.
Sentiment Analysis is the process of determining whether a piece of content is positive, negative or neutral in tonality.
It’s based on algorithms evaluating whether the words included in a post are related to positive, negative or neutral emotions and combines natural language processing (NLP) and machine learning techniques.
For specific use cases that require 100% accuracy, Sensika (and other solutions for media monitoring) allow manual sentiment evaluation, as well.
Usually, besides identifying the opinion, these systems extract attributes of the expression for example:
There are many ways that people analyse bodies of text for sentiment or opinions, but it usually boils down to two methods.
Using Natural Language Processing (NLP), and the attempt to truly “understand” the text.
This model attempts to have the machine actually understand the structure of the sentences and the context. Therefore, it is more focused on the succession of a string of words. Usually, this approach requires the machine to have an understanding of grammar principles. To do this, NLP techniques are being used to tag parts of speech, named entities and more, in order to actually understand the “language” of the text, and not just look for target words.
This method works quite well for Western languages but unfortunately, it is not very useful when it comes to languages such as Arabic, Farsi, Urdu, Hindi etc. The challenge comes from the specific symbols of their alphabets, their word formation and different dialects that basically don’t follow any written rules.
Although still challenging, the next method has proven to be more successful in the sentiment evaluation of texts written in Arabic.
“Bag of Words” model.
This model focuses completely on the words, or sometimes a string of words, but usually pays no attention to the “context” so-to-speak. The “Bag of Words” model usually has a large list, probably better thought of as a sort of a “dictionary”, which is considered to be
Most of the current thinking in sentiment analysis happens in a categorical framework: sentiment is analysed as belonging to a certain bucket, to a certain degree. For example, a given sentence may be 23% sad, 89% excited, 45% happy and 55% anxious. These numbers don’t add up to 100.
“My plain has been delayed. Brilliant!”
Most of us would be able to quickly interpret that the person was being sarcastic and to categorise the sentence as negative. But without contextual understanding, the machine will detect the word “Brilliant” and will categorise the sentence as positive.
Most highly machine-trained systems can reach 80% accuracy on average. But the sentiment analysis is not a perfect science. It can’t fully understand the complexities of human language and it definitely can’t be categorised simply as positive, negative or neutral.
Usually, we are good at judging sentiment in a given context. Detection of feelings like irony, sarcasm, scepticism, anxiety, that depends on a simple thing as the tone of one’s voice, is a complex task, even for some humans, but for the machines it is impossible.
Did you know that human sentiment accuracy is not 100% either? Sentiment basically refers to the contextual polarity of a text or a document, meaning, the emotional effect the text or the document has on the reader. It also indicates the attitude of the author on the subject.
Feelings are inherently subjective, you and I may interpret the attitude of the same text differently. It depends on the personal morals, values and beliefs of a person.
Have a look at the example below:
If your analysis is meant to be in favor of the Democratic Party what would be the sentiment? The sentiment of this article would be different for the Republicans, right?
Probably the biggest problem in this analysis is subjectivity. Someday highly machine-trained systems may reach 99% accuracy of the text dissection. But even then, they will never be able to predict the personal biases based on gender, race and political views of the reader.
Sentiment analysis is very useful in Social media marketing and PR because it helps in forecasting trends and gathering public opinion about products, services or situations.