VADER Sentiment Analysis Explained

VADER (Valence Aware Dictionary for sEntiment Reasoning) is a model used for text sentiment analysis that is sensitive to both polarity (positive/negative) and intensity (strength) of emotion. Introduced in 2014, VADER text sentiment analysis uses a human-centric approach, combining qualitative analysis and empirical validation by using human raters and the wisdom of the crowd.

In this post, I’ll discuss how VADER sentiment analysis calculates the sentiment score of an input text. It combines a dictionary, which maps lexical features to emotion intensity, and five simple heuristics, which encode how contextual elements increment, decrement, or negate the sentiment of text.

Before doing that, let’s go one level above and talk about sentiment analysis in general.

(The paper which introduces VADER can be accessed here.)

What is VADER Sentiment Analysis?

Consider the following sentences:

“The party is wonderful.”

and

“I hate that man.”

Do you get a sense of the feelings that these sentences imply? The first one clearly conveys positive emotion, whereas the second conveys negative emotion. Humans associate words, phrases, and sentences with emotion. The field of Text Sentiment Analysis attempts to use computational algorithms in order to decode and quantify the emotion contained in media such as text, audio, and video.

Text Sentiment Analysis is a really big field with a lot of academic literature behind it. However, its tools really just boil down to two approaches: the lexical approach and the machine learning approach.

Lexical approaches aim to map words to sentiment by building a lexicon or a ‘dictionary of sentiment.’ We can use this dictionary to assess the sentiment of phrases and sentences, without the need of looking at anything else. Sentiment can be categorical – such as {negative, neutral, positive} – or it can be numerical – like a range of intensities or scores. Lexical approaches look at the sentiment category or score of each word in the sentence and decide what the sentiment category or score of the whole sentence is. The power of lexical approaches lies in the fact that we do not need to train a model using labeled data, since we have everything we need to assess the sentiment of sentences in the dictionary of emotions. VADER is an example of a lexical method.

Machine learning approaches, on the other hand, look at previously labeled data in order to determine the sentiment of never-before-seen sentences. The machine learning approach involves training a model using previously seen text to predict/classify the sentiment of some new input text. The nice thing about machine learning approaches is that, with a greater volume of data, we generally get better prediction or classification results. However, unlike lexical approaches, we need previously labeled data in order to actually use machine learning models.

Quantifying the Emotion of a Word (or Emoticon)

Primarily, VADER sentiment analysis relies on a dictionary which maps lexical features to emotion intensities called sentiment scores. The sentiment score of a text can be obtained by summing up the intensity of each word in the text.

What is a lexical feature? How do we even measure emotional intensity?

By lexical feature, I mean anything that we use for textual communication. Think of a tweet as an example. In a typical tweet, we can find not only words, but also emoticons like “:-)”, acronyms like “LOL”, and slang like “meh”. The cool thing about VADER sentiment analysis is that these colloquialisms get mapped to intensity values as well.

Emotion intensity or sentiment score is measured on a scale from -4 to +4, where -4 is the most negative and +4 is the most positive. The midpoint 0 represents a neutral sentiment. Sample entries in the dictionary are “horrible” and “okay,” which get mapped to -2.5 and 0.9, respectively.  In addition, the emoticons “/-:” and “0:-3” get mapped to  -1.3 and 1.5.

The next question is, how do we construct this dictionary?

By using human raters from Amazon Mechanical Turk!

Ok. You might be thinking that emotional intensity can be very arbitrary since it depends on who you ask. Some words might not seem very negative to you, but they might be to me. To counter this, the creators of VADER sentiment analysis enlisted not just one, but a number of human raters and averaged their ratings for each word. This relies on the concept of the wisdom of the crowd : collective opinion is oftentimes more trustworthy than individual opinion. Think of the game show “Who Wants to Be a Millionaire?” One of the lifelines that contestants can use is Ask the Audience, which also relies on the wisdom of the crowd.

Quantifying the Emotion of a Sentence

VADER sentiment analysis (well, in the Python implementation anyway) returns a sentiment score in the range -1 to 1, from most negative to most positive.

The sentiment score of a sentence is calculated by summing up the sentiment scores of each VADER-dictionary-listed word in the sentence. Cautious readers would probably notice that there is a contradiction: individual words have a sentiment score between -4 to 4, but the returned sentiment score of a sentence is between -1 to 1.

They’re both true. The sentiment score of a sentence is the sum of the sentiment score of each sentiment-bearing word. However, we apply a normalization to the total to map it to a value between -1 to 1.

The normalization used by Hutto is

\large \dfrac{x}{\sqrt{x^2 + \alpha}}

where x is the sum of the sentiment scores of the constituent words of the sentence and \alpha is a normalization parameter that we set to 15. The normalization is graphed below.

vader sentiment analysis normalization

We see here that as x grows larger, it gets more and more close to -1 or 1. To similar effect, if there are a lot of words in the document you’re applying VADER sentiment analysis to, you get a score close to -1 or 1. Thus, VADER sentiment analysis works best on short documents, like tweets and sentences, not on large documents.

Five Simple Heuristics

Lexical features aren’t the only things in the sentence which affect the sentiment. There are other contextual elements, like punctuation, capitalization, and modifiers which also impart emotion. VADER sentiment analysis takes these into account by considering five simple heuristics. The effect of these heuristics are, again, quantified using human raters.

like

The first heuristic is punctuation. Compare “I like it.” and “I like it!!!” It’s not really hard to argue that the second sentence has more intense emotion than the first, and therefore must have a higher VADER sentiment score.

VADER sentiment analysis takes this into account by amplifying the sentiment score of the sentence proportional to the number of exclamation points and question marks ending the sentence. VADER first computes the sentiment score of the sentence. If the score is positive, VADER adds a certain empirically-obtained quantity for every exclamation point (0.292) and question mark (0.18). If the score is negative, VADER subtracts.

amazing

The second heuristic is capitalization. “AMAZING performance.” is definitely more intense than “amazing performance.” And so VADER takes this into account by incrementing or decrementing the sentiment score of the word by 0.733, depending on whether the word is positive or negative, respectively.

cute

The third heuristic is the use of degree modifiers. Take for example “effing cute” and “sort of cute”. The effect of the modifier in the first sentence is to increase the intensity of cute, while in the second sentence, it is to decrease the intensity. VADER maintains a booster dictionary which contains a set of boosters and dampeners.

The effect of the degree modifier also depends on its distance to the word it’s modifying. Farther words have a relatively smaller intensifying effect on the base word. One modifier beside the base word adds or subtracts 0.293 to the sentiment score of the sentence, depending on whether the base word is positive or not. A second modifier from the base word adds/subtract 95% of 0.293, and a third adds/subtracts 90%.

love

The fourth heuristic is the shift in polarity due to “but”. Oftentimes, “but” connects two clauses with contrasting sentiments. The dominant sentiment, however, is the latter one. For example, “I love you, but I don’t want to be with you anymore.” The first clause “I love you” is positive, but the second one “I don’t want to be with you anymore.” is negative and obviously more dominant sentiment-wise.

VADER implements a “but” checker. Basically, all sentiment-bearing words before the “but” have their valence reduced to 50% of their values, while those after the “but” increase to 150% of their values.

The fifth heuristic is examining the tri-gram before a sentiment-laden lexical feature to catch polarity negation. Here, a tri-gram refers to a set of three lexical features. VADER maintains a list of negator words. Negation is captured by multiplying the sentiment score of the sentiment-laden lexical feature by an empirically-determined value -0.74.

VADER Sentiment Analysis Wrap Up

VADER sentiment analysis combines a dictionary of lexical features to sentiment scores with a set of five heuristics. The model works best when applied to social media text, but it has also proven itself to be a great tool when analyzing the sentiment of movie reviews and opinion articles.

The great thing about VADER sentiment analysis is that an open-source implementation in Python is available here. Sentiment analysis becomes a joy using the code.  To see an application of VADER sentiment analysis, check out my post on Black Mirror, wherein I rank the show’s episodes according to how negative they are.

Leave a Reply