Sentiment Analysis in R
Sentiment analysis assigns emotional scores to text, revealing whether opinions are positive, negative, or neutral. This tutorial covers lexicon-based sentiment analysis using the tidytext package, the most common approach for getting started with text emotion classification in R.
Prerequisites
You should be familiar with tidytext basics—tokenization, stop word removal, and word frequency analysis. If you need a refresher, work through the Tidytext Basics tutorial first.
Sentiment lexicons
The tidytext package provides several sentiment lexicons. Each word receives a score indicating its emotional tone.
Getting started with lexicons
library(tidytext)
library(tidyverse)
# View available sentiment lexicons
get_sentiments()
# Load the AFINN lexicon (scores from -5 to +5)
afinn <- get_sentiments("afinn")
# Load the bing lexicon (binary positive/negative)
bing <- get_sentiments("bing")
# Load the nrc lexicon (emotions)
nrc <- get_sentiments("nrc")
Understanding lexicon structure
Each lexicon structures sentiment differently:
# AFINN: numeric scores
afinn %>%
head(10)
# # A tibble: 10 × 2
# word value
# 1 abandom -2
# 2 abbreviate -2
# 3 abdicate -2
# Bing: binary classification
bing %>%
head(10)
# # A tibble: 10 × 2
# word sentiment
# 1 2-faced negative
# 2 2-faces negative
# NRC: multiple emotions
nrc %>%
distinct(sentiment)
# # A tibble: 10 × 1
# sentiment
# 1 anger
# 2 anticipation
# 3 disgust
# 4 fear
# 5 joy
# 6 sadness
# 7 surprise
# 8 trust
# 9 positive
# 10 negative
Basic sentiment analysis workflow
The core workflow joins sentiment scores to your tokenized text:
# Start with tokenized text
text_data <- tibble(
id = 1:3,
text = c(
"I love this product, it is amazing and wonderful!",
"This is terrible, I hate it and would not recommend.",
"The product works as expected, nothing special."
)
)
# Tokenize
tidy_text <- text_data %>%
unnest_tokens(word, text)
# Join with sentiment lexicon
tidy_text %>%
inner_join(bing, by = "word")
This gives you sentiment labels for each word.
Sentiment scoring methods
Method 1: AFINN scores
AFINN provides numeric scores—the most intuitive for overall sentiment:
# Calculate sentiment score per document
sentiment_scores <- tidy_text %>%
inner_join(afinn, by = "word") %>%
group_by(id) %>%
summarise(
sentiment_score = sum(value),
word_count = n()
)
print(sentiment_scores)
Method 2: bing binary classification
Use bing for simple positive/negative counts:
sentiment_counts <- tidy_text %>%
inner_join(bing, by = "word") %>%
count(id, sentiment) %>%
spread(sentiment, n, fill = 0) %>%
mutate(
net_sentiment = positive - negative,
total_sentiment_words = positive + negative
)
print(sentiment_counts)
Method 3: NRC emotion categories
NRC lets you analyze specific emotions:
emotion_counts <- tidy_text %>%
inner_join(nrc, by = "word") %>%
count(id, sentiment) %>%
filter(!sentiment %in% c("positive", "negative"))
print(emotion_counts)
Practical example: movie reviews
Let us apply these concepts to a real dataset:
library(janeaustenr)
# Get text from Pride and Prejudice
text <- austen_books() %>%
filter(book == "Pride & Prejudice") %>%
mutate(linenumber = row_number()) %>%
unnest_tokens(word, text) %>%
filter(!word %in% stop_words$word)
# Calculate sentiment over the book
sentiment_by_section <- text %>%
inner_join(afinn, by = "word") %>%
mutate(
section = floor(linenumber / 80)
) %>%
group_by(section) %>%
summarise(
score = sum(value),
words = n()
)
print(sentiment_by_section)
Visualizing sentiment trends
Visualizations reveal emotional arcs in text:
library(ggplot2)
# Plot sentiment over the narrative
ggplot(sentiment_by_section, aes(section, score)) +
geom_col(fill = ifelse(sentiment_by_section$score > 0, "steelblue", "coral")) +
labs(
x = "Section of Book",
y = "Sentiment Score",
title = "Emotional Arc in Pride and Prejudice"
) +
theme_minimal()
Comparing sentiments between texts
Compare sentiment across different sources:
# Compare two books
books <- austen_books() %>%
filter(book %in% c("Pride & Prejudice", "Emma"))
book_sentiment <- books %>%
unnest_tokens(word, text) %>%
anti_join(stop_words, by = "word") %>%
inner_join(afinn, by = "word") %>%
group_by(book) %>%
summarise(
avg_sentiment = mean(value),
total_score = sum(value),
sentiment_words = n()
)
print(book_sentiment)
Handling negation
Words like “not” or “never” flip sentiment polarity:
# Custom negation words
negation_words <- c("not", "no", "never", "neither", "nobody", "nothing")
# Find negation + sentiment word pairs
text_data <- tibble(
id = 1,
text = "This is not good, I am not happy with this."
) %>%
unnest_tokens(bigram, text, token = "ngrams", n = 2) %>%
separate(bigram, c("word1", "word2"), sep = " ")
# Find negated sentiments
text_data %>%
filter(word1 %in% negation_words) %>%
inner_join(afinn, by = c("word2" = "word")) %>%
mutate(value = -value) # Flip the sentiment
print(negated)
This technique reverses sentiment scores for negated words.
What you have learned
| Method | Best For |
|---|---|
| AFINN | Overall sentiment score (numeric) |
| Bing | Simple positive/negative classification |
| NRC | Detailed emotion analysis |
Lexicon-Based sentiment
Lexicon-based sentiment assigns sentiment scores using a pre-built dictionary of words with known polarity. tidytext::get_sentiments("bing") returns a data frame of words labeled “positive” or “negative”. get_sentiments("afinn") returns scores from -5 to +5. get_sentiments("nrc") includes emotion categories like “joy”, “anger”, “fear”. Apply lexicons by joining tokenized text against the lexicon on the word column.
Limitations: lexicons built on general text may not capture domain-specific usage. Words like “breaking” are negative in general language but neutral in news headlines. Negation (“not good”) is not handled without additional processing.
Sentiment over time
For time-stamped text data, sentiment can be tracked over a time axis. After joining with a sentiment lexicon and aggregating sentiment scores per time unit (day, month, publication), plot the trend with ggplot2. geom_smooth() adds a trend line. Peaks and troughs in the sentiment line often correspond to real events, this is useful for social media monitoring and financial news analysis.
Sentence vs document level
Sentiment at the word level loses sentence context. Sentence-level analysis averages word sentiments within sentences, then aggregates sentences to documents. sentimentr::sentiment() computes sentence-level sentiment and handles negation, amplifiers (“very”), and de-amplifiers (“slightly”) automatically. It is more accurate than word-level averaging for texts with mixed polarity within a single document.
Lexicon-Based methods
Lexicon-based sentiment analysis assigns sentiment scores to individual words and aggregates them to the document level. The accuracy depends on lexicon quality and the match between the lexicon’s domain and your text. Sentiment lexicons built on general text perform poorly on domain-specific text, a financial lexicon would treat “loss” negatively, but in a medical text “weight loss” is often positive.
The tidytext package bundles three lexicons via get_sentiments(). Bing (binary positive/negative, ~6,800 words) is fast for simple classification. AFINN (numeric -5 to +5, ~2,500 words) enables arithmetic aggregation. NRC (10 emotional categories, ~14,000 words) captures nuanced emotions but may overfit to specific domains.
Custom lexicons extend coverage. Domain-specific vocabulary, medical terms, financial jargon, product names, often requires manual curation. Start with an existing lexicon and add domain-specific words: custom_lexicon <- bind_rows(get_sentiments("bing"), tibble(word = c("backorder", "defect"), sentiment = "negative")).
Valence shifters and context
Simple word-counting ignores context. “This is not good” has the word “good” but negative sentiment. Valence shifters, negations (“not”, “never”, “hardly”), amplifiers (“very”, “extremely”), and de-amplifiers (“slightly”, “somewhat”), change the effective sentiment of nearby words.
The sentimentr package handles valence shifters by scoring sentiment in clause windows rather than on individual words. sentimentr::sentiment(text) returns a row per sentence with a polarity score that accounts for negation and amplification. For text where negation is common (reviews, clinical notes), sentimentr outperforms simple lexicon matching.
Sentence-level analysis: split text into sentences with tidytext::unnest_sentences(), score each sentence with sentimentr::sentiment(), then aggregate to document level. Sentence-level scores can reveal emotional arc within a document, how sentiment changes from beginning to end.
Machine learning approaches
Pre-trained transformer models provide state-of-the-art sentiment analysis. text::textClassify() from the text package can call HuggingFace transformer models from R. Models like distilbert-base-uncased-finetuned-sst-2-english fine-tuned on movie reviews give accurate sentiment predictions. The tradeoff: much slower than lexicon methods, requires Python and the transformers library.
tidymodels with textrecipes preprocesses text for machine learning. step_tokenize(), step_stopwords(), step_tfidf() build a TF-IDF feature matrix from text. A logistic regression or random forest trained on labeled examples learns dataset-specific patterns that generalize better than generic lexicons.
For binary classification (positive/negative), train on labeled data from your domain. Active learning, label 200 examples, train a model, use the model to prioritize uncertain examples for labeling, is an efficient strategy when labels are expensive to produce.
Aspect-Based sentiment
Aspect-based sentiment analysis identifies sentiment toward specific entities or features. A review might express positive sentiment toward a hotel’s location but negative sentiment toward the beds. Document-level sentiment misses this nuance.
A simple approach: extract noun phrases around sentiment words. For each sentiment word, identify the nearest noun within a window of N words. This noun is the “aspect”; the sentiment word’s score is the “aspect sentiment”. Aggregate aspect scores across documents to understand which aspects are liked or disliked.
More sophisticated approaches use dependency parsing (spacyr wraps spaCy’s parser) to find syntactic relationships between adjectives and their noun heads, providing more accurate aspect-sentiment association than window-based methods.
Evaluation
Without labeled test data, sentiment scores cannot be validated rigorously. For any production system, sample 200-500 examples, manually assign ground-truth labels, and compute precision, recall, and F1 against your method. This reveals whether the lexicon matches your domain and guides decisions about customization.
Inter-annotator agreement (Cohen’s kappa) measures whether human labelers agree on the “correct” sentiment. Low agreement indicates ambiguous text, the task may be inherently subjective in that domain, setting an upper bound on achievable accuracy.
Practical considerations
Sentiment analysis results are only as good as the training data and lexicon. Generic lexicons trained on Twitter or movie reviews perform poorly on medical notes, legal documents, or technical documentation. Before deploying a sentiment system, always validate it against a labeled sample from your target domain. Even a sample of 100 manually labeled items reveals whether the lexicon’s vocabulary matches your corpus and whether the direction of sentiment words aligns with how they are used in context.
Key takeaways
- Join tokenized text with sentiment lexicons using
inner_join() - Aggregate word-level scores to document level
- Visualize sentiment trends to find emotional arcs
- Handle negation for more accurate analysis
Next steps
Continue your text mining journey:
- Topic Modeling with LDA in R — Uncover latent topics in document collections
- Text Classification in R — Build supervised models to categorize text
See also
- dplyr::filter(), Filter rows after tokenization
- dplyr::count(), Essential for word frequency analysis