rguides

Sentiment Analysis in R

Sentiment analysis assigns emotional scores to text, revealing whether opinions are positive, negative, or neutral. This tutorial covers lexicon-based sentiment analysis using the tidytext package, the most common approach for getting started with text emotion classification in R.

Prerequisites

You should be familiar with tidytext basics—tokenization, stop word removal, and word frequency analysis. If you need a refresher, work through the Tidytext Basics tutorial first.

Sentiment lexicons

The tidytext package provides several sentiment lexicons. Each word receives a score indicating its emotional tone.

Getting started with lexicons

library(tidytext)
library(tidyverse)

# View available sentiment lexicons
get_sentiments()

# Load the AFINN lexicon (scores from -5 to +5)
afinn <- get_sentiments("afinn")

# Load the bing lexicon (binary positive/negative)
bing <- get_sentiments("bing")

# Load the nrc lexicon (emotions)
nrc <- get_sentiments("nrc")

Understanding lexicon structure

Each lexicon structures sentiment differently:

# AFINN: numeric scores
afinn %>% 
  head(10)
# # A tibble: 10 × 2
#   word value
# 1 abandom  -2
# 2 abbreviate -2
# 3 abdicate  -2

# Bing: binary classification
bing %>% 
  head(10)
# # A tibble: 10 × 2
#   word sentiment
# 1 2-faced negative
# 2 2-faces negative

# NRC: multiple emotions
nrc %>% 
  distinct(sentiment)
# # A tibble: 10 × 1
#   sentiment
# 1 anger
# 2 anticipation
# 3 disgust
# 4 fear
# 5 joy
# 6 sadness
# 7 surprise
# 8 trust
# 9 positive
# 10 negative

Basic sentiment analysis workflow

The core workflow joins sentiment scores to your tokenized text:

# Start with tokenized text
text_data <- tibble(
  id = 1:3,
  text = c(
    "I love this product, it is amazing and wonderful!",
    "This is terrible, I hate it and would not recommend.",
    "The product works as expected, nothing special."
  )
)

# Tokenize
tidy_text <- text_data %>%
  unnest_tokens(word, text)

# Join with sentiment lexicon
tidy_text %>%
  inner_join(bing, by = "word")

This gives you sentiment labels for each word.

Sentiment scoring methods

Method 1: AFINN scores

AFINN provides numeric scores—the most intuitive for overall sentiment:

# Calculate sentiment score per document
sentiment_scores <- tidy_text %>%
  inner_join(afinn, by = "word") %>%
  group_by(id) %>%
  summarise(
    sentiment_score = sum(value),
    word_count = n()
  )

print(sentiment_scores)

Method 2: bing binary classification

Use bing for simple positive/negative counts:

sentiment_counts <- tidy_text %>%
  inner_join(bing, by = "word") %>%
  count(id, sentiment) %>%
  spread(sentiment, n, fill = 0) %>%
  mutate(
    net_sentiment = positive - negative,
    total_sentiment_words = positive + negative
  )

print(sentiment_counts)

Method 3: NRC emotion categories

NRC lets you analyze specific emotions:

emotion_counts <- tidy_text %>%
  inner_join(nrc, by = "word") %>%
  count(id, sentiment) %>%
  filter(!sentiment %in% c("positive", "negative"))

print(emotion_counts)

Practical example: movie reviews

Let us apply these concepts to a real dataset:

library(janeaustenr)

# Get text from Pride and Prejudice
text <- austen_books() %>%
  filter(book == "Pride & Prejudice") %>%
  mutate(linenumber = row_number()) %>%
  unnest_tokens(word, text) %>%
  filter(!word %in% stop_words$word)

# Calculate sentiment over the book
sentiment_by_section <- text %>%
  inner_join(afinn, by = "word") %>%
  mutate(
    section = floor(linenumber / 80)
  ) %>%
  group_by(section) %>%
  summarise(
    score = sum(value),
    words = n()
  )

print(sentiment_by_section)

Visualizations reveal emotional arcs in text:

library(ggplot2)

# Plot sentiment over the narrative
ggplot(sentiment_by_section, aes(section, score)) +
  geom_col(fill = ifelse(sentiment_by_section$score > 0, "steelblue", "coral")) +
  labs(
    x = "Section of Book",
    y = "Sentiment Score",
    title = "Emotional Arc in Pride and Prejudice"
  ) +
  theme_minimal()

Comparing sentiments between texts

Compare sentiment across different sources:

# Compare two books
books <- austen_books() %>%
  filter(book %in% c("Pride & Prejudice", "Emma"))

book_sentiment <- books %>%
  unnest_tokens(word, text) %>%
  anti_join(stop_words, by = "word") %>%
  inner_join(afinn, by = "word") %>%
  group_by(book) %>%
  summarise(
    avg_sentiment = mean(value),
    total_score = sum(value),
    sentiment_words = n()
  )

print(book_sentiment)

Handling negation

Words like “not” or “never” flip sentiment polarity:

# Custom negation words
negation_words <- c("not", "no", "never", "neither", "nobody", "nothing")

# Find negation + sentiment word pairs
text_data <- tibble(
  id = 1,
  text = "This is not good, I am not happy with this."
) %>%
  unnest_tokens(bigram, text, token = "ngrams", n = 2) %>%
  separate(bigram, c("word1", "word2"), sep = " ")

# Find negated sentiments
text_data %>%
  filter(word1 %in% negation_words) %>%
  inner_join(afinn, by = c("word2" = "word")) %>%
  mutate(value = -value)  # Flip the sentiment

print(negated)

This technique reverses sentiment scores for negated words.

What you have learned

MethodBest For
AFINNOverall sentiment score (numeric)
BingSimple positive/negative classification
NRCDetailed emotion analysis

Lexicon-Based sentiment

Lexicon-based sentiment assigns sentiment scores using a pre-built dictionary of words with known polarity. tidytext::get_sentiments("bing") returns a data frame of words labeled “positive” or “negative”. get_sentiments("afinn") returns scores from -5 to +5. get_sentiments("nrc") includes emotion categories like “joy”, “anger”, “fear”. Apply lexicons by joining tokenized text against the lexicon on the word column.

Limitations: lexicons built on general text may not capture domain-specific usage. Words like “breaking” are negative in general language but neutral in news headlines. Negation (“not good”) is not handled without additional processing.

Sentiment over time

For time-stamped text data, sentiment can be tracked over a time axis. After joining with a sentiment lexicon and aggregating sentiment scores per time unit (day, month, publication), plot the trend with ggplot2. geom_smooth() adds a trend line. Peaks and troughs in the sentiment line often correspond to real events, this is useful for social media monitoring and financial news analysis.

Sentence vs document level

Sentiment at the word level loses sentence context. Sentence-level analysis averages word sentiments within sentences, then aggregates sentences to documents. sentimentr::sentiment() computes sentence-level sentiment and handles negation, amplifiers (“very”), and de-amplifiers (“slightly”) automatically. It is more accurate than word-level averaging for texts with mixed polarity within a single document.

Lexicon-Based methods

Lexicon-based sentiment analysis assigns sentiment scores to individual words and aggregates them to the document level. The accuracy depends on lexicon quality and the match between the lexicon’s domain and your text. Sentiment lexicons built on general text perform poorly on domain-specific text, a financial lexicon would treat “loss” negatively, but in a medical text “weight loss” is often positive.

The tidytext package bundles three lexicons via get_sentiments(). Bing (binary positive/negative, ~6,800 words) is fast for simple classification. AFINN (numeric -5 to +5, ~2,500 words) enables arithmetic aggregation. NRC (10 emotional categories, ~14,000 words) captures nuanced emotions but may overfit to specific domains.

Custom lexicons extend coverage. Domain-specific vocabulary, medical terms, financial jargon, product names, often requires manual curation. Start with an existing lexicon and add domain-specific words: custom_lexicon <- bind_rows(get_sentiments("bing"), tibble(word = c("backorder", "defect"), sentiment = "negative")).

Valence shifters and context

Simple word-counting ignores context. “This is not good” has the word “good” but negative sentiment. Valence shifters, negations (“not”, “never”, “hardly”), amplifiers (“very”, “extremely”), and de-amplifiers (“slightly”, “somewhat”), change the effective sentiment of nearby words.

The sentimentr package handles valence shifters by scoring sentiment in clause windows rather than on individual words. sentimentr::sentiment(text) returns a row per sentence with a polarity score that accounts for negation and amplification. For text where negation is common (reviews, clinical notes), sentimentr outperforms simple lexicon matching.

Sentence-level analysis: split text into sentences with tidytext::unnest_sentences(), score each sentence with sentimentr::sentiment(), then aggregate to document level. Sentence-level scores can reveal emotional arc within a document, how sentiment changes from beginning to end.

Machine learning approaches

Pre-trained transformer models provide state-of-the-art sentiment analysis. text::textClassify() from the text package can call HuggingFace transformer models from R. Models like distilbert-base-uncased-finetuned-sst-2-english fine-tuned on movie reviews give accurate sentiment predictions. The tradeoff: much slower than lexicon methods, requires Python and the transformers library.

tidymodels with textrecipes preprocesses text for machine learning. step_tokenize(), step_stopwords(), step_tfidf() build a TF-IDF feature matrix from text. A logistic regression or random forest trained on labeled examples learns dataset-specific patterns that generalize better than generic lexicons.

For binary classification (positive/negative), train on labeled data from your domain. Active learning, label 200 examples, train a model, use the model to prioritize uncertain examples for labeling, is an efficient strategy when labels are expensive to produce.

Aspect-Based sentiment

Aspect-based sentiment analysis identifies sentiment toward specific entities or features. A review might express positive sentiment toward a hotel’s location but negative sentiment toward the beds. Document-level sentiment misses this nuance.

A simple approach: extract noun phrases around sentiment words. For each sentiment word, identify the nearest noun within a window of N words. This noun is the “aspect”; the sentiment word’s score is the “aspect sentiment”. Aggregate aspect scores across documents to understand which aspects are liked or disliked.

More sophisticated approaches use dependency parsing (spacyr wraps spaCy’s parser) to find syntactic relationships between adjectives and their noun heads, providing more accurate aspect-sentiment association than window-based methods.

Evaluation

Without labeled test data, sentiment scores cannot be validated rigorously. For any production system, sample 200-500 examples, manually assign ground-truth labels, and compute precision, recall, and F1 against your method. This reveals whether the lexicon matches your domain and guides decisions about customization.

Inter-annotator agreement (Cohen’s kappa) measures whether human labelers agree on the “correct” sentiment. Low agreement indicates ambiguous text, the task may be inherently subjective in that domain, setting an upper bound on achievable accuracy.

Practical considerations

Sentiment analysis results are only as good as the training data and lexicon. Generic lexicons trained on Twitter or movie reviews perform poorly on medical notes, legal documents, or technical documentation. Before deploying a sentiment system, always validate it against a labeled sample from your target domain. Even a sample of 100 manually labeled items reveals whether the lexicon’s vocabulary matches your corpus and whether the direction of sentiment words aligns with how they are used in context.

Key takeaways

  1. Join tokenized text with sentiment lexicons using inner_join()
  2. Aggregate word-level scores to document level
  3. Visualize sentiment trends to find emotional arcs
  4. Handle negation for more accurate analysis

Next steps

Continue your text mining journey:

  • Topic Modeling with LDA in R — Uncover latent topics in document collections
  • Text Classification in R — Build supervised models to categorize text

See also