🎬 YouTube Comment Sentiment Analyzer

Fine-Tuned Twitter-RoBERTa for Real-World YouTube Comments

Transformers Task Classes Language


🎯 Real-World Performance

Metric Score
Accuracy 88.00%
Macro F1 87.68%
Weighted F1 87.73%

Evaluated on a custom real-world YouTube benchmark containing slang, emojis, internet culture references, and creator-style comments.


πŸš€ Overview

This model is a sentiment classifier specifically adapted for YouTube comments.

The backbone model, Twitter-RoBERTa, was originally trained on Twitter/X posts and social media text.

To make it understand YouTube comment culture, the model was further fine-tuned on a dataset containing 1 Million+ labeled YouTube comments.

The result is a model that better understands:

βœ… Emojis

βœ… Internet slang

βœ… Creator-audience interactions

βœ… Short-form comments

βœ… Social media language

βœ… YouTube culture


🧠 Model Architecture

Twitter/X Posts
        ↓
Twitter-RoBERTa
(cardiffnlp/twitter-roberta-base-sentiment-latest)
        ↓
Fine-Tuning on 1M+ YouTube Comments
        ↓
YouTube Comment Sentiment Analyzer
        ↓
Negative | Neutral | Positive

πŸ“Š Project Pipeline

Pipeline


🎯 Sentiment Classes

Label Meaning
πŸ”΄ Negative Criticism, dislike, frustration, complaints
βšͺ Neutral Informational, factual, objective comments
🟒 Positive Praise, appreciation, excitement, support

πŸ§ͺ Real-World Benchmark Results

A manually curated benchmark was created to simulate actual YouTube comments.

The benchmark includes:

  • Internet slang
  • Emoji-heavy comments
  • Creator terminology
  • Sarcasm
  • Mixed sentiment
  • Short comments
  • Viral internet phrases

Results

Metric Score
Accuracy 88.00%
Macro F1 87.68%
Weighted F1 87.73%

πŸ“‰ Normalized Confusion Matrix

The model performs consistently across all sentiment classes and shows balanced classification behavior. Most classification errors occur between Neutral and Positive comments.

Negative comments are generally identified more reliably.

Normalized Confusion Matrix


πŸ‹οΈ Training Details

Base Model

cardiffnlp/twitter-roberta-base-sentiment-latest

Fine-Tuning Dataset

1M+ YouTube Comments

Hardware

NVIDIA RTX 3050 Laptop GPU (6GB)

Training Features

  • Mixed Precision Training (AMP)
  • Layer-wise Learning Rate Decay (LLRD)
  • Gradient Accumulation
  • Cosine Learning Rate Scheduler
  • Warmup Scheduling
  • Gradient Checkpointing
  • Class Weighted Loss
  • Early Stopping
  • Resume Training Support
  • Dynamic GPU Configuration

πŸ“‹ Internal Test Set Results

For reference, the held-out test split produced:

Metric Score
Accuracy 77.43%
Macro F1 77.38%
Weighted F1 77.41%

The external benchmark is considered a better estimate of real-world deployment performance.


πŸ”₯ Example Predictions

Positive

W video bro πŸ”₯

Prediction:

Positive

Positive

Absolute cinema

Prediction:

Positive

Negative

nah this ain't it

Prediction:

Negative

Negative

Worst update ever

Prediction:

Negative

Neutral

Uploaded 2 hours ago

Prediction:

Neutral

πŸ’» Usage

from transformers import (
    AutoTokenizer,
    AutoModelForSequenceClassification
)

MODEL_ID = "rishigupta04/yt-comments-sentiment-analyzer"

tokenizer = AutoTokenizer.from_pretrained(
    MODEL_ID
)

model = AutoModelForSequenceClassification.from_pretrained(
    MODEL_ID
)

🎯 Intended Use Cases

Suitable For

  • YouTube Comment Analysis
  • Social Media Monitoring
  • Creator Analytics
  • Content Moderation Assistance
  • Brand Sentiment Tracking
  • NLP Education
  • Research Projects

Not Intended For

  • Medical Decisions
  • Legal Decisions
  • Financial Decisions
  • High-Risk Automated Systems

⚠️ Known Limitations

Like most sentiment models, performance may degrade on:

  • Heavy sarcasm
  • Irony
  • Context-dependent jokes
  • Multi-comment conversations
  • Ambiguous sentiment

Example:

Wow another genius from YouTube University

May require additional context to classify correctly.


πŸ› οΈ Project Stack

Python
Transformers
PyTorch
FastAPI
Hugging Face
Scikit-Learn
Pandas
NumPy

πŸ‘¨β€πŸ’» Author

Rishiraj Gupta

M.Sc. Data Science

This project demonstrates:

  • End-to-End NLP Pipeline
  • Transformer Fine-Tuning
  • Model Evaluation
  • Error Analysis
  • MLOps Practices
  • FastAPI Deployment
  • Hugging Face Deployment
  • Real-World Sentiment Analysis

⭐ If you find this model useful, please consider liking the repository.

Built with πŸ€— Transformers + PyTorch

Downloads last month
93
Safetensors
Model size
0.1B params
Tensor type
F32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for rishigupta04/yt-comments-sentiment-analyzer

Finetuned
(234)
this model

Space using rishigupta04/yt-comments-sentiment-analyzer 1

Evaluation results

  • Accuracy on External YouTube Comment Validation Benchmark
    self-reported
    0.880
  • Macro F1 on External YouTube Comment Validation Benchmark
    self-reported
    0.877