Instructions to use rishigupta04/yt-comments-sentiment-analyzer with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use rishigupta04/yt-comments-sentiment-analyzer with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-classification", model="rishigupta04/yt-comments-sentiment-analyzer")# Load model directly from transformers import AutoTokenizer, AutoModelForSequenceClassification tokenizer = AutoTokenizer.from_pretrained("rishigupta04/yt-comments-sentiment-analyzer") model = AutoModelForSequenceClassification.from_pretrained("rishigupta04/yt-comments-sentiment-analyzer") - Notebooks
- Google Colab
- Kaggle
- π¬ YouTube Comment Sentiment Analyzer
- π Overview
- π§ Model Architecture
- π Project Pipeline
- π― Sentiment Classes
- π§ͺ Real-World Benchmark Results
- π Normalized Confusion Matrix
- ποΈ Training Details
- π Internal Test Set Results
- π₯ Example Predictions
- π» Usage
- π― Intended Use Cases
- β οΈ Known Limitations
- π οΈ Project Stack
- π¨βπ» Author
π¬ YouTube Comment Sentiment Analyzer
Fine-Tuned Twitter-RoBERTa for Real-World YouTube Comments
π― Real-World Performance
| Metric | Score |
|---|---|
| Accuracy | 88.00% |
| Macro F1 | 87.68% |
| Weighted F1 | 87.73% |
Evaluated on a custom real-world YouTube benchmark containing slang, emojis, internet culture references, and creator-style comments.
π Overview
This model is a sentiment classifier specifically adapted for YouTube comments.
The backbone model, Twitter-RoBERTa, was originally trained on Twitter/X posts and social media text.
To make it understand YouTube comment culture, the model was further fine-tuned on a dataset containing 1 Million+ labeled YouTube comments.
The result is a model that better understands:
β Emojis
β Internet slang
β Creator-audience interactions
β Short-form comments
β Social media language
β YouTube culture
π§ Model Architecture
Twitter/X Posts
β
Twitter-RoBERTa
(cardiffnlp/twitter-roberta-base-sentiment-latest)
β
Fine-Tuning on 1M+ YouTube Comments
β
YouTube Comment Sentiment Analyzer
β
Negative | Neutral | Positive
π Project Pipeline
π― Sentiment Classes
| Label | Meaning |
|---|---|
| π΄ Negative | Criticism, dislike, frustration, complaints |
| βͺ Neutral | Informational, factual, objective comments |
| π’ Positive | Praise, appreciation, excitement, support |
π§ͺ Real-World Benchmark Results
A manually curated benchmark was created to simulate actual YouTube comments.
The benchmark includes:
- Internet slang
- Emoji-heavy comments
- Creator terminology
- Sarcasm
- Mixed sentiment
- Short comments
- Viral internet phrases
Results
| Metric | Score |
|---|---|
| Accuracy | 88.00% |
| Macro F1 | 87.68% |
| Weighted F1 | 87.73% |
π Normalized Confusion Matrix
The model performs consistently across all sentiment classes and shows balanced classification behavior. Most classification errors occur between Neutral and Positive comments.
Negative comments are generally identified more reliably.
ποΈ Training Details
Base Model
cardiffnlp/twitter-roberta-base-sentiment-latest
Fine-Tuning Dataset
1M+ YouTube Comments
Hardware
NVIDIA RTX 3050 Laptop GPU (6GB)
Training Features
- Mixed Precision Training (AMP)
- Layer-wise Learning Rate Decay (LLRD)
- Gradient Accumulation
- Cosine Learning Rate Scheduler
- Warmup Scheduling
- Gradient Checkpointing
- Class Weighted Loss
- Early Stopping
- Resume Training Support
- Dynamic GPU Configuration
π Internal Test Set Results
For reference, the held-out test split produced:
| Metric | Score |
|---|---|
| Accuracy | 77.43% |
| Macro F1 | 77.38% |
| Weighted F1 | 77.41% |
The external benchmark is considered a better estimate of real-world deployment performance.
π₯ Example Predictions
Positive
W video bro π₯
Prediction:
Positive
Positive
Absolute cinema
Prediction:
Positive
Negative
nah this ain't it
Prediction:
Negative
Negative
Worst update ever
Prediction:
Negative
Neutral
Uploaded 2 hours ago
Prediction:
Neutral
π» Usage
from transformers import (
AutoTokenizer,
AutoModelForSequenceClassification
)
MODEL_ID = "rishigupta04/yt-comments-sentiment-analyzer"
tokenizer = AutoTokenizer.from_pretrained(
MODEL_ID
)
model = AutoModelForSequenceClassification.from_pretrained(
MODEL_ID
)
π― Intended Use Cases
Suitable For
- YouTube Comment Analysis
- Social Media Monitoring
- Creator Analytics
- Content Moderation Assistance
- Brand Sentiment Tracking
- NLP Education
- Research Projects
Not Intended For
- Medical Decisions
- Legal Decisions
- Financial Decisions
- High-Risk Automated Systems
β οΈ Known Limitations
Like most sentiment models, performance may degrade on:
- Heavy sarcasm
- Irony
- Context-dependent jokes
- Multi-comment conversations
- Ambiguous sentiment
Example:
Wow another genius from YouTube University
May require additional context to classify correctly.
π οΈ Project Stack
Python
Transformers
PyTorch
FastAPI
Hugging Face
Scikit-Learn
Pandas
NumPy
π¨βπ» Author
Rishiraj Gupta
M.Sc. Data Science
This project demonstrates:
- End-to-End NLP Pipeline
- Transformer Fine-Tuning
- Model Evaluation
- Error Analysis
- MLOps Practices
- FastAPI Deployment
- Hugging Face Deployment
- Real-World Sentiment Analysis
- Downloads last month
- 93
Model tree for rishigupta04/yt-comments-sentiment-analyzer
Space using rishigupta04/yt-comments-sentiment-analyzer 1
Evaluation results
- Accuracy on External YouTube Comment Validation Benchmarkself-reported0.880
- Macro F1 on External YouTube Comment Validation Benchmarkself-reported0.877

