RoBERTa Fine-tuned on Amazon Reviews (5-Star Rating)
Model Description
This model is a fine-tuned version of roberta-base for 5-class sentiment classification, predicting star ratings (1-5) from Amazon product reviews.
Comparison with DistilBERT
This model was trained as part of a model comparison study:
| Model | Parameters | Accuracy | Off-by-one Accuracy | Inference Speed |
|---|---|---|---|---|
| DistilBERT | 67M | 54.95% | 92.45% | 1.83x faster |
| RoBERTa | 125M | 59.90% | 95.10% | Baseline |
RoBERTa provides ~5 percentage points higher accuracy at the cost of slower inference.
Training Data
- Dataset: SetFit/amazon_reviews_multi_en
- Train samples: 20,000 (subset)
- Test samples: 2,000 (subset)
- Classes: 1 star, 2 stars, 3 stars, 4 stars, 5 stars
Training Procedure
- Base model: roberta-base
- Epochs: 3
- Batch size: 16
- Learning rate: 2e-5
- Max sequence length: 256
Usage
from transformers import pipeline
classifier = pipeline("text-classification", model="Nav772/roberta-amazon-reviews-5star")
result = classifier("This product exceeded my expectations! Great quality.")
print(result)
When to Use This Model
- Choose RoBERTa when accuracy is the priority and latency is less critical
- Choose DistilBERT when you need faster inference or have resource constraints
Demo
Try the model comparison demo: sentiment-model-comparison
Limitations
- Trained on Amazon product reviews; may not generalize to other review domains
- Adjacent star ratings (e.g., 2 vs 3 stars) are inherently difficult to distinguish
- English language only
- Downloads last month
- 16