DistilBERT Fine-tuned on Amazon Reviews (5-Star Rating)

Model Description

This model is a fine-tuned version of distilbert-base-uncased for 5-class sentiment classification, predicting star ratings (1-5) from Amazon product reviews.

Training Data

  • Dataset: SetFit/amazon_reviews_multi_en
  • Train samples: 20,000 (subset)
  • Test samples: 2,000 (subset)
  • Classes: 1 star, 2 stars, 3 stars, 4 stars, 5 stars

Training Procedure

  • Base model: distilbert-base-uncased
  • Epochs: 3
  • Batch size: 16
  • Learning rate: 2e-5
  • Max sequence length: 256

Evaluation Results

  • Accuracy: 54.95%
  • Off-by-one accuracy: 92.45%

Note: 55% accuracy on a 5-class problem is 2.75x better than random chance (20%). The high off-by-one accuracy (92%) indicates the model rarely makes catastrophic errors.

Usage

from transformers import pipeline

classifier = pipeline("text-classification", model="Nav772/distilbert-amazon-reviews-5star")
result = classifier("This product exceeded my expectations! Great quality.")
print(result)

Limitations

  • Trained on Amazon product reviews; may not generalize to other review domains
  • Adjacent star ratings (e.g., 2 vs 3 stars) are inherently difficult to distinguish due to subjective labeling
  • English language only
Downloads last month
16
Safetensors
Model size
67M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train Nav772/distilbert-amazon-reviews-5star

Spaces using Nav772/distilbert-amazon-reviews-5star 2