Thi144
/

sentiment-distilbert-7class

+---
+language: en
+license: apache-2.0
+tags:
+- sentiment-analysis
+- text-classification
+- distilbert
+- pytorch
+- transformers
+datasets:
+- imdb
+metrics:
+- accuracy
+- f1
+widget:
+- text: "This movie was absolutely amazing! Best film I've seen all year!"
+  example_title: "Very Positive"
+- text: "Pretty good movie, enjoyed it overall."
+  example_title: "Slightly Positive"
+- text: "It was okay, nothing special but not bad either."
+  example_title: "Neutral"
+- text: "Not a great movie, pretty disappointing."
+  example_title: "Slightly Negative"
+- text: "Terrible film, complete waste of time and money!"
+  example_title: "Very Negative"
+---
+# DistilBERT 7-Class Sentiment Analysis Model
+A fine-tuned DistilBERT model for nuanced sentiment analysis with 7 sentiment classes on a scale from -3 (Very Negative) to +3 (Very Positive).
+## Model Description
+This model performs fine-grained sentiment classification, providing more nuanced predictions than traditional binary positive/negative models. It's particularly useful for business applications where understanding the intensity of sentiment matters (e.g., identifying "at-risk" customers vs. extremely dissatisfied ones).
+**Architecture:** DistilBERT (distilbert-base-uncased)
+**Parameters:** 66 million
+**Training Data:** 6,000 IMDB movie reviews
+**Accuracy:** 73.7%
+## Sentiment Classes
+| Class | Scale | Label | Description |
+|-------|-------|-------|-------------|
+| 0 | -3 | Very Negative | Extremely dissatisfied, angry |
+| 1 | -2 | Negative | Clearly unhappy, disappointed |
+| 2 | -1 | Slightly Negative | Somewhat disappointed |
+| 3 | 0 | Neutral | Balanced, neither positive nor negative |
+| 4 | +1 | Slightly Positive | Somewhat satisfied |
+| 5 | +2 | Positive | Clearly satisfied, happy |
+| 6 | +3 | Very Positive | Extremely satisfied, delighted |
+## Usage
+```python
+from transformers import AutoTokenizer, AutoModelForSequenceClassification
+import torch
+# Load model and tokenizer
+model_id = "Thi144/sentiment-distilbert-7class"
+tokenizer = AutoTokenizer.from_pretrained(model_id)
+model = AutoModelForSequenceClassification.from_pretrained(model_id)
+# Class mapping
+CLASS_LABELS = {
+    0: {"scale": -3, "label": "negative", "name": "Very Negative"},
+    1: {"scale": -2, "label": "negative", "name": "Negative"},
+    2: {"scale": -1, "label": "negative", "name": "Slightly Negative"},
+    3: {"scale": 0, "label": "neutral", "name": "Neutral"},
+    4: {"scale": 1, "label": "positive", "name": "Slightly Positive"},
+    5: {"scale": 2, "label": "positive", "name": "Positive"},
+    6: {"scale": 3, "label": "positive", "name": "Very Positive"}
+}
+# Predict sentiment
+def predict_sentiment(text):
+    inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=512)
+    with torch.no_grad():
+        outputs = model(**inputs)
+        predictions = torch.nn.functional.softmax(outputs.logits, dim=-1)
+        class_id = predictions.argmax().item()
+        confidence = predictions[0][class_id].item()
+    result = CLASS_LABELS[class_id]
+    return {
+        "class": class_id,
+        "scale": result["scale"],
+        "label": result["label"],
+        "name": result["name"],
+        "confidence": confidence
+    }
+# Example
+result = predict_sentiment("This movie was absolutely amazing!")
+print(f"Sentiment: {result['name']} (Scale: {result['scale']}, Confidence: {result['confidence']:.2%})")
+```
+## Performance Metrics
+**Overall Accuracy:** 73.7%
+**Class-Specific Performance:**
+- **Very Negative (-3):** 81% precision, 88% recall
+- **Negative (-2):** 83% precision, 77% recall
+- **Slightly Negative (-1):** 54% precision, 58% recall
+- **Neutral (0):** 86% precision, 64% recall
+- **Slightly Positive (+1):** 58% precision, 54% recall
+- **Positive (+2):** 79% precision, 83% recall
+- **Very Positive (+3):** 88% precision, 81% recall
+The model performs best at identifying strong sentiments (Very Negative/Positive) and struggles most with subtle distinctions (Slightly Negative/Positive).
+## Training Details
+- **Base Model:** distilbert-base-uncased
+- **Dataset:** 6,000 IMDB reviews (4,800 train, 1,200 test)
+- **Label Conversion:** Binary labels converted to 7-class using text intensity analysis
+- **Epochs:** 4
+- **Batch Size:** 16
+- **Optimizer:** AdamW (lr=2e-5)
+- **Training Time:** ~15-20 minutes on CPU
+## Limitations
+- Trained on movie reviews, may not generalize perfectly to other domains
+- Slightly Negative/Positive classes have lower accuracy (~54-58%)
+- Performance depends on text clarity and length
+- May struggle with sarcasm or complex sentiment
+## Intended Use
+**Primary Use Cases:**
+- Customer feedback analysis with nuanced sentiment scoring
+- Product review sentiment classification
+- Social media monitoring with intensity detection
+- Business intelligence dashboards requiring granular sentiment
+**Not Recommended For:**
+- Safety-critical applications
+- Legal decision-making
+- Medical diagnosis
+## License
+Apache 2.0
+## Citation
+If you use this model, please cite:
+```
+@model{thi144-sentiment-distilbert-7class,
+  author = {Thi144},
+  title = {DistilBERT 7-Class Sentiment Analysis},
+  year = {2025},
+  publisher = {HuggingFace},
+  url = {https://huggingface.co/Thi144/sentiment-distilbert-7class}
+}
+```