DistilBERT for Twitter Sentiment Analysis 🐦

This model is a fine-tuned version of distilbert-base-uncased for sentiment classification on Twitter/X data using LoRA (Low-Rank Adaptation) for parameter-efficient fine-tuning.

Model Description

  • Base Model: DistilBERT (66M parameters)
  • Fine-tuning Method: LoRA/PEFT (only ~1.5M parameters trained)
  • Task: 3-class sentiment classification
    • 😊 Positive
    • 😐 Neutral
    • 😑 Negative
  • Dataset: tweet_eval sentiment subset
  • Language: English
  • Training Framework: Hugging Face Transformers + PEFT

🎯 Performance

The model achieves the following results on the test set:

Metric Score
Accuracy 67.84%
F1 Score (weighted) 0.6785

Per-Class Performance

Class Precision Recall F1-Score Support
Negative 😑 0.71 0.65 0.67 3,972
Neutral 😐 0.69 0.70 0.69 5,937
Positive 😊 0.62 0.67 0.65 2,375
Overall 0.68 0.68 0.68 12,284

Confusion Matrix

              Predicted
           Neg    Neu    Pos
Actual Neg [2562  1210   200]
       Neu [ 987  4170   780]
       Pos [  77   697  1601]

πŸš€ Usage

Quick Start (Recommended)

# Install required packages
!pip install transformers peft torch

from transformers import AutoTokenizer, AutoModelForSequenceClassification
from peft import PeftModel, PeftConfig
import torch

# Load model
model_name = "SeifElislamm/distilbert-sentiment-twitter"

# Load PEFT config
config = PeftConfig.from_pretrained(model_name)

# Load base model
base_model = AutoModelForSequenceClassification.from_pretrained(
    config.base_model_name_or_path,
    num_labels=3,
    id2label={0: "negative", 1: "neutral", 2: "positive"},
    label2id={"negative": 0, "neutral": 1, "positive": 2}
)

# Load LoRA adapters
model = PeftModel.from_pretrained(base_model, model_name)
model = model.merge_and_unload()  # Merge for faster inference
model.eval()

# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained(model_name)

# Predict sentiment
def predict_sentiment(text):
    inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=128)
    
    with torch.no_grad():
        outputs = model(**inputs)
        probs = torch.softmax(outputs.logits, dim=-1)
        pred_class = torch.argmax(probs).item()
        confidence = probs[0][pred_class].item()
    
    labels = {0: "negative", 1: "neutral", 2: "positive"}
    return labels[pred_class], confidence

# Test it
text = "I love this product! It's amazing!"
sentiment, confidence = predict_sentiment(text)
print(f"Sentiment: {sentiment.upper()} (confidence: {confidence:.1%})")

Batch Prediction

texts = [
    "I love this so much! 😍",
    "This is terrible. 😑",
    "It's okay, nothing special. 😐"
]

for text in texts:
    sentiment, confidence = predict_sentiment(text)
    print(f"{text} β†’ {sentiment.upper()} ({confidence:.1%})")

Expected Output

I love this so much! 😍 β†’ POSITIVE (85.3%)
This is terrible. 😑 β†’ NEGATIVE (79.2%)
It's okay, nothing special. 😐 β†’ NEUTRAL (71.5%)

πŸ§ͺ Quick Test in Google Colab

Want to test the model immediately? Copy this into a new Colab notebook:

!pip install -q transformers peft torch

from transformers import AutoTokenizer, AutoModelForSequenceClassification
from peft import PeftModel, PeftConfig
import torch

model_name = "SeifElislamm/distilbert-sentiment-twitter"
config = PeftConfig.from_pretrained(model_name)
base_model = AutoModelForSequenceClassification.from_pretrained(
    config.base_model_name_or_path, num_labels=3
)
model = PeftModel.from_pretrained(base_model, model_name).merge_and_unload()
tokenizer = AutoTokenizer.from_pretrained(model_name)

def predict(text):
    inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=128)
    with torch.no_grad():
        outputs = model(**inputs)
        probs = torch.softmax(outputs.logits, dim=-1)
        pred = torch.argmax(probs).item()
    labels = {0: "NEGATIVE", 1: "NEUTRAL", 2: "POSITIVE"}
    return labels[pred], probs[0][pred].item()

# Test it!
text = input("Enter text: ")
sentiment, conf = predict(text)
print(f"β†’ {sentiment} ({conf:.1%})")

πŸ“Š Training Details

Training Hyperparameters

Parameter Value
Base Model distilbert-base-uncased
Learning Rate 2e-5
Batch Size 32
Epochs 3
Weight Decay 0.01
Max Sequence Length 128
Optimizer AdamW
LR Scheduler Linear

LoRA Configuration

Parameter Value
LoRA Rank (r) 16
LoRA Alpha 32
LoRA Dropout 0.1
Target Modules q_lin, v_lin
Trainable Parameters ~1.5M / 66M (2.3%)

Training Results

Epoch Training Loss Validation Loss Accuracy F1 Score
1 0.6845 0.7014 0.6805 0.6817
2 0.6841 0.6861 0.6925 0.6936
3 0.6718 0.6819 0.6975 0.6985

βœ… Model converged successfully with decreasing loss and improving metrics!

πŸ“š Training Data

The model was trained on the tweet_eval sentiment dataset:

Split Samples
Training 45,615
Validation 2,000
Test 12,284

Dataset characteristics:

  • Short text (typical tweets: 10-50 words)
  • Informal language with emojis, hashtags, and mentions
  • Balanced across negative, neutral, and positive sentiments
  • Real-world social media data

πŸ’‘ Intended Uses

βœ… Recommended Uses

  • Social Media Monitoring: Analyze sentiment of tweets, posts, and comments
  • Customer Feedback Analysis: Classify product reviews and feedback
  • Brand Reputation Tracking: Monitor public opinion about brands
  • Market Research: Understand customer sentiment trends
  • Content Moderation: Flag potentially negative content
  • Academic Research: Study sentiment patterns in social media

⚠️ Limitations

  • Domain-specific: Trained on Twitter data; may not generalize well to:
    • Formal documents (legal, academic)
    • Long-form content (articles, essays)
    • Domain-specific language (medical, technical)
  • English only: Not suitable for other languages
  • Context limitations:
    • May struggle with sarcasm and irony
    • Limited understanding of cultural context
    • Can misinterpret complex or nuanced sentiments
  • Bias: May reflect biases present in Twitter data
  • Temporal: Trained on data up to 2024; may not capture emerging slang

❌ Out of Scope

  • Multi-lingual sentiment analysis
  • Emotion detection beyond positive/neutral/negative
  • Aspect-based sentiment analysis
  • Spam detection or content classification
  • Real-time critical decision making

πŸ”§ Technical Details

Model Architecture

  • Base: DistilBERT (distilled version of BERT)
  • Layers: 6 transformer layers
  • Hidden Size: 768
  • Attention Heads: 12
  • Parameters: 66M total, ~1.5M trained (LoRA)
  • Classification Head: Linear layer (768 β†’ 3)

Preprocessing

  • Tokenization: WordPiece tokenization
  • Max Length: 128 tokens
  • Padding: Dynamic padding to max length in batch
  • Truncation: Enabled for sequences > 128 tokens

Inference Speed

On GPU (T4):

  • Single prediction: ~10-15ms
  • Batch of 32: ~50-80ms

On CPU:

  • Single prediction: ~50-100ms
  • Batch of 32: ~500-800ms

πŸŽ“ Citation

If you use this model in your research or application, please cite:

@misc{seif2025distilbert-sentiment,
  author = {Seif Elislam},
  title = {DistilBERT Fine-tuned for Twitter Sentiment Analysis},
  year = {2025},
  publisher = {Hugging Face},
  journal = {Hugging Face Model Hub},
  howpublished = {\url{https://huggingface.co/SeifElislamm/distilbert-sentiment-twitter}}
}

πŸ“œ License

This model is released under the Apache 2.0 License. The base DistilBERT model is also Apache 2.0 licensed.

πŸ™ Acknowledgments

πŸ“ž Contact

For questions or issues, please open an issue on the model's discussion page.


Model Card Authors: Seif Elislam
Last Updated: November 2025
Model Version: 1.0

Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Dataset used to train SeifElislamm/distilbert-sentiment-twitter

Evaluation results