Tweet Sentiment Classifier (DistilBERT)

This model classifies tweets into three sentiment classes: positive, and negative.
It is fine-tuned from distilbert-base-uncased on a dataset of labeled tweets.

Training Methodology

The model was fine-tuned using the Hugging Face ๐Ÿค— transformers library on the Sentiment140 dataset.

  • Base model: distilbert-base-uncased
  • Task: Sentiment classification (positive, negative)
  • Data Preprocessing:
    • Cleaned tweets from links, mentions, hashtags
    • Removed duplicates and empty samples
  • Split: 60% training / 20% validation / 20% test
  • Optimizer: Adam
  • Learning rate: 2e-6
  • Batch size: 32
  • Epochs: 7
  • Loss function: CrossEntropyLoss
  • Evaluation metrics: Accuracy, F1, Precision, Recall, ROC-AUC

Training was done on a local machine using GPU.

Evaluation

  • Accuracy: 0.8070
  • Precision: 0.7880
  • Recall: 0.8400
  • F1 Score: 0.8132
  • ROC AUC: 0.8910

Example Usage

import torch
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import re

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

model = AutoModelForSequenceClassification.from_pretrained("KotYrod/tweet-sentiment-distilbert").to(device)
tokenizer = AutoTokenizer.from_pretrained("distilbert-base-uncased")

def predict_sentiment(text):
    model.eval()
    text = re.sub(r"http\S+|www\S+|\@\w+|\#|\s+", ' ', text).strip()
    inputs = tokenizer(text, return_tensors='pt', truncation=True, padding='max_length', max_length=128).to(device)
    with torch.no_grad():
        logits = model(**inputs).logits
        probs = torch.softmax(logits, dim=1)
        pred = torch.argmax(probs, dim=1).item()
    return ("Positive ๐Ÿ˜Š" if pred == 1 else "Negative ๐Ÿ˜ ", probs[0][pred].item())

if _name_ == "_main_":
    label, conf = predict_sentiment("wow im so glad")
    print(f"Prediction: {label} (Confidence: {conf:.2f})")
Downloads last month
8
Safetensors
Model size
67M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for KotYrod/tweet-sentiment-distilbert

Finetuned
(10483)
this model