🎭 Sentiment Analysis — IMDB Reviews

A binary sentiment classifier fine-tuned on IMDB movie reviews, predicting POSITIVE or NEGATIVE sentiment with confidence scores.

📊 Model Performance

Metric	Score
Accuracy	0.894
F1 Score	0.893
ROC-AUC	0.960
Precision	0.884
Recall	0.902

Confusion Matrix

🤖 Model Details

Property	Value
Base model	`distilbert-base-uncased`
Task	Binary text classification
Labels	`NEGATIVE` (0), `POSITIVE` (1)
Max token length	256
Training samples	5,000 (IMDB subset)
Epochs	2
Batch size	16
Learning rate	2e-5
Framework	HuggingFace Transformers + Trainer API
Experiment tracking	MLflow

🚀 How to Use

from transformers import pipeline, AutoTokenizer, AutoModelForSequenceClassification

MODEL_PATH = "amarshiv86/sentiment-analysis-imdb-model"

tokenizer = AutoTokenizer.from_pretrained(f"{MODEL_PATH}/model")
model     = AutoModelForSequenceClassification.from_pretrained(f"{MODEL_PATH}/model")

clf = pipeline("sentiment-analysis", model=model, tokenizer=tokenizer, truncation=True)

results = clf([
    "This movie was absolutely fantastic, loved every minute!",
    "Terrible film, complete waste of time.",
])

for r in results:
    print(f"{r['label']} — {r['score']:.1%} confidence")

🔁 MLOps Pipeline

Automatically retrained via GitHub Actions whenever src/ or params.yaml changes:

GitHub Push
    ↓
GitHub Actions
    ↓
prepare.py → train.py → evaluate.py
    ↓                        ↓
model files            metrics.json
                       confusion_matrix.png
    ↓
HuggingFace Hub (this repo)

📁 Repository Structure

amarshiv86/sentiment-analysis-imdb-model
├── model/
│   ├── model.safetensors      # fine-tuned weights (268 MB)
│   ├── config.json            # model architecture config
│   ├── tokenizer.json         # tokenizer vocab
│   └── tokenizer_config.json  # tokenizer settings
├── artifacts/
│   └── confusion_matrix.png   # evaluation plot
└── metrics.json               # latest eval metrics

📄 Dataset

Trained on a 5,000-sample subset of the IMDB dataset. Full processed dataset: amarshiv86/sentiment-analysis-imdb-dataset

📄 License

MIT — free to use, modify, and distribute.

Downloads last month: -; Downloads are not tracked for this model. How to track

Model tree for amarshiv86/sentiment-analysis-imdb-model

Base model

distilbert/distilbert-base-uncased

Finetuned

(11954)

this model

amarshiv86
/

sentiment-analysis-imdb-model