🎭 Sentiment Analysis β€” IMDB Reviews

A binary sentiment classifier fine-tuned on IMDB movie reviews, predicting POSITIVE or NEGATIVE sentiment with confidence scores.


πŸ“Š Model Performance

Metric Score
Accuracy 0.894
F1 Score 0.893
ROC-AUC 0.960
Precision 0.884
Recall 0.902

Confusion Matrix

Confusion Matrix


πŸ€– Model Details

Property Value
Base model distilbert-base-uncased
Task Binary text classification
Labels NEGATIVE (0), POSITIVE (1)
Max token length 256
Training samples 5,000 (IMDB subset)
Epochs 2
Batch size 16
Learning rate 2e-5
Framework HuggingFace Transformers + Trainer API
Experiment tracking MLflow

πŸš€ How to Use

from transformers import pipeline, AutoTokenizer, AutoModelForSequenceClassification

MODEL_PATH = "amarshiv86/sentiment-analysis-imdb-model"

tokenizer = AutoTokenizer.from_pretrained(f"{MODEL_PATH}/model")
model     = AutoModelForSequenceClassification.from_pretrained(f"{MODEL_PATH}/model")

clf = pipeline("sentiment-analysis", model=model, tokenizer=tokenizer, truncation=True)

results = clf([
    "This movie was absolutely fantastic, loved every minute!",
    "Terrible film, complete waste of time.",
])

for r in results:
    print(f"{r['label']} β€” {r['score']:.1%} confidence")

πŸ” MLOps Pipeline

Automatically retrained via GitHub Actions whenever src/ or params.yaml changes:

GitHub Push
    ↓
GitHub Actions
    ↓
prepare.py β†’ train.py β†’ evaluate.py
    ↓                        ↓
model files            metrics.json
                       confusion_matrix.png
    ↓
HuggingFace Hub (this repo)

πŸ“ Repository Structure

amarshiv86/sentiment-analysis-imdb-model
β”œβ”€β”€ model/
β”‚   β”œβ”€β”€ model.safetensors      # fine-tuned weights (268 MB)
β”‚   β”œβ”€β”€ config.json            # model architecture config
β”‚   β”œβ”€β”€ tokenizer.json         # tokenizer vocab
β”‚   └── tokenizer_config.json  # tokenizer settings
β”œβ”€β”€ artifacts/
β”‚   └── confusion_matrix.png   # evaluation plot
└── metrics.json               # latest eval metrics

πŸ“„ Dataset

Trained on a 5,000-sample subset of the IMDB dataset. Full processed dataset: amarshiv86/sentiment-analysis-imdb-dataset


πŸ“„ License

MIT β€” free to use, modify, and distribute.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for amarshiv86/sentiment-analysis-imdb-model

Finetuned
(11166)
this model

Dataset used to train amarshiv86/sentiment-analysis-imdb-model

Space using amarshiv86/sentiment-analysis-imdb-model 1