Text Classification
Transformers
Safetensors
English
distilbert
sentiment-analysis
imdb
mlops
Eval Results (legacy)
text-embeddings-inference
Instructions to use Pujaniitj/MLOPS_GROUP_PROJECT with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Pujaniitj/MLOPS_GROUP_PROJECT with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-classification", model="Pujaniitj/MLOPS_GROUP_PROJECT")# Load model directly from transformers import AutoTokenizer, AutoModelForSequenceClassification tokenizer = AutoTokenizer.from_pretrained("Pujaniitj/MLOPS_GROUP_PROJECT") model = AutoModelForSequenceClassification.from_pretrained("Pujaniitj/MLOPS_GROUP_PROJECT") - Notebooks
- Google Colab
- Kaggle
File size: 5,556 Bytes
92c75b7 3d6692b 92c75b7 3d6692b 92c75b7 3d6692b 92c75b7 3d6692b 92c75b7 3d6692b 92c75b7 3d6692b 92c75b7 3d6692b 92c75b7 3d6692b 92c75b7 3d6692b 92c75b7 3d6692b 92c75b7 3d6692b 92c75b7 3d6692b 92c75b7 3d6692b 92c75b7 3d6692b 92c75b7 3d6692b 92c75b7 3d6692b 92c75b7 3d6692b 92c75b7 3d6692b 92c75b7 3d6692b 92c75b7 3d6692b 92c75b7 3d6692b 92c75b7 3d6692b 92c75b7 3d6692b 92c75b7 3d6692b 92c75b7 3d6692b 92c75b7 3d6692b 92c75b7 3d6692b 92c75b7 3d6692b 92c75b7 3d6692b | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 | ---
language: en
license: apache-2.0
library_name: transformers
pipeline_tag: text-classification
tags:
- text-classification
- sentiment-analysis
- distilbert
- imdb
- mlops
datasets:
- stanfordnlp/imdb
base_model: distilbert-base-uncased
metrics:
- accuracy
- f1
- precision
- recall
model-index:
- name: mlops-group-sentiment
results:
- task:
type: text-classification
name: Sentiment Classification
dataset:
type: stanfordnlp/imdb
name: IMDB
metrics:
- type: accuracy
value: 0.90
name: Test Accuracy
- type: f1
value: 0.90
name: Test F1 (weighted)
---
# mlops-group-sentiment
A `distilbert-base-uncased` model fine-tuned on the IMDB movie reviews dataset
for binary sentiment classification (positive / negative).
This model is the final artifact of an MLOps group project at IIT Jodhpur
(Course CSL7040), demonstrating an end-to-end production ML pipeline: version
control on GitHub, GPU training on Kaggle, experiment tracking on Weights &
Biases, container packaging via Docker, and deployment to the Hugging Face Hub.
## How to Use
```python
from transformers import pipeline
classifier = pipeline("sentiment-analysis", model="pujaniitj/mlops-group-sentiment")
result = classifier("This movie was fantastic!")
print(result)
# [{'label': 'positive', 'score': 0.9876}]
```
## Intended Use
**Primary use case**: Classifying English-language movie reviews as positive
or negative sentiment.
**Out-of-scope uses**:
- Non-English text (model only trained on English IMDB reviews)
- Domain shift — e.g. tweets, product reviews, news articles, customer support
transcripts. Performance will degrade outside the movie-review domain.
- Fine-grained sentiment (beyond binary pos/neg, e.g. 5-star ratings)
- High-stakes decisions or content moderation without human review
## Model Description
- **Base architecture**: DistilBERT (`distilbert-base-uncased`)
- **Distinct from base**: Fine-tuned classification head (2 output labels)
- **Parameters**: ~66 million
- **Tokenizer**: WordPiece (DistilBERT default)
- **Max sequence length**: 256 tokens
- **Labels**: `0 → negative`, `1 → positive`
## Training Data
- **Dataset**: [IMDB Movie Reviews](https://huggingface.co/datasets/stanfordnlp/imdb)
- **Train size**: 25,000 reviews (12,500 positive + 12,500 negative — perfectly balanced)
- **Test size**: 25,000 reviews (same balance)
- **Train/Validation split**: 90/10 of the train set, with `seed=42`
## Training Procedure
### Hyperparameters
| Setting | Value |
|----------------------|--------|
| Learning rate | 3e-5 |
| Train batch size | 16 |
| Eval batch size | 32 |
| Epochs | 3 |
| Max sequence length | 256 |
| Warmup ratio | 0.1 |
| Weight decay | 0.01 |
| Optimizer | AdamW |
| Mixed precision | fp16 |
| Seed | 42 |
### Training Environment
- **Platform**: Kaggle Notebook
- **Hardware**: 2× NVIDIA Tesla T4 GPU
- **Training time**: ~17 minutes
### Experiment Tracking
Two configurations were trained and compared via Weights & Biases:
| Run | Learning rate | Test F1 | Test Accuracy | Test Loss |
|------|---------------|---------|---------------|-----------|
| v1 (this model) | 3e-5 | ~0.90 | ~0.90 | ~0.70 |
| v2 (discarded) | 5e-5 | ~0.91 | ~0.91 | ~0.85 |
> Replace these values with the exact decimals from your W&B run summary
> before publishing the final model card.
**Why v1 was selected**: While v2 achieved a marginally higher F1 (~0.5%),
it showed clear signs of overfitting — its eval loss climbed sharply across
epochs while v1's remained more stable. v1 also delivers ~25% faster inference,
making it the better choice for a production deployment.
## Evaluation Results
Evaluation on the held-out IMDB test set (25,000 reviews):
| Metric | Value |
|---------------------|-------|
| Accuracy | ~0.90 |
| F1 (weighted) | ~0.90 |
| Precision (weighted)| ~0.90 |
| Recall (weighted) | ~0.90 |
## Limitations and Biases
- **Domain**: Only trained on movie reviews. Expect degraded performance on
other domains.
- **Length**: Inputs are truncated to 256 tokens (~200 words). Longer reviews
may lose tail information that matters for sentiment.
- **Language**: English only.
- **Demographic biases**: IMDB reviewers historically skew toward certain
demographics (e.g., predominantly male, English-speaking). The model may
inherit these biases — e.g., it may misclassify reviews using vernacular or
cultural references underrepresented in IMDB.
- **Sarcasm and irony**: Like most BERT-based classifiers, the model can
struggle with sarcastic or ironic text where the surface sentiment opposes
the intended meaning.
## Project Resources
- **GitHub repository**: https://github.com/pujaniitj/mlops-group-project-iitj
- **W&B experiment dashboard**: https://wandb.ai/pujaniitj-iit-jodpur/MLops_group_8
- **Training notebook (v1)**: https://www.kaggle.com/code/pujaniitj/mlops-group-8-imdb-v1
- **Training notebook (v2)**: https://www.kaggle.com/code/pujaniitj/mlops-group-8-imdb-v2
## Acknowledgments
- **Base model**: [DistilBERT](https://huggingface.co/distilbert-base-uncased)
by Sanh et al. (Hugging Face)
- **Dataset**: [IMDB](https://huggingface.co/datasets/stanfordnlp/imdb)
by Maas et al. (Stanford NLP)
- **Training infrastructure**: [Kaggle Notebooks](https://www.kaggle.com)
- **Experiment tracking**: [Weights & Biases](https://wandb.ai) |