DistilBERT-Base-Uncased Quantized Model for Sentiment Analysis on IMDB Reviews

This repository hosts a quantized version of the DistilBERT model, fine-tuned for sentiment classification using the IMDB movie reviews dataset. The model has been optimized using FP16 quantization for efficient deployment without significant accuracy loss.

Model Details

Model Architecture: DistilBERT Base Uncased
Task: Binary Sentiment Classification (Positive/Negative)
Dataset: IMDB (Hugging Face Datasets)
Quantization: Float16
Fine-tuning Framework: Hugging Face Transformers

Installation

pip install transformers torch datasets scikit-learn

Loading the Model

from transformers import AutoModelForSequenceClassification, AutoTokenizer
import torch

# Load tokenizer and model
model_path = "distilbert-base-uncased"
tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForSequenceClassification.from_pretrained(model_path)
# Define test sentences
texts = [
    "The movie was fantastic! I loved every part of it.",
    "It was a total waste of time. Boring and slow.",
    "The acting was great, but the story was predictable."
]

# Tokenize and predict
for text in texts:
    inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True, max_length=256)
    inputs = {k: v.long() for k, v in inputs.items()}
    with torch.no_grad():
        outputs = model(**inputs)
    predicted_class = torch.argmax(outputs.logits, dim=1).item()
    label_map = {0: "Negative", 1: "Positive"}
    print(f"Text: {text}")
    print(f"Predicted Sentiment: {label_map[predicted_class]}\n")

Performance Metrics

Accuracy: 0.9206
Precision: 0.9193
Recall: 0.9229
F1 Score: 0.9211

Fine-Tuning Details

Dataset

The dataset is sourced from Hugging Face’s imdb dataset. It contains 50,000 labeled movie reviews (positive or negative).
The original training and testing sets were merged, shuffled, and re-split using an 80/20 ratio.

Training

Epochs: 3
Batch size: 8
Learning rate: 2e-5
Evaluation strategy: epoch

Quantization

Post-training quantization was applied using PyTorch’s half() precision (FP16) to reduce model size and inference time.

Repository Structure

.
├── quantized-model/               # Contains the quantized model files
│   ├── config.json
│   ├── model.safetensors
│   ├── tokenizer_config.json
│   ├── vocab.txt
│   └── special_tokens_map.json
├── README.md                      # Model documentation

Limitations

The model is trained specifically for binary sentiment classification on movie reviews.
FP16 quantization may result in slight numerical instability in edge cases.
Performance may degrade when used outside the IMDB domain.

Contributing

Feel free to open issues or submit pull requests to improve the model or documentation.

Downloads last month: -

Safetensors

Model size

67M params

Tensor type

F16

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support