Urdu Sentiment Classifier 🇵🇰

A fine-tuned bert-base-multilingual-cased model for Urdu sentiment analysis — classifying Urdu text as positive or negative.

Live Demo

Try it on HuggingFace Spaces

Performance

Metric	Score
Accuracy	81.00%
F1 Score (weighted)	0.8098

Example Predictions

from transformers import pipeline

classifier = pipeline("text-classification", model="H-Layba/urdu-sentiment-classifier")

classifier("یہ فلم بہت اچھی تھی")
# [{'label': 'positive', 'score': 0.9936}]

classifier("آج کا دن بہت برا تھا")
# [{'label': 'negative', 'score': 0.9918}]

Training Details

Base model: bert-base-multilingual-cased
Dataset: 50,000 Urdu movie reviews
Epochs: 5
Learning rate: 2e-5
Batch size: 32 (train), 64 (eval)
Hardware: Kaggle T4 GPU
Mixed precision: fp16

Dataset

Trained on mirfan899/imdb_urdu_reviews — 50,000 Urdu translations of IMDB movie reviews with positive/negative sentiment labels.

Part of Urdu NLP Suite

This model is part of a larger collection of fine-tuned Urdu NLP models:

Sentiment Classification ← this model
Text Summarization
Question Answering
Urdu → English Translation

Downloads last month: 125

Safetensors

Model size

0.2B params

Tensor type

F32

Dataset used to train H-Layba/urdu-sentiment-classifier

Space using H-Layba/urdu-sentiment-classifier 1

Evaluation results

accuracy on IMDB Urdu Reviews
self-reported

0.810
f1 on IMDB Urdu Reviews
self-reported

0.810