AraBERT for Arabic Sentiment Analysis

Fine-tuned aubmindlab/bert-base-arabertv02 for Arabic sentiment classification (positive/negative).

๐ŸŽฏ Model Description

This model classifies Arabic text into positive or negative sentiment with 92.87% accuracy. Built using transfer learning on BERT-base-arabertv02 with task-specific fine-tuning for binary sentiment analysis.

๐Ÿš€ Quick Start

from transformers import pipeline

# Load model
classifier = pipeline("sentiment-analysis", model="Belall87/arabert-arabic-sentiment")

# Predict
result = classifier("ู‡ุฐุง ุงู„ู…ู†ุชุฌ ุฑุงุฆุน ุฌุฏุงู‹")
print(result)
# [{'label': 'POSITIVE', 'score': 0.95}]

# Batch prediction
texts = [
    "ุงู„ุฎุฏู…ุฉ ู…ู…ุชุงุฒุฉ ูˆุงู„ู…ูˆุธููŠู† ู…ุชุนุงูˆู†ูŠู†",
    "ุชุฌุฑุจุฉ ุณูŠุฆุฉ ุฌุฏุงู‹ ูˆู„ุง ุฃู†ุตุญ ุจู‡ุง"
]
results = classifier(texts)

๐Ÿ“Š Performance

Metric Value
Accuracy 92.87%
F1-Score 92.87%
Precision 92.85%
Recall 92.89%

๐Ÿ”ง Training Details

Base Model

  • Architecture: BERT-base (12 layers, 768 hidden, 12 attention heads)
  • Pre-trained: aubmindlab/bert-base-arabertv02
  • Parameters: ~110M

Training

  • Optimizer: AdamW
  • Learning Rate: 2e-5 with cosine scheduling
  • Batch Size: 8
  • Epochs: 3
  • Max Length: 256 tokens
  • Hardware: GPU (fp16 mixed precision)

Data Preprocessing

  • Arabic text normalization (alef, yeh, hamza variants)
  • Diacritics removal
  • URL, mention, hashtag filtering
  • Stratified train/val/test split (72%/8%/20%)

๐Ÿ’ก Intended Use

Direct Use

  • Arabic social media sentiment monitoring
  • Product review analysis
  • Customer feedback classification
  • Opinion mining in Arabic text

Limitations

  • Binary classification only (positive/negative)
  • Trained on Modern Standard Arabic (MSA) and dialectal mix
  • May underperform on domain-specific jargon
  • Best with text length < 256 tokens

๐Ÿ“š Training Data

Arabic sentiment dataset with stratified sampling for balanced training. Preprocessing includes normalization of Arabic orthographic variants and removal of noise.

๐Ÿ”— Links

๐Ÿ“ Citation

@misc{belal-arabert-sentiment-2025,
  author = {Belal Mahmoud Hussien},
  title = {AraBERT for Arabic Sentiment Analysis},
  year = {2025},
  publisher = {Hugging Face},
  howpublished = {\url{https://huggingface.co/Belall87/arabert-arabic-sentiment}}
}

๐Ÿ“ง Contact

Belal Mahmoud Hussien

๐Ÿ“„ License

MIT License

Downloads last month
3
Safetensors
Model size
0.1B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support