AraBERT for Arabic Sentiment Analysis

Fine-tuned aubmindlab/bert-base-arabertv02 for Arabic sentiment classification (positive/negative).

๐ŸŽฏ Model Description

This model classifies Arabic text into positive or negative sentiment with 92.87% accuracy. Built using transfer learning on BERT-base-arabertv02 with task-specific fine-tuning for binary sentiment analysis.

๐Ÿš€ Quick Start

from transformers import pipeline

# Load model
classifier = pipeline("sentiment-analysis", model="Belall87/arabert-arabic-sentiment")

# Predict
result = classifier("ู‡ุฐุง ุงู„ู…ู†ุชุฌ ุฑุงุฆุน ุฌุฏุงู‹")
print(result)
# [{'label': 'POSITIVE', 'score': 0.95}]

# Batch prediction
texts = [
    "ุงู„ุฎุฏู…ุฉ ู…ู…ุชุงุฒุฉ ูˆุงู„ู…ูˆุธููŠู† ู…ุชุนุงูˆู†ูŠู†",
    "ุชุฌุฑุจุฉ ุณูŠุฆุฉ ุฌุฏุงู‹ ูˆู„ุง ุฃู†ุตุญ ุจู‡ุง"
]
results = classifier(texts)

๐Ÿ“Š Performance

Metric Value
Accuracy 92.87%
F1-Score 92.87%
Precision 92.85%
Recall 92.89%

๐Ÿ”ง Training Details

Base Model

  • Architecture: BERT-base (12 layers, 768 hidden, 12 attention heads)
  • Pre-trained: aubmindlab/bert-base-arabertv02
  • Parameters: ~110M

Training

  • Optimizer: AdamW
  • Learning Rate: 2e-5 with cosine scheduling
  • Batch Size: 8
  • Epochs: 3
  • Max Length: 256 tokens
  • Hardware: GPU (fp16 mixed precision)

Data Preprocessing

  • Arabic text normalization (alef, yeh, hamza variants)
  • Diacritics removal
  • URL, mention, hashtag filtering
  • Stratified train/val/test split (72%/8%/20%)

๐Ÿ’ก Intended Use

Direct Use

  • Arabic social media sentiment monitoring
  • Product review analysis
  • Customer feedback classification
  • Opinion mining in Arabic text

Limitations

  • Binary classification only (positive/negative)
  • Trained on Modern Standard Arabic (MSA) and dialectal mix
  • May underperform on domain-specific jargon
  • Best with text length < 256 tokens

๐Ÿ“š Training Data

Arabic sentiment dataset with stratified sampling for balanced training. Preprocessing includes normalization of Arabic orthographic variants and removal of noise.

๐Ÿ”— Links

๐Ÿ“ Citation

@misc{belal-arabert-sentiment-2025,
  author = {Belal Mahmoud Hussien},
  title = {AraBERT for Arabic Sentiment Analysis},
  year = {2025},
  publisher = {Hugging Face},
  howpublished = {\url{https://huggingface.co/Belall87/arabert-arabic-sentiment}}
}

๐Ÿ“ง Contact

Belal Mahmoud Hussien

๐Ÿ“„ License

MIT License

Downloads last month
29
Safetensors
Model size
0.1B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support