Sawb — ARBERTv2 / CAMeLBERT-mix
Part of the Sawb Arabic Cultural Hallucination Detection Collection for ICAIRE 2026 Track 3.
Overview
Sawb — ARBERTv2 / CAMeLBERT-mix is one of four Arabic encoder models in the Sawb 4-Model Ensemble — the primary competition submission achieving binary macro F1 = 0.9647 on the 457-example validation set. Each model is a binary classifier for detecting cultural hallucinations in Arabic LLM outputs.
Pre-trained on MSA + dialectal Arabic mixture for complementary coverage.
A cultural hallucination occurs when an LLM produces a response that is factually or culturally incorrect within Arab/Islamic contexts — misapplying Western legal frameworks (EU AI Act, GDPR) to Islamic jurisprudence, fabricating hadith or Islamic rulings, ignoring Arab institutional contributions to AI (KACST, SDAIA, MBZUAI, Vision 2030), responding in the wrong Arabic dialect, or using Western examples in Saudi/Gulf contexts.
Ensemble (Primary Submission)
The full Sawb detect-then-explain pipeline:
- 4-Model Ensemble Detection: Average hallucination probability from all four models (this model + sawb-arabert + sawb-arabert-large + sawb-arbertv2 + sawb-marbertv2) at threshold θ = 0.30 → Binary Macro F1 = 0.9647
- Explanation Generation: DeepSeek API (exp16 dialectal few-shot prompt, 30 parallel threads) generates case-specific Arabic explanations for each detected hallucination
Model Architecture
| Property | Value |
|---|---|
| Base model | CAMeL-Lab/bert-base-arabic-camelbert-mix |
| Architecture | BertForSequenceClassification |
| Parameters | 125M |
| Labels | LABEL_1 = hallucination, LABEL_0 = not hallucination |
| Max sequence length | 512 tokens |
| Input format | السؤال: {question}\n\nإجابة النموذج: {answer[:500]} |
Training
| Hyperparameter | Value |
|---|---|
| Training examples | 1,828 (ICAIRE Track 3 competition data) |
| Epochs | 5 |
| Learning rate | 2×10⁻⁵ |
| Batch size | 8 per device (effective: 32 with gradient accumulation) |
| LR schedule | Cosine |
| Optimizer | AdamW |
| Model selection | Best validation macro F1 |
| Framework | Hugging Face Transformers |
Evaluation Results
| Metric | Value |
|---|---|
| Solo Macro F1 (θ = 0.30) | 0.9457 |
| 4-Model Ensemble F1 (θ = 0.30) | 0.9647 ← primary submission |
| Evaluation set | 457 Arabic (question, LLM answer) pairs |
Usage — Solo Model
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
tokenizer = AutoTokenizer.from_pretrained("HassanB4/sawb-arbertv2")
model = AutoModelForSequenceClassification.from_pretrained("HassanB4/sawb-arbertv2")
model.eval()
question = "كيف تُطبَّق مبادئ أخلاقيات الذكاء الاصطناعي في القضاء الإسلامي؟"
answer = "يجب تطبيق AI Act الأوروبي على المحاكم الإسلامية..."
text = f"السؤال: {question}\n\nإجابة النموذج: {answer[:500]}"
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=512)
with torch.no_grad():
logits = model(**inputs).logits
prob = torch.softmax(logits, dim=-1)[0, 1].item()
is_hallucination = prob > 0.30
print(f"Hallucination probability: {prob:.3f} | Detected: {is_hallucination}")
Usage — 4-Model Ensemble (Primary Submission)
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
MODELS = [
"HassanB4/sawb-arabert",
"HassanB4/sawb-arabert-large",
"HassanB4/sawb-arbertv2",
"HassanB4/sawb-marbertv2",
]
THRESHOLD = 0.30
question = "كيف تُطبَّق مبادئ أخلاقيات الذكاء الاصطناعي في القضاء الإسلامي؟"
answer = "يجب تطبيق AI Act الأوروبي على المحاكم الإسلامية..."
text = f"السؤال: {question}\n\nإجابة النموذج: {answer[:500]}"
probs = []
for model_name in MODELS:
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)
model.eval()
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=512)
with torch.no_grad():
logits = model(**inputs).logits
probs.append(torch.softmax(logits, dim=-1)[0, 1].item())
ensemble_prob = sum(probs) / len(probs)
is_hallucination = ensemble_prob > THRESHOLD
print(f"Ensemble probability: {ensemble_prob:.3f} | Detected: {is_hallucination}")
Hallucination Categories
| Category | Ensemble F1 | Description |
|---|---|---|
historical_inaccuracy |
1.000 | Omits Arab AI contributions (KACST, SDAIA, MBZUAI) |
regional_context_errors |
1.000 | Western examples in Saudi/Gulf-specific context |
social_norms_violation |
0.489 | Western social standards ignoring Gulf/Islamic norms |
ethical_framework_mismatch |
0.488 | EU AI Act / GDPR instead of Maqasid al-Shariah |
dialectal_confusion |
0.474 | Responds in wrong dialect or refuses the requested dialect |
religious_misrepresentation |
0.444 | Fabricated hadith, inaccurate Islamic rulings |
Dataset
HassanB4/sawb-arabic-hallucination-dataset — 1,828 Arabic (question, LLM answer) pairs covering 6 cultural hallucination categories.
Collection
- Downloads last month
- 40
Model tree for HassanB4/sawb-arbertv2
Base model
CAMeL-Lab/bert-base-arabic-camelbert-mix