Sawb — ARBERTv2 / CAMeLBERT-mix

Part of the Sawb Arabic Cultural Hallucination Detection Collection for ICAIRE 2026 Track 3.

Overview

Sawb — ARBERTv2 / CAMeLBERT-mix is one of four Arabic encoder models in the Sawb 4-Model Ensemble — the primary competition submission achieving binary macro F1 = 0.9647 on the 457-example validation set. Each model is a binary classifier for detecting cultural hallucinations in Arabic LLM outputs.

Pre-trained on MSA + dialectal Arabic mixture for complementary coverage.

A cultural hallucination occurs when an LLM produces a response that is factually or culturally incorrect within Arab/Islamic contexts — misapplying Western legal frameworks (EU AI Act, GDPR) to Islamic jurisprudence, fabricating hadith or Islamic rulings, ignoring Arab institutional contributions to AI (KACST, SDAIA, MBZUAI, Vision 2030), responding in the wrong Arabic dialect, or using Western examples in Saudi/Gulf contexts.

Ensemble (Primary Submission)

The full Sawb detect-then-explain pipeline:

4-Model Ensemble Detection: Average hallucination probability from all four models (this model + sawb-arabert + sawb-arabert-large + sawb-arbertv2 + sawb-marbertv2) at threshold θ = 0.30 → Binary Macro F1 = 0.9647
Explanation Generation: DeepSeek API (exp16 dialectal few-shot prompt, 30 parallel threads) generates case-specific Arabic explanations for each detected hallucination

Model Architecture

Property	Value
Base model	`CAMeL-Lab/bert-base-arabic-camelbert-mix`
Architecture	`BertForSequenceClassification`
Parameters	125M
Labels	`LABEL_1` = hallucination, `LABEL_0` = not hallucination
Max sequence length	512 tokens
Input format	`السؤال: {question}\n\nإجابة النموذج: {answer[:500]}`

Training

Hyperparameter	Value
Training examples	1,828 (ICAIRE Track 3 competition data)
Epochs	5
Learning rate	2×10⁻⁵
Batch size	8 per device (effective: 32 with gradient accumulation)
LR schedule	Cosine
Optimizer	AdamW
Model selection	Best validation macro F1
Framework	Hugging Face Transformers

Evaluation Results

Metric	Value
Solo Macro F1 (θ = 0.30)	0.9457
4-Model Ensemble F1 (θ = 0.30)	0.9647 ← primary submission
Evaluation set	457 Arabic (question, LLM answer) pairs

Usage — Solo Model

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

tokenizer = AutoTokenizer.from_pretrained("HassanB4/sawb-arbertv2")
model = AutoModelForSequenceClassification.from_pretrained("HassanB4/sawb-arbertv2")
model.eval()

question = "كيف تُطبَّق مبادئ أخلاقيات الذكاء الاصطناعي في القضاء الإسلامي؟"
answer = "يجب تطبيق AI Act الأوروبي على المحاكم الإسلامية..."

text = f"السؤال: {question}\n\nإجابة النموذج: {answer[:500]}"
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=512)

with torch.no_grad():
    logits = model(**inputs).logits

prob = torch.softmax(logits, dim=-1)[0, 1].item()
is_hallucination = prob > 0.30
print(f"Hallucination probability: {prob:.3f} | Detected: {is_hallucination}")

Usage — 4-Model Ensemble (Primary Submission)

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

MODELS = [
    "HassanB4/sawb-arabert",
    "HassanB4/sawb-arabert-large",
    "HassanB4/sawb-arbertv2",
    "HassanB4/sawb-marbertv2",
]
THRESHOLD = 0.30

question = "كيف تُطبَّق مبادئ أخلاقيات الذكاء الاصطناعي في القضاء الإسلامي؟"
answer = "يجب تطبيق AI Act الأوروبي على المحاكم الإسلامية..."
text = f"السؤال: {question}\n\nإجابة النموذج: {answer[:500]}"

probs = []
for model_name in MODELS:
    tokenizer = AutoTokenizer.from_pretrained(model_name)
    model = AutoModelForSequenceClassification.from_pretrained(model_name)
    model.eval()
    inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=512)
    with torch.no_grad():
        logits = model(**inputs).logits
    probs.append(torch.softmax(logits, dim=-1)[0, 1].item())

ensemble_prob = sum(probs) / len(probs)
is_hallucination = ensemble_prob > THRESHOLD
print(f"Ensemble probability: {ensemble_prob:.3f} | Detected: {is_hallucination}")

Hallucination Categories

Category	Ensemble F1	Description
`historical_inaccuracy`	1.000	Omits Arab AI contributions (KACST, SDAIA, MBZUAI)
`regional_context_errors`	1.000	Western examples in Saudi/Gulf-specific context
`social_norms_violation`	0.489	Western social standards ignoring Gulf/Islamic norms
`ethical_framework_mismatch`	0.488	EU AI Act / GDPR instead of Maqasid al-Shariah
`dialectal_confusion`	0.474	Responds in wrong dialect or refuses the requested dialect
`religious_misrepresentation`	0.444	Fabricated hadith, inaccurate Islamic rulings