Sawb — MARBERTv2 / CAMeLBERT-MSA-half

Part of the Sawb Arabic Cultural Hallucination Detection Collection for ICAIRE 2026 Track 3.

Overview

Sawb — MARBERTv2 / CAMeLBERT-MSA-half is one of four Arabic encoder models in the Sawb 4-Model Ensemble — the primary competition submission achieving binary macro F1 = 0.9647 on the 457-example validation set. Each model is a binary classifier for detecting cultural hallucinations in Arabic LLM outputs.

Multi-dialect Arabic pre-training — particularly valuable for detecting dialectal confusion hallucinations.

A cultural hallucination occurs when an LLM produces a response that is factually or culturally incorrect within Arab/Islamic contexts — misapplying Western legal frameworks (EU AI Act, GDPR) to Islamic jurisprudence, fabricating hadith or Islamic rulings, ignoring Arab institutional contributions to AI (KACST, SDAIA, MBZUAI, Vision 2030), responding in the wrong Arabic dialect, or using Western examples in Saudi/Gulf contexts.

Ensemble (Primary Submission)

The full Sawb detect-then-explain pipeline:

  1. 4-Model Ensemble Detection: Average hallucination probability from all four models (this model + sawb-arabert + sawb-arabert-large + sawb-arbertv2 + sawb-marbertv2) at threshold θ = 0.30 → Binary Macro F1 = 0.9647
  2. Explanation Generation: DeepSeek API (exp16 dialectal few-shot prompt, 30 parallel threads) generates case-specific Arabic explanations for each detected hallucination

Model Architecture

Property Value
Base model CAMeL-Lab/bert-base-arabic-camelbert-msa-half
Architecture BertForSequenceClassification
Parameters 125M
Labels LABEL_1 = hallucination, LABEL_0 = not hallucination
Max sequence length 512 tokens
Input format السؤال: {question}\n\nإجابة النموذج: {answer[:500]}

Training

Hyperparameter Value
Training examples 1,828 (ICAIRE Track 3 competition data)
Epochs 5
Learning rate 2×10⁻⁵
Batch size 8 per device (effective: 32 with gradient accumulation)
LR schedule Cosine
Optimizer AdamW
Model selection Best validation macro F1
Framework Hugging Face Transformers

Evaluation Results

Metric Value
Solo Macro F1 (θ = 0.30) 0.9264
4-Model Ensemble F1 (θ = 0.30) 0.9647 ← primary submission
Evaluation set 457 Arabic (question, LLM answer) pairs

Usage — Solo Model

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

tokenizer = AutoTokenizer.from_pretrained("HassanB4/sawb-marbertv2")
model = AutoModelForSequenceClassification.from_pretrained("HassanB4/sawb-marbertv2")
model.eval()

question = "كيف تُطبَّق مبادئ أخلاقيات الذكاء الاصطناعي في القضاء الإسلامي؟"
answer = "يجب تطبيق AI Act الأوروبي على المحاكم الإسلامية..."

text = f"السؤال: {question}\n\nإجابة النموذج: {answer[:500]}"
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=512)

with torch.no_grad():
    logits = model(**inputs).logits

prob = torch.softmax(logits, dim=-1)[0, 1].item()
is_hallucination = prob > 0.30
print(f"Hallucination probability: {prob:.3f} | Detected: {is_hallucination}")

Usage — 4-Model Ensemble (Primary Submission)

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

MODELS = [
    "HassanB4/sawb-arabert",
    "HassanB4/sawb-arabert-large",
    "HassanB4/sawb-arbertv2",
    "HassanB4/sawb-marbertv2",
]
THRESHOLD = 0.30

question = "كيف تُطبَّق مبادئ أخلاقيات الذكاء الاصطناعي في القضاء الإسلامي؟"
answer = "يجب تطبيق AI Act الأوروبي على المحاكم الإسلامية..."
text = f"السؤال: {question}\n\nإجابة النموذج: {answer[:500]}"

probs = []
for model_name in MODELS:
    tokenizer = AutoTokenizer.from_pretrained(model_name)
    model = AutoModelForSequenceClassification.from_pretrained(model_name)
    model.eval()
    inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=512)
    with torch.no_grad():
        logits = model(**inputs).logits
    probs.append(torch.softmax(logits, dim=-1)[0, 1].item())

ensemble_prob = sum(probs) / len(probs)
is_hallucination = ensemble_prob > THRESHOLD
print(f"Ensemble probability: {ensemble_prob:.3f} | Detected: {is_hallucination}")

Hallucination Categories

Category Ensemble F1 Description
historical_inaccuracy 1.000 Omits Arab AI contributions (KACST, SDAIA, MBZUAI)
regional_context_errors 1.000 Western examples in Saudi/Gulf-specific context
social_norms_violation 0.489 Western social standards ignoring Gulf/Islamic norms
ethical_framework_mismatch 0.488 EU AI Act / GDPR instead of Maqasid al-Shariah
dialectal_confusion 0.474 Responds in wrong dialect or refuses the requested dialect
religious_misrepresentation 0.444 Fabricated hadith, inaccurate Islamic rulings

Dataset

HassanB4/sawb-arabic-hallucination-dataset — 1,828 Arabic (question, LLM answer) pairs covering 6 cultural hallucination categories.

Collection

Sawb Arabic Cultural Hallucination Detection Collection

Downloads last month
38
Safetensors
Model size
0.2B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for HassanB4/sawb-marbertv2

Finetuned
(1)
this model

Dataset used to train HassanB4/sawb-marbertv2

Collection including HassanB4/sawb-marbertv2