You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Arabic NLI Binary Classifier — s01-camelbert-qa

Binary Arabic text classifier (0 = faithful, 1 = unfaithful). Fine-tuned for Arabic NLI-based binary text classification.

Base model

CAMeL-Lab/bert-base-arabic-camelbert-mix

Input format

qa — [CLS] question [SEP] model_answer [SEP] (official baseline format; ignores gold_answer)

Dev results

AUC-ROC (official, full dev n=1300): 0.9263
AUC-ROC (clean dev, n=800, excludes ~100 questions also seen in train): 0.8713
Macro F1 (official, threshold=0.50): 0.8595

Note: the official-dev number is inflated by ~500 dev rows whose questions also appear in the training set (AUC-ROC 0.9961 on that subset — near-memorization). The clean-dev AUC-ROC (0.8713) is the honest generalization estimate and the number to use for ranking against future runs. Both already exceed the published CAMeLBERT baseline (0.7093 dev AUC-ROC).

Training data

Arabic training set — 4,705 Arabic (question, gold_answer, model_answer) triples, 5 source LLMs, 13 knowledge domains.

Usage

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

tokenizer = AutoTokenizer.from_pretrained("HassanB4/s01-camelbert-qa")
model = AutoModelForSequenceClassification.from_pretrained("HassanB4/s01-camelbert-qa")

inputs = tokenizer(question, model_answer, return_tensors="pt", truncation=True, max_length=512)
with torch.no_grad():
    logits = model(**inputs).logits
score = torch.softmax(logits, dim=-1)[0][1].item()  # unfaithfulness score
predicted_label = int(score > 0.5)

Downloads last month: -

Safetensors

Model size

0.1B params

Tensor type

F32

Collection including HassanB4/s01-camelbert-qa

HalluModels

Collection

6 items • Updated 12 days ago