Arabic NLI Binary Classifier โ s01-camelbert-qa
Binary Arabic text classifier (0 = faithful, 1 = unfaithful). Fine-tuned for Arabic NLI-based binary text classification.
Base model
CAMeL-Lab/bert-base-arabic-camelbert-mix
Input format
qa โ [CLS] question [SEP] model_answer [SEP] (official baseline format; ignores gold_answer)
Dev results
- AUC-ROC (official, full dev n=1300): 0.9263
- AUC-ROC (clean dev, n=800, excludes ~100 questions also seen in train): 0.8713
- Macro F1 (official, threshold=0.50): 0.8595
Note: the official-dev number is inflated by ~500 dev rows whose questions also appear in the training set (AUC-ROC 0.9961 on that subset โ near-memorization). The clean-dev AUC-ROC (0.8713) is the honest generalization estimate and the number to use for ranking against future runs. Both already exceed the published CAMeLBERT baseline (0.7093 dev AUC-ROC).
Training data
Arabic training set โ 4,705 Arabic (question, gold_answer, model_answer) triples, 5 source LLMs, 13 knowledge domains.
Usage
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
tokenizer = AutoTokenizer.from_pretrained("HassanB4/s01-camelbert-qa")
model = AutoModelForSequenceClassification.from_pretrained("HassanB4/s01-camelbert-qa")
inputs = tokenizer(question, model_answer, return_tensors="pt", truncation=True, max_length=512)
with torch.no_grad():
logits = model(**inputs).logits
score = torch.softmax(logits, dim=-1)[0][1].item() # unfaithfulness score
predicted_label = int(score > 0.5)
- Downloads last month
- -
Collection including HassanB4/s01-camelbert-qa
Collection
6 items โข Updated