You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

Arabic NLI Binary Classifier โ€” s01-camelbert-qa

Binary Arabic text classifier (0 = faithful, 1 = unfaithful). Fine-tuned for Arabic NLI-based binary text classification.

Base model

CAMeL-Lab/bert-base-arabic-camelbert-mix

Input format

qa โ€” [CLS] question [SEP] model_answer [SEP] (official baseline format; ignores gold_answer)

Dev results

  • AUC-ROC (official, full dev n=1300): 0.9263
  • AUC-ROC (clean dev, n=800, excludes ~100 questions also seen in train): 0.8713
  • Macro F1 (official, threshold=0.50): 0.8595

Note: the official-dev number is inflated by ~500 dev rows whose questions also appear in the training set (AUC-ROC 0.9961 on that subset โ€” near-memorization). The clean-dev AUC-ROC (0.8713) is the honest generalization estimate and the number to use for ranking against future runs. Both already exceed the published CAMeLBERT baseline (0.7093 dev AUC-ROC).

Training data

Arabic training set โ€” 4,705 Arabic (question, gold_answer, model_answer) triples, 5 source LLMs, 13 knowledge domains.

Usage

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

tokenizer = AutoTokenizer.from_pretrained("HassanB4/s01-camelbert-qa")
model = AutoModelForSequenceClassification.from_pretrained("HassanB4/s01-camelbert-qa")

inputs = tokenizer(question, model_answer, return_tensors="pt", truncation=True, max_length=512)
with torch.no_grad():
    logits = model(**inputs).logits
score = torch.softmax(logits, dim=-1)[0][1].item()  # unfaithfulness score
predicted_label = int(score > 0.5)
Downloads last month
-
Safetensors
Model size
0.1B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Collection including HassanB4/s01-camelbert-qa