Model Card for Model ID
FA-AraBert is an Arabic binary text classification model designed to detect whether a user query is related to first-aid. The classifier serves as the intent detection and safety filtering component of an MSA first-aid chatbot pipeline. Two models are developed and evaluated in this project: FA-AraBERTv2 and FA-AraBERTv0.2, both fine-tuned on the FALAH-Mix dataset (1,028 questionโanswer (QA) pairs, including 924 non first-aid pairs and 104 first-aid pairs) based on AraBERT base models. These classifiers were systematically compared under multiple training configurations to identify the most suitable model for deployment.
Model Details
Model Description
- Model name: FA-AraBERTv0.2
- Developed by: MABROUK Imane
- Supervised by: Dr. Rana R. Malhas (bigIR Research Group, Qatar University) & Dr. Imane Chlioui (INSEA)
- Task: Binary text classification
- Domain: First Aid / Emergency Care
- Labels:
- Funded by: Self-funded academic project (Academic graduation project)
The FA-AraBERT classifier was developed as part of the PFE project titled:
Towards Building an Arabic First-Aid Chatbot using FA-AraBERT Classifier and FALAH Dataset.
Shared by: MABROUK Imane
Model type: Transformer-based text classification model (BERT architecture)
Language(s) (NLP): MSA(Modern Standard Arabic)
License: Apache 2.0
Finetuned from model: AraBERTv02
Model Sources
- Repository: https://github.com/Imymab/FA-AraBERT-Classifier
Uses
Direct Use
This model can be used directly for:
- First-aid and emergency query detection
- Binary text classification (First-aid/Non First-aid)
Downstream Use
The model can be integrated into:
- Medical conversational agents
- Healthcare assistance tools for emergency guidance
Out-of-Scope Use
This model is not suitable for:
- Use in high-stakes clinical decision-making without human supervision
- Non-Arabic text processing
- Tasks requiring deep medical reasoning or long-context understanding
Bias, Risks, and Limitations
- The model is trained on a limited dataset (FALAH-Mix), which may not fully represent all medical domains.
- Class imbalance in the dataset may affect performance on underrepresented categories.
- The model may reflect biases present in the existing arabic medical datasets (AHD, MAQA).
- It is not a substitute for professional medical advice.
Recommendations
- Try to explore additional techniques to improve performance, particularly by addressing the class imbalance in the FALAH-Mix dataset (10% first-aid, 90% nonโfirst-aid).
How to Get Started with the Model
Use the code below to get started with the model:
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
model_name = "imaneumabderahmane/Arabertv02-classifier-FA"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)
text = "ู
ุง ูู ุงูุฅุณุนุงูุงุช ุงูุฃูููุฉ ูุญุฑูู ุงูุฏุฑุฌุฉ ุงูุฃูููุ"
inputs = tokenizer(text, return_tensors="pt", truncation=True)
with torch.no_grad():
outputs = model(**inputs)
prediction = torch.argmax(outputs.logits, dim=-1).item()
print(prediction)
Training Details
Training Data
The FA-AraBERTv02 classifier was trained and evaluated on the FALAH-Mix dataset, which contains 1,028 Arabic questionโanswer pairs (924 nonโfirst-aid QA pairs and 104 first-aid QA pairs). The dataset exhibits a strong class imbalance, with approximately 90% nonโfirst-aid queries and 10% first-aid queries. The data was split into training, development, and test sets while preserving the original class distribution.
For more details about the FALAH and FALAH-Mix datasets, developed as part of the PFE project, please refer to: https://huggingface.co/datasets/imaneumabderahmane/FALAH
To mitigate class imbalance, the training set was augmented with additional first-aid samples from external datasets, including the Mayo Clinic First-Aid dataset (374 first-aid QA pairs) and the 68 first-aid QA pairs from the AHD dataset. This resulted in a balanced training set of 1,184 samples, while the development and test sets remained unchanged.
The following table presents the FALAH-Mix dataset before balancing the training set:
The following table presents the FALAH-Mix dataset after balancing the training set:
Note: The FALAH-Mix dataset was split according to the emergency labels. As a result, the training, development, and test sets each contain approximately 90% nonโfirst-aid QA pairs and 10% first-aid QA pairs. The balancing strategy was applied only to the training set in order to evaluate its impact on the classifier.
Training Procedure
Preprocessing
- Arabic text normalization
- Translating questions from Arabic Dialect into Modern Standard Arabic using this model: https://huggingface.co/HamzaNaser/Dialects-to-MSA-Transformer
- Tokenization using AraBERT tokenizer
Training Hyperparameters
The models were fine-tuned using supervised learning with the following configuration:
- Optimizer: AdamW
- Learning rate: 3 ร 10โปโต
- Batch size: 16
- Epochs: 3
- Loss function: Cross-entropy
- Mixed-precision training enabled
Speeds, Sizes, Times
Evaluation
Testing Data, Factors & Metrics
Testing Data
The test split from the FALAH-Mix dataset: https://huggingface.co/datasets/imaneumabderahmane/FALAH
Metrics
- Macro F1-score, chosen to account for class imbalance
Results
The following Table summarizes the Macro F1 scores obtained by FA-AraBERTv2 and FA-AraBERTv0.2 under different training configurations.
- Downloads last month
- 28
Model tree for imaneumabderahmane/Arabertv02-classifier-FA
Base model
aubmindlab/bert-base-arabertv02