A2NLP at StanceNakba 2026: AraBERT-Based Arabic Stance Detection (Subtask B)
This repository contains the official submission of Team A2NLP to the StanceNakba 2026 Shared Task โ Subtask B (Topic-Based Stance Detection), co-located with LREC-COLING 2026.
aomar85/A2NLP-STANCENAKBA2026-CROSS-TOPIC
Base model: aubmindlab/bert-base-arabertv02-twitter This model is a fine-tuned version of aubmindlab/bert-base-arabertv02-twitter on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 0.4543
- Accuracy: 0.8431
- Macro F1: 0.8412
- Weighted F1: 0.8425
- F1 Pro: 0.8732
- F1 Against: 0.8472
- F1 Neutral: 0.8033
Model Description
This model performs Arabic stance classification with three labels:
proagainstneutral
The architecture is based on BERT-base and fine-tuned using a prompt-based input formulation that explicitly conditions stance prediction on the topic.
Input Format
During training and inference, each instance is formatted as:
ุงููุฏู: {topic} [SEP] ุงูู
ููู ู
ู: {sentence}
This prompt-based concatenation strategy was used to explicitly inject the topic context into the transformer encoder.
The model outputs a probability distribution over the three stance labels using a softmax classification head.
Intended Use
Arabic stance detection in social media text.
Topic-conditioned stance classification.
Research and shared-task benchmarking.
The model is particularly suited for:
Political discourse analysis.
Arabic Twitter stance modeling.
Experimental NLP research on stance detection.
Limitations
The model was trained on Arabic Twitter-style text and may not generalize well to:
Formal Arabic prose
Long documents
Non-political domains
Sensitive to distribution shift.
Performance may degrade on unseen topics.
No external data augmentation was used.
This model should not be used for high-stakes automated decision-making.
Training and Evaluation Data
The model was trained on the official StanceNakba Subtask B train/validation dataset provided by the shared task organizers.
Language: Arabic
Domain: Social media (Twitter-style text)
Labels:
pro,against,neutral
No external labeled data were used.
Data Processing
The following preprocessing steps were applied:
Emoji normalization using
emoji.demojizeRemoval of non-Arabic characters
Removal of URLs, mentions, and hashtags
Diacritics removal
Arabic normalization:
Alef variants โ ุง
ู โ ู
ุฉ โ ู
Removal of tatweel
Whitespace normalization
Duplicate sentence removal
Label encoding:
{"pro": 0, "against": 1, "neutral": 2}
Training procedure
Cross-Validation Strategy
The model was trained using:
5-fold Stratified Cross-Validation
StratifiedKFold(n_splits=5, shuffle=True, random_state=42)
This repository corresponds to Fold 4, which achieved the best validation macro-F1 score.
Loss Function
To address class imbalance, weighted cross-entropy loss was used.
Class weights were computed using:
compute_class_weight(class_weight="balanced")
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 16
- eval_batch_size: 16
- seed: 42
- optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- num_epochs: 10
- mixed_precision_training: Native AMP
Training results
| Training Loss | Epoch | Step | Validation Loss | Accuracy | Macro F1 | Weighted F1 | F1 Pro | F1 Against | F1 Neutral |
|---|---|---|---|---|---|---|---|---|---|
| 0.9504 | 0.9615 | 50 | 0.7297 | 0.7157 | 0.7151 | 0.7152 | 0.7143 | 0.7191 | 0.7119 |
| 0.6594 | 1.9231 | 100 | 0.5215 | 0.7990 | 0.8000 | 0.8009 | 0.8421 | 0.7939 | 0.7639 |
| 0.4548 | 2.8846 | 150 | 0.4543 | 0.8284 | 0.8277 | 0.8288 | 0.8613 | 0.8310 | 0.7907 |
| 0.3114 | 3.8462 | 200 | 0.4538 | 0.8431 | 0.8412 | 0.8425 | 0.8732 | 0.8472 | 0.8033 |
| 0.2346 | 4.8077 | 250 | 0.5162 | 0.8382 | 0.8360 | 0.8371 | 0.8552 | 0.8493 | 0.8034 |
| 0.1895 | 5.7692 | 300 | 0.5695 | 0.8186 | 0.8180 | 0.8195 | 0.8592 | 0.8244 | 0.7704 |
Framework versions
- Transformers 5.0.0
- Pytorch 2.9.0+cu128
- Datasets 4.0.0
- Tokenizers 0.22.2
Reproducibility
- 5-fold Stratified Cross-Validation
- Seed: 42
- Weighted Cross-Entropy Loss
- Early Stopping (patience=2)
- Best checkpoint selected by Macro-F1
- Downloads last month
- 259
Model tree for aomar85/A2NLP-STANCENAKBA2026-CROSS-TOPIC
Base model
aubmindlab/bert-base-arabertv02-twitter