A2NLP at StanceNakba 2026: AraBERT-Based Arabic Stance Detection (Subtask B)
This repository contains the official submission of Team A2NLP to the StanceNakba 2026 Shared Task โ Subtask B (Topic-Based Stance Detection), co-located with LREC-COLING 2026.
Base model: aubmindlab/bert-base-arabertv02-twitter
Best Validation Macro-F1: 0.8434
Team A2NLP
A2NLP is a research team focusing on Arabic Natural Language Processing, with interests in stance detection, political discourse analysis, and transformer-based modeling.
A2NLP-StanceNakba2026-SubtaskB-AraBERTv02-Twitter
This model is a fine-tuned version of aubmindlab/bert-base-arabertv02-twitter on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 0.4374
- Accuracy: 0.8452
- Macro F1: 0.8434
- Weighted F1: 0.8443
- F1 Pro: 0.8870
- F1 Against: 0.8346
- F1 Neutral: 0.8085
Model Description
This model performs Arabic stance classification with three labels:
proagainstneutral
The architecture is based on BERT-base and fine-tuned using a prompt-based input formulation that explicitly conditions stance prediction on the topic.
Input Format
During training and inference, each instance is formatted as:
ุงููุฏู: {topic} [SEP] ุงูู
ููู ู
ู: {sentence}
This prompt-based concatenation strategy was used to explicitly inject the topic context into the transformer encoder.
The model outputs a probability distribution over the three stance labels using a softmax classification head.
Intended Uses & Limitations
Intended Use
Arabic stance detection in social media text.
Topic-conditioned stance classification.
Research and shared-task benchmarking.
The model is particularly suited for:
Political discourse analysis.
Arabic Twitter stance modeling.
Experimental NLP research on stance detection.
Limitations
The model was trained on Arabic Twitter-style text and may not generalize well to:
Formal Arabic prose
Long documents
Non-political domains
Sensitive to distribution shift.
Performance may degrade on unseen topics.
No external data augmentation was used.
This model should not be used for high-stakes automated decision-making.
Training and Evaluation Data
The model was trained on the official StanceNakba Subtask B train/validation dataset provided by the shared task organizers.
Language: Arabic
Domain: Social media (Twitter-style text)
Labels:
pro,against,neutral
No external labeled data were used.
Data Processing
The following preprocessing steps were applied:
Emoji normalization using
emoji.demojizeRemoval of non-Arabic characters
Removal of URLs, mentions, and hashtags
Diacritics removal
Arabic normalization:
Alef variants โ ุง
ู โ ู
ุฉ โ ู
Removal of tatweel
Whitespace normalization
Duplicate sentence removal
Label encoding:
{"pro": 0, "against": 1, "neutral": 2}
Training procedure
Cross-Validation Strategy
The model was trained using:
5-fold Stratified Cross-Validation
StratifiedKFold(n_splits=5, shuffle=True, random_state=42)
This repository corresponds to Fold 4, which achieved the best validation macro-F1 score.
Loss Function
To address class imbalance, weighted cross-entropy loss was used.
Class weights were computed using:
compute_class_weight(class_weight="balanced")
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 16
- eval_batch_size: 16
- seed: 42
- optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- num_epochs: 10
- mixed_precision_training: Native AMP
Training results
| Training Loss | Epoch | Step | Validation Loss | Accuracy | Macro F1 | Weighted F1 | F1 Pro | F1 Against | F1 Neutral |
|---|---|---|---|---|---|---|---|---|---|
| 0.9856 | 1.1628 | 50 | 0.7085 | 0.7321 | 0.7342 | 0.7334 | 0.7963 | 0.6723 | 0.7339 |
| 0.6344 | 2.3256 | 100 | 0.4542 | 0.8333 | 0.8330 | 0.8334 | 0.8571 | 0.8254 | 0.8163 |
| 0.4127 | 3.4884 | 150 | 0.4393 | 0.8452 | 0.8434 | 0.8443 | 0.8870 | 0.8346 | 0.8085 |
| 0.2872 | 4.6512 | 200 | 0.4369 | 0.8274 | 0.8265 | 0.8276 | 0.8727 | 0.8189 | 0.7879 |
| 0.2028 | 5.8140 | 250 | 0.4617 | 0.8333 | 0.8331 | 0.8332 | 0.8522 | 0.8224 | 0.8246 |
Framework versions
- Transformers 5.0.0
- Pytorch 2.9.0+cu128
- Datasets 4.0.0
- Tokenizers 0.22.2
Reproducibility
- 5-fold Stratified Cross-Validation
- Seed: 42
- Weighted Cross-Entropy Loss
- Early Stopping (patience=2)
- Best checkpoint selected by Macro-F1
- Downloads last month
- 17
Model tree for aomar85/A2NLP-StanceNakba2026-SubtaskB-AraBERTv02-Twitter
Base model
aubmindlab/bert-base-arabertv02-twitter