A2NLP at StanceNakba 2026: AraBERT-Based Arabic Stance Detection (Subtask B)

This repository contains the official submission of Team A2NLP to the StanceNakba 2026 Shared Task โ€“ Subtask B (Topic-Based Stance Detection), co-located with LREC-COLING 2026.

aomar85/A2NLP-STANCENAKBA2026-CROSS-TOPIC

Base model: aubmindlab/bert-base-arabertv02-twitter This model is a fine-tuned version of aubmindlab/bert-base-arabertv02-twitter on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4543
  • Accuracy: 0.8431
  • Macro F1: 0.8412
  • Weighted F1: 0.8425
  • F1 Pro: 0.8732
  • F1 Against: 0.8472
  • F1 Neutral: 0.8033

Model Description

This model performs Arabic stance classification with three labels:

  • pro
  • against
  • neutral

The architecture is based on BERT-base and fine-tuned using a prompt-based input formulation that explicitly conditions stance prediction on the topic.

Input Format

During training and inference, each instance is formatted as:

ุงู„ู‡ุฏู: {topic} [SEP] ุงู„ู…ูˆู‚ู ู…ู†: {sentence}

This prompt-based concatenation strategy was used to explicitly inject the topic context into the transformer encoder.

The model outputs a probability distribution over the three stance labels using a softmax classification head.

Intended Use

  • Arabic stance detection in social media text.

  • Topic-conditioned stance classification.

  • Research and shared-task benchmarking.

The model is particularly suited for:

  • Political discourse analysis.

  • Arabic Twitter stance modeling.

  • Experimental NLP research on stance detection.

Limitations

  • The model was trained on Arabic Twitter-style text and may not generalize well to:

    • Formal Arabic prose

    • Long documents

    • Non-political domains

  • Sensitive to distribution shift.

  • Performance may degrade on unseen topics.

  • No external data augmentation was used.

This model should not be used for high-stakes automated decision-making.

Training and Evaluation Data

The model was trained on the official StanceNakba Subtask B train/validation dataset provided by the shared task organizers.

  • Language: Arabic

  • Domain: Social media (Twitter-style text)

  • Labels: pro, against, neutral

No external labeled data were used.

Data Processing

The following preprocessing steps were applied:

  • Emoji normalization using emoji.demojize

  • Removal of non-Arabic characters

  • Removal of URLs, mentions, and hashtags

  • Diacritics removal

  • Arabic normalization:

    • Alef variants โ†’ ุง

    • ู‰ โ†’ ูŠ

    • ุฉ โ†’ ู‡

    • Removal of tatweel

  • Whitespace normalization

  • Duplicate sentence removal

  • Label encoding:

    {"pro": 0, "against": 1, "neutral": 2}

Training procedure

Cross-Validation Strategy

The model was trained using:

  • 5-fold Stratified Cross-Validation

  • StratifiedKFold(n_splits=5, shuffle=True, random_state=42)

This repository corresponds to Fold 4, which achieved the best validation macro-F1 score.


Loss Function

To address class imbalance, weighted cross-entropy loss was used.

Class weights were computed using:

compute_class_weight(class_weight="balanced")

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 10
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Accuracy Macro F1 Weighted F1 F1 Pro F1 Against F1 Neutral
0.9504 0.9615 50 0.7297 0.7157 0.7151 0.7152 0.7143 0.7191 0.7119
0.6594 1.9231 100 0.5215 0.7990 0.8000 0.8009 0.8421 0.7939 0.7639
0.4548 2.8846 150 0.4543 0.8284 0.8277 0.8288 0.8613 0.8310 0.7907
0.3114 3.8462 200 0.4538 0.8431 0.8412 0.8425 0.8732 0.8472 0.8033
0.2346 4.8077 250 0.5162 0.8382 0.8360 0.8371 0.8552 0.8493 0.8034
0.1895 5.7692 300 0.5695 0.8186 0.8180 0.8195 0.8592 0.8244 0.7704

Framework versions

  • Transformers 5.0.0
  • Pytorch 2.9.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.22.2

Reproducibility

  • 5-fold Stratified Cross-Validation
  • Seed: 42
  • Weighted Cross-Entropy Loss
  • Early Stopping (patience=2)
  • Best checkpoint selected by Macro-F1
Downloads last month
259
Safetensors
Model size
0.1B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for aomar85/A2NLP-STANCENAKBA2026-CROSS-TOPIC

Finetuned
(37)
this model