NorBERT4 Norwegian QA Embedding Model

This is NorBERT4-NLI-QA, a Norwegian sentence embedding model optimized for question-answering and semantic retrieval tasks.

Model Description

This model is the result of a 2-stage curriculum learning approach:

  • Stage 1 (V1): Fine-tuned on 569k Norwegian NLI samples for semantic understanding
  • Stage 2 (This model): Further fine-tuned on 103k Norwegian/Danish QA and paraphrase samples

Training Details

Stage 2 Training Configuration

  • Base model: thivy/norbert4-base-nli-norwegian
  • Datasets: NorQuAD (3.8k), NorOpenBookQA (2.9k), PAWS-X NO (21.8k), Supervised-DA (74.5k)
  • Total samples: ~103,000
  • Training objective: MultipleNegativesRankingLoss with in-batch negatives
  • Max sequence length: 128 tokens
  • Batch size: 32

Hyperparameters (Anti-Overfitting Optimized)

  • Learning rate: 5.0e-6 (75% lower than baseline)
  • Warmup: 0.0 (removed to prevent early peaking)
  • Weight decay: 0.015 (50% stronger regularization)
  • Gradient clipping: 1.0
  • Early stopping patience: 5
  • LR scheduler: Cosine decay

Performance Metrics

NDCG@10 Progression:

Step NDCG@10 Change
500 0.8781 Best
1000 0.8720 -0.69%
1500 0.8693 -1.00%
2000 0.8695 -0.98%
2500 0.8677 -1.18%

Evaluation Metrics (at step 500 - best checkpoint):

  • NDCG@10: 0.8781
  • MRR@10: 0.8640
  • MAP@100: 0.8658
  • Accuracy@1: 0.8331
  • Accuracy@10: 0.9219

Training Stability:

  • Eval loss: Decreased by 3.3% (0.0911 → 0.0881)
  • Best checkpoint: Step 500 (19.7% through training)
  • Final degradation: 1.18% (lowest among all variants tested)

Usage

from sentence_transformers import SentenceTransformer

# Load model
model = SentenceTransformer('thivy/norbert4-norwegian-qa')

# Encode sentences
sentences = [
    "Hva er hovedstaden i Norge?",
    "Oslo er hovedstaden i Norge.",
    "Bergen er en by på vestlandet."
]

embeddings = model.encode(sentences)

# Compute similarities
from sentence_transformers.util import cos_sim
similarities = cos_sim(embeddings[0], embeddings[1:])
print(similarities)

Intended Use

This model is designed for:

  • ✅ Norwegian question-answering systems
  • ✅ Semantic search and retrieval
  • ✅ Document similarity
  • ✅ Paraphrase detection
  • ✅ Cross-lingual tasks (Norwegian, Danish, Swedish)

Training Data

Stage 1 (NLI - Base Model)

  • NorNLI Combined: 569,000 samples
  • Format: Premise-hypothesis pairs with entailment labels

Stage 2 (QA & Paraphrase - This Model)

  • NorQuAD: 3,808 Norwegian question-answer pairs
  • NorOpenBookQA: 2,886 Norwegian QA samples
  • PAWS-X Norwegian: 21,829 paraphrase pairs
  • Supervised-DA: 74,560 Danish sentence pairs

Limitations

  • Optimized primarily for Norwegian text (with Danish/Swedish support)
  • Maximum sequence length: 128 tokens
  • Best performance on question-answering and retrieval tasks
  • May require domain adaptation for specialized domains

Model Card Authors

Thivyesh Ahilathasan

Citation

If you use this model, please cite:

@misc{norbert4-nli-qa,
  author = {Ahilathasan, Thivyesh},
  title = {NorBERT4 Norwegian QA Embedding Model},
  year = {2026},
  publisher = {HuggingFace},
  howpublished = {\url{https://huggingface.co/thivy/norbert4-norwegian-qa}}
}
Downloads last month
-
Safetensors
Model size
0.1B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support