NorBERT4 Norwegian QA Embedding Model
This is NorBERT4-NLI-QA, a Norwegian sentence embedding model optimized for question-answering and semantic retrieval tasks.
Model Description
This model is the result of a 2-stage curriculum learning approach:
- Stage 1 (V1): Fine-tuned on 569k Norwegian NLI samples for semantic understanding
- Stage 2 (This model): Further fine-tuned on 103k Norwegian/Danish QA and paraphrase samples
Training Details
Stage 2 Training Configuration
- Base model:
thivy/norbert4-base-nli-norwegian - Datasets: NorQuAD (3.8k), NorOpenBookQA (2.9k), PAWS-X NO (21.8k), Supervised-DA (74.5k)
- Total samples: ~103,000
- Training objective: MultipleNegativesRankingLoss with in-batch negatives
- Max sequence length: 128 tokens
- Batch size: 32
Hyperparameters (Anti-Overfitting Optimized)
- Learning rate: 5.0e-6 (75% lower than baseline)
- Warmup: 0.0 (removed to prevent early peaking)
- Weight decay: 0.015 (50% stronger regularization)
- Gradient clipping: 1.0
- Early stopping patience: 5
- LR scheduler: Cosine decay
Performance Metrics
NDCG@10 Progression:
| Step | NDCG@10 | Change |
|---|---|---|
| 500 | 0.8781 | Best |
| 1000 | 0.8720 | -0.69% |
| 1500 | 0.8693 | -1.00% |
| 2000 | 0.8695 | -0.98% |
| 2500 | 0.8677 | -1.18% |
Evaluation Metrics (at step 500 - best checkpoint):
- NDCG@10: 0.8781
- MRR@10: 0.8640
- MAP@100: 0.8658
- Accuracy@1: 0.8331
- Accuracy@10: 0.9219
Training Stability:
- Eval loss: Decreased by 3.3% (0.0911 → 0.0881)
- Best checkpoint: Step 500 (19.7% through training)
- Final degradation: 1.18% (lowest among all variants tested)
Usage
from sentence_transformers import SentenceTransformer
# Load model
model = SentenceTransformer('thivy/norbert4-norwegian-qa')
# Encode sentences
sentences = [
"Hva er hovedstaden i Norge?",
"Oslo er hovedstaden i Norge.",
"Bergen er en by på vestlandet."
]
embeddings = model.encode(sentences)
# Compute similarities
from sentence_transformers.util import cos_sim
similarities = cos_sim(embeddings[0], embeddings[1:])
print(similarities)
Intended Use
This model is designed for:
- ✅ Norwegian question-answering systems
- ✅ Semantic search and retrieval
- ✅ Document similarity
- ✅ Paraphrase detection
- ✅ Cross-lingual tasks (Norwegian, Danish, Swedish)
Training Data
Stage 1 (NLI - Base Model)
- NorNLI Combined: 569,000 samples
- Format: Premise-hypothesis pairs with entailment labels
Stage 2 (QA & Paraphrase - This Model)
- NorQuAD: 3,808 Norwegian question-answer pairs
- NorOpenBookQA: 2,886 Norwegian QA samples
- PAWS-X Norwegian: 21,829 paraphrase pairs
- Supervised-DA: 74,560 Danish sentence pairs
Limitations
- Optimized primarily for Norwegian text (with Danish/Swedish support)
- Maximum sequence length: 128 tokens
- Best performance on question-answering and retrieval tasks
- May require domain adaptation for specialized domains
Model Card Authors
Thivyesh Ahilathasan
Citation
If you use this model, please cite:
@misc{norbert4-nli-qa,
author = {Ahilathasan, Thivyesh},
title = {NorBERT4 Norwegian QA Embedding Model},
year = {2026},
publisher = {HuggingFace},
howpublished = {\url{https://huggingface.co/thivy/norbert4-norwegian-qa}}
}
- Downloads last month
- -