Update README.md
Browse files
README.md
CHANGED
|
@@ -35,12 +35,6 @@ This model is the result of a 2-stage curriculum learning approach:
|
|
| 35 |
- **Stage 1 (V1)**: Fine-tuned on 569k Norwegian NLI samples for semantic understanding
|
| 36 |
- **Stage 2 (This model)**: Further fine-tuned on 103k Norwegian/Danish QA and paraphrase samples
|
| 37 |
|
| 38 |
-
**Key Features:**
|
| 39 |
-
- ✅ **Lowest overfitting**: Only 1.18% NDCG@10 degradation (vs 2.5-3.4% for other configurations)
|
| 40 |
-
- ✅ **Decreasing eval loss**: -3.3% improvement during training (only configuration to improve generalization)
|
| 41 |
-
- ✅ **Optimal hyperparameters**: Very low learning rate (5e-6), strong regularization, no warmup
|
| 42 |
-
- ✅ **Production-ready**: Stable performance from step 500 onwards
|
| 43 |
-
|
| 44 |
## Training Details
|
| 45 |
|
| 46 |
### Stage 2 Training Configuration
|
|
@@ -126,17 +120,6 @@ This model is designed for:
|
|
| 126 |
- **PAWS-X Norwegian**: 21,829 paraphrase pairs
|
| 127 |
- **Supervised-DA**: 74,560 Danish sentence pairs
|
| 128 |
|
| 129 |
-
## Evaluation & Comparison
|
| 130 |
-
|
| 131 |
-
This model was selected from 4 Stage 2 variants based on overfitting analysis:
|
| 132 |
-
|
| 133 |
-
| Variant | Best NDCG@10 | Degradation | Eval Loss Δ | Verdict |
|
| 134 |
-
|---------|--------------|-------------|-------------|---------|
|
| 135 |
-
| **This model (V3)** | **0.8781** | **1.18%** ✓ | **-3.3%** ✓ | **Winner** |
|
| 136 |
-
| Original | 0.8750 | 2.52% | +8.2% ⚠️ | High eval loss increase |
|
| 137 |
-
| Filtered | 0.9099 | 2.65% | +6.5% ⚠️ | Unfair dev set |
|
| 138 |
-
| V2 (no warmup) | 0.8692 | 3.36% | +1.1% ⚠️ | Poor peak performance |
|
| 139 |
-
|
| 140 |
## Limitations
|
| 141 |
|
| 142 |
- Optimized primarily for Norwegian text (with Danish/Swedish support)
|
|
|
|
| 35 |
- **Stage 1 (V1)**: Fine-tuned on 569k Norwegian NLI samples for semantic understanding
|
| 36 |
- **Stage 2 (This model)**: Further fine-tuned on 103k Norwegian/Danish QA and paraphrase samples
|
| 37 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 38 |
## Training Details
|
| 39 |
|
| 40 |
### Stage 2 Training Configuration
|
|
|
|
| 120 |
- **PAWS-X Norwegian**: 21,829 paraphrase pairs
|
| 121 |
- **Supervised-DA**: 74,560 Danish sentence pairs
|
| 122 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 123 |
## Limitations
|
| 124 |
|
| 125 |
- Optimized primarily for Norwegian text (with Danish/Swedish support)
|