Upload README.md with huggingface_hub
Browse files
README.md
CHANGED
|
@@ -41,7 +41,7 @@ $$\mathcal{L}_{\text{total}} = \mathcal{L}_{\text{CE}} + \frac{\lambda}{B \cdot
|
|
| 41 |
|
| 42 |
with **λ = 0.1**. This soft regularization reduces divergence errors at inference time at zero architectural cost.
|
| 43 |
|
| 44 |
-

|
| 45 |
|
| 46 |
---
|
| 47 |
|