Forseti 31B
Forseti 31B is a Swedish government domain language model created through continued pre-training (CPT) of Gemma 4 31B on 2.5 billion tokens from 260+ Swedish government sources.
Evaluation (EuroEval Swedish, canonical Modal A100)
| Benchmark | Base (Gemma 4) | Forseti CPT | Delta |
|---|---|---|---|
| swerec (MCC) | 69.7 | 78.4 | +8.8 |
| suc3 (micro-F1) | 56.5 | 77.2 | +20.7 |
| scala-sv (MCC) | 50.4 | 74.3 | +23.9 |
| multi-wiki-qa-sv (F1) | 73.1 | 78.5 | +5.4 |
| swedn (chrF3++) | 28.5 | 33.7 | +5.2 |
| mmlu-sv (MCC) | 63.1 | 72.8 | +9.7 |
| hellaswag-sv (acc) | 58.6 | 66.5 | +7.9 |
| Average | 57.1 | 68.8 | +11.7 |
All seven Swedish EuroEval benchmarks improved, with no capability regression.
Training Details
| Parameter | Value |
|---|---|
| Base model | google/gemma-4-31B (text-only, 30.7B params) |
| Training tokens | 2.5B Swedish government |
| Replay buffer | FineWeb-2 Swedish (50% mix, reasoning-filtered) |
| Learning rate | 7.5e-6 (cosine decay) |
| Steps | 20,000 |
| Hardware | 2x LUMI nodes (16 AMD MI250X GCDs) |
| Framework | PyTorch FSDP2 + Liger fused cross-entropy |
| Training time | ~168 GPU-hours |
Corpus
Training corpus covers 260+ Swedish government sources including parliament records (Riksdag), government proposals (propositioner), legal texts (SFS, case law), agency publications (Socialstyrelsen, Skatteverket, etc.), and municipal documents.
Instruction-Tuned Version
See Forseti 31B IT for the instruction-tuned variant (+0.7 EuroEval avg, hellaswag +4.9).
GGUF Quantizations
For local deployment, see Forseti 31B GGUF (Q4_K_M and Q8_0).
License
Apache 2.0 (inherited from Gemma 4).
Citation
@misc{forseti2026,
title={Forseti: A Swedish Government Domain Language Model},
author={Martini, Alexandro},
year={2026},
url={https://huggingface.co/noterat/forseti-31b}
}
- Downloads last month
- 346
Model tree for noterat/forseti-31b
Space using noterat/forseti-31b 1
Evaluation results
- EuroEval Average on EuroEval Swedishself-reported68.800