Forseti 31B

Forseti 31B is a Swedish government domain language model created through continued pre-training (CPT) of Gemma 4 31B on 2.5 billion tokens from 260+ Swedish government sources.

Evaluation (EuroEval Swedish, canonical Modal A100)

Benchmark Base (Gemma 4) Forseti CPT Delta
swerec (MCC) 69.7 78.4 +8.8
suc3 (micro-F1) 56.5 77.2 +20.7
scala-sv (MCC) 50.4 74.3 +23.9
multi-wiki-qa-sv (F1) 73.1 78.5 +5.4
swedn (chrF3++) 28.5 33.7 +5.2
mmlu-sv (MCC) 63.1 72.8 +9.7
hellaswag-sv (acc) 58.6 66.5 +7.9
Average 57.1 68.8 +11.7

All seven Swedish EuroEval benchmarks improved, with no capability regression.

Training Details

Parameter Value
Base model google/gemma-4-31B (text-only, 30.7B params)
Training tokens 2.5B Swedish government
Replay buffer FineWeb-2 Swedish (50% mix, reasoning-filtered)
Learning rate 7.5e-6 (cosine decay)
Steps 20,000
Hardware 2x LUMI nodes (16 AMD MI250X GCDs)
Framework PyTorch FSDP2 + Liger fused cross-entropy
Training time ~168 GPU-hours

Corpus

Training corpus covers 260+ Swedish government sources including parliament records (Riksdag), government proposals (propositioner), legal texts (SFS, case law), agency publications (Socialstyrelsen, Skatteverket, etc.), and municipal documents.

Instruction-Tuned Version

See Forseti 31B IT for the instruction-tuned variant (+0.7 EuroEval avg, hellaswag +4.9).

GGUF Quantizations

For local deployment, see Forseti 31B GGUF (Q4_K_M and Q8_0).

License

Apache 2.0 (inherited from Gemma 4).

Citation

@misc{forseti2026,
  title={Forseti: A Swedish Government Domain Language Model},
  author={Martini, Alexandro},
  year={2026},
  url={https://huggingface.co/noterat/forseti-31b}
}
Downloads last month
346
Safetensors
Model size
32B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for noterat/forseti-31b

Finetuned
(33)
this model
Finetunes
1 model
Quantizations
1 model

Space using noterat/forseti-31b 1

Evaluation results