Apertus: Democratizing Open and Compliant LLMs for Global Language Environments
Paper
•
2509.14233
•
Published
•
16
This model is a fine-tuned version of the Apertus 8B Instruct model, further trained using the RLVR (Reinforcement Learning with Verifiable Rewards) framework on the GSM8K dataset. The base Apertus models are introduced in the paper Apertus: Democratizing Open and Compliant LLMs for Global Language Environments.
Project Page: https://www.swiss-ai.org/apertus Code Repository: https://github.com/swiss-ai/apertus-tech-report
Validation accuracy improved from 46.41% to 66.23%.
Training performed on a GPU node with 4× NVIDIA H100 (95 GB), running for approximately 5 hours.
| Rollouts | |
|---|---|
num_unique_prompts_rollout | 32 |
num_samples_per_prompt_rollout | 8 |
temperature | 0.8 |
| Optimization | |
learning_rate | 3.0e-7 |
beta | 0.01 |
<think> </think>.This work builds upon and was inspired by the following contributions:
Base model
swiss-ai/Apertus-8B-2509