GPT-1900 D34 Math RL

3.29B parameter GPT-1900 with math RL. GSM8K (30%) + MATH (70%), SymPy verification, format reward. Chain: base -> physics CLM expanded -> v3 SFT safe -> R1 SFT -> math RL.

Training

  • Training: Math RL (step 60), from r1-reasoning-sft
  • Parameters: 3.29B
  • Architecture: Custom GPT with RoPE, QK-norm, ReLU², value embeddings
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including mhla/gpt1900-d34-math-rl