mhla
/

gpt1900-d34-math-rl

Model card Files Files and versions

GPT-1900 D34 Math RL

3.29B parameter GPT-1900 with math RL. GSM8K (30%) + MATH (70%), SymPy verification, format reward. Chain: base -> physics CLM expanded -> v3 SFT safe -> R1 SFT -> math RL.

Training

Training: Math RL (step 60), from r1-reasoning-sft
Parameters: 3.29B
Architecture: Custom GPT with RoPE, QK-norm, ReLU², value embeddings

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including mhla/gpt1900-d34-math-rl

GPT-1900 Drafts

Experimental and intermediate GPT-1900 checkpoints. Working artifacts, not for general use. • 49 items • Updated Mar 29