math_model / config.json

Commit History

Upload exp4b: SFT→DPO v2 (L3-5, ≤5/8, beta=0.1, lr=5e-7)
32ac537
verified

jdecim commited on

Upload math SFT checkpoint (smoke test)
2b172d7
verified

jdecim commited on