math_model / README.md

Commit History

Update math model
72927ef
verified

jdecim commited on

Push exp8 GRPO best (step 750), gen temp=0.7 for pass@8
ab2047d
verified

jdecim commited on

exp6: SFT(NuminaMath) -> DPO v2, T=0.3
5184a78
verified

jdecim commited on

sft_mixlong_full: best Hard100 p@8, temperature=0.3
a421196
verified

jdecim commited on

Push DPO checkpoint with T=0.3 (optimal for pass@8 on CI gate)
ac3517e
verified

jdecim commited on

Fix README: add base_model, fix model references
6af8d3a
verified

jdecim commited on

Upload exp4b: SFT→DPO v2 (L3-5, ≤5/8, beta=0.1, lr=5e-7)
32ac537
verified

jdecim commited on

Upload math SFT checkpoint
f13b0a9
verified

jdecim commited on

Upload math SFT checkpoint (smoke test)
ed23471
verified

jdecim commited on

Upload math SFT checkpoint (smoke test)
2b172d7
verified

jdecim commited on