Commit History

Update README with detailed data pipeline and reproduction steps
bab0696
verified

NotoriousH2 commited on

Add model card README
9adcdbb
verified

NotoriousH2 commited on

Add eval.py
013f3b2
verified

NotoriousH2 commited on

Add train_grpo.py
b50d571
verified

NotoriousH2 commited on

Add train_rs_sft.py
12dd0e7
verified

NotoriousH2 commited on

Add rs_sample.py
24e2849
verified

NotoriousH2 commited on

Add train_sft.py
d304eb5
verified

NotoriousH2 commited on

SFT + RS-SFT + GRPO (500 steps, beta=0.04). GSM8K ~46.2%
cbc68ed
verified

NotoriousH2 commited on

initial commit
ae7d925
verified

NotoriousH2 commited on