gemma-3-1b-it-Math-GRPO / tokenizer_config.json
NotoriousH2's picture
SFT + RS-SFT + GRPO (500 steps, beta=0.04). GSM8K ~46.2%
cbc68ed verified
raw
history contribute delete
1.16 MB
File too large to display, you can check the raw version instead.