gemma-3-1b-it-Math-RS-SFT / generation_config.json
NotoriousH2's picture
SFT + Rejection Sampling SFT (5x teacher replay). GSM8K avg ~46.6%, best 48.9%
8e96ac1 verified
raw
history blame contribute delete
217 Bytes
{
"bos_token_id": 2,
"cache_implementation": "hybrid",
"do_sample": true,
"eos_token_id": [
1,
1,
106
],
"pad_token_id": 1,
"top_k": 64,
"top_p": 0.95,
"transformers_version": "4.57.3"
}