math_model / generation_config.json
jdecim's picture
Push exp8 GRPO best (step 750), gen temp=0.7 for pass@8
ab2047d verified
raw
history blame
212 Bytes
{
"do_sample": true,
"eos_token_id": [
151645,
151643
],
"pad_token_id": 151643,
"temperature": 0.7,
"top_k": 20,
"top_p": 0.95,
"transformers_version": "5.8.0",
"bos_token_id": 151643
}