SmolLM2-360M-Think-R18 / generation_config.json

Commit History

DuoNeural Think Instillation R18 — dead-prompt filtered GRPO, +0.030 over post-SFT
e86cf4d
verified

DuoNeural commited on