dhara-250m / generation_config.json
codelion's picture
dhara-250m tri-mode (AR + block-diffusion + self-speculation), annealed, ~60B tokens
bd52fdd verified
Raw
History Blame Contribute Delete
93 Bytes
{
"_from_model_config": true,
"eos_token_id": 49154,
"transformers_version": "5.8.1"
}