ci-2layer-llama2-7b / generation_config.json
ELutris's picture
KD-distilled 2-layer student against Llama-2-7B teacher (alpaca-cleaned, 1500 steps, T=2.0, KL loss)
f0597fe verified
{
"_from_model_config": true,
"bos_token_id": 1,
"do_sample": true,
"eos_token_id": 2,
"pad_token_id": 32000,
"temperature": 0.9,
"top_p": 0.6,
"transformers_version": "5.5.0"
}