gpt-oss-20b-WFP8-AFP8-KVFP8 / generation_config.json
XuebinWang's picture
KV cache quantization in FP8 (#1)
73fc8ea verified
raw
history blame contribute delete
172 Bytes
{
"bos_token_id": 199998,
"do_sample": true,
"eos_token_id": [
200002,
199999,
200012
],
"pad_token_id": 199999,
"transformers_version": "4.55.1"
}