amd
/

gpt-oss-20b-WFP8-AFP8-KVFP8

Model card Files Files and versions

gpt-oss-20b-WFP8-AFP8-KVFP8 / generation_config.json

XuebinWang's picture

KV cache quantization in FP8 (#1)

73fc8ea verified 6 months ago

history blame contribute delete

172 Bytes

	{
	"bos_token_id": 199998,
	"do_sample": true,
	"eos_token_id": [
	200002,
	199999,
	200012
	],
	"pad_token_id": 199999,
	"transformers_version": "4.55.1"
	}