amd
/

gpt-oss-20b-WFP8-AFP8-KVFP8

Model card Files Files and versions

gpt-oss-20b-WFP8-AFP8-KVFP8

22.1 GB

Ctrl+K

Ctrl+K

1 contributor

History: 5 commits

XuebinWang's picture

update readme with disclaimer (#4)

2431671 verified 5 months ago

.gitattributes
1.57 kB
KV cache quantization in FP8 (#1) 7 months ago
LICENSE
11.4 kB
update README (results etc) and upload LICENSE and USAGE_POLICY (#2) 5 months ago
README.md
6.97 kB
update readme with disclaimer (#4) 5 months ago
USAGE_POLICY
200 Bytes
update README (results etc) and upload LICENSE and USAGE_POLICY (#2) 5 months ago
chat_template.jinja
16.7 kB
KV cache quantization in FP8 (#1) 7 months ago
config.json
9.37 kB
KV cache quantization in FP8 (#1) 7 months ago
generation_config.json
172 Bytes
KV cache quantization in FP8 (#1) 7 months ago
model-00001-of-00005.safetensors
4.99 GB
xet

KV cache quantization in FP8 (#1) 7 months ago
model-00002-of-00005.safetensors
5 GB
xet

KV cache quantization in FP8 (#1) 7 months ago
model-00003-of-00005.safetensors
4.99 GB
xet

KV cache quantization in FP8 (#1) 7 months ago
model-00004-of-00005.safetensors
4.99 GB
xet

KV cache quantization in FP8 (#1) 7 months ago
model-00005-of-00005.safetensors
2.11 GB
xet

KV cache quantization in FP8 (#1) 7 months ago
model.safetensors.index.json
624 kB
KV cache quantization in FP8 (#1) 7 months ago
special_tokens_map.json
323 Bytes
KV cache quantization in FP8 (#1) 7 months ago
tokenizer.json
27.9 MB
xet

KV cache quantization in FP8 (#1) 7 months ago
tokenizer_config.json
4.22 kB
KV cache quantization in FP8 (#1) 7 months ago