Commit History

Add Serving mode on Blackwell section — clarify weight-only FP4 reality on GB10+vLLM
d529b4d
verified

Kaleto commited on

Remove misleading W4A16 quality claim — measured no-op in modelopt+MARLIN path
21120da
verified

Kaleto commited on

Update model card for NVFP4A16 (W4A16) switch + re-bench numbers
5e32066
verified

Kaleto commited on

Switch to NVFP4A16 (W4A16) — weight-only quant, 2-4x better KLD than W4A4 per community feedback
3913a87
verified

Kaleto commited on

Add model-00008-of-00008.safetensors
a27f353
verified

Kaleto commited on

Add model-00007-of-00008.safetensors
15fab2d
verified

Kaleto commited on

Add model-00006-of-00008.safetensors
8907ecf
verified

Kaleto commited on

Add model-00005-of-00008.safetensors
1e7653c
verified

Kaleto commited on

Add model-00004-of-00008.safetensors
1872d39
verified

Kaleto commited on

Add model-00003-of-00008.safetensors
d579b5e
verified

Kaleto commited on

Add model-00002-of-00008.safetensors
0504382
verified

Kaleto commited on

Add model-00001-of-00008.safetensors
628ca61
verified

Kaleto commited on

Add tokenizer.json
e592b8a
verified

Kaleto commited on

Add tokenizer_config.json
63b60f5
verified

Kaleto commited on

Add model.safetensors.index.json
e32713d
verified

Kaleto commited on

Add hf_quant_config.json
1cb7234
verified

Kaleto commited on

Add generation_config.json
2203f45
verified

Kaleto commited on

Add config.json
74d99a8
verified

Kaleto commited on

Initial release: DeepSeek-R1-Distill-Llama-70B-NVFP4
1fcc983
verified

Kaleto commited on

initial commit
2c23dab
verified

Kaleto commited on