Commit History

Add Serving mode on Blackwell section — clarify weight-only FP4 reality on GB10+vLLM
c28de06
verified

Kaleto commited on

Remove misleading W4A16 quality claim — measured no-op in modelopt+MARLIN path
1c38f73
verified

Kaleto commited on

Update model card for NVFP4A16 (W4A16) switch + re-bench numbers
c0c2491
verified

Kaleto commited on

Switch to NVFP4A16 (W4A16) — weight-only quant, 2-4x better KLD than W4A4 per community feedback
d05ccfd
verified

Kaleto commited on

Add MARLIN-tuned performance numbers (+22% short-context vs stock)
7184735
verified

Kaleto commited on

Add link to KaletoAI/distrib-nvfp4 GitHub repo
c33921a
verified

Kaleto commited on

Model card corrections: drop misleading FlashInferCutlassNvFp4LinearKernel marker, credit Avarok-Cybersecurity + saricles + RedHatAI, add stock-vLLM disclaimer, add future-work section
00d92fb
verified

Kaleto commited on

upload model-00012-of-00012.safetensors
1f8bd3f
verified

Kaleto commited on

upload model-00011-of-00012.safetensors
0ab67b9
verified

Kaleto commited on

upload model-00010-of-00012.safetensors
f212a10
verified

Kaleto commited on

upload model-00009-of-00012.safetensors
c98efe7
verified

Kaleto commited on

upload model-00008-of-00012.safetensors
e2edef0
verified

Kaleto commited on

upload model-00007-of-00012.safetensors
a5608a1
verified

Kaleto commited on

upload model-00006-of-00012.safetensors
da6669c
verified

Kaleto commited on

upload model-00005-of-00012.safetensors
890f315
verified

Kaleto commited on

upload model-00004-of-00012.safetensors
8222dac
verified

Kaleto commited on

upload model-00003-of-00012.safetensors
b91675b
verified

Kaleto commited on

upload model-00002-of-00012.safetensors
227ad6e
verified

Kaleto commited on

upload model-00001-of-00012.safetensors
b4d8c33
verified

Kaleto commited on

upload tokenizer.json
0b528e9
verified

Kaleto commited on

upload model-input_scales.safetensors
0c33920
verified

Kaleto commited on

upload model.safetensors.index.json
713795a
verified

Kaleto commited on

upload tokenizer_config.json
844e8fa
verified

Kaleto commited on

upload special_tokens_map.json
a5eceaf
verified

Kaleto commited on

upload hf_quant_config.json
586cdae
verified

Kaleto commited on

upload generation_config.json
2071fe7
verified

Kaleto commited on

upload config.json
6963f04
verified

Kaleto commited on

upload README.md
736080a
verified

Kaleto commited on

upload NOTICE
7b5bed7
verified

Kaleto commited on

upload LICENSE
cee0d4f
verified

Kaleto commited on

upload .gitattributes
8d44d44
verified

Kaleto commited on

initial commit
5d0d6e3
verified

Kaleto commited on