Commit History

Remove throughput from model card
0bf916a
verified

haanjack commited on

Update to attn-excl variant: <1pct accuracy loss, 3x compression, 1.2x faster
1a10d45
verified

haanjack commited on

Add raw lm-eval results (MMLU + KMMLU, 5-shot, vLLM backend)
00bb97c
verified

haanjack commited on

Update benchmark results with vLLM-based evaluation
8be2651
verified

haanjack commited on

Add W4A4 weight/activation quantization details
fadd06e
verified

haanjack commited on

Remove --enforce-eager from vLLM usage (graph compilation works)
1b64d9a
verified

haanjack commited on

Solar-Open-100B MXFP4 quantized with quanto (Quark file2file)
8efd1fc
verified

haanjack commited on

initial commit
f94391e
verified

haanjack commited on