fraQtl
/

Llama-3.2-3B-optimized

kv-cache-optimized

Model card Files Files and versions

Llama-3.2-3B-optimized

Commit History

Add arXiv paper link

06fb73d
verified

Zenalyze commited on 1 day ago

Clarify: KV cache optimized, not smaller file

228619a
verified

Zenalyze commited on 3 days ago

Update card with correct k=32 numbers

d7dedab
verified

Zenalyze commited on 5 days ago

Update to k=32 results

0277873
verified

Zenalyze commited on 5 days ago

Update model card with runtime comparison

847903c
verified

Zenalyze commited on 5 days ago

fraQtl compressed: k=32 INT3, delta=+0.4671

96b6596
verified

Zenalyze commited on 5 days ago

fraQtl compressed: k=16 INT3, delta=+0.7151

667ffd6
verified

Zenalyze commited on 5 days ago

initial commit

da4545b
verified

Zenalyze commited on 5 days ago