Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Log In
Sign Up
fraQtl
/
Llama-3.2-3B-optimized
like
0
Follow
fraQtl AI Research
2
Safetensors
llama
fraqtl
kv-cache-optimized
inference
arxiv:
2604.11501
License:
other
Model card
Files
Files and versions
xet
Community
main
Llama-3.2-3B-optimized
Commit History
Add arXiv paper link
06fb73d
verified
Zenalyze
commited on
1 day ago
Clarify: KV cache optimized, not smaller file
228619a
verified
Zenalyze
commited on
3 days ago
Update card with correct k=32 numbers
d7dedab
verified
Zenalyze
commited on
5 days ago
Update to k=32 results
0277873
verified
Zenalyze
commited on
5 days ago
Update model card with runtime comparison
847903c
verified
Zenalyze
commited on
5 days ago
fraQtl compressed: k=32 INT3, delta=+0.4671
96b6596
verified
Zenalyze
commited on
5 days ago
fraQtl compressed: k=16 INT3, delta=+0.7151
667ffd6
verified
Zenalyze
commited on
5 days ago
initial commit
da4545b
verified
Zenalyze
commited on
5 days ago