harshithsaiv
/

kv-cache-compression

memory-efficient

inference-optimization

4-bit precision

mixed-precision

Model card Files Files and versions

kv-cache-compression / kernel

Ctrl+K

Ctrl+K

2 contributors

History: 7 commits

harshithsaiv's picture

feat: complete honest 4-method benchmark both models

5e16ca3 about 2 months ago

quant_cache.py

4.39 kB
feat: complete honest 4-method benchmark both models about 2 months ago
quant_cache_triton.py

11.2 kB
feat: complete honest 4-method benchmark both models about 2 months ago