harshithsaiv
/

kv-cache-compression

memory-efficient

inference-optimization

4-bit precision

mixed-precision

Model card Files Files and versions

kv-cache-compression / kernel

Commit History

feat: complete honest 4-method benchmark both models

5e16ca3

harshithsaiv commited on May 3

feat: true Triton 4-bit kernel with real bit packing

35feffe

harshithsaiv commited on May 3

feat: Complete working pytorch version of kernel

4b2bdf2

harshithsaiv commited on May 3

feat: Testing pure pytorch

e23db09

harshithsaiv commited on May 3

chore: clean the quant_cache

6a962fc

harshithsaiv commited on May 3

chore: libdevice not present in the current version

bc4bbbe

harshithsaiv commited on May 3

feat: Initial Triton kernel

91c163e

harshithsaiv commited on May 3