GuidedQuant: Large Language Model Quantization via Exploiting End Loss Guidance Paper • 2505.07004 • Published May 11 • 7
KVzip: Query-Agnostic KV Cache Compression with Context Reconstruction Paper • 2505.23416 • Published May 29 • 11