CompactAttention: Accelerating Chunked Prefill with Block-Union KV Selection Paper • 2605.16839 • Published 4 days ago • 10
LRAgent: Efficient KV Cache Sharing for Multi-LoRA LLM Agents Paper • 2602.01053 • Published Feb 1 • 8
Token Sparse Attention: Efficient Long-Context Inference with Interleaved Token Selection Paper • 2602.03216 • Published Feb 3 • 13