Elastic Attention: Test-time Adaptive Sparsity Ratios for Efficient Transformers Paper • 2601.17367 • Published 6 days ago • 33
Innovator-VL: A Multimodal Large Language Model for Scientific Discovery Paper • 2601.19325 • Published 3 days ago • 62
Towards Pixel-Level VLM Perception via Simple Points Prediction Paper • 2601.19228 • Published 3 days ago • 14
EvolVE: Evolutionary Search for LLM-based Verilog Generation and Optimization Paper • 2601.18067 • Published 4 days ago • 3
DRPG (Decompose, Retrieve, Plan, Generate): An Agentic Framework for Academic Rebuttal Paper • 2601.18081 • Published 4 days ago • 7
Fast KVzip: Efficient and Accurate LLM Inference with Gated KV Eviction Paper • 2601.17668 • Published 5 days ago • 3
Quantized Side Tuning: Fast and Memory-Efficient Tuning of Quantized Large Language Models Paper • 2401.07159 • Published Jan 13, 2024 • 1
KV-Distill: Nearly Lossless Learnable Context Compression for LLMs Paper • 2503.10337 • Published Mar 13, 2025 • 1
LMCache: An Efficient KV Cache Layer for Enterprise-Scale LLM Inference Paper • 2510.09665 • Published Oct 8, 2025 • 1
Plug-and-Play 1.x-Bit KV Cache Quantization for Video Large Language Models Paper • 2503.16257 • Published Mar 20, 2025 • 27
SALAD: Achieve High-Sparsity Attention via Efficient Linear Attention Tuning for Video Diffusion Transformer Paper • 2601.16515 • Published 7 days ago • 15
Unsloth Dynamic 2.0 Quants Collection New 2.0 version of our Dynamic GGUF + Quants. Dynamic 2.0 achieves superior accuracy & SOTA quantization performance. • 69 items • Updated 3 days ago • 326
360Anything: Geometry-Free Lifting of Images and Videos to 360° Paper • 2601.16192 • Published 7 days ago • 8
VideoMaMa: Mask-Guided Video Matting via Generative Prior Paper • 2601.14255 • Published 9 days ago • 13