V_1: Unifying Generation and Self-Verification for Parallel Reasoners Paper • 2603.04304 • Published 9 days ago • 14
SLA2: Sparse-Linear Attention with Learnable Routing and QAT Paper • 2602.12675 • Published 29 days ago • 54
SpargeAttention2: Trainable Sparse Attention via Hybrid Top-k+Top-p Masking and Distillation Fine-Tuning Paper • 2602.13515 • Published 28 days ago • 43
SpargeAttention2: Trainable Sparse Attention via Hybrid Top-k+Top-p Masking and Distillation Fine-Tuning Paper • 2602.13515 • Published 28 days ago • 43
Quant VideoGen: Auto-Regressive Long Video Generation via 2-Bit KV-Cache Quantization Paper • 2602.02958 • Published Feb 3 • 34