Residual Context Diffusion Collection Residual Context Diffusion (RCD): Repurposing discarded signals as structured priors for high-performance reasoning in dLLMs. • 12 items • Updated 19 days ago • 2
V_1: Unifying Generation and Self-Verification for Parallel Reasoners Paper • 2603.04304 • Published Mar 4 • 14
SLA2: Sparse-Linear Attention with Learnable Routing and QAT Paper • 2602.12675 • Published Feb 13 • 59
SpargeAttention2: Trainable Sparse Attention via Hybrid Top-k+Top-p Masking and Distillation Fine-Tuning Paper • 2602.13515 • Published Feb 13 • 45
Quant VideoGen: Auto-Regressive Long Video Generation via 2-Bit KV-Cache Quantization Paper • 2602.02958 • Published Feb 3 • 34
Jet-RL: Enabling On-Policy FP8 Reinforcement Learning with Unified Training and Rollout Precision Flow Paper • 2601.14243 • Published Jan 20 • 24
Arbitrage: Efficient Reasoning via Advantage-Aware Speculation Paper • 2512.05033 • Published Dec 4, 2025 • 17
ParallelBench: Understanding the Trade-offs of Parallel Decoding in Diffusion LLMs Paper • 2510.04767 • Published Oct 6, 2025 • 28
AdaSPEC: Selective Knowledge Distillation for Efficient Speculative Decoders Paper • 2510.19779 • Published Oct 22, 2025 • 62
SLA: Beyond Sparsity in Diffusion Transformers via Fine-Tunable Sparse-Linear Attention Paper • 2509.24006 • Published Sep 28, 2025 • 119
SageAttention3: Microscaling FP4 Attention for Inference and An Exploration of 8-Bit Training Paper • 2505.11594 • Published May 16, 2025 • 77