EfficientRollout: System-Aware Self-Speculative Decoding for RL Rollouts Paper • 2606.18967 • Published 9 days ago • 24
Learning, Fast and Slow: Towards LLMs That Adapt Continually Paper • 2605.12484 • Published May 12 • 18
Nietzsche6700/NW1110-draft-Llama-3.1-8B-Instruct-target-Llama-3.1-70B-Instruct-732 2B • Updated Mar 27 • 5
Nietzsche6700/NW1110-draft-Llama-3.1-8B-Instruct-target-Llama-3.1-70B-Instruct-732 2B • Updated Mar 27 • 5
Squeezed Attention: Accelerating Long Context Length LLM Inference Paper • 2411.09688 • Published Nov 14, 2024 • 1
Arbitrage: Efficient Reasoning via Advantage-Aware Speculation Paper • 2512.05033 • Published Dec 4, 2025 • 17
Arbitrage: Efficient Reasoning via Advantage-Aware Speculation Paper • 2512.05033 • Published Dec 4, 2025 • 17
AdaSPEC: Selective Knowledge Distillation for Efficient Speculative Decoders Paper • 2510.19779 • Published Oct 22, 2025 • 62