From Next-Token to Next-Block: A Principled Adaptation Path for Diffusion LLMs Paper • 2512.06776 • Published 5 days ago • 20
Beyond Real: Imaginary Extension of Rotary Position Embeddings for Long-Context LLMs Paper • 2512.07525 • Published 4 days ago • 52
The Path Not Taken: RLVR Provably Learns Off the Principals Paper • 2511.08567 • Published about 1 month ago • 32
To Think or Not to Think: The Hidden Cost of Meta-Training with Excessive CoT Examples Paper • 2512.05318 • Published 7 days ago • 2
Reasoning Under 1 Billion: Memory-Augmented Reinforcement Learning for Large Language Models Paper • 2504.02273 • Published Apr 3 • 7
Structured Linear CDEs: Maximally Expressive and Parallel-in-Time Sequence Models Paper • 2505.17761 • Published May 23 • 1
The Automated LLM Speedrunning Benchmark: Reproducing NanoGPT Improvements Paper • 2506.22419 • Published Jun 27 • 15
Tiled Flash Linear Attention: More Efficient Linear RNN and xLSTM Kernels Paper • 2503.14376 • Published Mar 18 • 1
WPO: Enhancing RLHF with Weighted Preference Optimization Paper • 2406.11827 • Published Jun 17, 2024 • 16