Value-Aware Stochastic KV Cache Eviction for Reasoning Models Paper • 2606.03928 • Published 25 days ago • 8
Convergent Evolution: How Different Language Models Learn Similar Number Representations Paper • 2604.20817 • Published Apr 22 • 8
Convergent Evolution: How Different Language Models Learn Similar Number Representations Paper • 2604.20817 • Published Apr 22 • 8
FIPO: Eliciting Deep Reasoning with Future-KL Influenced Policy Optimization Paper • 2603.19835 • Published Mar 20 • 352