Nemotron 3 Ultra: Open, Efficient Mixture-of-Experts Hybrid Mamba-Transformer Model for Agentic Reasoning Paper • 2606.15007 • Published 19 days ago • 16
MaxProof: Scaling Mathematical Proof with Generative-Verifier RL and Population-Level Test-Time Scaling Paper • 2606.13473 • Published 20 days ago • 91
ResearchClawBench: A Benchmark for End-to-End Autonomous Scientific Research Paper • 2606.07591 • Published May 28 • 98
Toward Generalist Autonomous Research via Hypothesis-Tree Refinement Paper • 2606.11926 • Published 21 days ago • 123
SWE-Explore: Benchmarking How Coding Agents Explore Repositories Paper • 2606.07297 • Published 26 days ago • 121
Imaginative Perception Tokens Enhance Spatial Reasoning in Multimodal Language Models Paper • 2606.03988 • Published 28 days ago • 126
VibeThinker-3B: Exploring the Frontier of Verifiable Reasoning in Small Language Models Paper • 2606.16140 • Published 16 days ago • 120
GRAM-R^2: Self-Training Generative Foundation Reward Models for Reward Reasoning Paper • 2509.02492 • Published Sep 2, 2025 • 2
Nudging Beyond the Comfort Zone: Efficient Strategy-Guided Exploration for RLVR Paper • 2605.15726 • Published May 15 • 35
Stop When Reasoning Converges: Semantic-Preserving Early Exit for Reasoning Models Paper • 2605.17672 • Published May 17 • 23
ParoQuant: Pairwise Rotation Quantization for Efficient Reasoning LLM Inference Paper • 2511.10645 • Published Nov 13, 2025 • 14
ParoQuant Collection Pairwise Rotation Quantization for Efficient Reasoning LLM Inference • 24 items • Updated 22 days ago • 27
Golden Goose: A Simple Trick to Synthesize Unlimited RLVR Tasks from Unverifiable Internet Text Paper • 2601.22975 • Published Jan 30 • 113