Reasoning Cache: Continual Improvement Over Long Horizons via Short-Horizon RL Paper • 2602.03773 • Published Feb 3 • 13
Numina-Lean-Agent: An Open and General Agentic Reasoning System for Formal Mathematics Paper • 2601.14027 • Published Jan 20 • 14
Kimina-Prover Preview: Towards Large Formal Reasoning Models with Reinforcement Learning Paper • 2504.11354 • Published Apr 15, 2025 • 6
Kimina-Prover Preview: Towards Large Formal Reasoning Models with Reinforcement Learning Paper • 2504.11354 • Published Apr 15, 2025 • 6
Kimina-Prover Preview: Towards Large Formal Reasoning Models with Reinforcement Learning Paper • 2504.11354 • Published Apr 15, 2025 • 6
SmolVLM: Redefining small and efficient multimodal models Paper • 2504.05299 • Published Apr 7, 2025 • 207
Optimizing Test-Time Compute via Meta Reinforcement Fine-Tuning Paper • 2503.07572 • Published Mar 10, 2025 • 48