NatureBench: Can Coding Agents Match the Published SOTA of Nature-Family Papers? Paper • 2606.24530 • Published 5 days ago • 60
view article Article PhysicsIntern: from an Autonomous Benchmark-runner to a Research Sidekick dlouapre • 16 days ago • 7
ResearchClawBench: A Benchmark for End-to-End Autonomous Scientific Research Paper • 2606.07591 • Published about 1 month ago • 97
COLLEAGUE.SKILL: Automated AI Skill Generation via Expert Knowledge Distillation Paper • 2605.31264 • Published 30 days ago • 120
AutoResearchClaw: Self-Reinforcing Autonomous Research with Human-AI Collaboration Paper • 2605.20025 • Published May 19 • 190
Teaching Thinking Models to Reason with Tools: A Full-Pipeline Recipe for Tool-Integrated Reasoning Paper • 2605.06326 • Published May 7 • 26
PRBench: End-to-end Paper Reproduction in Physics Research Paper • 2603.27646 • Published Mar 29 • 29
Intern-S1-Pro: Scientific Multimodal Foundation Model at Trillion Scale Paper • 2603.25040 • Published Mar 26 • 134
LongCat-Flash-Prover: Advancing Native Formal Reasoning via Agentic Tool-Integrated Reinforcement Learning Paper • 2603.21065 • Published Mar 22 • 78
OpenSeeker: Democratizing Frontier Search Agents by Fully Open-Sourcing Training Data Paper • 2603.15594 • Published Mar 16 • 150
MOOSE-Star: Unlocking Tractable Training for Scientific Discovery by Breaking the Complexity Barrier Paper • 2603.03756 • Published Mar 4 • 90
Cognitive Foundations for Reasoning and Their Manifestation in LLMs Paper • 2511.16660 • Published Nov 20, 2025 • 11
DataChef: Cooking Up Optimal Data Recipes for LLM Adaptation via Reinforcement Learning Paper • 2602.11089 • Published Feb 11 • 18
P1-VL: Bridging Visual Perception and Scientific Reasoning in Physics Olympiads Paper • 2602.09443 • Published Feb 10 • 59
InternAgent-1.5: A Unified Agentic Framework for Long-Horizon Autonomous Scientific Discovery Paper • 2602.08990 • Published Feb 9 • 79