Scaling the Horizon, Not the Parameters: Reaching Trillion-Parameter Performance with a 35B Agent Paper • 2606.30616 • Published 2 days ago • 66
Translation as a Bridging Action: Transferring Manipulation Skills from Humans to Robots Paper • 2606.28133 • Published 5 days ago • 35
ResearchClawBench: A Benchmark for End-to-End Autonomous Scientific Research Paper • 2606.07591 • Published May 28 • 99
Beyond Accuracy: Unveiling Inefficiency Patterns in Tool-Integrated Reasoning Paper • 2604.05404 • Published Apr 7 • 46
From Perception to Action: An Interactive Benchmark for Vision Reasoning Paper • 2602.21015 • Published Feb 24 • 24
InternAgent-1.5: A Unified Agentic Framework for Long-Horizon Autonomous Scientific Discovery Paper • 2602.08990 • Published Feb 9 • 79
Toward Efficient Agents: Memory, Tool learning, and Planning Paper • 2601.14192 • Published Jan 20 • 57
Agent Data Protocol: Unifying Datasets for Diverse, Effective Fine-tuning of LLM Agents Paper • 2510.24702 • Published Oct 28, 2025 • 32
Probing Scientific General Intelligence of LLMs with Scientist-Aligned Workflows Paper • 2512.16969 • Published Dec 18, 2025 • 121
RoboTracer: Mastering Spatial Trace with Reasoning in Vision-Language Models for Robotics Paper • 2512.13660 • Published Dec 15, 2025 • 37
Geometrically-Constrained Agent for Spatial Reasoning Paper • 2511.22659 • Published Nov 27, 2025 • 41
Don't Just Fine-tune the Agent, Tune the Environment Paper • 2510.10197 • Published Oct 11, 2025 • 30
BEAR: Benchmarking and Enhancing Multimodal Language Models for Atomic Embodied Capabilities Paper • 2510.08759 • Published Oct 9, 2025 • 46
LLMs Learn to Deceive Unintentionally: Emergent Misalignment in Dishonesty from Misaligned Samples to Biased Human-AI Interactions Paper • 2510.08211 • Published Oct 9, 2025 • 23
StockBench: Can LLM Agents Trade Stocks Profitably In Real-world Markets? Paper • 2510.02209 • Published Oct 2, 2025 • 57
Understand Before You Generate: Self-Guided Training for Autoregressive Image Generation Paper • 2509.15185 • Published Sep 18, 2025 • 29