Tool-R0: Self-Evolving LLM Agents for Tool-Learning from Zero Data Paper • 2602.21320 • Published 16 days ago • 12
Humans and LLMs Diverge on Probabilistic Inferences Paper • 2602.23546 • Published 14 days ago • 12
ReMix: Reinforcement routing for mixtures of LoRAs in LLM finetuning Paper • 2603.10160 • Published 1 day ago • 13
Memex(RL): Scaling Long-Horizon LLM Agents via Indexed Experience Memory Paper • 2603.04257 • Published 8 days ago • 19
Large Multimodal Models as General In-Context Classifiers Paper • 2602.23229 • Published 14 days ago • 22
How Controllable Are Large Language Models? A Unified Evaluation across Behavioral Granularities Paper • 2603.02578 • Published 9 days ago • 23
Reasoning Models Struggle to Control their Chains of Thought Paper • 2603.05706 • Published 7 days ago • 26
MemSifter: Offloading LLM Memory Retrieval via Outcome-Driven Proxy Reasoning Paper • 2603.03379 • Published 9 days ago • 28
Progressive Residual Warmup for Language Model Pretraining Paper • 2603.05369 • Published 7 days ago • 32
Planning in 8 Tokens: A Compact Discrete Tokenizer for Latent World Model Paper • 2603.05438 • Published 7 days ago • 34
AgentVista: Evaluating Multimodal Agents in Ultra-Challenging Realistic Visual Scenarios Paper • 2602.23166 • Published 14 days ago • 40
How Far Can Unsupervised RLVR Scale LLM Training? Paper • 2603.08660 • Published 3 days ago • 44
DARE: Aligning LLM Agents with the R Statistical Ecosystem via Distribution-Aware Retrieval Paper • 2603.04743 • Published 7 days ago • 47
Thinking to Recall: How Reasoning Unlocks Parametric Knowledge in LLMs Paper • 2603.09906 • Published 2 days ago • 49
Lost in Stories: Consistency Bugs in Long Story Generation by LLMs Paper • 2603.05890 • Published 6 days ago • 80
Penguin-VL: Exploring the Efficiency Limits of VLM with LLM-based Vision Encoders Paper • 2603.06569 • Published 6 days ago • 99
T2S-Bench & Structure-of-Thought: Benchmarking and Prompting Comprehensive Text-to-Structure Reasoning Paper • 2603.03790 • Published 8 days ago • 113
Heterogeneous Agent Collaborative Reinforcement Learning Paper • 2603.02604 • Published 9 days ago • 170