TUA-Bench: A Benchmark for General-Purpose Terminal-Use Agents Paper • 2606.28480 • Published 7 days ago • 44
LatentSkill: From In-Context Textual Skills to In-Weight Latent Skills for LLM Agents Paper • 2606.06087 • Published 29 days ago • 66
AffordanceVLA: A Vision-Language-Action Model Empowering Action Generation through Affordance-Aware Understanding Paper • 2606.06155 • Published 29 days ago • 10
Neural Networks Provably Learn Spectral Representations for Group Composition Paper • 2606.02993 • Published about 1 month ago • 6
Small RL Controller, Large Language Model: RL-Guided Adaptive Sampling for Test-Time Scaling Paper • 2606.03102 • Published about 1 month ago • 14
Speculative Pipeline Decoding: Higher-Accruacy and Zero-Bubble Speculation via Pipeline Parallelism Paper • 2605.30852 • Published May 29 • 10
ZeroUnlearn: Few-Shot Knowledge Unlearning in Large Language Models Paper • 2605.18879 • Published May 20 • 8
Agent Explorative Policy Optimization for Multimodal Agentic Reasoning Paper • 2605.28774 • Published May 27 • 93
Share More, Search Less: Collaborative Parallel Thinking for Efficient Test-Time Scaling Paper • 2605.27030 • Published May 26 • 32
SkillOpt: Executive Strategy for Self-Evolving Agent Skills Paper • 2605.23904 • Published May 22 • 249
You Only Need Minimal RLVR Training: Extrapolating LLMs via Rank-1 Trajectories Paper • 2605.21468 • Published May 20 • 51
G-Zero: Self-Play for Open-Ended Generation from Zero Data Paper • 2605.09959 • Published May 11 • 17
LLMs Improving LLMs: Agentic Discovery for Test-Time Scaling Paper • 2605.08083 • Published May 8 • 70
Beyond Semantic Similarity: Rethinking Retrieval for Agentic Search via Direct Corpus Interaction Paper • 2605.05242 • Published May 3 • 126
Nonsense Helps: Prompt Space Perturbation Broadens Reasoning Exploration Paper • 2605.05566 • Published May 7 • 38