Rethinking Generative Recommender Tokenizer: Recsys-Native Encoding and Semantic Quantization Beyond LLMs Paper • 2602.02338 • Published 6 days ago • 40
LatentMem: Customizing Latent Memory for Multi-Agent Systems Paper • 2602.03036 • Published 5 days ago • 12
Reinforcement World Model Learning for LLM-based Agents Paper • 2602.05842 • Published 3 days ago • 17
Length-Unbiased Sequence Policy Optimization: Revealing and Controlling Response Length Variation in RLVR Paper • 2602.05261 • Published 3 days ago • 45
Dr. Kernel: Reinforcement Learning Done Right for Triton Kernel Generations Paper • 2602.05885 • Published 3 days ago • 20
SAGE: Benchmarking and Improving Retrieval for Deep Research Agents Paper • 2602.05975 • Published 2 days ago • 10
Training Data Efficiency in Multimodal Process Reward Models Paper • 2602.04145 • Published 4 days ago • 73
TIDE: Trajectory-based Diagnostic Evaluation of Test-Time Improvement in LLM Agents Paper • 2602.02196 • Published 6 days ago • 31
VLS: Steering Pretrained Robot Policies via Vision-Language Models Paper • 2602.03973 • Published 4 days ago • 20
WideSeek-R1: Exploring Width Scaling for Broad Information Seeking via Multi-Agent Reinforcement Learning Paper • 2602.04634 • Published 4 days ago • 85
HySparse: A Hybrid Sparse Attention Architecture with Oracle Token Selection and KV Cache Sharing Paper • 2602.03560 • Published 5 days ago • 40
Research on World Models Is Not Merely Injecting World Knowledge into Specific Tasks Paper • 2602.01630 • Published 6 days ago • 46
AOrchestra: Automating Sub-Agent Creation for Agentic Orchestration Paper • 2602.03786 • Published 4 days ago • 81
Rethinking the Trust Region in LLM Reinforcement Learning Paper • 2602.04879 • Published 3 days ago • 29