Generate, Filter, Control, Replay: A Comprehensive Survey of Rollout Strategies for LLM Reinforcement Learning Paper • 2605.02913 • Published about 1 month ago • 5
Skills-Coach: A Self-Evolving Skill Optimizer via Training-Free GRPO Paper • 2604.27488 • Published 8 days ago • 4
ARIS: Autonomous Research via Adversarial Multi-Agent Collaboration Paper • 2605.03042 • Published 4 days ago • 91
MedSkillAudit: A Domain-Specific Audit Framework for Medical Research Agent Skills Paper • 2604.20441 • Published 16 days ago • 2
Lightning Unified Video Editing via In-Context Sparse Attention Paper • 2605.04569 • Published 2 days ago • 11
Awaking Spatial Intelligence in Unified Multimodal Understanding and Generation Paper • 2605.04128 • Published 3 days ago • 8
Rethinking Reasoning-Intensive Retrieval: Evaluating and Advancing Retrievers in Agentic Search Systems Paper • 2605.04018 • Published 3 days ago • 27
OpenSearch-VL: An Open Recipe for Frontier Multimodal Search Agents Paper • 2605.05185 • Published 2 days ago • 80
D-OPSD: On-Policy Self-Distillation for Continuously Tuning Step-Distilled Diffusion Models Paper • 2605.05204 • Published 2 days ago • 19
Stream-T1: Test-Time Scaling for Streaming Video Generation Paper • 2605.04461 • Published 2 days ago • 93
Stream-R1: Reliability-Perplexity Aware Reward Distillation for Streaming Video Generation Paper • 2605.03849 • Published 3 days ago • 108
SVGS: Enhancing Gaussian Splatting Using Primitives with Spatially Varying Colors Paper • 2411.18966 • Published 4 days ago • 7
HeavySkill: Heavy Thinking as the Inner Skill in Agentic Harness Paper • 2605.02396 • Published 4 days ago • 16
Beyond SFT-to-RL: Pre-alignment via Black-Box On-Policy Distillation for Multimodal RL Paper • 2604.28123 • Published 7 days ago • 40
SymptomAI: Towards a Conversational AI Agent for Everyday Symptom Assessment Paper • 2605.04012 • Published 3 days ago • 8
OpenSeeker-v2: Pushing the Limits of Search Agents with Informative and High-Difficulty Trajectories Paper • 2605.04036 • Published 3 days ago • 55
Persistent Visual Memory: Sustaining Perception for Deep Generation in LVLMs Paper • 2605.00814 • Published 7 days ago • 19
T^2PO: Uncertainty-Guided Exploration Control for Stable Multi-Turn Agentic Reinforcement Learning Paper • 2605.02178 • Published 4 days ago • 6
Motion-Aware Caching for Efficient Autoregressive Video Generation Paper • 2605.01725 • Published 5 days ago • 5