IV-CoT: Implicit Visual Chain-of-Thought for Structure-Aware Text-to-Image Generation Paper • 2606.24849 • Published 3 days ago • 13
FLAT: Feedforward Latent Triangle Splatting for Geometrically Accurate Scene Generation Paper • 2606.24876 • Published 3 days ago • 16
Guava: An Effective and Universal Harness for Embodied Manipulation Paper • 2606.18363 • Published 10 days ago • 28
From Trainee to Trainer: LLM-Designed Training Environment for RL with Multi-Agent Reasoning Paper • 2606.17682 • Published 10 days ago • 26
FORT-Searcher: Synthesizing Shortcut-Resistant Search Tasks for Training Deep Search Agents Paper • 2606.12087 • Published 16 days ago • 75
WeaveBench: A Long-Horizon, Real-World Benchmark for Computer-Use Agents with Hybrid Interfaces Paper • 2606.09426 • Published 18 days ago • 102
i1: A Simple and Fully Open Recipe for Strong Text-to-Image Models Paper • 2606.11289 • Published 17 days ago • 16
World Pilot: Steering Vision-Language-Action Models with World-Action Priors Paper • 2606.12403 • Published 16 days ago • 26
Rethinking the Divergence Regularization in LLM RL Paper • 2606.09821 • Published 18 days ago • 33
VideoMDM: Towards 3D Human Motion Generation From 2D Supervision Paper • 2606.13364 • Published 15 days ago • 20
InternVideo3: Agentify Foundation Models with Multimodal Contextual Reasoning Paper • 2606.12195 • Published 16 days ago • 23
OmniVideo-100K: A Dataset for Audio-Visual Reasoning through Structured Scripts and Evidence Chains Paper • 2606.14702 • Published 14 days ago • 31
World Model Self-Distillation: Training World Models to Solve General Tasks Paper • 2606.12072 • Published 16 days ago • 14
MMAE: A Massive Multitask Audio Editing Benchmark Paper • 2606.07229 • Published 21 days ago • 45
LatentSkill: From In-Context Textual Skills to In-Weight Latent Skills for LLM Agents Paper • 2606.06087 • Published 22 days ago • 64
Flow-DPPO: Divergence Proximal Policy Optimization for Flow Matching Models Paper • 2606.11025 • Published 17 days ago • 41
FlashMemory-DeepSeek-V4: Lightning Index Ultra-Long Context via Lookahead Sparse Attention Paper • 2606.09079 • Published 18 days ago • 62