MVTrack4Gen: Multi-View Point Tracking as Geometric Supervision for 4D Video Generation Paper • 2606.26087 • Published 1 day ago • 24
ShutterMuse: Capture-Time Photography Guidance with MLLMs Paper • 2606.25763 • Published 1 day ago • 36
FLAT: Feedforward Latent Triangle Splatting for Geometrically Accurate Scene Generation Paper • 2606.24876 • Published 3 days ago • 15
AOHP: An Open-Source OS-Level Agent Harness for Personalized, Efficient and Secure Interaction Paper • 2606.23449 • Published 4 days ago • 26
NatureBench: Can Coding Agents Match the Published SOTA of Nature-Family Papers? Paper • 2606.24530 • Published 3 days ago • 54
Qwen-AgentWorld: Language World Models for General Agents Paper • 2606.24597 • Published 3 days ago • 107
PlanBench-XL: Evaluating Long-Horizon Planning of LLM Tool-Use Agents in Large-Scale Tool Ecosystems Paper • 2606.22388 • Published 5 days ago • 90
KaLM-Reranker-V1: Fast but Not Late Interaction for Compressed Document Reranking Paper • 2606.22807 • Published 4 days ago • 44
PerceptionDLM: Parallel Region Perception with Multimodal Diffusion Language Models Paper • 2606.19534 • Published 9 days ago • 61
DragMesh-2: Physically Plausible Dexterous Hand-Object Interaction with Articulated Objects Paper • 2606.15133 • Published 13 days ago • 72
Moebius: 0.2B Lightweight Image Inpainting Framework with 10B-Level Performance Paper • 2606.19195 • Published 9 days ago • 135
PhoneHarness: Harnessing Phone-Use Agents through Mixed GUI, CLI, and Tool Actions Paper • 2606.14832 • Published 14 days ago • 12
Guava: An Effective and Universal Harness for Embodied Manipulation Paper • 2606.18363 • Published 10 days ago • 28
Beyond the Current Observation: Evaluating Multimodal Large Language Models in Controllable Non-Markov Games Paper • 2606.19338 • Published 9 days ago • 46
MolmoMotion: Forecasting Point Trajectories in 3D with Language Instruction Paper • 2606.18558 • Published 9 days ago • 50