ImageWAM: Do World Action Models Really Need Video Generation, or Just Image Editing? Paper • 2606.19531 • Published 8 days ago • 18
DreamX-World 1.0: A General-Purpose Interactive World Model Paper • 2606.16993 • Published 10 days ago • 110
JoyAI-VL-Interaction: Real-Time Vision-Language Interaction Intelligence Paper • 2606.14777 • Published 15 days ago • 200
InterleaveThinker: Reinforcing Agentic Interleaved Generation Paper • 2606.13679 • Published 14 days ago • 80
Lens: Rethinking Training Efficiency for Foundational Text-to-Image Models Paper • 2605.21573 • Published May 20 • 111
Video2GUI: Synthesizing Large-Scale Interaction Trajectories for Generalized GUI Agent Pretraining Paper • 2605.14747 • Published May 14 • 147
InsightTok: Improving Text and Face Fidelity in Discrete Tokenization for Autoregressive Image Generation Paper • 2605.14333 • Published May 14 • 35
HY-World 2.0: A Multi-Modal World Model for Reconstructing, Generating, and Simulating 3D Worlds Paper • 2604.14268 • Published Apr 15 • 127
Seedance 2.0: Advancing Video Generation for World Complexity Paper • 2604.14148 • Published Apr 15 • 166
Strips as Tokens: Artist Mesh Generation with Native UV Segmentation Paper • 2604.09132 • Published Apr 10 • 56
Omni123: Exploring 3D Native Foundation Models with Limited 3D Data by Unifying Text to 2D and 3D Generation Paper • 2604.02289 • Published Apr 2 • 15
WorldCam: Interactive Autoregressive 3D Gaming Worlds with Camera Pose as a Unifying Geometric Representation Paper • 2603.16871 • Published Mar 17 • 61
HiFi-Inpaint: Towards High-Fidelity Reference-Based Inpainting for Generating Detail-Preserving Human-Product Images Paper • 2603.02210 • Published Mar 2 • 30
Code2World: A GUI World Model via Renderable Code Generation Paper • 2602.09856 • Published Feb 10 • 201
Agent Banana: High-Fidelity Image Editing with Agentic Thinking and Tooling Paper • 2602.09084 • Published Feb 9 • 30