Code-as-Room: Generating 3D Rooms from Top-Down View Images via Agentic Code Synthesis Paper • 2605.18451 • Published 1 day ago • 34
RoboVIP: Multi-View Video Generation with Visual Identity Prompting Augments Robot Manipulation Paper • 2601.05241 • Published Jan 8 • 24
MesaTask: Towards Task-Driven Tabletop Scene Generation via 3D Spatial Reasoning Paper • 2509.22281 • Published Sep 26, 2025 • 33
InternScenes: A Large-scale Simulatable Indoor Scene Dataset with Realistic Layouts Paper • 2509.10813 • Published Sep 13, 2025 • 31
Running on T4 Featured 123 CountGD_Multi-Modal_Open-World_Counting 🚀 123 Count objects in images using text and example boxes