Dense Reward for Multi-View 3D Reasoning with Global Maps and Local Views Paper • 2606.23557 • Published 4 days ago • 5
WaymoQA: A Multi-View Visual Question Answering Dataset for Safety-Critical Reasoning in Autonomous Driving Paper • 2511.20022 • Published Feb 11
Dense Reward for Multi-View 3D Reasoning with Global Maps and Local Views Paper • 2606.23557 • Published 4 days ago • 5
GameCraft-Bench: Can Agents Build Playable Games End-to-End in a Real Game Engine? Paper • 2606.17861 • Published 10 days ago • 55
Mitigating Perceptual Judgment Bias in Multimodal LLM-as-a-Judge via Perceptual Perturbation and Reward Modeling Paper • 2606.02578 • Published 25 days ago • 6
Mitigating Perceptual Judgment Bias in Multimodal LLM-as-a-Judge via Perceptual Perturbation and Reward Modeling Paper • 2606.02578 • Published 25 days ago • 6
Scribble-Guided Diffusion for Training-free Text-to-Image Generation Paper • 2409.08026 • Published Sep 12, 2024
3D-Aware Vision-Language Models Fine-Tuning with Geometric Distillation Paper • 2506.09883 • Published Jun 11, 2025 • 1
What "Not" to Detect: Negation-Aware VLMs via Structured Reasoning and Token Merging Paper • 2510.13232 • Published Oct 15, 2025 • 1
Running on Zero Agents 25 FiT3D 🏃 25 Visualize and compare 2D and 3D-aware feature representations of images
DreamCatalyst: Fast and High-Quality 3D Editing via Controlling Editability and Identity Preservation Paper • 2407.11394 • Published Jul 16, 2024 • 12
DreamCatalyst: Fast and High-Quality 3D Editing via Controlling Editability and Identity Preservation Paper • 2407.11394 • Published Jul 16, 2024 • 12