AIGeeksGroup
/

3D-R1

Model card Files Files and versions

Add model card

#1

by nielsr HF Staff - opened Aug 1, 2025

base: refs/heads/main

←

from: refs/pr/1

Discussion Files changed

Files changed (1) hide show

README.md +15 -0

README.md CHANGED Viewed

	@@ -1 +1,16 @@





1

+---
+pipeline_tag: image-text-to-text
+library_name: transformers
+license: apache-2.0
+---
+# 3D-R1: Enhancing Reasoning in 3D VLMs for Unified Scene Understanding
+**3D-R1** is a foundation model designed to enhance the reasoning capabilities of 3D Vision-Language Models (VLMs) for unified scene understanding. It addresses limitations in existing 3D VLMs by leveraging a high-quality synthetic dataset (Scene-30K), incorporating RLHF policies with novel reward functions (perception, semantic similarity, format), and introducing a dynamic view selection strategy. This approach aims to improve robust reasoning and generalization in 3D scene understanding.
+The model was presented in the paper:
+-   **Paper**: [3D-R1: Enhancing Reasoning in 3D VLMs for Unified Scene Understanding](https://huggingface.co/papers/2507.23478)
+For more details, visit the project page and code repository:
+-   **Project Page**: [https://aigeeksgroup.github.io/3D-R1](https://aigeeksgroup.github.io/3D-R1)
+-   **Code**: [https://github.com/AIGeeksGroup/3D-R1](https://github.com/AIGeeksGroup/3D-R1)