haoningwu
/

SceneGen

Image-to-3D

English

scene-generation

Model card Files Files and versions

xet

Community

haoningwu commited on Dec 15, 2025

Commit

7995e55

verified ·

1 Parent(s): be4749b

Update README.md

Browse files

Files changed (1) hide show

README.md +61 -22

README.md CHANGED Viewed

@@ -5,11 +5,11 @@ language:
 - en
 ---
-# SceneGen: Single-Image 3D Scene Generation in One Feedforward Pass
-This repository contains the official PyTorch implementation of SceneGen: https://arxiv.org/abs/2508.15769/. Feel free to reach out for discussions!
-**Now the Inference Code and Pretrained Models are released!**
 <div align="center">
    <img src="./assets/SceneGen.png">
@@ -19,6 +19,9 @@ This repository contains the official PyTorch implementation of SceneGen: https:
 [Project Page](https://mengmouxu.github.io/SceneGen/) · [Paper](https://arxiv.org/abs/2508.15769/) · [Checkpoints](https://huggingface.co/haoningwu/SceneGen/)
 ## ⏩ News
 - [2025.8] The inference code and checkpoints are released.
 - [2025.8] Our pre-print paper has been released on arXiv.
@@ -82,7 +85,7 @@ This script launches a Gradio web interface for interactive scene generation.
   >
   > ### 🗃️ Step 2: Manage Cache
   > 1.  Click **"Add to Cache"** when satisfied with the segmentation.
-  > 2.  Repeat Step 1-2 for multiple images.
   > 3.  Use **"Delete Selected"** or **"Clear All"** to manage cached images.
   >
   > ### 🎮 Step 3: Generate Scene
@@ -92,6 +95,11 @@ This script launches a Gradio web interface for interactive scene generation.
   >
   > **💡 Pro Tip:**  Try the examples below to get started quickly!
 ### Pre-segmented Image Inference
 This script processes a directory of pre-segmented images.
 - **Input**: The input folder structure should be similar to `assets/masked_image_test`, containing segmented scene images.
@@ -102,33 +110,64 @@ This script processes a directory of pre-segmented images.
   ```
 ## 📚 Dataset
-To be updated soon...
 ## 🏋️‍♂️ Training
-To be updated soon...
-## Evaluation
-To be updated soon...
 ## 📜 Citation
 If you use this code and data for your research or project, please cite:
-    @article{meng2025scenegen,
-      author    = {Meng, Yanxu and Wu, Haoning and Zhang, Ya and Xie, Weidi},
-      title     = {SceneGen: Single-Image 3D Scene Generation in One Feedforward Pass},
-      journal   = {arXiv preprint arXiv:2508.15769},
-      year      = {2025},
-    }
 ## TODO
 - [x] Release Paper
 - [x] Release Checkpoints & Inference Code
-- [ ] Release Training Code
-- [ ] Release Evaluation Code
-- [ ] Release Data Processing Code
 ## Acknowledgements
 Many thanks to the code bases from [TRELLIS](https://github.com/microsoft/TRELLIS), [DINOv2](https://github.com/facebookresearch/dinov2), and [VGGT](https://github.com/facebookresearch/vggt).
 ## Contact
-If you have any questions, please feel free to contact [meng-mou-xu@sjtu.edu.cn](mailto:meng-mou-xu@sjtu.edu.cn) and [haoningwu3639@gmail.com](mailto:haoningwu3639@gmail.com).

 - en
 ---
+# SceneGen: Single-Image 3D Scene Generation in One Feedforward Pass (3DV 2026)
+This repository contains the official PyTorch implementation of SceneGen: https://arxiv.org/abs/2508.15769/.
+**Now the Training, Inference Code, and Pretrained Models have all been released! Feel free to reach out for discussions!**
 <div align="center">
    <img src="./assets/SceneGen.png">
 [Project Page](https://mengmouxu.github.io/SceneGen/) · [Paper](https://arxiv.org/abs/2508.15769/) · [Checkpoints](https://huggingface.co/haoningwu/SceneGen/)
 ## ⏩ News
+- [2025.11] Evaluation code has been released.
+- [2025.11] Glad to share that SceneGen has been accepted to 3DV 2026.
+- [2025.9] Our training code and data processing code are released.
 - [2025.8] The inference code and checkpoints are released.
 - [2025.8] Our pre-print paper has been released on arXiv.
   >
   > ### 🗃️ Step 2: Manage Cache
   > 1.  Click **"Add to Cache"** when satisfied with the segmentation.
+  > 2.  Repeat Steps 1-2 for multiple images.
   > 3.  Use **"Delete Selected"** or **"Clear All"** to manage cached images.
   >
   > ### 🎮 Step 3: Generate Scene
   >
   > **💡 Pro Tip:**  Try the examples below to get started quickly!
+https://github.com/user-attachments/assets/d0d53506-70cd-4bd3-a6ab-2f9b5b16f4d8
+*Click the image above to watch the demo video*
 ### Pre-segmented Image Inference
 This script processes a directory of pre-segmented images.
 - **Input**: The input folder structure should be similar to `assets/masked_image_test`, containing segmented scene images.
   ```
 ## 📚 Dataset
+To train and evaluate SceneGen, we use the [3D-FUTURE](https://tianchi.aliyun.com/dataset/98063) dataset. Please download and preprocess the dataset as follows:
+1. Download the 3D-FUTURE dataset from [here](https://tianchi.aliyun.com/dataset/98063) which requires applying for access.
+2. Follow the [TRELLIS](https://github.com/microsoft/TRELLIS) data processing instructions to preprocess the dataset. Make sure to follow their directory structure for compatibility and fully generate the necessary files and ``metadata.csv``.
+3. Run the ``dataset_toolkits/build_metadata_scene.py`` script to create the scene-level metadata file:
+    ```sh
+    python dataset_toolkits/build_metadata_scene.py 3D-FUTURE
+    --output_dir <path_to_3D-FUTURE>
+    --set <train or test>
+    --vggt_ckpt checkpoints/VGGT-1B --save_mask
+    ```
+    This will generate a `metadata_scene.csv` file or a `metadata_scene_test.csv` file in the specified dataset directory.
+4. For evaluation, run the ``dataset_toolkits/build_scene.sh`` script to render scene image for each scene(with Blender installed and the configs in the script set correctly):
+    ```sh
+    bash dataset_toolkits/build_scene.sh
+    ```
+    This will create a `scene_test_render` folder in the dataset directory containing the rendered images of the test scenes with Blender, which will be further used for evaluation.
 ## 🏋️‍♂️ Training
+With the processed 3D-FUTURE dataset and the pretrained `ss_flow_img_dit_L_16l8_fp16.safetensors` model checkpoint from [TRELLIS](https://huggingface.co/microsoft/TRELLIS-image-large) correctly placed in the `checkpoints/scenegen/ckpts` directory, you can train SceneGen using the following command:
+```
+bash scripts/train.sh
+```
+For detailed training configurations, please refer to `configs/generation/ss_scenegen_flow_img_train.json` and change the parameters as needed.
+## 🧪 Evaluation
+To generate the 3D scenes on the 3D-FUTURE test set using the SceneGen model, use the following command:
+```
+bash scenegen_eval.sh
+```
+which will use the `scenegen_eval.py` script to generate the normalized scenes.
+To evaluate the trained SceneGen model on the 3D-FUTURE test set, use the following command:
+```
+cd evalscene
+bash eval_scenegen.sh
+```
+Make sure to have the processed 3D-FUTURE dataset and the rendered images in place as described in the Dataset section and the evaluation configs in `evalscene/configs/test/scene_evaluation_scenegen.yaml` set correctly. Then the evaluation script will compute metrics between the normalized generated scenes and the ground truth.
+Some packages used in the evaluation require additional installation. Please install the packages: `torchmetrics`, `lpips`, `clip`, and `probreg` via pip.
 ## 📜 Citation
 If you use this code and data for your research or project, please cite:
+```
+   @inproceedings{meng2026scenegen,
+     author    = {Meng, Yanxu and Wu, Haoning and Zhang, Ya and Xie, Weidi},
+     title     = {SceneGen: Single-Image 3D Scene Generation in One Feedforward Pass},
+     booktitle   = {International Conference on 3D Vision 2026},
+     year      = {2026},
+   }
+```
 ## TODO
 - [x] Release Paper
 - [x] Release Checkpoints & Inference Code
+- [x] Release Training Code
+- [x] Release Data Processing Code
+- [x] Release Evaluation Code
 ## Acknowledgements
 Many thanks to the code bases from [TRELLIS](https://github.com/microsoft/TRELLIS), [DINOv2](https://github.com/facebookresearch/dinov2), and [VGGT](https://github.com/facebookresearch/vggt).
 ## Contact
+If you have any questions, please feel free to contact [meng-mou-xu@sjtu.edu.cn](mailto:meng-mou-xu@sjtu.edu.cn) and [haoningwu3639@gmail.com](mailto:haoningwu3639@gmail.com).