--- license: mit pipeline_tag: image-to-3d --- # Gen-3Diffusion: Realistic Image-to-3D Generation via 2D & 3D Diffusion Synergy This repository contains the `Gen-3Diffusion` model, which achieves realistic image-to-3D generation by leveraging a pre-trained 2D diffusion model and a 3D diffusion model, as presented in the paper: [**Gen-3Diffusion: Realistic Image-to-3D Generation via 2D & 3D Diffusion Synergy**](https://huggingface.co/papers/2412.06698) Project Page: [https://yuxuan-xue.com/gen-3diffusion](https://yuxuan-xue.com/gen-3diffusion) Code: [https://github.com/YuxuanSnow/Gen3Diffusion](https://github.com/YuxuanSnow/Gen3Diffusion) ![](https://github.com/YuxuanSnow/Gen3Diffusion/blob/main/assets/teaser_video.gif) ## Key Insight :raised_hands: - 2D foundation models are powerful but output lacks 3D consistency! - 3D generative models can reconstruct 3D representation but is poor in generalization! - How to combine 2D foundation models with 3D generative models?: - they are both diffusion-based generative models => **Can be synchronized at each diffusion step** - 2D foundation model helps 3D generation => **provides strong prior informations about 3D shape** - 3D representation guides 2D diffusion sampling => **use rendered output from 3D reconstruction for reverse sampling, where 3D consistency is guaranteed** ## Install Same Conda environment to Human-3Diffusion. Please skip if you already installed it. ```bash # Conda environment conda create -n gen3diffusion python=3.10 conda activate gen3diffusion pip install torch==2.1.0 torchvision==0.16.0 torchaudio==2.1.0 --index-url https://download.pytorch.org/whl/cu121 pip install xformers==0.0.22.post4 --index-url https://download.pytorch.org/whl/cu121 # Gaussian Opacity Fields git clone https://github.com/YuxuanSnow/gaussian-opacity-fields.git cd gaussian-opacity-fields && pip install submodules/diff-gaussian-rasterization pip install submodules/simple-knn/ && cd .. export CPATH=/usr/local/cuda-12.1/targets/x86_64-linux/include:$CPATH # Dependencies pip install -r requirements.txt # TSDF Fusion (Mesh extraction) Dependencies pip install --user numpy opencv-python scikit-image numba pip install --user pycuda pip install scipy==1.11 ``` ## Pretrained Weights Our pretrained weight can be downloaded from huggingface. ```bash mkdir checkpoints_obj && cd checkpoints_obj wget https://huggingface.co/yuxuanx/gen3diffusion/resolve/main/model.safetensors wget https://huggingface.co/yuxuanx/gen3diffusion/resolve/main/model_1.safetensors wget https://huggingface.co/yuxuanx/gen3diffusion/resolve/main/pifuhd.pt cd .. ``` The avatar reconstruction module is same to Human-3Diffusion. Please skip if you already installed Human-3Diffusion. ```bash mkdir checkpoints_avatar && cd checkpoints_avatar wget https://huggingface.co/yuxuanx/human3diffusion/resolve/main/model.safetensors wget https://huggingface.co/yuxuanx/human3diffusion/resolve/main/model_1.safetensors wget https://huggingface.co/yuxuanx/human3diffusion/resolve/main/pifuhd.pt cd .. ``` ## Inference ```bash # given one image of object, generate 3D-GS object # subject should be centered in a square image, please crop properly # recenter plays a huge role in object reconstruction. Please adjust the recentering if the reconstruction doesn't work well python infer.py --test_imgs test_imgs_obj --output output_obj --checkpoints checkpoints_obj # given generated 3D-GS, perform TSDF mesh extraction python infer_mesh.py --test_imgs test_imgs_obj --output output_obj --checkpoints checkpoints_obj --mesh_quality high ``` ```bash # given one image of human, generate 3D-GS avatar # subject should be centered in a square image, please crop properly python infer.py --test_imgs test_imgs_avatar --output output_avatar --checkpoints checkpoints_avatar # given generated 3D-GS, perform TSDF mesh extraction python infer_mesh.py --test_imgs test_imgs_avatar --output output_avatar --checkpoints checkpoints_avatar --mesh_quality high ``` ## Citation :writing_hand: ```bibtex @inproceedings{xue2024gen3diffusion, title = {{Gen-3Diffusion: Realistic Image-to-3D Generation via 2D & 3D Diffusion Synergy }}, author = {Xue, Yuxuan and Xie, Xianghui and Marin, Riccardo and Pons-Moll, Gerard.},\ journal = {Arxiv},\ year = {2024},\ } ```