Image-to-3D
English
zoujx96 commited on
Commit
533fe21
·
verified ·
1 Parent(s): b478f77

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +76 -5
README.md CHANGED
@@ -1,5 +1,76 @@
1
- ---
2
- license: other
3
- license_name: snap-non-commercial-license
4
- license_link: LICENSE
5
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: other
3
+ license_name: snap-non-commercial-license
4
+ license_link: LICENSE
5
+ datasets:
6
+ - allenai/objaverse
7
+ language:
8
+ - en
9
+ pipeline_tag: image-to-3d
10
+ ---
11
+ ## Model Details
12
+
13
+ GTR is a large 3D reconstruction model that takes multi-view images as input and enables the generation of high-quality meshes with faithful texture reconstruction within seconds.
14
+
15
+ ## Model Description
16
+
17
+ <!-- Provide a longer summary of what this model is. -->
18
+
19
+ - **Developed by:** [Snap Research](https://github.com/snap-research)
20
+ - **License:** [snap-non-commercial-license](https://huggingface.co/snap-research/gtr/blob/main/LICENSE)
21
+
22
+ ## Model Sources [optional]
23
+
24
+ <!-- Provide the basic links for the model. -->
25
+
26
+ - **Repository:** [https://github.com/snap-research/snap_gtr/tree/main]
27
+ - **Paper [optional]:** [https://arxiv.org/abs/2406.05649]
28
+
29
+ ## How to Get Started with the Model
30
+
31
+ Use the code below to get started with the model.
32
+
33
+ ### Installation
34
+
35
+ We recommend using `Python>=3.10`, `PyTorch==2.7.0`, and `CUDA>=12.4`.
36
+ ```bash
37
+ conda create --name gtr python=3.10
38
+ conda activate gtr
39
+ pip install -U pip
40
+
41
+ pip install torch==2.7.0 torchvision==0.22.0 torchmetrics==1.2.1 --index-url https://download.pytorch.org/whl/cu124
42
+ pip install -U xformers --index-url https://download.pytorch.org/whl/cu124
43
+
44
+ pip install -r requirements.txt
45
+ ```
46
+
47
+ ### How to Use
48
+
49
+ Please download model checkpoint from [here](https://drive.google.com/file/d/1ITVqdDLmY5EISj4vsZ2O4sN5mZv9fUfB/view?usp=sharing), and then put it under the `ckpts/` directory.
50
+
51
+ We provide multiview grid data examples under `./examples/` generated using [Zero123++](https://github.com/SUDO-AI-3D/zero123plus). Our inference script loads pretrained checkpoint, runs fast texture refinement, reconstructs the textured mesh from multiview grid data and exports the mesh. There will be 3 files in the output folder, including exported mesh in `.obj` format, rotating gif visuals of mesh and rotating gif visuals of NeRF.
52
+
53
+ To infer on multiview data from other sources, simply change camera parameters [here](https://github.sc-corp.net/Snapchat/GTR/blob/main/scripts/prepare_mv.py#L153-L157) accordingly to match the multiview data.
54
+
55
+ ```bash
56
+ # Preprocessing
57
+ python3 scripts/prepare_mv.py --in_dir ./examples/cute_horse.png --out_dir ./examples/cute_horse
58
+
59
+ # Inference
60
+ python3 scripts/inference.py --ckpt_path ckpts/full_checkpoint.pth --in_dir ./examples/cute_horse --out_dir ./outputs/cute_horse
61
+ ```
62
+
63
+ ## Citation [optional]
64
+
65
+ <!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
66
+
67
+ **BibTeX:**
68
+
69
+ ```bibtex
70
+ @article{zhuang2024gtr,
71
+ title={Gtr: Improving large 3d reconstruction models through geometry and texture refinement},
72
+ author={Zhuang, Peiye and Han, Songfang and Wang, Chaoyang and Siarohin, Aliaksandr and Zou, Jiaxu and Vasilkovsky, Michael and Shakhrai, Vladislav and Korolev, Sergey and Tulyakov, Sergey and Lee, Hsin-Ying},
73
+ journal={arXiv preprint arXiv:2406.05649},
74
+ year={2024}
75
+ }
76
+ ```