microsoft
/

TRELLIS-image-large

@@ -7,10 +7,109 @@ language:
 ---
 # TRELLIS Image Large
-<!-- Provide a quick summary of what the model is/does. -->
-The image conditioned version of TRELLIS, a large 3D genetive model. It was introduced in the paper [Structured 3D Latents for Scalable and Versatile 3D Generation](https://huggingface.co/papers/2412.01506).
-Project page: https://trellis3d.github.io/
-Code: https://github.com/Microsoft/TRELLIS

 ---
 # TRELLIS Image Large
+**TRELLIS Image Large** generates 3D objects from images. The inputs are images (`.jpg`, `.png`) and the outputs are meshes (`.glb`) and splats (`.ply`).
+## 🗒️ Model Details
+- Name: TRELLIS-image-large
+- Type: [Image-to-3D](https://huggingface.co/tasks/image-to-3d)
+- Size: 1.2 billion parameters
+- Code: https://github.com/microsoft/TRELLIS
+- Paper: https://arxiv.org/abs/2412.01506
+- Training Data: [TRELLIS-500K](https://github.com/microsoft/TRELLIS#-dataset)
+## 💡 Usage
+### Minimal Example
+Here is an example of how to use the pretrained models for 3D asset generation.
+```
+import os
+# os.environ['ATTN_BACKEND'] = 'xformers'   # Can be 'flash-attn' or 'xformers', default is 'flash-attn'
+os.environ['SPCONV_ALGO'] = 'native'        # Can be 'native' or 'auto', default is 'auto'.
+                                            # 'auto' is faster but will do benchmarking at the beginning.
+                                            # Recommended to set to 'native' if run only once.
+import imageio
+from PIL import Image
+from trellis.pipelines import TrellisImageTo3DPipeline
+from trellis.utils import render_utils, postprocessing_utils
+# Load a pipeline from a model folder or a Hugging Face model hub.
+pipeline = TrellisImageTo3DPipeline.from_pretrained("JeffreyXiang/TRELLIS-image-large")
+pipeline.cuda()
+# Load an image
+image = Image.open("assets/example_image/T.png")
+# Run the pipeline
+outputs = pipeline.run(
+    image,
+    seed=1,
+    # Optional parameters
+    # sparse_structure_sampler_params={
+    #     "steps": 12,
+    #     "cfg_strength": 7.5,
+    # },
+    # slat_sampler_params={
+    #     "steps": 12,
+    #     "cfg_strength": 3,
+    # },
+)
+# outputs is a dictionary containing generated 3D assets in different formats:
+# - outputs['gaussian']: a list of 3D Gaussians
+# - outputs['radiance_field']: a list of radiance fields
+# - outputs['mesh']: a list of meshes
+# Render the outputs
+video = render_utils.render_video(outputs['gaussian'][0])['color']
+imageio.mimsave("sample_gs.mp4", video, fps=30)
+video = render_utils.render_video(outputs['radiance_field'][0])['color']
+imageio.mimsave("sample_rf.mp4", video, fps=30)
+video = render_utils.render_video(outputs['mesh'][0])['normal']
+imageio.mimsave("sample_mesh.mp4", video, fps=30)
+# GLB files can be extracted from the outputs
+glb = postprocessing_utils.to_glb(
+    outputs['gaussian'][0],
+    outputs['mesh'][0],
+    # Optional parameters
+    simplify=0.95,          # Ratio of triangles to remove in the simplification process
+    texture_size=1024,      # Size of the texture used for the GLB
+)
+glb.export("sample.glb")
+# Save Gaussians as PLY files
+outputs['gaussian'][0].save_ply("sample.ply")
+```
+After running the code, you will get the following files:
+- sample_gs.mp4: a video showing the 3D Gaussian representation
+- sample_rf.mp4: a video showing the Radiance Field representation
+- sample_mesh.mp4: a video showing the mesh representation
+- sample.glb: a GLB file containing the extracted textured mesh
+- sample.ply: a PLY file containing the 3D Gaussian representation
+## ⚖️ License
+TRELLIS models and the majority of the code are licensed under the [MIT License](LICENSE). The following submodules may have different licenses:
+- [**diffoctreerast**](https://github.com/JeffreyXiang/diffoctreerast): We developed a CUDA-based real-time differentiable octree renderer for rendering radiance fields as part of this project. This renderer is derived from the [diff-gaussian-rasterization](https://github.com/graphdeco-inria/diff-gaussian-rasterization) project and is available under the [LICENSE](https://github.com/JeffreyXiang/diffoctreerast/blob/master/LICENSE).
+- [**Modified Flexicubes**](https://github.com/MaxtirError/FlexiCubes): In this project, we used a modified version of [Flexicubes](https://github.com/nv-tlabs/FlexiCubes) to support vertex attributes. This modified version is licensed under the [LICENSE](https://github.com/nv-tlabs/FlexiCubes/blob/main/LICENSE.txt).
+## 📜 Citation
+If you find this work helpful, please consider citing our paper:
+```bibtex
+@article{xiang2024structured,
+    title   = {Structured 3D Latents for Scalable and Versatile 3D Generation},
+    author  = {Xiang, Jianfeng and Lv, Zelong and Xu, Sicheng and Deng, Yu and Wang, Ruicheng and Zhang, Bowen and Chen, Dong and Tong, Xin and Yang, Jiaolong},
+    journal = {arXiv preprint arXiv:2412.01506},
+    year    = {2024}
+}
+```