| | --- |
| | library_name: trellis |
| | pipeline_tag: image-to-3d |
| | license: mit |
| | language: |
| | - en |
| | --- |
| | # TRELLIS Image Large |
| |
|
| | **TRELLIS Image Large** generates 3D objects from images. The inputs are images (`.jpg`, `.png`) and the outputs are meshes (`.glb`) and splats (`.ply`). |
| |
|
| | ## ๐๏ธ Model Details |
| |
|
| | - Name: TRELLIS-image-large |
| | - Type: [Image-to-3D](https://huggingface.co/tasks/image-to-3d) |
| | - Size: 1.2 billion parameters |
| | - Code: https://github.com/microsoft/TRELLIS |
| | - Paper: https://arxiv.org/abs/2412.01506 |
| | - Training Data: [TRELLIS-500K](https://github.com/microsoft/TRELLIS#-dataset) |
| |
|
| | ## ๐ก Usage |
| |
|
| | ### Minimal Example |
| |
|
| | Here is an example of how to use the pretrained models for 3D asset generation. |
| |
|
| | ``` |
| | import os |
| | # os.environ['ATTN_BACKEND'] = 'xformers' # Can be 'flash-attn' or 'xformers', default is 'flash-attn' |
| | os.environ['SPCONV_ALGO'] = 'native' # Can be 'native' or 'auto', default is 'auto'. |
| | # 'auto' is faster but will do benchmarking at the beginning. |
| | # Recommended to set to 'native' if run only once. |
| | |
| | import imageio |
| | from PIL import Image |
| | from trellis.pipelines import TrellisImageTo3DPipeline |
| | from trellis.utils import render_utils, postprocessing_utils |
| | |
| | # Load a pipeline from a model folder or a Hugging Face model hub. |
| | pipeline = TrellisImageTo3DPipeline.from_pretrained("JeffreyXiang/TRELLIS-image-large") |
| | pipeline.cuda() |
| | |
| | # Load an image |
| | image = Image.open("assets/example_image/T.png") |
| | |
| | # Run the pipeline |
| | outputs = pipeline.run( |
| | image, |
| | seed=1, |
| | # Optional parameters |
| | # sparse_structure_sampler_params={ |
| | # "steps": 12, |
| | # "cfg_strength": 7.5, |
| | # }, |
| | # slat_sampler_params={ |
| | # "steps": 12, |
| | # "cfg_strength": 3, |
| | # }, |
| | ) |
| | # outputs is a dictionary containing generated 3D assets in different formats: |
| | # - outputs['gaussian']: a list of 3D Gaussians |
| | # - outputs['radiance_field']: a list of radiance fields |
| | # - outputs['mesh']: a list of meshes |
| | |
| | # Render the outputs |
| | video = render_utils.render_video(outputs['gaussian'][0])['color'] |
| | imageio.mimsave("sample_gs.mp4", video, fps=30) |
| | video = render_utils.render_video(outputs['radiance_field'][0])['color'] |
| | imageio.mimsave("sample_rf.mp4", video, fps=30) |
| | video = render_utils.render_video(outputs['mesh'][0])['normal'] |
| | imageio.mimsave("sample_mesh.mp4", video, fps=30) |
| | |
| | # GLB files can be extracted from the outputs |
| | glb = postprocessing_utils.to_glb( |
| | outputs['gaussian'][0], |
| | outputs['mesh'][0], |
| | # Optional parameters |
| | simplify=0.95, # Ratio of triangles to remove in the simplification process |
| | texture_size=1024, # Size of the texture used for the GLB |
| | ) |
| | glb.export("sample.glb") |
| | |
| | # Save Gaussians as PLY files |
| | outputs['gaussian'][0].save_ply("sample.ply") |
| | ``` |
| |
|
| | After running the code, you will get the following files: |
| |
|
| | - sample_gs.mp4: a video showing the 3D Gaussian representation |
| | - sample_rf.mp4: a video showing the Radiance Field representation |
| | - sample_mesh.mp4: a video showing the mesh representation |
| | - sample.glb: a GLB file containing the extracted textured mesh |
| | - sample.ply: a PLY file containing the 3D Gaussian representation |
| | |
| | ## โ๏ธ License |
| | |
| | TRELLIS models and the majority of the code are licensed under the [MIT License](LICENSE). The following submodules may have different licenses: |
| | - [**diffoctreerast**](https://github.com/JeffreyXiang/diffoctreerast): We developed a CUDA-based real-time differentiable octree renderer for rendering radiance fields as part of this project. This renderer is derived from the [diff-gaussian-rasterization](https://github.com/graphdeco-inria/diff-gaussian-rasterization) project and is available under the [LICENSE](https://github.com/JeffreyXiang/diffoctreerast/blob/master/LICENSE). |
| | |
| | |
| | - [**Modified Flexicubes**](https://github.com/MaxtirError/FlexiCubes): In this project, we used a modified version of [Flexicubes](https://github.com/nv-tlabs/FlexiCubes) to support vertex attributes. This modified version is licensed under the [LICENSE](https://github.com/nv-tlabs/FlexiCubes/blob/main/LICENSE.txt). |
| | |
| | |
| | ## ๐ Citation |
| | |
| | If you find this work helpful, please consider citing our paper: |
| | |
| | ```bibtex |
| | @article{xiang2024structured, |
| | title = {Structured 3D Latents for Scalable and Versatile 3D Generation}, |
| | author = {Xiang, Jianfeng and Lv, Zelong and Xu, Sicheng and Deng, Yu and Wang, Ruicheng and Zhang, Bowen and Chen, Dong and Tong, Xin and Yang, Jiaolong}, |
| | journal = {arXiv preprint arXiv:2412.01506}, |
| | year = {2024} |
| | } |
| | ``` |
| | |