| --- |
| license: mit |
| tags: |
| - image-to-3d |
| - interior-design |
| - 3d-generation |
| - scene-reconstruction |
| - gaussian-splatting |
| - pbr-materials |
| - ml-intern |
| language: |
| - en |
| pipeline_tag: image-to-3d |
| library_name: interiorgen3d |
| --- |
| |
| # π InteriorGen3D |
|
|
| ## Single 2D Interior Image β High-Quality Editable 3D Interior Scene |
|
|
| **InteriorGen3D** is a production-grade AI system that converts a single interior photograph into a fully editable, photorealistic 3D scene. Unlike existing image-to-3D models (TRELLIS, Hunyuan3D-2, TripoSR) which are object-centric, InteriorGen3D is specialized for **room-scale interior reconstruction** with semantic decomposition. |
|
|
| ## β¨ Key Features |
|
|
| - π― **Interior-Specialized**: Trained on 3D-FRONT + Structured3D room data |
| - πͺ **Semantic Decomposition**: Each furniture piece is a separate, editable 3D object |
| - ποΈ **Physics-Consistent**: Manhattan-world geometry, gravity-aware placement |
| - π¨ **PBR Materials**: Albedo + metallic + roughness maps |
| - π¦ **Multi-Format Export**: GLB, FBX, OBJ, USDZ |
| - π **Editable**: Move, rotate, delete, replace individual objects |
| - π‘ **Relightable**: Change environment lighting without re-generation |
| - π **Fast**: <30s on A100, <60s on RTX 4090 |
|
|
| ## Architecture |
|
|
| 5-Stage Pipeline: |
| 1. **Scene Understanding** β Depth Anything V2 + SAM2 + SpatialLM |
| 2. **Room Structure** β Manhattan-world constrained wall/floor/ceiling meshes |
| 3. **Object Generation** β Multi-view diffusion β TRELLIS SLAT β PBR textures |
| 4. **Scene Composition** β Physics optimization + Gaussian splat preview |
| 5. **Export** β GLB/FBX/OBJ/USDZ with scene hierarchy |
|
|
| ## Model Comparison |
|
|
| | System | Geo Quality | Texture | Speed | Scene Understanding | Editability | |
| |--------|-------------|---------|-------|--------------------:|-------------| |
| | TRELLIS.2 | 9.5/10 | 9/10 | 3-60s | β | βββ | |
| | Hunyuan3D-2.1 | 8/10 | 9.5/10 | 30-60s | β | ββ | |
| | SF3D | 7.5/10 | 7/10 | 0.5s | β | ββ | |
| | **InteriorGen3D** | 8/10 | 8/10 | 30s | β
| βββββ | |
|
|
| ## Usage |
|
|
| ```python |
| from interiorgen3d.pipeline.main_pipeline import InteriorGen3DPipeline |
| from interiorgen3d.config.pipeline_config import PipelineConfig |
| |
| config = PipelineConfig.for_rtx4090() |
| pipeline = InteriorGen3DPipeline(config) |
| pipeline.load_models() |
| |
| result = pipeline.generate("living_room.jpg", output_dir="./output") |
| ``` |
|
|
| ## Training Data |
|
|
| - **3D-FRONT**: 18,968 rooms with furniture arrangements |
| - **Structured3D**: 21,835 rooms with panoramic renders |
| - **SpatialLM-Dataset**: 54,778 rooms with structured annotations |
| - **Objaverse** (filtered): ~50K interior furniture objects |
| - **Hypersim**: 461 photorealistic interior scenes |
|
|
| ## Hardware Requirements |
|
|
| | Platform | Performance | VRAM | |
| |----------|-------------|------| |
| | H100 | <10s | 80GB | |
| | A100 | <30s | 80GB | |
| | RTX 4090 | <45s | 24GB | |
| | RTX 3090 | <60s | 24GB | |
|
|
| ## Research Foundation |
|
|
| Built on: TRELLIS/TRELLIS.2 (Microsoft, arXiv:2412.01506, 2512.14692) β’ Hunyuan3D-2.1 (Tencent, arXiv:2506.15442) β’ SpatialLM (arXiv:2506.07491) β’ Depth Anything V2 (arXiv:2406.09414) β’ SF3D (arXiv:2408.00653) β’ 3DGS (arXiv:2308.04079) |
|
|
| ## License |
|
|
| MIT β free for commercial and non-commercial use. |
|
|
| <!-- ml-intern-provenance --> |
| ## Generated by ML Intern |
|
|
| This model repository was generated by [ML Intern](https://github.com/huggingface/ml-intern), an agent for machine learning research and development on the Hugging Face Hub. |
|
|
| - Try ML Intern: https://smolagents-ml-intern.hf.space |
| - Source code: https://github.com/huggingface/ml-intern |
|
|