InteriorFusion / docs /INFERENCE_OPTIMIZATION.md
stevee00's picture
Upload docs/INFERENCE_OPTIMIZATION.md
d78cc54 verified
# InteriorFusion Inference Optimization Guide
## Target Platforms
### RTX 4090 (24GB VRAM) β€” Consumer Desktop
```bash
# Quantized inference with INT8
python -m interiorfusion.infer \
--image room.jpg \
--output ./output/ \
--model-size L \
--device cuda \
--dtype float16 \
--no-pbr # Disable PBR for faster generation
# Expected: ~12s for full scene with GLB+PLY output
```
**Optimizations**:
- FP16 inference throughout pipeline
- Skip material generation for preview mode
- Use `torch.compile()` on DiT forward pass
- Flash Attention 2 for transformer attention
- Batch multi-view generation (6 views simultaneously)
### A100 (80GB VRAM) β€” Cloud / Datacenter
```bash
# Full quality generation
python -m interiorfusion.infer \
--image room.jpg \
--output ./output/ \
--model-size XL \
--device cuda \
--dtype bfloat16
# Expected: ~8s for full scene with all formats
```
**Optimizations**:
- BF16 precision (better numerical stability than FP16)
- Batch size 4 for parallel room generation
- CUDA Graphs for repeated operations
- Persistent CUDA cache
### H100 (80GB VRAM) β€” Latest Datacenter
```bash
# Maximum quality with Transformer Engine
python -m interiorfusion.infer \
--image room.jpg \
--output ./output/ \
--model-size XL \
--device cuda \
--dtype bfloat16
# Expected: ~5s full pipeline
```
**Optimizations**:
- FP8 via Transformer Engine
- Hardware-accelerated attention
- NVLink for multi-GPU distribution
### Apple Silicon (MLX)
```bash
# MLX-optimized inference
python -m interiorfusion.infer \
--image room.jpg \
--output ./output/ \
--model-size S \
--device mps \
--dtype float32
# Expected: ~30s on M3 Max (36GB unified memory)
```
**Optimizations**:
- MLX graph compilation
- Unified memory avoids CPU-GPU copies
- Model quantization to 4-bit via GPTQ
### Edge / Mobile
```bash
# Core pipeline only (depth + layout)
python -m interiorfusion.infer \
--image room.jpg \
--output ./output/ \
--model-size S \
--device cpu \
--no-pbr --no-gaussian \
--formats glb
# Expected: ~5s depth+layout, scene sent to cloud for 3D generation
```
**Optimizations**:
- Core inference on-device (depth + segmentation)
- Cloud offloading for 3D generation
- Streaming mesh chunks
- Aggressive quantization (INT4)
## Quantization Strategies
| Method | Model Size | Speedup | Quality Impact | VRAM Reduction |
|--------|-----------|---------|---------------|---------------|
| FP32 (baseline) | 100% | 1Γ— | β€” | 100% |
| FP16 | 50% | 1.8Γ— | Minimal | 50% |
| BF16 | 50% | 1.8Γ— | Minimal | 50% |
| INT8 (SmoothQuant) | 25% | 2.5Γ— | Low | 25% |
| FP8 (TE) | 25% | 3Γ— | Low | 25% |
| GPTQ-4bit | 12.5% | 3.5Γ— | Medium | 12.5% |
| AWQ-4bit | 12.5% | 3.2Γ— | Low | 12.5% |
## Export Formats
| Format | Size | Viewer | Game Engine | AR/VR | Notes |
|--------|------|--------|------------|-------|-------|
| **GLB** | ~5-50MB | βœ… (Web) | βœ… (UE/Unity) | βœ… (WebXR) | Recommended default |
| **FBX** | ~10-100MB | ⚠️ (Limited) | βœ… (UE/Unity/Maya) | ⚠️ | For animation/ rigging |
| **OBJ** | ~5-30MB | βœ… (Universal) | βœ… (All) | ⚠️ | Legacy, no PBR |
| **USDZ** | ~5-50MB | βœ… (iOS AR) | ⚠️ (UE via plugin) | βœ… (ARKit) | Apple's format |
| **PLY (3DGS)** | ~10-500MB | βœ… (Gaussian viewers) | ⚠️ (UE5 plugin) | ⚠️ | For splatting render |
## ComfyUI Integration
Install the custom nodes:
```bash
cd ComfyUI/custom_nodes
git clone https://github.com/stevee00/ComfyUI-InteriorFusion
```
Available nodes:
- `InteriorFusion: Generate Scene` β€” Full pipeline
- `InteriorFusion: Generate Object` β€” Single furniture
- `InteriorFusion: Apply Material` β€” PBR material
- `InteriorFusion: Export Mesh` β€” Format conversion
## Blender Integration
Install the addon:
```bash
# In Blender: Edit > Preferences > Add-ons > Install
# Select blender_plugin/interiorfusion_blender.py
```
Features:
- Generate 3D scene from reference image
- Import with PBR materials
- Interactive object editing
- Export to game engines
## Unreal Engine Integration
1. Export GLB from InteriorFusion
2. Import via glTF importer (UE5 built-in)
3. Materials auto-convert to Unreal PBR
4. Use Gaussian Splatting plugin for real-time preview
Plugins needed:
- `glTFRuntime` for runtime GLB loading
- `MLSLabsGaussianSplattingRenderer` for 3DGS
## Unity Integration
1. Export GLB or FBX from InteriorFusion
2. Import into Unity project
3. Materials map to Unity Standard/URP/HDRP
4. Use GaussianSplatting package for 3DGS
## Performance Targets
| Platform | Target Time | Target VRAM | Output Quality |
|----------|------------|-------------|---------------|
| RTX 4090 | < 15s | < 20GB | Production |
| A100 | < 8s | < 72GB | Maximum |
| H100 | < 5s | < 72GB | Maximum |
| M3 Max | < 30s | < 36GB | Production |
| RTX 3060 | < 60s | < 10GB | Preview |
| Edge (CPU) | < 10s (depth only) | < 4GB | Core only |