# InteriorFusion Inference Optimization Guide ## Target Platforms ### RTX 4090 (24GB VRAM) — Consumer Desktop ```bash # Quantized inference with INT8 python -m interiorfusion.infer \ --image room.jpg \ --output ./output/ \ --model-size L \ --device cuda \ --dtype float16 \ --no-pbr # Disable PBR for faster generation # Expected: ~12s for full scene with GLB+PLY output ``` **Optimizations**: - FP16 inference throughout pipeline - Skip material generation for preview mode - Use `torch.compile()` on DiT forward pass - Flash Attention 2 for transformer attention - Batch multi-view generation (6 views simultaneously) ### A100 (80GB VRAM) — Cloud / Datacenter ```bash # Full quality generation python -m interiorfusion.infer \ --image room.jpg \ --output ./output/ \ --model-size XL \ --device cuda \ --dtype bfloat16 # Expected: ~8s for full scene with all formats ``` **Optimizations**: - BF16 precision (better numerical stability than FP16) - Batch size 4 for parallel room generation - CUDA Graphs for repeated operations - Persistent CUDA cache ### H100 (80GB VRAM) — Latest Datacenter ```bash # Maximum quality with Transformer Engine python -m interiorfusion.infer \ --image room.jpg \ --output ./output/ \ --model-size XL \ --device cuda \ --dtype bfloat16 # Expected: ~5s full pipeline ``` **Optimizations**: - FP8 via Transformer Engine - Hardware-accelerated attention - NVLink for multi-GPU distribution ### Apple Silicon (MLX) ```bash # MLX-optimized inference python -m interiorfusion.infer \ --image room.jpg \ --output ./output/ \ --model-size S \ --device mps \ --dtype float32 # Expected: ~30s on M3 Max (36GB unified memory) ``` **Optimizations**: - MLX graph compilation - Unified memory avoids CPU-GPU copies - Model quantization to 4-bit via GPTQ ### Edge / Mobile ```bash # Core pipeline only (depth + layout) python -m interiorfusion.infer \ --image room.jpg \ --output ./output/ \ --model-size S \ --device cpu \ --no-pbr --no-gaussian \ --formats glb # Expected: ~5s depth+layout, scene sent to cloud for 3D generation ``` **Optimizations**: - Core inference on-device (depth + segmentation) - Cloud offloading for 3D generation - Streaming mesh chunks - Aggressive quantization (INT4) ## Quantization Strategies | Method | Model Size | Speedup | Quality Impact | VRAM Reduction | |--------|-----------|---------|---------------|---------------| | FP32 (baseline) | 100% | 1× | — | 100% | | FP16 | 50% | 1.8× | Minimal | 50% | | BF16 | 50% | 1.8× | Minimal | 50% | | INT8 (SmoothQuant) | 25% | 2.5× | Low | 25% | | FP8 (TE) | 25% | 3× | Low | 25% | | GPTQ-4bit | 12.5% | 3.5× | Medium | 12.5% | | AWQ-4bit | 12.5% | 3.2× | Low | 12.5% | ## Export Formats | Format | Size | Viewer | Game Engine | AR/VR | Notes | |--------|------|--------|------------|-------|-------| | **GLB** | ~5-50MB | ✅ (Web) | ✅ (UE/Unity) | ✅ (WebXR) | Recommended default | | **FBX** | ~10-100MB | ⚠️ (Limited) | ✅ (UE/Unity/Maya) | ⚠️ | For animation/ rigging | | **OBJ** | ~5-30MB | ✅ (Universal) | ✅ (All) | ⚠️ | Legacy, no PBR | | **USDZ** | ~5-50MB | ✅ (iOS AR) | ⚠️ (UE via plugin) | ✅ (ARKit) | Apple's format | | **PLY (3DGS)** | ~10-500MB | ✅ (Gaussian viewers) | ⚠️ (UE5 plugin) | ⚠️ | For splatting render | ## ComfyUI Integration Install the custom nodes: ```bash cd ComfyUI/custom_nodes git clone https://github.com/stevee00/ComfyUI-InteriorFusion ``` Available nodes: - `InteriorFusion: Generate Scene` — Full pipeline - `InteriorFusion: Generate Object` — Single furniture - `InteriorFusion: Apply Material` — PBR material - `InteriorFusion: Export Mesh` — Format conversion ## Blender Integration Install the addon: ```bash # In Blender: Edit > Preferences > Add-ons > Install # Select blender_plugin/interiorfusion_blender.py ``` Features: - Generate 3D scene from reference image - Import with PBR materials - Interactive object editing - Export to game engines ## Unreal Engine Integration 1. Export GLB from InteriorFusion 2. Import via glTF importer (UE5 built-in) 3. Materials auto-convert to Unreal PBR 4. Use Gaussian Splatting plugin for real-time preview Plugins needed: - `glTFRuntime` for runtime GLB loading - `MLSLabsGaussianSplattingRenderer` for 3DGS ## Unity Integration 1. Export GLB or FBX from InteriorFusion 2. Import into Unity project 3. Materials map to Unity Standard/URP/HDRP 4. Use GaussianSplatting package for 3DGS ## Performance Targets | Platform | Target Time | Target VRAM | Output Quality | |----------|------------|-------------|---------------| | RTX 4090 | < 15s | < 20GB | Production | | A100 | < 8s | < 72GB | Maximum | | H100 | < 5s | < 72GB | Maximum | | M3 Max | < 30s | < 36GB | Production | | RTX 3060 | < 60s | < 10GB | Preview | | Edge (CPU) | < 10s (depth only) | < 4GB | Core only |