File size: 4,958 Bytes
d78cc54
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
# InteriorFusion Inference Optimization Guide

## Target Platforms

### RTX 4090 (24GB VRAM) β€” Consumer Desktop
```bash
# Quantized inference with INT8
python -m interiorfusion.infer \
    --image room.jpg \
    --output ./output/ \
    --model-size L \
    --device cuda \
    --dtype float16 \
    --no-pbr  # Disable PBR for faster generation

# Expected: ~12s for full scene with GLB+PLY output
```

**Optimizations**:
- FP16 inference throughout pipeline
- Skip material generation for preview mode
- Use `torch.compile()` on DiT forward pass
- Flash Attention 2 for transformer attention
- Batch multi-view generation (6 views simultaneously)

### A100 (80GB VRAM) β€” Cloud / Datacenter
```bash
# Full quality generation
python -m interiorfusion.infer \
    --image room.jpg \
    --output ./output/ \
    --model-size XL \
    --device cuda \
    --dtype bfloat16

# Expected: ~8s for full scene with all formats
```

**Optimizations**:
- BF16 precision (better numerical stability than FP16)
- Batch size 4 for parallel room generation
- CUDA Graphs for repeated operations
- Persistent CUDA cache

### H100 (80GB VRAM) β€” Latest Datacenter
```bash
# Maximum quality with Transformer Engine
python -m interiorfusion.infer \
    --image room.jpg \
    --output ./output/ \
    --model-size XL \
    --device cuda \
    --dtype bfloat16

# Expected: ~5s full pipeline
```

**Optimizations**:
- FP8 via Transformer Engine
- Hardware-accelerated attention
- NVLink for multi-GPU distribution

### Apple Silicon (MLX)
```bash
# MLX-optimized inference
python -m interiorfusion.infer \
    --image room.jpg \
    --output ./output/ \
    --model-size S \
    --device mps \
    --dtype float32

# Expected: ~30s on M3 Max (36GB unified memory)
```

**Optimizations**:
- MLX graph compilation
- Unified memory avoids CPU-GPU copies
- Model quantization to 4-bit via GPTQ

### Edge / Mobile
```bash
# Core pipeline only (depth + layout)
python -m interiorfusion.infer \
    --image room.jpg \
    --output ./output/ \
    --model-size S \
    --device cpu \
    --no-pbr --no-gaussian \
    --formats glb

# Expected: ~5s depth+layout, scene sent to cloud for 3D generation
```

**Optimizations**:
- Core inference on-device (depth + segmentation)
- Cloud offloading for 3D generation
- Streaming mesh chunks
- Aggressive quantization (INT4)

## Quantization Strategies

| Method | Model Size | Speedup | Quality Impact | VRAM Reduction |
|--------|-----------|---------|---------------|---------------|
| FP32 (baseline) | 100% | 1Γ— | β€” | 100% |
| FP16 | 50% | 1.8Γ— | Minimal | 50% |
| BF16 | 50% | 1.8Γ— | Minimal | 50% |
| INT8 (SmoothQuant) | 25% | 2.5Γ— | Low | 25% |
| FP8 (TE) | 25% | 3Γ— | Low | 25% |
| GPTQ-4bit | 12.5% | 3.5Γ— | Medium | 12.5% |
| AWQ-4bit | 12.5% | 3.2Γ— | Low | 12.5% |

## Export Formats

| Format | Size | Viewer | Game Engine | AR/VR | Notes |
|--------|------|--------|------------|-------|-------|
| **GLB** | ~5-50MB | βœ… (Web) | βœ… (UE/Unity) | βœ… (WebXR) | Recommended default |
| **FBX** | ~10-100MB | ⚠️ (Limited) | βœ… (UE/Unity/Maya) | ⚠️ | For animation/ rigging |
| **OBJ** | ~5-30MB | βœ… (Universal) | βœ… (All) | ⚠️ | Legacy, no PBR |
| **USDZ** | ~5-50MB | βœ… (iOS AR) | ⚠️ (UE via plugin) | βœ… (ARKit) | Apple's format |
| **PLY (3DGS)** | ~10-500MB | βœ… (Gaussian viewers) | ⚠️ (UE5 plugin) | ⚠️ | For splatting render |

## ComfyUI Integration

Install the custom nodes:
```bash
cd ComfyUI/custom_nodes
git clone https://github.com/stevee00/ComfyUI-InteriorFusion
```

Available nodes:
- `InteriorFusion: Generate Scene` β€” Full pipeline
- `InteriorFusion: Generate Object` β€” Single furniture
- `InteriorFusion: Apply Material` β€” PBR material
- `InteriorFusion: Export Mesh` β€” Format conversion

## Blender Integration

Install the addon:
```bash
# In Blender: Edit > Preferences > Add-ons > Install
# Select blender_plugin/interiorfusion_blender.py
```

Features:
- Generate 3D scene from reference image
- Import with PBR materials
- Interactive object editing
- Export to game engines

## Unreal Engine Integration

1. Export GLB from InteriorFusion
2. Import via glTF importer (UE5 built-in)
3. Materials auto-convert to Unreal PBR
4. Use Gaussian Splatting plugin for real-time preview

Plugins needed:
- `glTFRuntime` for runtime GLB loading
- `MLSLabsGaussianSplattingRenderer` for 3DGS

## Unity Integration

1. Export GLB or FBX from InteriorFusion
2. Import into Unity project
3. Materials map to Unity Standard/URP/HDRP
4. Use GaussianSplatting package for 3DGS

## Performance Targets

| Platform | Target Time | Target VRAM | Output Quality |
|----------|------------|-------------|---------------|
| RTX 4090 | < 15s | < 20GB | Production |
| A100 | < 8s | < 72GB | Maximum |
| H100 | < 5s | < 72GB | Maximum |
| M3 Max | < 30s | < 36GB | Production |
| RTX 3060 | < 60s | < 10GB | Preview |
| Edge (CPU) | < 10s (depth only) | < 4GB | Core only |