# Open3DForge — Full Build Plan

> **Project:** Personal image-to-game-ready 3D asset pipeline
> **Owner:** Basel · Solo dev for "What Remains" (UE5.7)
> **Hosting:** Single HF Pro Space, ZeroGPU H200, 25 min/day quota
> **SDK:** Gradio 6.x (currently 6.14.0)
> **Repo:** `Reverb/open3dforge`

---

## Architectural Decisions (Locked In)

These were debated and decided during early development. Don't relitigate them mid-build.

1. **One Space, not multiple.** Option B: vendor TRELLIS.2 + Hunyuan3D-2 + UniRig into this Space rather than orchestrating across multiple Spaces via `gradio_client`. Larger repo, but single deployment, no inter-Space latency, no auth juggling.

2. **HF Space only, no local fallback.** Don't build a path for running on the RTX 4070. Quota is enough for personal use.

3. **Gradio 6.x.** Match the `sdk_version` in `README.md`. No upper bound in `requirements.txt`.

4. **UE5-first export defaults.** DirectX normals, ORM packing, cm units, Z-up, `SM_/SK_/T_` naming.

5. **Drop the custom website.** Standard Gradio tabs. `gradio.Server` is not worth the work for one user.

6. **Drop CHORD.** Research-only license. Use TRELLIS.2's own metallic/roughness volume attributes instead, which are already correct and license-compatible.

7. **nvdiffrast for all baking.** Not Blender headless. Fast (~2-5s per bake), GPU-based, fits inside `@spaces.GPU`, no apt install needed beyond what TRELLIS already requires.

8. **Solo-user workspace pattern.** One `workspace/current/` folder. No session IDs, no multi-tenancy.

---

## Reference Code: Pipeline Stage Order

```
INPUT: 1-4 images
   │
   ├── rembg (background removal, CPU)
   │
   ▼
[Stage 1] GENERATION                          [GPU]
   ├── TRELLIS.2 (hard surface) or Hunyuan3D-2 (organic)
   ├── SAVE high_poly.glb (kept for normal baking)
   └── Extract: albedo + metallic + roughness volume attrs
   │
   ▼
[Stage 2] POST-PROCESSING                     [CPU + GPU baking]
   2A  Mesh repair             pymeshfix          CPU
   2B  Geometry cleanup        PyMeshLab          CPU
   2C  Decimation                                 CPU
        ├── Preview: fast-simplification
        └── Final:   PyMeshLab quality pass
   2D  Symmetry (characters)   PyMeshLab          CPU
   2E  UV unwrap               xatlas             CPU
        (texels_per_unit packing)
   2F  Normal bake             nvdiffrast         GPU
        DX + GL outputs
   2G  Albedo bake             nvdiffrast         GPU
        vertex color → UV atlas
   2H  Material bake           nvdiffrast         GPU
        TRELLIS volume attrs → UV
   2I  AO bake                 nvdiffrast         GPU
        ray-occlusion → UV
   2J  Texture inpaint (opt)   SDXL inpaint       GPU
        hidden UV regions
   2K  Channel pack            numpy              CPU
        Unreal ORM / Unity MetSmooth
   2L  LOD generation          PyMeshLab          CPU
        LOD0/LOD1/LOD2 only (UE5 HLOD handles billboards)
   2M  Collision mesh          CoACD              CPU
   2N  Pivot correction        trimesh            CPU
   2O  Scale validation        trimesh            CPU
   │
   ▼
[Stage 3] AUTO-RIGGING (Optional)             [GPU]
   UniRig → rigged.glb / rigged.fbx
   │
   ▼
[Stage 4] EXPORT
   UE5 preset (default) → DX normals + ORM
   Naming: SM_/SK_/T_ convention
   → export_AssetName_UE5.zip
```

---

## Milestone Plan

Each milestone is sized to be one focused work session. Push at the end of each, verify, then move on.

---

### ✅ Milestone 1 — Foundation (COMPLETE)

**Status:** Deployed and verified. ZeroGPU smoke test passes.

What got built:
- HF Space scaffolded with Gradio 6 + ZeroGPU
- 5 tabs: Generate / Post-Process / Auto-Rig / Export / Presets + Diagnostics
- `workspace/` folder pattern (current/exports/presets/history)
- `src/workspace.py` — AssetState, preset save/load
- `src/quota.py` — daily quota tracking
- `src/ui_helpers.py` — status bar, asset summary, viewer model picker
- `gr.Model3D` viewer wired up
- Pipeline stubs returning placeholder messages
- Diagnostics tab with `@spaces.GPU` smoke test

Files in repo:
```
open3dforge/
├── README.md             ← sdk_version: 6.14.0
├── requirements.txt      ← gradio>=5.0, spaces, numpy, pillow
├── .gitignore
├── app.py                ← main entry, all UI wiring
├── src/
│   ├── __init__.py
│   ├── workspace.py
│   ├── quota.py
│   └── ui_helpers.py
└── workspace/
    ├── current/.gitkeep
    ├── exports/.gitkeep
    ├── presets/.gitkeep
    └── history/.gitkeep
```

---

### 🟡 Milestone 2 — Stage 1: TRELLIS.2 Generation (NEXT)

**Goal:** Real image-to-3D generation working end-to-end. Upload image → get a GLB in the viewer.

**Approach:** Option B — duplicate the microsoft/TRELLIS.2 Space, merge its contents into our repo, then refactor app.py to integrate with our tab structure.

**Step-by-step:**

1. **Duplicate microsoft/TRELLIS.2 Space** to get a known-good baseline:
   - On HF: `huggingface.co/spaces/microsoft/TRELLIS.2` → Duplicate this Space → name it `open3dforge-trellis-staging`
   - This is a *staging* copy — we don't deploy it, we just clone it locally for the merge
   - Confirm it builds and runs in your duplicate before touching anything

2. **Clone both repos locally:**
   ```bash
   git clone https://huggingface.co/spaces/Reverb/open3dforge
   git clone https://huggingface.co/spaces/baselanaya/open3dforge-trellis-staging
   ```

3. **Copy TRELLIS.2 assets into open3dforge:**
   ```bash
   cp -r open3dforge-trellis-staging/trellis2/ open3dforge/
   cp -r open3dforge-trellis-staging/assets/ open3dforge/
   cp open3dforge-trellis-staging/autotune_cache.json open3dforge/
   cp open3dforge-trellis-staging/packages.txt open3dforge/
   ```
   This gives us the vendored `trellis2/` Python package, HDRI envmaps, FlexGemm cache, and apt deps.

4. **Merge requirements.txt:**
   Combine the TRELLIS.2 requirements with our existing ones. Add to `requirements.txt`:
   ```
   # TRELLIS.2 deps (from microsoft/TRELLIS.2 Space)
   torch
   torchvision
   cv2 / opencv-python-headless
   imageio
   imageio-ffmpeg
   rembg
   # plus the custom wheels they install at build time
   ```
   Copy theirs verbatim and add to ours. Inspect the resolved `requirements.txt` in the staging duplicate first.

5. **Refactor app.py to integrate the TRELLIS handlers:**
   - Move TRELLIS pipeline init to module level (per ZeroGPU rules — must be on CUDA at module-level)
   - Wrap their `image_to_3d` + `extract_glb` functions as the implementation of our existing `stub_generate` handler
   - Update the Generate tab to match TRELLIS parameter names (resolution, ss_sampling_steps, etc.)
   - Hide most TRELLIS knobs behind the "Advanced" accordion; expose only Quality preset + Seed at top level
   - Keep our quality presets (Fast/Balanced/Hero) mapping to their parameter sets
   - Hook the output GLB into `workspace.get_state().raw_gen_glb` and save the high-poly separately

6. **Critical: save `high_poly.glb` before decimation.** TRELLIS's `extract_glb` calls `o_voxel.postprocess.to_glb(decimation_target=...)`. We need to call it once with no decimation (or a very high target like 16M faces — the nvdiffrast limit they use) to get the high-poly we'll bake from in Stage 2, then call it again with the user's chosen decimation_target for the working low-poly.

7. **Update workspace state on success:**
   ```python
   state = workspace.get_state()
   state.high_poly_glb = Path("workspace/current/high_poly.glb")
   state.raw_gen_glb = Path("workspace/current/raw_gen.glb")
   state.face_count = len(mesh.faces)
   state.vertex_count = len(mesh.vertices)
   state.model_used = "TRELLIS.2"
   ```

8. **Test:**
   - Push to Space
   - Wait for build (~10-15 min due to CUDA wheels compiling)
   - Upload a test image
   - Confirm the GLB appears in the viewer
   - Confirm Diagnostics quota tracker shows time consumed

**Quality presets to wire up (map to TRELLIS params):**

| Preset | resolution | ss_steps | shape_steps | tex_steps | Expected time |
|---|---|---|---|---|---|
| Fast | 512 | 8 | 8 | 8 | ~30s |
| Balanced | 1024 | 12 | 12 | 12 | ~60s |
| Hero | 1536 | 16 | 16 | 16 | ~90s |

**Risk mitigation:**
- TRELLIS.2 build can fail in many ways (CUDA wheel compilation, flash-attn install). If a build fails, check the build logs for which wheel failed. The staging duplicate is the reference — if it built there, the issue is in *your* merge.
- Don't move anything into `@spaces.GPU` functions that should be at module level. Pipeline init goes at module level.

---

### Milestone 2b — Hunyuan3D-2 Alternative Generator

**Goal:** Second generator option for organic shapes (characters, creatures).

**Approach:** Same duplicate-and-vendor pattern as Milestone 2.

1. Duplicate `tencent/Hunyuan3D-2` to staging Space
2. Clone, copy the `hy3dgen/` package into open3dforge
3. Merge requirements (most overlap with TRELLIS.2 — torch, diffusers)
4. Add Hunyuan pipeline init at module level
5. The model dropdown in the Generate tab routes between `image_to_3d_trellis()` and `image_to_3d_hunyuan()`
6. Hunyuan needs 16GB VRAM — fits alongside TRELLIS in H200's 70GB but only load one at a time via lazy module-level guards

**Decision deferred to this milestone:** Whether to keep both models in VRAM at module load (faster, more memory) or lazy-load per call (slower first call, less memory). Test both.

---

### Milestone 3 — Stage 2A-2C: Mesh Cleanup

**Goal:** Working CPU-side mesh repair, cleanup, and decimation with live preview.

**Dependencies to add:**
```
trimesh[easy]
pymeshfix
pymeshlab
fast-simplification
```

**Files to create:**
- `src/stages/__init__.py`
- `src/stages/stage2_repair.py` — pymeshfix wrapper
- `src/stages/stage2_cleanup.py` — PyMeshLab filter chain
- `src/stages/stage2_decimate.py` — both fast-simplification (preview) and PyMeshLab (final)

**UI work in app.py:**
- Wire the existing checkboxes/sliders in Tab 2 to call the real implementations
- Live preview: slider `.change()` event fires `fast-simplification`, updates face count display
- Run button: actually runs full pipeline on the current GLB

**Workspace state updates:**
- `state.repaired_glb`, `state.cleaned_glb`, `state.low_poly_glb` all get populated as steps complete

**Test criteria:**
- Generate a TRELLIS asset (50k faces)
- Run repair → no errors
- Run cleanup → no errors
- Set decimation slider to 10k → live preview updates face count
- Click "Run final" → produces low_poly.glb at 10k faces
- Viewer auto-refreshes to show the cleaned mesh

---

### Milestone 4 — Stage 2D-2E: Symmetry + UV Unwrap

**Goal:** Symmetry enforcement + xatlas UV unwrapping with consistent texel density.

**Dependencies to add:**
```
xatlas
```

**Files to create:**
- `src/stages/stage2_symmetry.py` — PyMeshLab `apply_filter_mesh_symmetrize`
- `src/stages/stage2_uv.py` — xatlas with `texels_per_unit` packing

**UI work:**
- Symmetry: off / bilateral-X / bilateral-Y / radial dropdown
- UV: atlas resolution, texels_per_unit, padding

**Test criteria:**
- Run on a human-character GLB → symmetry produces clean mirror
- UV unwrap produces `unwrapped.glb` with valid UV0 coords visible if you inspect via trimesh
- No overlapping UV islands (check with PyMeshLab's quality measure)

---

### Milestone 5 — Stage 2F: Normal Baking with nvdiffrast

**Goal:** High-poly → low-poly normal map baking, GPU-accelerated, 2-5 second bakes.

**Dependencies to add:**
- `nvdiffrast` (already installed via TRELLIS.2 wheels — verify in the staging duplicate)

**Files to create:**
- `src/stages/stage2_bake_normal.py` — full nvdiffrast pipeline

**Algorithm (from the plan doc):**
```python
@spaces.GPU(duration=60)
def bake_normal_map(high_poly_path, low_poly_path, uv_coords, map_size=2048):
    ctx = dr.RasterizeCudaContext()
    # 1. UV → clip space
    # 2. Rasterize low-poly UVs → per-pixel world position + tri ID
    # 3. For each pixel: nearest-on-surface from high-poly
    # 4. Sample high-poly normal at that point
    # 5. Transform to tangent space (low-poly tangent frame)
    # 6. Pack RGB [0,1], save PNG
    # 7. Dilate edges past UV island boundaries
```

**Output:** Two PNGs — `normal_gl.png` and `normal_dx.png` (DX has Y-flipped green channel).

**Test criteria:**
- Run on TRELLIS character output (50k high-poly → 10k low-poly)
- Bake completes in <10 seconds
- Open the normal map in any image viewer — should be bluish/purple with surface detail visible
- Both DX and GL versions are produced
- Quota shows 5-10 seconds consumed

---

### Milestone 6 — Stage 2G-2I: Albedo, Material, AO Baking

**Goal:** Three more nvdiffrast bakes producing the full PBR texture set.

**Files to create:**
- `src/stages/stage2_bake_albedo.py`
- `src/stages/stage2_bake_material.py` — uses TRELLIS.2's stored metallic+roughness attrs
- `src/stages/stage2_bake_ao.py` — ray-occlusion in hemisphere

**Key reuse:** Same nvdiffrast rasterization pattern as Milestone 5 — refactor that code into a shared helper `_rasterize_uv_atlas()` in `src/stages/_baking_helpers.py`.

**Workspace state:** All texture paths populated on the AssetState.

**Test criteria:**
- All four maps (normal, albedo, metallic, roughness, AO) viewable as PNG thumbnails in Tab 2
- Total Stage 2 baking time < 30 seconds for a Balanced-quality asset

---

### Milestone 7 — Stage 2J: SDXL Inpainting for Hidden UVs

**Goal:** Detect stretched/synthetic UV regions and inpaint them with SDXL.

**Dependencies to add:**
```
diffusers
accelerate
safetensors
```

**Files to create:**
- `src/stages/stage2_inpaint.py`
  - `detect_hidden_regions(albedo, uvs, faces)` — variance analysis
  - `inpaint_hidden_uvs(...)` — SDXL inpainting pipeline

**UI:** Toggle off by default (costs ~30s quota). Prompt input. Strength slider.

**Test criteria:**
- Generate an asset with a clear "back side" (e.g., a humanoid character)
- Without inpainting: back of character has visible texture stretching
- With inpainting: back is plausibly filled in
- Quota cost: ~30s per inpaint

---

### Milestone 8 — Stage 2K-2O: Finalization Steps

**Goal:** Channel packing, LODs, collision, pivot, scale — all CPU-side, fast.

**Dependencies to add:**
```
coacd==1.0.4
```

**Files to create:**
- `src/stages/stage2_channel_pack.py` — numpy ORM / MetallicSmoothness packing
- `src/stages/stage2_lods.py` — PyMeshLab quality-aware LOD0/1/2
- `src/stages/stage2_collision.py` — CoACD with `trimesh.convex_hull` fallback
- `src/stages/stage2_pivot.py` — bottom_center / geometric_center / custom
- `src/stages/stage2_scale.py` — height presets, UE5 cm units

**UI:** All controls already scaffolded in Milestone 1's Post-Process tab. Just wire to real implementations.

**Test criteria:**
- ORM packed as RGB with AO/Roughness/Metallic in correct channels
- LOD0/LOD1/LOD2 all generated, all share same UV layout
- Collision mesh has <1% the triangle count of LOD0
- Pivot at bottom_center for a generated human character results in feet at world origin Y=0
- Scale: human asset is 1.8m tall = 180cm in UE5 export

---

### Milestone 9 — Stage 3: UniRig Auto-Rigging

**Goal:** Generate a skeleton + skinning weights for character meshes.

**Approach:** Same vendor-the-Space pattern as Milestone 2.

1. Duplicate `MohamedRashad/UniRig` Space → staging
2. Verify it builds in the staging duplicate
3. Copy `UniRig/` package into our repo
4. Merge requirements
5. Wire to the Auto-Rig tab handler
6. Output: rigged FBX (UE5 default) or GLB

**Test criteria:**
- Run on a humanoid character (after full Stage 2 processing)
- Output FBX imports into UE5 as a Skeletal Mesh
- Drag into Mixamo → animations auto-attach correctly

---

### Milestone 10 — Stage 4: UE5 Export

**Goal:** Bundle everything into a UE5-ready zip with proper naming and packing.

**Dependencies to add:**
```
pygltflib
```

**Files to create:**
- `src/stages/stage4_export.py`
  - `export_ue5(asset_state, asset_name, asset_type) → zip_path`
  - Handles FBX conversion via trimesh
  - Applies naming convention (`SM_`, `SK_`, `T_`)
  - Writes ORM-packed textures to correct paths
  - Zip + drop in `workspace/exports/`

**Engine presets (only UE5 fully implemented):**
- UE5: FBX, DX normals, ORM, Z-up, cm — the default
- Unity HDRP: FBX, GL normals, MetallicSmoothness, Y-up, m — stub for later
- Godot/Blender/Web: stubs

**Test criteria:**
- Export a character → unzip → 6-7 files following naming convention
- Import to UE5: drag-drop the zip's contents → no warnings, materials auto-create from textures
- Both Static Mesh and Skeletal Mesh paths work

---

### Milestone 11 — Presets System

**Goal:** Save and load named parameter configurations across tabs.

**Files to update:**
- `src/workspace.py` — already has `save_preset/load_preset/delete_preset`, just needs the JSON schema fleshed out
- `app.py` — wire the Presets tab's Save button to actually read all current tab values

**Schema:**
```json
{
  "name": "character_UE5_hero",
  "stage1": { ... },
  "stage2": { ... },
  "stage3": { ... },
  "stage4": { ... }
}
```

**Ship five default presets:**
- `character_UE5_hero.json`
- `character_UE5_npc.json`
- `prop_UE5_hero.json`
- `prop_UE5_standard.json`
- `environment_UE5_background.json`

---

### Milestone 12 — Polish & Production Hardening

- Error handling on every stage (don't crash the app, show clear error in UI)
- Progress bars during long ops (`gr.Progress(track_tqdm=True)`)
- Quota cost shown *before* each GPU operation (warning if it would exceed remaining)
- Game-ready checklist passes shown before allowing Export
- Asset history sidebar (last 5 generated assets with thumbnails)
- Session cleanup of `workspace/current/` on new generation

---

## Working with Claude Code

When you continue in Claude Code, you'll have the full repo locally. Key things to remember:

### Project conventions

1. **Each stage = its own module** in `src/stages/`. Don't dump pipeline logic into `app.py`.
2. **Workspace state is the single source of truth.** Every stage reads from and writes to `workspace.get_state()`.
3. **GPU functions live where they're needed**, not all in app.py. The `@spaces.GPU` decorator works in any file as long as `spaces` is imported.
4. **No `if __name__ == "__main__":` on `demo.launch()`.** HF Spaces imports app.py at module level.
5. **Gradio 6 specifics:**
   - `theme` and `css` go in `launch()`, not `Blocks()`
   - `show_api` is gone — use `footer_links=["gradio", "settings"]`
   - `api_visibility` replaces `api_name=False` on events
6. **The 3 global components** (`viewer`, `summary`, `status_bar`) get refreshed via `_global_refresh()` chained off every pipeline action button. Don't forget to add new buttons to that list.

### Useful commands

```bash
# Pull the latest Space state
cd open3dforge
git pull

# Make changes, syntax-check before push
python -c "import ast; ast.parse(open('app.py').read())"

# Push to deploy
git add -A
git commit -m "Milestone N: <stage>"
git push

# Watch build/runtime logs at:
# https://huggingface.co/spaces/Reverb/open3dforge?logs=container
```

### Common HF Space build failures (we've hit these)

| Symptom | Cause | Fix |
|---|---|---|
| `Cannot install gradio<X and gradio==Y` | `sdk_version` in README conflicts with requirements.txt pin | Remove version pin in requirements.txt or update README's sdk_version |
| `Blocks.launch() got an unexpected keyword argument 'X'` | Gradio 6 removed parameter | Check Gradio 6 migration guide for replacement |
| `When localhost is not accessible` | `demo.launch()` wrapped in `if __name__ == "__main__"` | Move to module level |
| CUDA wheel compile failures | Mismatched torch/CUDA versions | Match TRELLIS.2 staging duplicate's exact pins |
| OOM during model load | Multiple large models loaded at module level | Lazy-load with module-level guards inside `@spaces.GPU` |

### Useful resources

- **Gradio 6 migration guide:** https://www.gradio.app/main/guides/gradio-6-migration-guide
- **ZeroGPU docs:** https://huggingface.co/docs/hub/spaces-zerogpu
- **TRELLIS.2 reference Space:** https://huggingface.co/spaces/microsoft/TRELLIS.2
- **Hunyuan3D-2 reference Space:** https://huggingface.co/spaces/tencent/Hunyuan3D-2
- **UniRig reference Space:** https://huggingface.co/spaces/MohamedRashad/UniRig

---

## Constraints to Remember

- **Daily quota:** 1500s (25 min) of H200 time per day. Plan asset iteration accordingly.
- **VRAM budget:** ~70GB per workload. TRELLIS.2 alone is 24GB; UniRig is 8GB; SDXL inpaint is 8GB. Don't load all at once.
- **Function timeout:** Default `@spaces.GPU` duration is 60s. Override with `duration=N` for longer ops (Stage 1 generation, AO bake high quality).
- **Build time:** With TRELLIS.2 vendored + CUDA wheels, expect 10-15 min builds. Cache hits will be ~3 min.
- **Repo size:** Will grow large with vendored models + HDRIs. Git LFS may be needed for the autotune_cache.json (~1MB) and wheel files (~100MB+). HF Spaces handles this via Xet storage automatically.

---

*Plan version 3.0 — May 15, 2026*
*Last action completed: Milestone 1 deployed, ZeroGPU smoke test passing*
*Next action: Milestone 2 — duplicate microsoft/TRELLIS.2 staging Space, merge into open3dforge*