open3dforge / PLAN.md
Reverb's picture
Milestone 2: Wire TRELLIS.2 generation + vendor both model packages
1ae114b
|
Raw
History Blame Contribute Delete
21.5 kB
# Open3DForge β€” Full Build Plan
> **Project:** Personal image-to-game-ready 3D asset pipeline
> **Owner:** Basel Β· Solo dev for "What Remains" (UE5.7)
> **Hosting:** Single HF Pro Space, ZeroGPU H200, 25 min/day quota
> **SDK:** Gradio 6.x (currently 6.14.0)
> **Repo:** `Reverb/open3dforge`
---
## Architectural Decisions (Locked In)
These were debated and decided during early development. Don't relitigate them mid-build.
1. **One Space, not multiple.** Option B: vendor TRELLIS.2 + Hunyuan3D-2 + UniRig into this Space rather than orchestrating across multiple Spaces via `gradio_client`. Larger repo, but single deployment, no inter-Space latency, no auth juggling.
2. **HF Space only, no local fallback.** Don't build a path for running on the RTX 4070. Quota is enough for personal use.
3. **Gradio 6.x.** Match the `sdk_version` in `README.md`. No upper bound in `requirements.txt`.
4. **UE5-first export defaults.** DirectX normals, ORM packing, cm units, Z-up, `SM_/SK_/T_` naming.
5. **Drop the custom website.** Standard Gradio tabs. `gradio.Server` is not worth the work for one user.
6. **Drop CHORD.** Research-only license. Use TRELLIS.2's own metallic/roughness volume attributes instead, which are already correct and license-compatible.
7. **nvdiffrast for all baking.** Not Blender headless. Fast (~2-5s per bake), GPU-based, fits inside `@spaces.GPU`, no apt install needed beyond what TRELLIS already requires.
8. **Solo-user workspace pattern.** One `workspace/current/` folder. No session IDs, no multi-tenancy.
---
## Reference Code: Pipeline Stage Order
```
INPUT: 1-4 images
β”‚
β”œβ”€β”€ rembg (background removal, CPU)
β”‚
β–Ό
[Stage 1] GENERATION [GPU]
β”œβ”€β”€ TRELLIS.2 (hard surface) or Hunyuan3D-2 (organic)
β”œβ”€β”€ SAVE high_poly.glb (kept for normal baking)
└── Extract: albedo + metallic + roughness volume attrs
β”‚
β–Ό
[Stage 2] POST-PROCESSING [CPU + GPU baking]
2A Mesh repair pymeshfix CPU
2B Geometry cleanup PyMeshLab CPU
2C Decimation CPU
β”œβ”€β”€ Preview: fast-simplification
└── Final: PyMeshLab quality pass
2D Symmetry (characters) PyMeshLab CPU
2E UV unwrap xatlas CPU
(texels_per_unit packing)
2F Normal bake nvdiffrast GPU
DX + GL outputs
2G Albedo bake nvdiffrast GPU
vertex color β†’ UV atlas
2H Material bake nvdiffrast GPU
TRELLIS volume attrs β†’ UV
2I AO bake nvdiffrast GPU
ray-occlusion β†’ UV
2J Texture inpaint (opt) SDXL inpaint GPU
hidden UV regions
2K Channel pack numpy CPU
Unreal ORM / Unity MetSmooth
2L LOD generation PyMeshLab CPU
LOD0/LOD1/LOD2 only (UE5 HLOD handles billboards)
2M Collision mesh CoACD CPU
2N Pivot correction trimesh CPU
2O Scale validation trimesh CPU
β”‚
β–Ό
[Stage 3] AUTO-RIGGING (Optional) [GPU]
UniRig β†’ rigged.glb / rigged.fbx
β”‚
β–Ό
[Stage 4] EXPORT
UE5 preset (default) β†’ DX normals + ORM
Naming: SM_/SK_/T_ convention
β†’ export_AssetName_UE5.zip
```
---
## Milestone Plan
Each milestone is sized to be one focused work session. Push at the end of each, verify, then move on.
---
### βœ… Milestone 1 β€” Foundation (COMPLETE)
**Status:** Deployed and verified. ZeroGPU smoke test passes.
What got built:
- HF Space scaffolded with Gradio 6 + ZeroGPU
- 5 tabs: Generate / Post-Process / Auto-Rig / Export / Presets + Diagnostics
- `workspace/` folder pattern (current/exports/presets/history)
- `src/workspace.py` β€” AssetState, preset save/load
- `src/quota.py` β€” daily quota tracking
- `src/ui_helpers.py` β€” status bar, asset summary, viewer model picker
- `gr.Model3D` viewer wired up
- Pipeline stubs returning placeholder messages
- Diagnostics tab with `@spaces.GPU` smoke test
Files in repo:
```
open3dforge/
β”œβ”€β”€ README.md ← sdk_version: 6.14.0
β”œβ”€β”€ requirements.txt ← gradio>=5.0, spaces, numpy, pillow
β”œβ”€β”€ .gitignore
β”œβ”€β”€ app.py ← main entry, all UI wiring
β”œβ”€β”€ src/
β”‚ β”œβ”€β”€ __init__.py
β”‚ β”œβ”€β”€ workspace.py
β”‚ β”œβ”€β”€ quota.py
β”‚ └── ui_helpers.py
└── workspace/
β”œβ”€β”€ current/.gitkeep
β”œβ”€β”€ exports/.gitkeep
β”œβ”€β”€ presets/.gitkeep
└── history/.gitkeep
```
---
### 🟑 Milestone 2 β€” Stage 1: TRELLIS.2 Generation (NEXT)
**Goal:** Real image-to-3D generation working end-to-end. Upload image β†’ get a GLB in the viewer.
**Approach:** Option B β€” duplicate the microsoft/TRELLIS.2 Space, merge its contents into our repo, then refactor app.py to integrate with our tab structure.
**Step-by-step:**
1. **Duplicate microsoft/TRELLIS.2 Space** to get a known-good baseline:
- On HF: `huggingface.co/spaces/microsoft/TRELLIS.2` β†’ Duplicate this Space β†’ name it `open3dforge-trellis-staging`
- This is a *staging* copy β€” we don't deploy it, we just clone it locally for the merge
- Confirm it builds and runs in your duplicate before touching anything
2. **Clone both repos locally:**
```bash
git clone https://huggingface.co/spaces/Reverb/open3dforge
git clone https://huggingface.co/spaces/baselanaya/open3dforge-trellis-staging
```
3. **Copy TRELLIS.2 assets into open3dforge:**
```bash
cp -r open3dforge-trellis-staging/trellis2/ open3dforge/
cp -r open3dforge-trellis-staging/assets/ open3dforge/
cp open3dforge-trellis-staging/autotune_cache.json open3dforge/
cp open3dforge-trellis-staging/packages.txt open3dforge/
```
This gives us the vendored `trellis2/` Python package, HDRI envmaps, FlexGemm cache, and apt deps.
4. **Merge requirements.txt:**
Combine the TRELLIS.2 requirements with our existing ones. Add to `requirements.txt`:
```
# TRELLIS.2 deps (from microsoft/TRELLIS.2 Space)
torch
torchvision
cv2 / opencv-python-headless
imageio
imageio-ffmpeg
rembg
# plus the custom wheels they install at build time
```
Copy theirs verbatim and add to ours. Inspect the resolved `requirements.txt` in the staging duplicate first.
5. **Refactor app.py to integrate the TRELLIS handlers:**
- Move TRELLIS pipeline init to module level (per ZeroGPU rules β€” must be on CUDA at module-level)
- Wrap their `image_to_3d` + `extract_glb` functions as the implementation of our existing `stub_generate` handler
- Update the Generate tab to match TRELLIS parameter names (resolution, ss_sampling_steps, etc.)
- Hide most TRELLIS knobs behind the "Advanced" accordion; expose only Quality preset + Seed at top level
- Keep our quality presets (Fast/Balanced/Hero) mapping to their parameter sets
- Hook the output GLB into `workspace.get_state().raw_gen_glb` and save the high-poly separately
6. **Critical: save `high_poly.glb` before decimation.** TRELLIS's `extract_glb` calls `o_voxel.postprocess.to_glb(decimation_target=...)`. We need to call it once with no decimation (or a very high target like 16M faces β€” the nvdiffrast limit they use) to get the high-poly we'll bake from in Stage 2, then call it again with the user's chosen decimation_target for the working low-poly.
7. **Update workspace state on success:**
```python
state = workspace.get_state()
state.high_poly_glb = Path("workspace/current/high_poly.glb")
state.raw_gen_glb = Path("workspace/current/raw_gen.glb")
state.face_count = len(mesh.faces)
state.vertex_count = len(mesh.vertices)
state.model_used = "TRELLIS.2"
```
8. **Test:**
- Push to Space
- Wait for build (~10-15 min due to CUDA wheels compiling)
- Upload a test image
- Confirm the GLB appears in the viewer
- Confirm Diagnostics quota tracker shows time consumed
**Quality presets to wire up (map to TRELLIS params):**
| Preset | resolution | ss_steps | shape_steps | tex_steps | Expected time |
|---|---|---|---|---|---|
| Fast | 512 | 8 | 8 | 8 | ~30s |
| Balanced | 1024 | 12 | 12 | 12 | ~60s |
| Hero | 1536 | 16 | 16 | 16 | ~90s |
**Risk mitigation:**
- TRELLIS.2 build can fail in many ways (CUDA wheel compilation, flash-attn install). If a build fails, check the build logs for which wheel failed. The staging duplicate is the reference β€” if it built there, the issue is in *your* merge.
- Don't move anything into `@spaces.GPU` functions that should be at module level. Pipeline init goes at module level.
---
### Milestone 2b β€” Hunyuan3D-2 Alternative Generator
**Goal:** Second generator option for organic shapes (characters, creatures).
**Approach:** Same duplicate-and-vendor pattern as Milestone 2.
1. Duplicate `tencent/Hunyuan3D-2` to staging Space
2. Clone, copy the `hy3dgen/` package into open3dforge
3. Merge requirements (most overlap with TRELLIS.2 β€” torch, diffusers)
4. Add Hunyuan pipeline init at module level
5. The model dropdown in the Generate tab routes between `image_to_3d_trellis()` and `image_to_3d_hunyuan()`
6. Hunyuan needs 16GB VRAM β€” fits alongside TRELLIS in H200's 70GB but only load one at a time via lazy module-level guards
**Decision deferred to this milestone:** Whether to keep both models in VRAM at module load (faster, more memory) or lazy-load per call (slower first call, less memory). Test both.
---
### Milestone 3 β€” Stage 2A-2C: Mesh Cleanup
**Goal:** Working CPU-side mesh repair, cleanup, and decimation with live preview.
**Dependencies to add:**
```
trimesh[easy]
pymeshfix
pymeshlab
fast-simplification
```
**Files to create:**
- `src/stages/__init__.py`
- `src/stages/stage2_repair.py` β€” pymeshfix wrapper
- `src/stages/stage2_cleanup.py` β€” PyMeshLab filter chain
- `src/stages/stage2_decimate.py` β€” both fast-simplification (preview) and PyMeshLab (final)
**UI work in app.py:**
- Wire the existing checkboxes/sliders in Tab 2 to call the real implementations
- Live preview: slider `.change()` event fires `fast-simplification`, updates face count display
- Run button: actually runs full pipeline on the current GLB
**Workspace state updates:**
- `state.repaired_glb`, `state.cleaned_glb`, `state.low_poly_glb` all get populated as steps complete
**Test criteria:**
- Generate a TRELLIS asset (50k faces)
- Run repair β†’ no errors
- Run cleanup β†’ no errors
- Set decimation slider to 10k β†’ live preview updates face count
- Click "Run final" β†’ produces low_poly.glb at 10k faces
- Viewer auto-refreshes to show the cleaned mesh
---
### Milestone 4 β€” Stage 2D-2E: Symmetry + UV Unwrap
**Goal:** Symmetry enforcement + xatlas UV unwrapping with consistent texel density.
**Dependencies to add:**
```
xatlas
```
**Files to create:**
- `src/stages/stage2_symmetry.py` β€” PyMeshLab `apply_filter_mesh_symmetrize`
- `src/stages/stage2_uv.py` β€” xatlas with `texels_per_unit` packing
**UI work:**
- Symmetry: off / bilateral-X / bilateral-Y / radial dropdown
- UV: atlas resolution, texels_per_unit, padding
**Test criteria:**
- Run on a human-character GLB β†’ symmetry produces clean mirror
- UV unwrap produces `unwrapped.glb` with valid UV0 coords visible if you inspect via trimesh
- No overlapping UV islands (check with PyMeshLab's quality measure)
---
### Milestone 5 β€” Stage 2F: Normal Baking with nvdiffrast
**Goal:** High-poly β†’ low-poly normal map baking, GPU-accelerated, 2-5 second bakes.
**Dependencies to add:**
- `nvdiffrast` (already installed via TRELLIS.2 wheels β€” verify in the staging duplicate)
**Files to create:**
- `src/stages/stage2_bake_normal.py` β€” full nvdiffrast pipeline
**Algorithm (from the plan doc):**
```python
@spaces.GPU(duration=60)
def bake_normal_map(high_poly_path, low_poly_path, uv_coords, map_size=2048):
ctx = dr.RasterizeCudaContext()
# 1. UV β†’ clip space
# 2. Rasterize low-poly UVs β†’ per-pixel world position + tri ID
# 3. For each pixel: nearest-on-surface from high-poly
# 4. Sample high-poly normal at that point
# 5. Transform to tangent space (low-poly tangent frame)
# 6. Pack RGB [0,1], save PNG
# 7. Dilate edges past UV island boundaries
```
**Output:** Two PNGs β€” `normal_gl.png` and `normal_dx.png` (DX has Y-flipped green channel).
**Test criteria:**
- Run on TRELLIS character output (50k high-poly β†’ 10k low-poly)
- Bake completes in <10 seconds
- Open the normal map in any image viewer β€” should be bluish/purple with surface detail visible
- Both DX and GL versions are produced
- Quota shows 5-10 seconds consumed
---
### Milestone 6 β€” Stage 2G-2I: Albedo, Material, AO Baking
**Goal:** Three more nvdiffrast bakes producing the full PBR texture set.
**Files to create:**
- `src/stages/stage2_bake_albedo.py`
- `src/stages/stage2_bake_material.py` β€” uses TRELLIS.2's stored metallic+roughness attrs
- `src/stages/stage2_bake_ao.py` β€” ray-occlusion in hemisphere
**Key reuse:** Same nvdiffrast rasterization pattern as Milestone 5 β€” refactor that code into a shared helper `_rasterize_uv_atlas()` in `src/stages/_baking_helpers.py`.
**Workspace state:** All texture paths populated on the AssetState.
**Test criteria:**
- All four maps (normal, albedo, metallic, roughness, AO) viewable as PNG thumbnails in Tab 2
- Total Stage 2 baking time < 30 seconds for a Balanced-quality asset
---
### Milestone 7 β€” Stage 2J: SDXL Inpainting for Hidden UVs
**Goal:** Detect stretched/synthetic UV regions and inpaint them with SDXL.
**Dependencies to add:**
```
diffusers
accelerate
safetensors
```
**Files to create:**
- `src/stages/stage2_inpaint.py`
- `detect_hidden_regions(albedo, uvs, faces)` β€” variance analysis
- `inpaint_hidden_uvs(...)` β€” SDXL inpainting pipeline
**UI:** Toggle off by default (costs ~30s quota). Prompt input. Strength slider.
**Test criteria:**
- Generate an asset with a clear "back side" (e.g., a humanoid character)
- Without inpainting: back of character has visible texture stretching
- With inpainting: back is plausibly filled in
- Quota cost: ~30s per inpaint
---
### Milestone 8 β€” Stage 2K-2O: Finalization Steps
**Goal:** Channel packing, LODs, collision, pivot, scale β€” all CPU-side, fast.
**Dependencies to add:**
```
coacd==1.0.4
```
**Files to create:**
- `src/stages/stage2_channel_pack.py` β€” numpy ORM / MetallicSmoothness packing
- `src/stages/stage2_lods.py` β€” PyMeshLab quality-aware LOD0/1/2
- `src/stages/stage2_collision.py` β€” CoACD with `trimesh.convex_hull` fallback
- `src/stages/stage2_pivot.py` β€” bottom_center / geometric_center / custom
- `src/stages/stage2_scale.py` β€” height presets, UE5 cm units
**UI:** All controls already scaffolded in Milestone 1's Post-Process tab. Just wire to real implementations.
**Test criteria:**
- ORM packed as RGB with AO/Roughness/Metallic in correct channels
- LOD0/LOD1/LOD2 all generated, all share same UV layout
- Collision mesh has <1% the triangle count of LOD0
- Pivot at bottom_center for a generated human character results in feet at world origin Y=0
- Scale: human asset is 1.8m tall = 180cm in UE5 export
---
### Milestone 9 β€” Stage 3: UniRig Auto-Rigging
**Goal:** Generate a skeleton + skinning weights for character meshes.
**Approach:** Same vendor-the-Space pattern as Milestone 2.
1. Duplicate `MohamedRashad/UniRig` Space β†’ staging
2. Verify it builds in the staging duplicate
3. Copy `UniRig/` package into our repo
4. Merge requirements
5. Wire to the Auto-Rig tab handler
6. Output: rigged FBX (UE5 default) or GLB
**Test criteria:**
- Run on a humanoid character (after full Stage 2 processing)
- Output FBX imports into UE5 as a Skeletal Mesh
- Drag into Mixamo β†’ animations auto-attach correctly
---
### Milestone 10 β€” Stage 4: UE5 Export
**Goal:** Bundle everything into a UE5-ready zip with proper naming and packing.
**Dependencies to add:**
```
pygltflib
```
**Files to create:**
- `src/stages/stage4_export.py`
- `export_ue5(asset_state, asset_name, asset_type) β†’ zip_path`
- Handles FBX conversion via trimesh
- Applies naming convention (`SM_`, `SK_`, `T_`)
- Writes ORM-packed textures to correct paths
- Zip + drop in `workspace/exports/`
**Engine presets (only UE5 fully implemented):**
- UE5: FBX, DX normals, ORM, Z-up, cm β€” the default
- Unity HDRP: FBX, GL normals, MetallicSmoothness, Y-up, m β€” stub for later
- Godot/Blender/Web: stubs
**Test criteria:**
- Export a character β†’ unzip β†’ 6-7 files following naming convention
- Import to UE5: drag-drop the zip's contents β†’ no warnings, materials auto-create from textures
- Both Static Mesh and Skeletal Mesh paths work
---
### Milestone 11 β€” Presets System
**Goal:** Save and load named parameter configurations across tabs.
**Files to update:**
- `src/workspace.py` β€” already has `save_preset/load_preset/delete_preset`, just needs the JSON schema fleshed out
- `app.py` β€” wire the Presets tab's Save button to actually read all current tab values
**Schema:**
```json
{
"name": "character_UE5_hero",
"stage1": { ... },
"stage2": { ... },
"stage3": { ... },
"stage4": { ... }
}
```
**Ship five default presets:**
- `character_UE5_hero.json`
- `character_UE5_npc.json`
- `prop_UE5_hero.json`
- `prop_UE5_standard.json`
- `environment_UE5_background.json`
---
### Milestone 12 β€” Polish & Production Hardening
- Error handling on every stage (don't crash the app, show clear error in UI)
- Progress bars during long ops (`gr.Progress(track_tqdm=True)`)
- Quota cost shown *before* each GPU operation (warning if it would exceed remaining)
- Game-ready checklist passes shown before allowing Export
- Asset history sidebar (last 5 generated assets with thumbnails)
- Session cleanup of `workspace/current/` on new generation
---
## Working with Claude Code
When you continue in Claude Code, you'll have the full repo locally. Key things to remember:
### Project conventions
1. **Each stage = its own module** in `src/stages/`. Don't dump pipeline logic into `app.py`.
2. **Workspace state is the single source of truth.** Every stage reads from and writes to `workspace.get_state()`.
3. **GPU functions live where they're needed**, not all in app.py. The `@spaces.GPU` decorator works in any file as long as `spaces` is imported.
4. **No `if __name__ == "__main__":` on `demo.launch()`.** HF Spaces imports app.py at module level.
5. **Gradio 6 specifics:**
- `theme` and `css` go in `launch()`, not `Blocks()`
- `show_api` is gone β€” use `footer_links=["gradio", "settings"]`
- `api_visibility` replaces `api_name=False` on events
6. **The 3 global components** (`viewer`, `summary`, `status_bar`) get refreshed via `_global_refresh()` chained off every pipeline action button. Don't forget to add new buttons to that list.
### Useful commands
```bash
# Pull the latest Space state
cd open3dforge
git pull
# Make changes, syntax-check before push
python -c "import ast; ast.parse(open('app.py').read())"
# Push to deploy
git add -A
git commit -m "Milestone N: <stage>"
git push
# Watch build/runtime logs at:
# https://huggingface.co/spaces/Reverb/open3dforge?logs=container
```
### Common HF Space build failures (we've hit these)
| Symptom | Cause | Fix |
|---|---|---|
| `Cannot install gradio<X and gradio==Y` | `sdk_version` in README conflicts with requirements.txt pin | Remove version pin in requirements.txt or update README's sdk_version |
| `Blocks.launch() got an unexpected keyword argument 'X'` | Gradio 6 removed parameter | Check Gradio 6 migration guide for replacement |
| `When localhost is not accessible` | `demo.launch()` wrapped in `if __name__ == "__main__"` | Move to module level |
| CUDA wheel compile failures | Mismatched torch/CUDA versions | Match TRELLIS.2 staging duplicate's exact pins |
| OOM during model load | Multiple large models loaded at module level | Lazy-load with module-level guards inside `@spaces.GPU` |
### Useful resources
- **Gradio 6 migration guide:** https://www.gradio.app/main/guides/gradio-6-migration-guide
- **ZeroGPU docs:** https://huggingface.co/docs/hub/spaces-zerogpu
- **TRELLIS.2 reference Space:** https://huggingface.co/spaces/microsoft/TRELLIS.2
- **Hunyuan3D-2 reference Space:** https://huggingface.co/spaces/tencent/Hunyuan3D-2
- **UniRig reference Space:** https://huggingface.co/spaces/MohamedRashad/UniRig
---
## Constraints to Remember
- **Daily quota:** 1500s (25 min) of H200 time per day. Plan asset iteration accordingly.
- **VRAM budget:** ~70GB per workload. TRELLIS.2 alone is 24GB; UniRig is 8GB; SDXL inpaint is 8GB. Don't load all at once.
- **Function timeout:** Default `@spaces.GPU` duration is 60s. Override with `duration=N` for longer ops (Stage 1 generation, AO bake high quality).
- **Build time:** With TRELLIS.2 vendored + CUDA wheels, expect 10-15 min builds. Cache hits will be ~3 min.
- **Repo size:** Will grow large with vendored models + HDRIs. Git LFS may be needed for the autotune_cache.json (~1MB) and wheel files (~100MB+). HF Spaces handles this via Xet storage automatically.
---
*Plan version 3.0 β€” May 15, 2026*
*Last action completed: Milestone 1 deployed, ZeroGPU smoke test passing*
*Next action: Milestone 2 β€” duplicate microsoft/TRELLIS.2 staging Space, merge into open3dforge*