Spaces:

Reverb
/

open3dforge

Sleeping

App Files Files Community

open3dforge / PLAN.md

Reverb

Milestone 2: Wire TRELLIS.2 generation + vendor both model packages

1ae114b about 2 months ago

preview code

Raw

History Blame Contribute Delete

21.5 kB

	# Open3DForge — Full Build Plan

	> Project: Personal image-to-game-ready 3D asset pipeline
	> Owner: Basel · Solo dev for "What Remains" (UE5.7)
	> Hosting: Single HF Pro Space, ZeroGPU H200, 25 min/day quota
	> SDK: Gradio 6.x (currently 6.14.0)
	> Repo: `Reverb/open3dforge`

	---

	## Architectural Decisions (Locked In)

	These were debated and decided during early development. Don't relitigate them mid-build.

	1. One Space, not multiple. Option B: vendor TRELLIS.2 + Hunyuan3D-2 + UniRig into this Space rather than orchestrating across multiple Spaces via `gradio_client`. Larger repo, but single deployment, no inter-Space latency, no auth juggling.

	2. HF Space only, no local fallback. Don't build a path for running on the RTX 4070. Quota is enough for personal use.

	3. Gradio 6.x. Match the `sdk_version` in `README.md`. No upper bound in `requirements.txt`.

	4. UE5-first export defaults. DirectX normals, ORM packing, cm units, Z-up, `SM_/SK_/T_` naming.

	5. Drop the custom website. Standard Gradio tabs. `gradio.Server` is not worth the work for one user.

	6. Drop CHORD. Research-only license. Use TRELLIS.2's own metallic/roughness volume attributes instead, which are already correct and license-compatible.

	7. nvdiffrast for all baking. Not Blender headless. Fast (~2-5s per bake), GPU-based, fits inside `@spaces.GPU`, no apt install needed beyond what TRELLIS already requires.

	8. Solo-user workspace pattern. One `workspace/current/` folder. No session IDs, no multi-tenancy.

	---

	## Reference Code: Pipeline Stage Order

	```
	INPUT: 1-4 images
	│
	├── rembg (background removal, CPU)
	│
	▼
	[Stage 1] GENERATION [GPU]
	├── TRELLIS.2 (hard surface) or Hunyuan3D-2 (organic)
	├── SAVE high_poly.glb (kept for normal baking)
	└── Extract: albedo + metallic + roughness volume attrs
	│
	▼
	[Stage 2] POST-PROCESSING [CPU + GPU baking]
	2A Mesh repair pymeshfix CPU
	2B Geometry cleanup PyMeshLab CPU
	2C Decimation CPU
	├── Preview: fast-simplification
	└── Final: PyMeshLab quality pass
	2D Symmetry (characters) PyMeshLab CPU
	2E UV unwrap xatlas CPU
	(texels_per_unit packing)
	2F Normal bake nvdiffrast GPU
	DX + GL outputs
	2G Albedo bake nvdiffrast GPU
	vertex color → UV atlas
	2H Material bake nvdiffrast GPU
	TRELLIS volume attrs → UV
	2I AO bake nvdiffrast GPU
	ray-occlusion → UV
	2J Texture inpaint (opt) SDXL inpaint GPU
	hidden UV regions
	2K Channel pack numpy CPU
	Unreal ORM / Unity MetSmooth
	2L LOD generation PyMeshLab CPU
	LOD0/LOD1/LOD2 only (UE5 HLOD handles billboards)
	2M Collision mesh CoACD CPU
	2N Pivot correction trimesh CPU
	2O Scale validation trimesh CPU
	│
	▼
	[Stage 3] AUTO-RIGGING (Optional) [GPU]
	UniRig → rigged.glb / rigged.fbx
	│
	▼
	[Stage 4] EXPORT
	UE5 preset (default) → DX normals + ORM
	Naming: SM_/SK_/T_ convention
	→ export_AssetName_UE5.zip
	```

	---

	## Milestone Plan

	Each milestone is sized to be one focused work session. Push at the end of each, verify, then move on.

	---

	### ✅ Milestone 1 — Foundation (COMPLETE)

	Status: Deployed and verified. ZeroGPU smoke test passes.

	What got built:
	- HF Space scaffolded with Gradio 6 + ZeroGPU
	- 5 tabs: Generate / Post-Process / Auto-Rig / Export / Presets + Diagnostics
	- `workspace/` folder pattern (current/exports/presets/history)
	- `src/workspace.py` — AssetState, preset save/load
	- `src/quota.py` — daily quota tracking
	- `src/ui_helpers.py` — status bar, asset summary, viewer model picker
	- `gr.Model3D` viewer wired up
	- Pipeline stubs returning placeholder messages
	- Diagnostics tab with `@spaces.GPU` smoke test

	Files in repo:
	```
	open3dforge/
	├── README.md ← sdk_version: 6.14.0
	├── requirements.txt ← gradio>=5.0, spaces, numpy, pillow
	├── .gitignore
	├── app.py ← main entry, all UI wiring
	├── src/
	│ ├── __init__.py
	│ ├── workspace.py
	│ ├── quota.py
	│ └── ui_helpers.py
	└── workspace/
	├── current/.gitkeep
	├── exports/.gitkeep
	├── presets/.gitkeep
	└── history/.gitkeep
	```

	---

	### 🟡 Milestone 2 — Stage 1: TRELLIS.2 Generation (NEXT)

	Goal: Real image-to-3D generation working end-to-end. Upload image → get a GLB in the viewer.

	Approach: Option B — duplicate the microsoft/TRELLIS.2 Space, merge its contents into our repo, then refactor app.py to integrate with our tab structure.

	Step-by-step:

	1. Duplicate microsoft/TRELLIS.2 Space to get a known-good baseline:
	- On HF: `huggingface.co/spaces/microsoft/TRELLIS.2` → Duplicate this Space → name it `open3dforge-trellis-staging`
	- This is a staging copy — we don't deploy it, we just clone it locally for the merge
	- Confirm it builds and runs in your duplicate before touching anything

	2. Clone both repos locally:
	```bash
	git clone https://huggingface.co/spaces/Reverb/open3dforge
	git clone https://huggingface.co/spaces/baselanaya/open3dforge-trellis-staging
	```

	3. Copy TRELLIS.2 assets into open3dforge:
	```bash
	cp -r open3dforge-trellis-staging/trellis2/ open3dforge/
	cp -r open3dforge-trellis-staging/assets/ open3dforge/
	cp open3dforge-trellis-staging/autotune_cache.json open3dforge/
	cp open3dforge-trellis-staging/packages.txt open3dforge/
	```
	This gives us the vendored `trellis2/` Python package, HDRI envmaps, FlexGemm cache, and apt deps.

	4. Merge requirements.txt:
	Combine the TRELLIS.2 requirements with our existing ones. Add to `requirements.txt`:
	```
	# TRELLIS.2 deps (from microsoft/TRELLIS.2 Space)
	torch
	torchvision
	cv2 / opencv-python-headless
	imageio
	imageio-ffmpeg
	rembg
	# plus the custom wheels they install at build time
	```
	Copy theirs verbatim and add to ours. Inspect the resolved `requirements.txt` in the staging duplicate first.

	5. Refactor app.py to integrate the TRELLIS handlers:
	- Move TRELLIS pipeline init to module level (per ZeroGPU rules — must be on CUDA at module-level)
	- Wrap their `image_to_3d` + `extract_glb` functions as the implementation of our existing `stub_generate` handler
	- Update the Generate tab to match TRELLIS parameter names (resolution, ss_sampling_steps, etc.)
	- Hide most TRELLIS knobs behind the "Advanced" accordion; expose only Quality preset + Seed at top level
	- Keep our quality presets (Fast/Balanced/Hero) mapping to their parameter sets
	- Hook the output GLB into `workspace.get_state().raw_gen_glb` and save the high-poly separately

	6. Critical: save `high_poly.glb` before decimation. TRELLIS's `extract_glb` calls `o_voxel.postprocess.to_glb(decimation_target=...)`. We need to call it once with no decimation (or a very high target like 16M faces — the nvdiffrast limit they use) to get the high-poly we'll bake from in Stage 2, then call it again with the user's chosen decimation_target for the working low-poly.

	7. Update workspace state on success:
	```python
	state = workspace.get_state()
	state.high_poly_glb = Path("workspace/current/high_poly.glb")
	state.raw_gen_glb = Path("workspace/current/raw_gen.glb")
	state.face_count = len(mesh.faces)
	state.vertex_count = len(mesh.vertices)
	state.model_used = "TRELLIS.2"
	```

	8. Test:
	- Push to Space
	- Wait for build (~10-15 min due to CUDA wheels compiling)
	- Upload a test image
	- Confirm the GLB appears in the viewer
	- Confirm Diagnostics quota tracker shows time consumed

	Quality presets to wire up (map to TRELLIS params):

	\| Preset \| resolution \| ss_steps \| shape_steps \| tex_steps \| Expected time \|
	\|---\|---\|---\|---\|---\|---\|
	\| Fast \| 512 \| 8 \| 8 \| 8 \| ~30s \|
	\| Balanced \| 1024 \| 12 \| 12 \| 12 \| ~60s \|
	\| Hero \| 1536 \| 16 \| 16 \| 16 \| ~90s \|

	Risk mitigation:
	- TRELLIS.2 build can fail in many ways (CUDA wheel compilation, flash-attn install). If a build fails, check the build logs for which wheel failed. The staging duplicate is the reference — if it built there, the issue is in your merge.
	- Don't move anything into `@spaces.GPU` functions that should be at module level. Pipeline init goes at module level.

	---

	### Milestone 2b — Hunyuan3D-2 Alternative Generator

	Goal: Second generator option for organic shapes (characters, creatures).

	Approach: Same duplicate-and-vendor pattern as Milestone 2.

	1. Duplicate `tencent/Hunyuan3D-2` to staging Space
	2. Clone, copy the `hy3dgen/` package into open3dforge
	3. Merge requirements (most overlap with TRELLIS.2 — torch, diffusers)
	4. Add Hunyuan pipeline init at module level
	5. The model dropdown in the Generate tab routes between `image_to_3d_trellis()` and `image_to_3d_hunyuan()`
	6. Hunyuan needs 16GB VRAM — fits alongside TRELLIS in H200's 70GB but only load one at a time via lazy module-level guards

	Decision deferred to this milestone: Whether to keep both models in VRAM at module load (faster, more memory) or lazy-load per call (slower first call, less memory). Test both.

	---

	### Milestone 3 — Stage 2A-2C: Mesh Cleanup

	Goal: Working CPU-side mesh repair, cleanup, and decimation with live preview.

	Dependencies to add:
	```
	trimesh[easy]
	pymeshfix
	pymeshlab
	fast-simplification
	```

	Files to create:
	- `src/stages/__init__.py`
	- `src/stages/stage2_repair.py` — pymeshfix wrapper
	- `src/stages/stage2_cleanup.py` — PyMeshLab filter chain
	- `src/stages/stage2_decimate.py` — both fast-simplification (preview) and PyMeshLab (final)

	UI work in app.py:
	- Wire the existing checkboxes/sliders in Tab 2 to call the real implementations
	- Live preview: slider `.change()` event fires `fast-simplification`, updates face count display
	- Run button: actually runs full pipeline on the current GLB

	Workspace state updates:
	- `state.repaired_glb`, `state.cleaned_glb`, `state.low_poly_glb` all get populated as steps complete

	Test criteria:
	- Generate a TRELLIS asset (50k faces)
	- Run repair → no errors
	- Run cleanup → no errors
	- Set decimation slider to 10k → live preview updates face count
	- Click "Run final" → produces low_poly.glb at 10k faces
	- Viewer auto-refreshes to show the cleaned mesh

	---

	### Milestone 4 — Stage 2D-2E: Symmetry + UV Unwrap

	Goal: Symmetry enforcement + xatlas UV unwrapping with consistent texel density.

	Dependencies to add:
	```
	xatlas
	```

	Files to create:
	- `src/stages/stage2_symmetry.py` — PyMeshLab `apply_filter_mesh_symmetrize`
	- `src/stages/stage2_uv.py` — xatlas with `texels_per_unit` packing

	UI work:
	- Symmetry: off / bilateral-X / bilateral-Y / radial dropdown
	- UV: atlas resolution, texels_per_unit, padding

	Test criteria:
	- Run on a human-character GLB → symmetry produces clean mirror
	- UV unwrap produces `unwrapped.glb` with valid UV0 coords visible if you inspect via trimesh
	- No overlapping UV islands (check with PyMeshLab's quality measure)

	---

	### Milestone 5 — Stage 2F: Normal Baking with nvdiffrast

	Goal: High-poly → low-poly normal map baking, GPU-accelerated, 2-5 second bakes.

	Dependencies to add:
	- `nvdiffrast` (already installed via TRELLIS.2 wheels — verify in the staging duplicate)

	Files to create:
	- `src/stages/stage2_bake_normal.py` — full nvdiffrast pipeline

	Algorithm (from the plan doc):
	```python
	@spaces.GPU(duration=60)
	def bake_normal_map(high_poly_path, low_poly_path, uv_coords, map_size=2048):
	ctx = dr.RasterizeCudaContext()
	# 1. UV → clip space
	# 2. Rasterize low-poly UVs → per-pixel world position + tri ID
	# 3. For each pixel: nearest-on-surface from high-poly
	# 4. Sample high-poly normal at that point
	# 5. Transform to tangent space (low-poly tangent frame)
	# 6. Pack RGB [0,1], save PNG
	# 7. Dilate edges past UV island boundaries
	```

	Output: Two PNGs — `normal_gl.png` and `normal_dx.png` (DX has Y-flipped green channel).

	Test criteria:
	- Run on TRELLIS character output (50k high-poly → 10k low-poly)
	- Bake completes in <10 seconds
	- Open the normal map in any image viewer — should be bluish/purple with surface detail visible
	- Both DX and GL versions are produced
	- Quota shows 5-10 seconds consumed

	---

	### Milestone 6 — Stage 2G-2I: Albedo, Material, AO Baking

	Goal: Three more nvdiffrast bakes producing the full PBR texture set.

	Files to create:
	- `src/stages/stage2_bake_albedo.py`
	- `src/stages/stage2_bake_material.py` — uses TRELLIS.2's stored metallic+roughness attrs
	- `src/stages/stage2_bake_ao.py` — ray-occlusion in hemisphere

	Key reuse: Same nvdiffrast rasterization pattern as Milestone 5 — refactor that code into a shared helper `_rasterize_uv_atlas()` in `src/stages/_baking_helpers.py`.

	Workspace state: All texture paths populated on the AssetState.

	Test criteria:
	- All four maps (normal, albedo, metallic, roughness, AO) viewable as PNG thumbnails in Tab 2
	- Total Stage 2 baking time < 30 seconds for a Balanced-quality asset

	---

	### Milestone 7 — Stage 2J: SDXL Inpainting for Hidden UVs

	Goal: Detect stretched/synthetic UV regions and inpaint them with SDXL.

	Dependencies to add:
	```
	diffusers
	accelerate
	safetensors
	```

	Files to create:
	- `src/stages/stage2_inpaint.py`
	- `detect_hidden_regions(albedo, uvs, faces)` — variance analysis
	- `inpaint_hidden_uvs(...)` — SDXL inpainting pipeline

	UI: Toggle off by default (costs ~30s quota). Prompt input. Strength slider.

	Test criteria:
	- Generate an asset with a clear "back side" (e.g., a humanoid character)
	- Without inpainting: back of character has visible texture stretching
	- With inpainting: back is plausibly filled in
	- Quota cost: ~30s per inpaint

	---

	### Milestone 8 — Stage 2K-2O: Finalization Steps

	Goal: Channel packing, LODs, collision, pivot, scale — all CPU-side, fast.

	Dependencies to add:
	```
	coacd==1.0.4
	```

	Files to create:
	- `src/stages/stage2_channel_pack.py` — numpy ORM / MetallicSmoothness packing
	- `src/stages/stage2_lods.py` — PyMeshLab quality-aware LOD0/1/2
	- `src/stages/stage2_collision.py` — CoACD with `trimesh.convex_hull` fallback
	- `src/stages/stage2_pivot.py` — bottom_center / geometric_center / custom
	- `src/stages/stage2_scale.py` — height presets, UE5 cm units

	UI: All controls already scaffolded in Milestone 1's Post-Process tab. Just wire to real implementations.

	Test criteria:
	- ORM packed as RGB with AO/Roughness/Metallic in correct channels
	- LOD0/LOD1/LOD2 all generated, all share same UV layout
	- Collision mesh has <1% the triangle count of LOD0
	- Pivot at bottom_center for a generated human character results in feet at world origin Y=0
	- Scale: human asset is 1.8m tall = 180cm in UE5 export

	---

	### Milestone 9 — Stage 3: UniRig Auto-Rigging

	Goal: Generate a skeleton + skinning weights for character meshes.

	Approach: Same vendor-the-Space pattern as Milestone 2.

	1. Duplicate `MohamedRashad/UniRig` Space → staging
	2. Verify it builds in the staging duplicate
	3. Copy `UniRig/` package into our repo
	4. Merge requirements
	5. Wire to the Auto-Rig tab handler
	6. Output: rigged FBX (UE5 default) or GLB

	Test criteria:
	- Run on a humanoid character (after full Stage 2 processing)
	- Output FBX imports into UE5 as a Skeletal Mesh
	- Drag into Mixamo → animations auto-attach correctly

	---

	### Milestone 10 — Stage 4: UE5 Export

	Goal: Bundle everything into a UE5-ready zip with proper naming and packing.

	Dependencies to add:
	```
	pygltflib
	```

	Files to create:
	- `src/stages/stage4_export.py`
	- `export_ue5(asset_state, asset_name, asset_type) → zip_path`
	- Handles FBX conversion via trimesh
	- Applies naming convention (`SM_`, `SK_`, `T_`)
	- Writes ORM-packed textures to correct paths
	- Zip + drop in `workspace/exports/`

	Engine presets (only UE5 fully implemented):
	- UE5: FBX, DX normals, ORM, Z-up, cm — the default
	- Unity HDRP: FBX, GL normals, MetallicSmoothness, Y-up, m — stub for later
	- Godot/Blender/Web: stubs

	Test criteria:
	- Export a character → unzip → 6-7 files following naming convention
	- Import to UE5: drag-drop the zip's contents → no warnings, materials auto-create from textures
	- Both Static Mesh and Skeletal Mesh paths work

	---

	### Milestone 11 — Presets System

	Goal: Save and load named parameter configurations across tabs.

	Files to update:
	- `src/workspace.py` — already has `save_preset/load_preset/delete_preset`, just needs the JSON schema fleshed out
	- `app.py` — wire the Presets tab's Save button to actually read all current tab values

	Schema:
	```json
	{
	"name": "character_UE5_hero",
	"stage1": { ... },
	"stage2": { ... },
	"stage3": { ... },
	"stage4": { ... }
	}
	```

	Ship five default presets:
	- `character_UE5_hero.json`
	- `character_UE5_npc.json`
	- `prop_UE5_hero.json`
	- `prop_UE5_standard.json`
	- `environment_UE5_background.json`

	---

	### Milestone 12 — Polish & Production Hardening

	- Error handling on every stage (don't crash the app, show clear error in UI)
	- Progress bars during long ops (`gr.Progress(track_tqdm=True)`)
	- Quota cost shown before each GPU operation (warning if it would exceed remaining)
	- Game-ready checklist passes shown before allowing Export
	- Asset history sidebar (last 5 generated assets with thumbnails)
	- Session cleanup of `workspace/current/` on new generation

	---

	## Working with Claude Code

	When you continue in Claude Code, you'll have the full repo locally. Key things to remember:

	### Project conventions

	1. Each stage = its own module in `src/stages/`. Don't dump pipeline logic into `app.py`.
	2. Workspace state is the single source of truth. Every stage reads from and writes to `workspace.get_state()`.
	3. GPU functions live where they're needed, not all in app.py. The `@spaces.GPU` decorator works in any file as long as `spaces` is imported.
	4. No `if __name__ == "__main__":` on `demo.launch()`. HF Spaces imports app.py at module level.
	5. Gradio 6 specifics:
	- `theme` and `css` go in `launch()`, not `Blocks()`
	- `show_api` is gone — use `footer_links=["gradio", "settings"]`
	- `api_visibility` replaces `api_name=False` on events
	6. The 3 global components (`viewer`, `summary`, `status_bar`) get refreshed via `_global_refresh()` chained off every pipeline action button. Don't forget to add new buttons to that list.

	### Useful commands

	```bash
	# Pull the latest Space state
	cd open3dforge
	git pull

	# Make changes, syntax-check before push
	python -c "import ast; ast.parse(open('app.py').read())"

	# Push to deploy
	git add -A
	git commit -m "Milestone N: <stage>"
	git push

	# Watch build/runtime logs at:
	# https://huggingface.co/spaces/Reverb/open3dforge?logs=container
	```

	### Common HF Space build failures (we've hit these)

	\| Symptom \| Cause \| Fix \|
	\|---\|---\|---\|
	\| `Cannot install gradio<X and gradio==Y` \| `sdk_version` in README conflicts with requirements.txt pin \| Remove version pin in requirements.txt or update README's sdk_version \|
	\| `Blocks.launch() got an unexpected keyword argument 'X'` \| Gradio 6 removed parameter \| Check Gradio 6 migration guide for replacement \|
	\| `When localhost is not accessible` \| `demo.launch()` wrapped in `if __name__ == "__main__"` \| Move to module level \|
	\| CUDA wheel compile failures \| Mismatched torch/CUDA versions \| Match TRELLIS.2 staging duplicate's exact pins \|
	\| OOM during model load \| Multiple large models loaded at module level \| Lazy-load with module-level guards inside `@spaces.GPU` \|

	### Useful resources

	- Gradio 6 migration guide: https://www.gradio.app/main/guides/gradio-6-migration-guide
	- ZeroGPU docs: https://huggingface.co/docs/hub/spaces-zerogpu
	- TRELLIS.2 reference Space: https://huggingface.co/spaces/microsoft/TRELLIS.2
	- Hunyuan3D-2 reference Space: https://huggingface.co/spaces/tencent/Hunyuan3D-2
	- UniRig reference Space: https://huggingface.co/spaces/MohamedRashad/UniRig

	---

	## Constraints to Remember

	- Daily quota: 1500s (25 min) of H200 time per day. Plan asset iteration accordingly.
	- VRAM budget: ~70GB per workload. TRELLIS.2 alone is 24GB; UniRig is 8GB; SDXL inpaint is 8GB. Don't load all at once.
	- Function timeout: Default `@spaces.GPU` duration is 60s. Override with `duration=N` for longer ops (Stage 1 generation, AO bake high quality).
	- Build time: With TRELLIS.2 vendored + CUDA wheels, expect 10-15 min builds. Cache hits will be ~3 min.
	- Repo size: Will grow large with vendored models + HDRIs. Git LFS may be needed for the autotune_cache.json (~1MB) and wheel files (~100MB+). HF Spaces handles this via Xet storage automatically.

	---

	Plan version 3.0 — May 15, 2026
	Last action completed: Milestone 1 deployed, ZeroGPU smoke test passing
	Next action: Milestone 2 — duplicate microsoft/TRELLIS.2 staging Space, merge into open3dforge