File size: 5,179 Bytes
62f867c
 
 
 
 
 
6403336
62f867c
 
844d694
 
 
 
 
7473ab7
 
 
 
62f867c
 
 
 
844d694
62f867c
844d694
 
 
 
 
 
 
 
 
 
 
 
 
 
62f867c
 
844d694
62f867c
 
844d694
62f867c
844d694
 
 
 
 
 
 
 
 
62f867c
844d694
62f867c
844d694
 
 
 
 
62f867c
aaf1325
844d694
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
62f867c
 
844d694
62f867c
 
844d694
 
 
 
 
dbd0bfd
 
 
 
844d694
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
---
title: ThreeGen
emoji: 🧊
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: "5.29.0"
app_file: app.py
pinned: false
tags:
  - build-small
  - off-brand
  - tiny-titan
  - best-demo
  - track:wood
  - sponsor:openai
  - achievement:offbrand
  - achievement:fieldnotes
---

# 🧊 ThreeGen

**Describe a 3D scene, get clean Three.js code.**

Type a prompt. A 3B model emits a JSON scene graph. A deterministic compiler turns it into a live Three.js preview with copy-paste-ready code.

## What it does

ThreeGen converts natural-language prompts into interactive 3D scenes. You describe an object β€” a star badge, a stack of neon cubes, a glass torus knot β€” and get a self-contained `.html` file you can open in any browser. The preview in the app is the exact same code that ends up in the Code tab: no divergence between "preview" and "output."

## The problem and the approach

Small models can't write Three.js directly β€” they hallucinate method names, invent APIs, and produce broken code at high rates. ThreeGen sidesteps this by splitting the job in two:

1. **The model writes structured JSON** β€” a narrow, validated scene graph (objects, materials, layout, animations). This is a small, learnable format even a 3B model handles well.
2. **A deterministic compiler writes Three.js** β€” `compiler.py` translates the scene graph into guaranteed-valid code, applies version-pinned Three.js r160, handles lighting, bloom, OrbitControls, and async font loading.

The model never touches the Three.js API. The preview never breaks on a bad token.

```
prompt β†’ Qwen2.5-Coder-3B β†’ JSON scene graph β†’ validate/repair β†’ compiler β†’ Three.js HTML
```

## Features

- **Primitives** β€” box, sphere, cylinder, cone, torus, torus knot, icosahedron, plane, ring, capsule, and more
- **Extruded shapes** β€” star, heart, arrow, hexagon, shield, custom SVG paths
- **3D text** β€” FontLoader + TextGeometry, composited into extruded badge templates
- **Deterministic templates** β€” star badge, shield badge: the model picks shape + colors, Python handles layout
- **Scene-graph groups** β€” hierarchical `THREE.Group()` with row / column / stack / grid layout, nested groups supported
- **Materials** β€” glass, metal, neon glow, chrome, matte, wireframe, with roughness/metalness control
- **Live preview** β€” iframe with OrbitControls (drag to orbit, scroll to zoom)
- **Copy-paste code** β€” the Code tab is the complete HTML file; save it, double-click, done
- **Lighting + glow controls** β€” ambient, directional, environment map, UnrealBloomPass strength slider

## Built small

Runs on **Qwen2.5-Coder-3B-Instruct** β€” a 3-billion-parameter model β€” on a single ZeroGPU T4. The model's job is intentionally narrow: emit a small JSON schema, not write a graphics library. Reliability comes from the compiler, not from model size.

## Demo

Demo video: [https://youtu.be/1NUB0DOGgcA](https://youtu.be/1NUB0DOGgcA)

Social post: [https://x.com/i/status/2066608249064030439](https://x.com/i/status/2066608249064030439)

## How it works (technical)

**Pipeline:**
1. `llm.py` β€” loads the model (or returns a mock for local dev), assembles a system prompt with the scene schema and ~30 few-shot examples, runs greedy generation, extracts the first valid JSON block from the output
2. `scene.py` β€” Pydantic v2 models for every node type (`Obj`, `ExtrudeNode`, `Text3DNode`, `GroupNode`, `Animation`); validators clamp out-of-range values and repair missing fields; `_parse_scene_item` dispatches on `"type"`
3. `compiler.py` β€” walks the scene graph with a `_Ctx` context object, emits Three.js JS for each node type, collects async font jobs for `Text3DNode`; wraps everything in an `async IIFE` inside a single self-contained HTML file with an importmap pinned to Three.js r160
4. `app.py` β€” Gradio 6 UI with ZeroGPU `@spaces.GPU` decorator; `generate()` builds the scene and compiles HTML; `rerender()` recompiles from cached scene JSON with new style/glow values without re-running the model

**Three.js output** uses a version-pinned importmap (`r160`) so the generated code remains valid regardless of CDN updates. All add-ons (OrbitControls, EffectComposer, UnrealBloomPass, RoundedBoxGeometry, FontLoader) are loaded from the same pinned release.

## Future work

- **Bloc3D mesh extension** β€” extend the pipeline to emit geometry for actual mesh generation (the project's longer-term goal)
- **Fine-tuning** — synthetic prompt→JSON pairs on a 0.5B model to push reliability further down in parameter count
- **More templates** β€” coin, ribbon, seal; parameterized layout macros

## Files

| file | role |
|---|---|
| `scene.py` | scene DSL schema, Pydantic validation, node types |
| `llm.py` | model load, few-shot prompt, JSON extraction, mock fallback |
| `compiler.py` | scene graph β†’ standalone Three.js HTML |
| `app.py` | Gradio UI, ZeroGPU wiring, style/glow controls |

## GitHub

[https://github.com/bolajiev/ThreeGen](https://github.com/bolajiev/ThreeGen)

## Local dev

```bash
pip install -r requirements.txt

# No model download, no GPU β€” instant iteration:
MOCK=1 python app.py

# Real model:
python app.py
```