twarner commited on
Commit
607c84e
Β·
verified Β·
1 Parent(s): 1e7a96b

Upload folder using huggingface_hub

Browse files
README.md ADDED
@@ -0,0 +1,145 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ library_name: diffusers
4
+ pipeline_tag: text-to-image
5
+ tags:
6
+ - gcode
7
+ - cnc
8
+ - plotter
9
+ - polargraph
10
+ - stable-diffusion
11
+ - text-to-gcode
12
+ - diffusion
13
+ base_model: runwayml/stable-diffusion-v1-5
14
+ datasets:
15
+ - twarner/dcode-imagenet-sketch
16
+ ---
17
+
18
+ # dcode: Text-to-Gcode Diffusion Model
19
+
20
+ An end-to-end diffusion model that converts **text prompts directly into G-code** for CNC machines, plotters, and polargraph drawing robots.
21
+
22
+ ## Overview
23
+
24
+ dcode is a fine-tuned Stable Diffusion model with a custom G-code decoder head. It takes a text description (e.g., "a sketch of a horse") and outputs machine-executable G-code.
25
+
26
+ | Component | Description |
27
+ |-----------|-------------|
28
+ | Base Model | [Stable Diffusion v1.5](https://huggingface.co/runwayml/stable-diffusion-v1-5) |
29
+ | Decoder | 200M param transformer (12 layers, 1024 hidden, 16 heads) |
30
+ | Tokenizer | Custom BPE tokenizer for G-code |
31
+ | Training Data | [dcode-imagenet-sketch](https://huggingface.co/datasets/twarner/dcode-imagenet-sketch) |
32
+
33
+ ## Architecture
34
+
35
+ ```
36
+ Text Prompt
37
+ ↓
38
+ [CLIP Text Encoder] ← frozen
39
+ ↓
40
+ [UNet Diffusion] ← frozen
41
+ ↓
42
+ Latent (4Γ—64Γ—64)
43
+ ↓
44
+ [CNN Projector] ← trained
45
+ ↓
46
+ [Transformer Decoder] ← trained
47
+ ↓
48
+ G-code Tokens
49
+ ↓
50
+ G-code Text
51
+ ```
52
+
53
+ ## Usage
54
+
55
+ ### With Diffusers
56
+
57
+ ```python
58
+ import torch
59
+ from diffusers import StableDiffusionPipeline
60
+ from huggingface_hub import hf_hub_download
61
+ from transformers import PreTrainedTokenizerFast
62
+
63
+ # Load components
64
+ pipe = StableDiffusionPipeline.from_pretrained(
65
+ "runwayml/stable-diffusion-v1-5",
66
+ torch_dtype=torch.float16
67
+ ).to("cuda")
68
+
69
+ # Download decoder weights
70
+ weights = hf_hub_download("twarner/dcode-sd-gcode-v3", "pytorch_model.bin")
71
+ tokenizer_path = hf_hub_download("twarner/dcode-sd-gcode-v3", "gcode_tokenizer/tokenizer.json")
72
+
73
+ # Load custom gcode tokenizer
74
+ gcode_tokenizer = PreTrainedTokenizerFast(tokenizer_file=tokenizer_path)
75
+
76
+ # Generate latent from text
77
+ with torch.no_grad():
78
+ latent = pipe("a sketch of a horse", output_type="latent").images
79
+
80
+ # ... decode with GcodeDecoderV3 (see repo for full inference code)
81
+ ```
82
+
83
+ ### Interactive Demo
84
+
85
+ Try the model live: **[huggingface.co/spaces/twarner/dcode](https://huggingface.co/spaces/twarner/dcode)**
86
+
87
+ ## Training
88
+
89
+ - **Dataset**: 50,000 ImageNet-Sketch images β†’ 200,000 G-code files
90
+ - **Hardware**: 8Γ— NVIDIA H100 80GB
91
+ - **Epochs**: 50
92
+ - **Batch Size**: 256 effective (32 Γ— 8 GPUs)
93
+ - **Learning Rate**: 1e-4 with cosine schedule
94
+ - **Regularization**: Label smoothing (0.1), weight decay (0.05)
95
+
96
+ ## G-code Output
97
+
98
+ The model generates G-code compatible with:
99
+ - Polargraph/drawbot machines
100
+ - Pen plotters
101
+ - Any G-code compatible CNC
102
+
103
+ Example output:
104
+ ```gcode
105
+ G21 ; mm
106
+ G90 ; absolute
107
+ M280 P0 S90 ; pen up
108
+ G28 ; home
109
+
110
+ G0 X-200.00 Y100.00 F1000
111
+ M280 P0 S40 ; pen down
112
+ G1 X-180.00 Y120.00 F500
113
+ G1 X-160.00 Y115.00 F500
114
+ ...
115
+ ```
116
+
117
+ ## Machine Specs
118
+
119
+ Default work area (configurable):
120
+ - Width: 841mm
121
+ - Height: 1189mm (A0 paper)
122
+ - Pen servo: 40Β° down, 90Β° up
123
+
124
+ ## Project
125
+
126
+ Full project documentation, hardware build guide, and source code:
127
+
128
+ **πŸ”— [teddywarner.org/Projects/Polargraph/#dcode](https://teddywarner.org/Projects/Polargraph/#dcode)**
129
+
130
+ **GitHub**: [github.com/Twarner491/dcode](https://github.com/Twarner491/dcode)
131
+
132
+ ## Citation
133
+
134
+ ```bibtex
135
+ @misc{dcode2024,
136
+ author = {Teddy Warner},
137
+ title = {dcode: Text-to-Gcode Diffusion Model},
138
+ year = {2024},
139
+ url = {https://teddywarner.org/Projects/Polargraph/#dcode}
140
+ }
141
+ ```
142
+
143
+ ## License
144
+
145
+ MIT License
config.json CHANGED
@@ -7,7 +7,7 @@
7
  "hidden_size": 1024,
8
  "num_layers": 12,
9
  "num_heads": 16,
10
- "vocab_size": 1865,
11
  "max_seq_len": 2048,
12
  "ffn_mult": 4
13
  }
 
7
  "hidden_size": 1024,
8
  "num_layers": 12,
9
  "num_heads": 16,
10
+ "vocab_size": 1714,
11
  "max_seq_len": 2048,
12
  "ffn_mult": 4
13
  }
gcode_tokenizer/tokenizer.json CHANGED
The diff for this file is too large to render. See raw diff
 
gcode_tokenizer/tokenizer_config.json CHANGED
@@ -35,10 +35,10 @@
35
  "4": {
36
  "content": "<newline>",
37
  "lstrip": false,
38
- "normalized": true,
39
  "rstrip": false,
40
  "single_word": false,
41
- "special": false
42
  }
43
  },
44
  "bos_token": "<s>",
 
35
  "4": {
36
  "content": "<newline>",
37
  "lstrip": false,
38
+ "normalized": false,
39
  "rstrip": false,
40
  "single_word": false,
41
+ "special": true
42
  }
43
  },
44
  "bos_token": "<s>",
pytorch_model.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:4632c26f22caad3d31d8cd7498e2dea00a6eb3bfb7973a276b0040ef7258cb81
3
- size 2807496475
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:530516a85ff658469b9f36c87261c95d57e2e9de34d13a63b9a8d298f6295f12
3
+ size 2806259483