--- license: mit library_name: diffusers pipeline_tag: text-to-image tags: - gcode - cnc - plotter - polargraph - stable-diffusion - text-to-gcode - diffusion base_model: runwayml/stable-diffusion-v1-5 datasets: - twarner/dcode-imagenet-sketch --- # dcode: Text-to-Gcode Diffusion Model An end-to-end diffusion model that converts **text prompts directly into G-code** for CNC machines, plotters, and polargraph drawing robots. ## Overview dcode is a fine-tuned Stable Diffusion model with a custom G-code decoder head. It takes a text description (e.g., "a sketch of a horse") and outputs machine-executable G-code. | Component | Description | |-----------|-------------| | Base Model | [Stable Diffusion v1.5](https://huggingface.co/runwayml/stable-diffusion-v1-5) | | Decoder | 200M param transformer (12 layers, 1024 hidden, 16 heads) | | Tokenizer | Custom BPE tokenizer for G-code | | Training Data | [dcode-imagenet-sketch](https://huggingface.co/datasets/twarner/dcode-imagenet-sketch) | ## Architecture ``` Text Prompt ↓ [CLIP Text Encoder] ← frozen ↓ [UNet Diffusion] ← frozen ↓ Latent (4×64×64) ↓ [CNN Projector] ← trained ↓ [Transformer Decoder] ← trained ↓ G-code Tokens ↓ G-code Text ``` ## Usage ### With Diffusers ```python import torch from diffusers import StableDiffusionPipeline from huggingface_hub import hf_hub_download from transformers import PreTrainedTokenizerFast # Load components pipe = StableDiffusionPipeline.from_pretrained( "runwayml/stable-diffusion-v1-5", torch_dtype=torch.float16 ).to("cuda") # Download decoder weights weights = hf_hub_download("twarner/dcode-sd-gcode-v3", "pytorch_model.bin") tokenizer_path = hf_hub_download("twarner/dcode-sd-gcode-v3", "gcode_tokenizer/tokenizer.json") # Load custom gcode tokenizer gcode_tokenizer = PreTrainedTokenizerFast(tokenizer_file=tokenizer_path) # Generate latent from text with torch.no_grad(): latent = pipe("a sketch of a horse", output_type="latent").images # ... decode with GcodeDecoderV3 (see repo for full inference code) ``` ### Interactive Demo Try the model live: **[huggingface.co/spaces/twarner/dcode](https://huggingface.co/spaces/twarner/dcode)** ## Training - **Dataset**: 50,000 ImageNet-Sketch images → 200,000 G-code files - **Hardware**: 8× NVIDIA H100 80GB - **Epochs**: 50 - **Batch Size**: 256 effective (32 × 8 GPUs) - **Learning Rate**: 1e-4 with cosine schedule - **Regularization**: Label smoothing (0.1), weight decay (0.05) ## G-code Output The model generates G-code compatible with: - Polargraph/drawbot machines - Pen plotters - Any G-code compatible CNC Example output: ```gcode G21 ; mm G90 ; absolute M280 P0 S90 ; pen up G28 ; home G0 X-200.00 Y100.00 F1000 M280 P0 S40 ; pen down G1 X-180.00 Y120.00 F500 G1 X-160.00 Y115.00 F500 ... ``` ## Machine Specs Default work area (configurable): - Width: 841mm - Height: 1189mm (A0 paper) - Pen servo: 40° down, 90° up ## Project Full project documentation, hardware build guide, and source code: **🔗 [teddywarner.org/Projects/Polargraph/#dcode](https://teddywarner.org/Projects/Polargraph/#dcode)** **GitHub**: [github.com/Twarner491/dcode](https://github.com/Twarner491/dcode) ## Citation ```bibtex @misc{dcode2024, author = {Teddy Warner}, title = {dcode: Text-to-Gcode Diffusion Model}, year = {2026}, url = {https://teddywarner.org/Projects/Polargraph/#dcode} } ``` ## License MIT License