| --- |
| license: c-uda |
| datasets: |
| - SadilKhan/CADCap-1M |
| language: |
| - en |
| base_model: |
| - stabilityai/stable-diffusion-3.5-medium |
| tags: |
| - text-to-image |
| - cad |
| - dreamcad |
| - fine-tuned |
| --- |
| |
| <div align="center"> |
| <img src="https://sadilkhan.github.io/dreamcad2026/static/images/logo.svg" width="400"/> |
| </div> |
|
|
| <h1 align="center"> Scaling Multi-modal CAD Generation using Differentiable Parametric Surfaces </h1> |
|
|
| <p align="center"> |
| <a href="https://arxiv.org/abs/2603.05607"><img src="https://img.shields.io/badge/arXiv-2603.05607-b31b1b.svg" /></a> |
| <a href="https://sadilkhan.github.io/dreamcad2026/"><img src="https://img.shields.io/badge/Project-Page-blue.svg" /></a> |
| <a href="https://huggingface.co/datasets/SadilKhan/CADCap-1M"><img src="https://img.shields.io/badge/Dataset-CADCap--1M-ffd21e?logo=huggingface" /></a> |
| </p> |
|
|
|
|
| ## Overview |
|
|
| This repo contains the fine-tuned **Stable Diffusion 3.5** used as the **text-to-image** component of [DreamCAD](https://arxiv.org/abs/2603.05607) β a multi-modal generative framework for scalable CAD generation via differentiable parametric surfaces. |
|
|
| DreamCAD adopts a two-stage approach to text-to-CAD generation: |
|
|
| ``` |
| Text Prompt βββΊ [This model] SD 3.5 (fine-tuned) βββΊ CAD-style image βββΊ Image-to-CAD model βββΊ STEP file |
| ``` |
|
|
| > π‘ Direct text-to-CAD generation is notoriously difficult without visual grounding. This model bridges that gap by generating CAD-style images that provide the geometric and structural grounding needed for downstream image-to-CAD reconstruction. |
|
|
| ## Usage |
|
|
| ### Install dependencies |
| ```bash |
| pip install diffusers transformers accelerate |
| ``` |
|
|
|
|
| ### For Text-to-Image Generation (CAD-style) |
| ```python |
| from diffusers import StableDiffusion3Pipeline |
| import torch |
| DEFAULT_TEXT ="A CAD model of " |
| |
| HF_TOKEN = "YOUR_TOKEN_ID" |
| |
| os.environ["HF_TOKEN"] = HF_TOKEN |
| # Load the base model first |
| pipe = StableDiffusion3Pipeline.from_pretrained( |
| "stabilityai/stable-diffusion-3.5-medium", |
| torch_dtype=torch.float16, |
| cache_dir="/netscratch/mokhan/.cache", |
| ) |
| |
| # Load and fuse the DreamCAD LoRA |
| pipe.load_lora_weights( |
| "SadilKhan/DreamCAD", |
| weight_name="dreamcad_sd35/pytorch_lora_weights.safetensors", |
| token=HF_TOKEN, |
| ) |
| |
| pipe = pipe.to("cuda") |
| |
| image = pipe(DEFAULT_TEXT + "Ergonomic office chair with curved backrest frame, adjustable armrests, and five-spoke base with casters.").images[0] |
| image.save("output.png") |
| ``` |
|
|
| ## Citation |
|
|
| If you find DreamCAD useful, please cite |
|
|
| ```bibtex |
| @article{khan2026dreamcad, |
| title={DreamCAD: Scaling Multi-modal CAD Generation using Differentiable Parametric Surfaces}, |
| author={Khan, Mohammad Sadil and Usama, Muhammad and Potamias, Rolandos Alexandros and Stricker, Didier and Afzal, Muhammad Zeshan and Deng, Jiankang and Elezi, Ismail}, |
| journal={arXiv preprint arXiv:2603.05607}, |
| year={2026} |
| } |
| ``` |
|
|
|
|
| ## License |
|
|
| This model inherits the [Stability AI Community License](https://stability.ai/license) from the base model. |
| - β
Free for research and non-commercial use |
| - β
Free for commercial use if your org has **< $1M annual revenue** |
| - β Requires an [Enterprise License](https://stability.ai/enterprise) above $1M revenue |