DreamCAD / README.md
SadilKhan's picture
Update README.md
e05e4bc verified
---
license: c-uda
datasets:
- SadilKhan/CADCap-1M
language:
- en
base_model:
- stabilityai/stable-diffusion-3.5-medium
tags:
- text-to-image
- cad
- dreamcad
- fine-tuned
---
<div align="center">
<img src="https://sadilkhan.github.io/dreamcad2026/static/images/logo.svg" width="400"/>
</div>
<h1 align="center"> Scaling Multi-modal CAD Generation using Differentiable Parametric Surfaces </h1>
<p align="center">
<a href="https://arxiv.org/abs/2603.05607"><img src="https://img.shields.io/badge/arXiv-2603.05607-b31b1b.svg" /></a>
<a href="https://sadilkhan.github.io/dreamcad2026/"><img src="https://img.shields.io/badge/Project-Page-blue.svg" /></a>
<a href="https://huggingface.co/datasets/SadilKhan/CADCap-1M"><img src="https://img.shields.io/badge/Dataset-CADCap--1M-ffd21e?logo=huggingface" /></a>
</p>
## Overview
This repo contains the fine-tuned **Stable Diffusion 3.5** used as the **text-to-image** component of [DreamCAD](https://arxiv.org/abs/2603.05607) β€” a multi-modal generative framework for scalable CAD generation via differentiable parametric surfaces.
DreamCAD adopts a two-stage approach to text-to-CAD generation:
```
Text Prompt ──► [This model] SD 3.5 (fine-tuned) ──► CAD-style image ──► Image-to-CAD model ──► STEP file
```
> πŸ’‘ Direct text-to-CAD generation is notoriously difficult without visual grounding. This model bridges that gap by generating CAD-style images that provide the geometric and structural grounding needed for downstream image-to-CAD reconstruction.
## Usage
### Install dependencies
```bash
pip install diffusers transformers accelerate
```
### For Text-to-Image Generation (CAD-style)
```python
from diffusers import StableDiffusion3Pipeline
import torch
DEFAULT_TEXT ="A CAD model of "
HF_TOKEN = "YOUR_TOKEN_ID"
os.environ["HF_TOKEN"] = HF_TOKEN
# Load the base model first
pipe = StableDiffusion3Pipeline.from_pretrained(
"stabilityai/stable-diffusion-3.5-medium",
torch_dtype=torch.float16,
cache_dir="/netscratch/mokhan/.cache",
)
# Load and fuse the DreamCAD LoRA
pipe.load_lora_weights(
"SadilKhan/DreamCAD",
weight_name="dreamcad_sd35/pytorch_lora_weights.safetensors",
token=HF_TOKEN,
)
pipe = pipe.to("cuda")
image = pipe(DEFAULT_TEXT + "Ergonomic office chair with curved backrest frame, adjustable armrests, and five-spoke base with casters.").images[0]
image.save("output.png")
```
## Citation
If you find DreamCAD useful, please cite
```bibtex
@article{khan2026dreamcad,
title={DreamCAD: Scaling Multi-modal CAD Generation using Differentiable Parametric Surfaces},
author={Khan, Mohammad Sadil and Usama, Muhammad and Potamias, Rolandos Alexandros and Stricker, Didier and Afzal, Muhammad Zeshan and Deng, Jiankang and Elezi, Ismail},
journal={arXiv preprint arXiv:2603.05607},
year={2026}
}
```
## License
This model inherits the [Stability AI Community License](https://stability.ai/license) from the base model.
- βœ… Free for research and non-commercial use
- βœ… Free for commercial use if your org has **< $1M annual revenue**
- ❌ Requires an [Enterprise License](https://stability.ai/enterprise) above $1M revenue