SadilKhan
/

DreamCAD

Model card Files Files and versions

DreamCAD / README.md

SadilKhan's picture

Update README.md

e05e4bc verified 16 days ago

|

history blame contribute delete

3.18 kB

	---
	license: c-uda
	datasets:
	- SadilKhan/CADCap-1M
	language:
	- en
	base_model:
	- stabilityai/stable-diffusion-3.5-medium
	tags:
	- text-to-image
	- cad
	- dreamcad
	- fine-tuned
	---

	<div align="center">
	<img src="https://sadilkhan.github.io/dreamcad2026/static/images/logo.svg" width="400"/>
	</div>

	<h1 align="center"> Scaling Multi-modal CAD Generation using Differentiable Parametric Surfaces </h1>

	<p align="center">
	<a href="https://arxiv.org/abs/2603.05607"><img src="https://img.shields.io/badge/arXiv-2603.05607-b31b1b.svg" /></a>
	<a href="https://sadilkhan.github.io/dreamcad2026/"><img src="https://img.shields.io/badge/Project-Page-blue.svg" /></a>
	<a href="https://huggingface.co/datasets/SadilKhan/CADCap-1M"><img src="https://img.shields.io/badge/Dataset-CADCap--1M-ffd21e?logo=huggingface" /></a>
	</p>


	## Overview

	This repo contains the fine-tuned Stable Diffusion 3.5 used as the text-to-image component of [DreamCAD](https://arxiv.org/abs/2603.05607) — a multi-modal generative framework for scalable CAD generation via differentiable parametric surfaces.

	DreamCAD adopts a two-stage approach to text-to-CAD generation:

	```
	Text Prompt ──► [This model] SD 3.5 (fine-tuned) ──► CAD-style image ──► Image-to-CAD model ──► STEP file
	```

	> 💡 Direct text-to-CAD generation is notoriously difficult without visual grounding. This model bridges that gap by generating CAD-style images that provide the geometric and structural grounding needed for downstream image-to-CAD reconstruction.

	## Usage

	### Install dependencies
	```bash
	pip install diffusers transformers accelerate
	```


	### For Text-to-Image Generation (CAD-style)
	```python
	from diffusers import StableDiffusion3Pipeline
	import torch
	DEFAULT_TEXT ="A CAD model of "

	HF_TOKEN = "YOUR_TOKEN_ID"

	os.environ["HF_TOKEN"] = HF_TOKEN
	# Load the base model first
	pipe = StableDiffusion3Pipeline.from_pretrained(
	"stabilityai/stable-diffusion-3.5-medium",
	torch_dtype=torch.float16,
	cache_dir="/netscratch/mokhan/.cache",
	)

	# Load and fuse the DreamCAD LoRA
	pipe.load_lora_weights(
	"SadilKhan/DreamCAD",
	weight_name="dreamcad_sd35/pytorch_lora_weights.safetensors",
	token=HF_TOKEN,
	)

	pipe = pipe.to("cuda")

	image = pipe(DEFAULT_TEXT + "Ergonomic office chair with curved backrest frame, adjustable armrests, and five-spoke base with casters.").images[0]
	image.save("output.png")
	```

	## Citation

	If you find DreamCAD useful, please cite

	```bibtex
	@article{khan2026dreamcad,
	title={DreamCAD: Scaling Multi-modal CAD Generation using Differentiable Parametric Surfaces},
	author={Khan, Mohammad Sadil and Usama, Muhammad and Potamias, Rolandos Alexandros and Stricker, Didier and Afzal, Muhammad Zeshan and Deng, Jiankang and Elezi, Ismail},
	journal={arXiv preprint arXiv:2603.05607},
	year={2026}
	}
	```


	## License

	This model inherits the [Stability AI Community License](https://stability.ai/license) from the base model.
	- ✅ Free for research and non-commercial use
	- ✅ Free for commercial use if your org has < $1M annual revenue
	- ❌ Requires an [Enterprise License](https://stability.ai/enterprise) above $1M revenue