CADEvolve: Creating Realistic CAD via Program Evolution
Abstract
CADEvolve presents an evolution-based approach using VLM-guided edits to generate complex CAD programs from simple primitives, creating a large dataset for improved Image2CAD performance.
Computer-Aided Design (CAD) delivers rapid, editable modeling for engineering and manufacturing. Recent AI progress now makes full automation feasible for various CAD tasks. However, progress is bottlenecked by data: public corpora mostly contain sketch-extrude sequences, lack complex operations, multi-operation composition and design intent, and thus hinder effective fine-tuning. Attempts to bypass this with frozen VLMs often yield simple or invalid programs due to limited 3D grounding in current foundation models. We present CADEvolve, an evolution-based pipeline and dataset that starts from simple primitives and, via VLM-guided edits and validations, incrementally grows CAD programs toward industrial-grade complexity. The result is 8k complex parts expressed as executable CadQuery parametric generators. After multi-stage post-processing and augmentation, we obtain a unified dataset of 1.3m scripts paired with rendered geometry and exercising the full CadQuery operation set. A VLM fine-tuned on CADEvolve achieves state-of-the-art results on the Image2CAD task across the DeepCAD, Fusion 360, and MCB benchmarks.
Community
Title: CADEvolve: Creating Realistic CAD via Program Evolution
Paper: https://arxiv.org/abs/2602.16317
Code: https://github.com/zhemdi/CADEvolve
Dataset: https://huggingface.co/datasets/kulibinai/cadevolve
Models: https://huggingface.co/kulibinai/cadevolve-rl1
TL;DR: We generate realistic and complex CAD dataset that spans all the operation by evolving from simplest shapes, then expand to a large-scale corpus for various CAD tasks. We prove the advantages of the dataset by training an Image2CAD model on this dataset
Highlights:
- Evolutionary pipeline.
- Dataset of ~1.3M executable CAD programs
- SOTA Image2CAD model.
Authors: Maksim Elistratov, Marina Barannikov, Gregory Ivanov, Valentin Khrulkov, Anton Konushin, Andrey Kuznetsov, Dmitrii Zhemchuzhnikov
Authors, thank you for your work.
I'm building something in the space. Would be interesting to see the same dataset rebuilt for build123d!
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- STEP-LLM: Generating CAD STEP Models from Natural Language with Large Language Models (2026)
- PLLM: Pseudo-Labeling Large Language Models for CAD Program Synthesis (2026)
- Clarify Before You Draw: Proactive Agents for Robust Text-to-CAD Generation (2026)
- Chart Specification: Structural Representations for Incentivizing VLM Reasoning in Chart-to-Code Generation (2026)
- CME-CAD: Heterogeneous Collaborative Multi-Expert Reinforcement Learning for CAD Code Generation (2025)
- Proc3D: Procedural 3D Generation and Parametric Editing of 3D Shapes with Large Language Models (2026)
- Draw it like Euclid: Teaching transformer models to generate CAD profiles using ruler and compass construction steps (2026)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Models citing this paper 1
Datasets citing this paper 1
Spaces citing this paper 0
No Space linking this paper
