910 MB
22 files
Updated 6 days ago
Name
Size
QA
code_gen
edit-bench
.gitattributes2.5 kB
xet
README.md1.8 kB
xet
README.md

BenchCAD

Three-config dataset for CAD evaluation:

  • edit-bench — held-out CAD edit benchmark.
  • code_gen — 17,900 synthetic CadQuery samples (compact 12-column variant) covering 106 mechanical part families. Each row contains the GT CadQuery code plus 5 normalized renders.
  • QA — CAD question-answering benchmark.

code_gen schema (12 columns)

Column Type Description
stem string unique sample identifier
family string mechanical part family (106 distinct)
variant string sub-variant within family
difficulty string easy / medium / hard
base_plane string initial workplane (XY / XZ / YZ)
standard string ISO/DIN standard if applicable
code string CadQuery Python source (ground truth)
view_0_png image front view (134×134 PNG)
view_1_png image right view
view_2_png image top view
view_3_png image iso view
composite_png image 2×2 composite of the four views

Usage

from datasets import load_dataset

ds = load_dataset("BenchCAD/BenchCAD", "code_gen", split="code_gen")
print(ds[0]["family"], ds[0]["difficulty"])
ds[0]["composite_png"].show()
print(ds[0]["code"])
Total size
910 MB
Files
22
Last updated
Jun 23
Pre-warmed CDN
US EU US EU

Contributors