GeoCAD-LLM
/

GeoCAD-LLM_4B

Model card Files Files and versions

GeoCAD-LLM_4B / README.md

kyujinpy's picture

Update README.md

21b45c6 verified 6 days ago

|

history blame contribute delete

1.59 kB

	---
	license: apache-2.0
	---

	![img1](./img/overview_paper.jpg)
	# GeoCAD-LLM: CAD Sequence Generation via Multimodal LLMs with Equivariant Geometric Features🛠️
	- [Github](https://github.com/kshsh0405/GeoCADLLM_Inference)
	- [Paper](comming_soon)

	## GeoCAD-LLM_4B🛠️
	- Base_model: [Qwen3-4B-Instruct](Qwen/Qwen3-4B-Instruct-2507)
	- Max sequence length: 8,192
	- Epoch: 2
	- Learning rate: 1e-4
	- Batch size: 128
	- This model specialized for text-to-CAD. However, it also supports multi-modality.

	## GeoCAD-LLM Contributions🔥
	![img2](./img/model_structure.jpg)

	- State-of-the-art Performace🏆 in [Text2CAD](https://huggingface.co/datasets/SadilKhan/Text2CAD) datasets. (as shown in below Tables)
	- Multimodal CAD Generation🌐: Both text-to-CAD and pc-text-to-CAD.
	- GeoCAD-LLM directly generate CAD vector sequence as natural language.
	- Novel Two Stage Training Pipeline🧭: In stage1, training semantic geometry alignment. In stage2, training fine-grained geometry. Especially, we direct levearge E(3)-equivariant features for geomtry-consistent supervision, inherently ensuring geometric feature consistency regardless of input orientation.
	- Apply Point Cloud Dropout (PCD) technique🧶: PCD mitigates over-reliance on geometric inputs and improves multimodal generalization. Also, it is a critical training technique for multimodal CAD generation.

	## Performace (text-to-CAD & pc-text-to-CAD)🔥
	![img3](./img/performance.jpg)

	## Qualitative Results
	Please check our paper and supplementary materials.🤗

	## Bibtex🤗
	```
	(TODO)
	```