skadio
/

learn2zinc

Text Generation

constraint-programming

code-generation

Model card Files Files and versions

learn2zinc / README.md

skadio's picture

Update README.md

a83c3b0 verified 21 days ago

|

history blame contribute delete

3.35 kB

	---
	license: mit
	tags:
	- minizinc
	- constraint-programming
	- optimization
	- code-generation
	- lora
	- learn2zinc
	pipeline_tag: text-generation
	---

	# Learn2Zinc Models

	Learn2Zinc is a a family of small language models fine-tuned with LoRA for translating natural-language optimization problems into executable [MiniZinc](https://www.minizinc.org/) code. All models were trained on the [learn2zinc_augmented](https://huggingface.co/datasets/skadio/learn2zinc_augmented) dataset using the [Unsloth](https://github.com/unslothai/unsloth) library.

	## Models

	\| Model \| Base Model \| Parameters \| Chat Format \|
	\|---\|---\|---\|---\|
	\| [learn2zinc-GPT-oss-20B](https://huggingface.co/skadio/learn2zinc-GPT-oss-20B) \| GPT-OSS-20B \| 20B \| Harmony (custom) \|
	\| [learn2zinc-Gemma-2-9B](https://huggingface.co/skadio/learn2zinc-Gemma-2-9B) \| Gemma 2 9B \| 9B \| Standard chat template \|
	\| [learn2zinc-Llama-3.2-3B](https://huggingface.co/skadio/learn2zinc-Llama-3.2-3B) \| Llama 3.2 3B \| 3B \| Standard chat template \|
	\| [learn2zinc-Llama-3.2-1B](https://huggingface.co/skadio/learn2zinc-Llama-3.2-1B) \| Llama 3.2 1B \| 1B \| Standard chat template \|
	\| [learn2zinc-Qwen3-0.6B](https://huggingface.co/skadio/learn2zinc-Qwen3-0.6B) \| Qwen3 0.6B \| 0.6B \| Standard chat template \|

	## Shared Training Configuration

	All models share the same LoRA and training setup:

	\| Hyperparameter \| Value \|
	\|---\|---\|
	\| Fine-tuning method \| LoRA (rank 64, alpha 64) \|
	\| Target modules \| q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj \|
	\| Learning rate \| 2e-4 (cosine schedule, 50 warmup steps) \|
	\| Epochs \| 3 \|
	\| Optimizer \| AdamW 8-bit \|
	\| Weight decay \| 0.01 \|
	\| Precision \| bf16 \|
	\| Quantization during training \| 4-bit \|
	\| Max sequence length \| 4096 \|
	\| Training \| Response-only (SFTTrainer) \|

	## Evaluation

	Models were evaluated on the IndustryOR subset of the learn2zinc benchmark. Generated MiniZinc code was executed with the HiGHS solver (120 s timeout). All generations used temperature = 0 for reproducibility.

	Metrics: Execution Success Rate (code compiles and runs) and Solution Correctness (objective matches expected value within 1e-6).

	For full evaluation details, see the [learn2zinc GitHub repo](https://github.com/skadio/learn2zinc).

	## Datasets

	All models were trained on datasets from the [Learn2Zinc collection](https://huggingface.co/datasets/skadio/learn2zinc-collection):

	\| Dataset \| Strategy \| Examples \|
	\|---\|---\|---\|
	\| [learn2zinc-base](https://huggingface.co/datasets/skadio/learn2zinc-base) \| Direct generation \| 8,014 \|
	\| [learn2zinc-cot](https://huggingface.co/datasets/skadio/learn2zinc-cot) \| Chain-of-thought + generation \| 8,014 \|
	\| [learn2zinc-augmented](https://huggingface.co/datasets/skadio/learn2zinc-augmented) \| Generation + correction \| 15,649 \|

	## Framework

	- [Unsloth](https://github.com/unslothai/unsloth)
	- [PEFT / LoRA](https://github.com/huggingface/peft)
	- [TRL SFTTrainer](https://github.com/huggingface/trl)

	## Citation

	```bibtex
	@misc{kadioglu2026modelingcopilotstexttomodeltranslation,
	title={Modeling Copilots for Text-to-Model Translation},
	author={Serdar Kadioglu and Karthik Uppuluri and Akash Singirikonda},
	year={2026},
	eprint={2604.12955},
	archivePrefix={arXiv},
	primaryClass={cs.AI},
	url={https://arxiv.org/abs/2604.12955},
	}
	```