--- license: mit tags: - minizinc - constraint-programming - optimization - code-generation - lora - learn2zinc pipeline_tag: text-generation --- # Learn2Zinc Models Learn2Zinc is a a family of small language models fine-tuned with LoRA for translating natural-language optimization problems into executable [MiniZinc](https://www.minizinc.org/) code. All models were trained on the [learn2zinc_augmented](https://huggingface.co/datasets/skadio/learn2zinc_augmented) dataset using the [Unsloth](https://github.com/unslothai/unsloth) library. ## Models | Model | Base Model | Parameters | Chat Format | |---|---|---|---| | [learn2zinc-GPT-oss-20B](https://huggingface.co/skadio/learn2zinc-GPT-oss-20B) | GPT-OSS-20B | 20B | Harmony (custom) | | [learn2zinc-Gemma-2-9B](https://huggingface.co/skadio/learn2zinc-Gemma-2-9B) | Gemma 2 9B | 9B | Standard chat template | | [learn2zinc-Llama-3.2-3B](https://huggingface.co/skadio/learn2zinc-Llama-3.2-3B) | Llama 3.2 3B | 3B | Standard chat template | | [learn2zinc-Llama-3.2-1B](https://huggingface.co/skadio/learn2zinc-Llama-3.2-1B) | Llama 3.2 1B | 1B | Standard chat template | | [learn2zinc-Qwen3-0.6B](https://huggingface.co/skadio/learn2zinc-Qwen3-0.6B) | Qwen3 0.6B | 0.6B | Standard chat template | ## Shared Training Configuration All models share the same LoRA and training setup: | Hyperparameter | Value | |---|---| | Fine-tuning method | LoRA (rank 64, alpha 64) | | Target modules | q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj | | Learning rate | 2e-4 (cosine schedule, 50 warmup steps) | | Epochs | 3 | | Optimizer | AdamW 8-bit | | Weight decay | 0.01 | | Precision | bf16 | | Quantization during training | 4-bit | | Max sequence length | 4096 | | Training | Response-only (SFTTrainer) | ## Evaluation Models were evaluated on the **IndustryOR** subset of the learn2zinc benchmark. Generated MiniZinc code was executed with the **HiGHS** solver (120 s timeout). All generations used **temperature = 0** for reproducibility. **Metrics:** Execution Success Rate (code compiles and runs) and Solution Correctness (objective matches expected value within 1e-6). For full evaluation details, see the [learn2zinc GitHub repo](https://github.com/skadio/learn2zinc). ## Datasets All models were trained on datasets from the [Learn2Zinc collection](https://huggingface.co/datasets/skadio/learn2zinc-collection): | Dataset | Strategy | Examples | |---|---|---| | [learn2zinc-base](https://huggingface.co/datasets/skadio/learn2zinc-base) | Direct generation | 8,014 | | [learn2zinc-cot](https://huggingface.co/datasets/skadio/learn2zinc-cot) | Chain-of-thought + generation | 8,014 | | [learn2zinc-augmented](https://huggingface.co/datasets/skadio/learn2zinc-augmented) | Generation + correction | 15,649 | ## Framework - [Unsloth](https://github.com/unslothai/unsloth) - [PEFT / LoRA](https://github.com/huggingface/peft) - [TRL SFTTrainer](https://github.com/huggingface/trl) ## Citation ```bibtex @misc{kadioglu2026modelingcopilotstexttomodeltranslation, title={Modeling Copilots for Text-to-Model Translation}, author={Serdar Kadioglu and Karthik Uppuluri and Akash Singirikonda}, year={2026}, eprint={2604.12955}, archivePrefix={arXiv}, primaryClass={cs.AI}, url={https://arxiv.org/abs/2604.12955}, } ```