| --- |
| license: mit |
| tags: |
| - minizinc |
| - constraint-programming |
| - optimization |
| - code-generation |
| - lora |
| - learn2zinc |
| pipeline_tag: text-generation |
| --- |
| |
| # Learn2Zinc Models |
|
|
| Learn2Zinc is a a family of small language models fine-tuned with LoRA for translating natural-language optimization problems into executable [MiniZinc](https://www.minizinc.org/) code. All models were trained on the [learn2zinc_augmented](https://huggingface.co/datasets/skadio/learn2zinc_augmented) dataset using the [Unsloth](https://github.com/unslothai/unsloth) library. |
|
|
| ## Models |
|
|
| | Model | Base Model | Parameters | Chat Format | |
| |---|---|---|---| |
| | [learn2zinc-GPT-oss-20B](https://huggingface.co/skadio/learn2zinc-GPT-oss-20B) | GPT-OSS-20B | 20B | Harmony (custom) | |
| | [learn2zinc-Gemma-2-9B](https://huggingface.co/skadio/learn2zinc-Gemma-2-9B) | Gemma 2 9B | 9B | Standard chat template | |
| | [learn2zinc-Llama-3.2-3B](https://huggingface.co/skadio/learn2zinc-Llama-3.2-3B) | Llama 3.2 3B | 3B | Standard chat template | |
| | [learn2zinc-Llama-3.2-1B](https://huggingface.co/skadio/learn2zinc-Llama-3.2-1B) | Llama 3.2 1B | 1B | Standard chat template | |
| | [learn2zinc-Qwen3-0.6B](https://huggingface.co/skadio/learn2zinc-Qwen3-0.6B) | Qwen3 0.6B | 0.6B | Standard chat template | |
|
|
| ## Shared Training Configuration |
|
|
| All models share the same LoRA and training setup: |
|
|
| | Hyperparameter | Value | |
| |---|---| |
| | Fine-tuning method | LoRA (rank 64, alpha 64) | |
| | Target modules | q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj | |
| | Learning rate | 2e-4 (cosine schedule, 50 warmup steps) | |
| | Epochs | 3 | |
| | Optimizer | AdamW 8-bit | |
| | Weight decay | 0.01 | |
| | Precision | bf16 | |
| | Quantization during training | 4-bit | |
| | Max sequence length | 4096 | |
| | Training | Response-only (SFTTrainer) | |
| |
| ## Evaluation |
| |
| Models were evaluated on the **IndustryOR** subset of the learn2zinc benchmark. Generated MiniZinc code was executed with the **HiGHS** solver (120 s timeout). All generations used **temperature = 0** for reproducibility. |
| |
| **Metrics:** Execution Success Rate (code compiles and runs) and Solution Correctness (objective matches expected value within 1e-6). |
| |
| For full evaluation details, see the [learn2zinc GitHub repo](https://github.com/skadio/learn2zinc). |
| |
| ## Datasets |
| |
| All models were trained on datasets from the [Learn2Zinc collection](https://huggingface.co/datasets/skadio/learn2zinc-collection): |
| |
| | Dataset | Strategy | Examples | |
| |---|---|---| |
| | [learn2zinc-base](https://huggingface.co/datasets/skadio/learn2zinc-base) | Direct generation | 8,014 | |
| | [learn2zinc-cot](https://huggingface.co/datasets/skadio/learn2zinc-cot) | Chain-of-thought + generation | 8,014 | |
| | [learn2zinc-augmented](https://huggingface.co/datasets/skadio/learn2zinc-augmented) | Generation + correction | 15,649 | |
| |
| ## Framework |
| |
| - [Unsloth](https://github.com/unslothai/unsloth) |
| - [PEFT / LoRA](https://github.com/huggingface/peft) |
| - [TRL SFTTrainer](https://github.com/huggingface/trl) |
| |
| ## Citation |
| |
| ```bibtex |
| @misc{kadioglu2026modelingcopilotstexttomodeltranslation, |
| title={Modeling Copilots for Text-to-Model Translation}, |
| author={Serdar Kadioglu and Karthik Uppuluri and Akash Singirikonda}, |
| year={2026}, |
| eprint={2604.12955}, |
| archivePrefix={arXiv}, |
| primaryClass={cs.AI}, |
| url={https://arxiv.org/abs/2604.12955}, |
| } |
| ``` |