---
license: mit
tags:
  - minizinc
  - constraint-programming
  - optimization
  - code-generation
  - lora
  - learn2zinc
pipeline_tag: text-generation
---

# Learn2Zinc Models

Learn2Zinc is a a family of small language models fine-tuned with LoRA for translating natural-language optimization problems into executable [MiniZinc](https://www.minizinc.org/) code. All models were trained on the [learn2zinc_augmented](https://huggingface.co/datasets/skadio/learn2zinc_augmented) dataset using the [Unsloth](https://github.com/unslothai/unsloth) library.

## Models

| Model | Base Model | Parameters | Chat Format |
|---|---|---|---|
| [learn2zinc-GPT-oss-20B](https://huggingface.co/skadio/learn2zinc-GPT-oss-20B) | GPT-OSS-20B | 20B | Harmony (custom) |
| [learn2zinc-Gemma-2-9B](https://huggingface.co/skadio/learn2zinc-Gemma-2-9B) | Gemma 2 9B | 9B | Standard chat template |
| [learn2zinc-Llama-3.2-3B](https://huggingface.co/skadio/learn2zinc-Llama-3.2-3B) | Llama 3.2 3B | 3B | Standard chat template |
| [learn2zinc-Llama-3.2-1B](https://huggingface.co/skadio/learn2zinc-Llama-3.2-1B) | Llama 3.2 1B | 1B | Standard chat template |
| [learn2zinc-Qwen3-0.6B](https://huggingface.co/skadio/learn2zinc-Qwen3-0.6B) | Qwen3 0.6B | 0.6B | Standard chat template |

## Shared Training Configuration

All models share the same LoRA and training setup:

| Hyperparameter | Value |
|---|---|
| Fine-tuning method | LoRA (rank 64, alpha 64) |
| Target modules | q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj |
| Learning rate | 2e-4 (cosine schedule, 50 warmup steps) |
| Epochs | 3 |
| Optimizer | AdamW 8-bit |
| Weight decay | 0.01 |
| Precision | bf16 |
| Quantization during training | 4-bit |
| Max sequence length | 4096 |
| Training | Response-only (SFTTrainer) |

## Evaluation

Models were evaluated on the **IndustryOR** subset of the learn2zinc benchmark. Generated MiniZinc code was executed with the **HiGHS** solver (120 s timeout). All generations used **temperature = 0** for reproducibility.

**Metrics:** Execution Success Rate (code compiles and runs) and Solution Correctness (objective matches expected value within 1e-6).

For full evaluation details, see the [learn2zinc GitHub repo](https://github.com/skadio/learn2zinc).

## Datasets

All models were trained on datasets from the [Learn2Zinc collection](https://huggingface.co/datasets/skadio/learn2zinc-collection):

| Dataset | Strategy | Examples |
|---|---|---|
| [learn2zinc-base](https://huggingface.co/datasets/skadio/learn2zinc-base) | Direct generation | 8,014 |
| [learn2zinc-cot](https://huggingface.co/datasets/skadio/learn2zinc-cot) | Chain-of-thought + generation | 8,014 |
| [learn2zinc-augmented](https://huggingface.co/datasets/skadio/learn2zinc-augmented) | Generation + correction | 15,649 |

## Framework

- [Unsloth](https://github.com/unslothai/unsloth)
- [PEFT / LoRA](https://github.com/huggingface/peft)
- [TRL SFTTrainer](https://github.com/huggingface/trl)

## Citation

```bibtex
@misc{kadioglu2026modelingcopilotstexttomodeltranslation,
      title={Modeling Copilots for Text-to-Model Translation}, 
      author={Serdar Kadioglu and Karthik Uppuluri and Akash Singirikonda},
      year={2026},
      eprint={2604.12955},
      archivePrefix={arXiv},
      primaryClass={cs.AI},
      url={https://arxiv.org/abs/2604.12955}, 
}
```