learn2zinc / README.md
skadio's picture
Update README.md
a83c3b0 verified
---
license: mit
tags:
- minizinc
- constraint-programming
- optimization
- code-generation
- lora
- learn2zinc
pipeline_tag: text-generation
---
# Learn2Zinc Models
Learn2Zinc is a a family of small language models fine-tuned with LoRA for translating natural-language optimization problems into executable [MiniZinc](https://www.minizinc.org/) code. All models were trained on the [learn2zinc_augmented](https://huggingface.co/datasets/skadio/learn2zinc_augmented) dataset using the [Unsloth](https://github.com/unslothai/unsloth) library.
## Models
| Model | Base Model | Parameters | Chat Format |
|---|---|---|---|
| [learn2zinc-GPT-oss-20B](https://huggingface.co/skadio/learn2zinc-GPT-oss-20B) | GPT-OSS-20B | 20B | Harmony (custom) |
| [learn2zinc-Gemma-2-9B](https://huggingface.co/skadio/learn2zinc-Gemma-2-9B) | Gemma 2 9B | 9B | Standard chat template |
| [learn2zinc-Llama-3.2-3B](https://huggingface.co/skadio/learn2zinc-Llama-3.2-3B) | Llama 3.2 3B | 3B | Standard chat template |
| [learn2zinc-Llama-3.2-1B](https://huggingface.co/skadio/learn2zinc-Llama-3.2-1B) | Llama 3.2 1B | 1B | Standard chat template |
| [learn2zinc-Qwen3-0.6B](https://huggingface.co/skadio/learn2zinc-Qwen3-0.6B) | Qwen3 0.6B | 0.6B | Standard chat template |
## Shared Training Configuration
All models share the same LoRA and training setup:
| Hyperparameter | Value |
|---|---|
| Fine-tuning method | LoRA (rank 64, alpha 64) |
| Target modules | q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj |
| Learning rate | 2e-4 (cosine schedule, 50 warmup steps) |
| Epochs | 3 |
| Optimizer | AdamW 8-bit |
| Weight decay | 0.01 |
| Precision | bf16 |
| Quantization during training | 4-bit |
| Max sequence length | 4096 |
| Training | Response-only (SFTTrainer) |
## Evaluation
Models were evaluated on the **IndustryOR** subset of the learn2zinc benchmark. Generated MiniZinc code was executed with the **HiGHS** solver (120 s timeout). All generations used **temperature = 0** for reproducibility.
**Metrics:** Execution Success Rate (code compiles and runs) and Solution Correctness (objective matches expected value within 1e-6).
For full evaluation details, see the [learn2zinc GitHub repo](https://github.com/skadio/learn2zinc).
## Datasets
All models were trained on datasets from the [Learn2Zinc collection](https://huggingface.co/datasets/skadio/learn2zinc-collection):
| Dataset | Strategy | Examples |
|---|---|---|
| [learn2zinc-base](https://huggingface.co/datasets/skadio/learn2zinc-base) | Direct generation | 8,014 |
| [learn2zinc-cot](https://huggingface.co/datasets/skadio/learn2zinc-cot) | Chain-of-thought + generation | 8,014 |
| [learn2zinc-augmented](https://huggingface.co/datasets/skadio/learn2zinc-augmented) | Generation + correction | 15,649 |
## Framework
- [Unsloth](https://github.com/unslothai/unsloth)
- [PEFT / LoRA](https://github.com/huggingface/peft)
- [TRL SFTTrainer](https://github.com/huggingface/trl)
## Citation
```bibtex
@misc{kadioglu2026modelingcopilotstexttomodeltranslation,
title={Modeling Copilots for Text-to-Model Translation},
author={Serdar Kadioglu and Karthik Uppuluri and Akash Singirikonda},
year={2026},
eprint={2604.12955},
archivePrefix={arXiv},
primaryClass={cs.AI},
url={https://arxiv.org/abs/2604.12955},
}
```