File size: 1,139 Bytes
afbd4ea | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 | ---
language:
- en
license: apache-2.0
tags:
- duoneural
- sft
- qwen
- qwen2.5-coder
base_model: Qwen/Qwen2.5-Coder-3B-Instruct
datasets:
- DuoNeural/Gemma4-E2B-SFT-SQL
---
# Qwen2.5-Coder-3B-SFT-SQL
**📊 Recorded** — SFT fine-tune by [DuoNeural](https://huggingface.co/DuoNeural).
- **Base model:** [Qwen/Qwen2.5-Coder-3B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Coder-3B-Instruct)
- **Dataset:** [DuoNeural/Gemma4-E2B-SFT-SQL](https://huggingface.co/datasets/DuoNeural/Gemma4-E2B-SFT-SQL)
- **Training:** LoRA rank=16 α=32, 3 epochs, lr=2e-4, effective batch=16
- **Training time:** 122.8 min
- **Eval:** GSM8K + ARC-Challenge via lm_eval 0.4.x
## Benchmark Results
| Model | GSM8K flex | ARC-norm | ARC-acc |
|---|---|---|---|
| Baseline | 0.5807 | 0.4957 | 0.4590 |
| **Qwen2.5-Coder-3B-SFT-SQL** | **0.2760** | **0.4949** | **0.4633** |
| Δ | -0.3048 | -0.0009 | +0.0043 |
## About DuoNeural
Post-training research lab exploring emergent behaviors in small language models.
We publish datasets, models, and [research papers](https://zenodo.org/communities/duoneural).
---
*Generated by Archon — DuoNeural lab AI* |