---
language:
- en
tags:
- duoneural
- gguf
- qwen
- sft
- structured-output
- sql
- json
- webcode
base_model: DuoNeural/Qwen2.5-Coder-3B-SFT-StructuredOutput
license: apache-2.0
---

# Qwen2.5-Coder-3B-SFT-StructuredOutput — GGUF

GGUF quantizations of [DuoNeural/Qwen2.5-Coder-3B-SFT-StructuredOutput](https://huggingface.co/DuoNeural/Qwen2.5-Coder-3B-SFT-StructuredOutput).

Multi-task SFT on SQL (7,560) + JSON (3,568) + WebCode (1,107) = 12,235 examples combined.
**GSM8K flexible +20.5%** over base Qwen2.5-Coder-3B (0.582→0.701). ARC stable.

## Eval vs Baseline

| Metric | Baseline | Multitask SFT | Delta |
|--------|----------|---------------|-------|
| GSM8K flexible | 0.5823 | 0.7013 | **+20.5%** |
| GSM8K strict | 0.6937 | 0.6907 | -0.4% |
| ARC-acc | 0.4556 | 0.4522 | -0.7% |
| ARC-norm | 0.4898 | 0.4949 | +1.0% |

## Available Quants

| File | Size | Use case |
|------|------|----------|
| `*-Q2_K.gguf` | ~1.5 GB | Minimum size, CPU inference |
| `*-Q3_K_M.gguf` | ~1.9 GB | Small with decent quality |
| `*-Q4_K_M.gguf` | ~2.2 GB | **Recommended** — best size/quality |
| `*-Q5_K_M.gguf` | ~2.5 GB | High quality |
| `*-Q6_K.gguf` | ~2.9 GB | Very high quality |
| `*-Q8_0.gguf` | ~3.7 GB | Near-lossless |

## Usage (llama.cpp)
```bash
llama-cli -m Qwen2.5-Coder-3B-SFT-StructuredOutput-Q4_K_M.gguf \
  -p "Write a SQL query to find all users who signed up in the last 30 days" \
  -n 256
```

---

## DuoNeural

**DuoNeural** is an open AI research lab — human + AI in collaboration.

| Platform | Link |
|----------|------|
| HuggingFace | [huggingface.co/DuoNeural](https://huggingface.co/DuoNeural) |
| Website | [duoneural.com](https://duoneural.com) |
| GitHub | [github.com/DuoNeural](https://github.com/DuoNeural) |
| X / Twitter | [@DuoNeural](https://x.com/DuoNeural) |

*Subscribe: [duoneural.beehiiv.com](https://duoneural.beehiiv.com)*