DuoNeural's picture
Add model card
a88640e verified
---
language:
- en
tags:
- duoneural
- gguf
- qwen
- sft
- structured-output
- sql
- json
- webcode
base_model: DuoNeural/Qwen2.5-Coder-3B-SFT-StructuredOutput
license: apache-2.0
---
# Qwen2.5-Coder-3B-SFT-StructuredOutput — GGUF
GGUF quantizations of [DuoNeural/Qwen2.5-Coder-3B-SFT-StructuredOutput](https://huggingface.co/DuoNeural/Qwen2.5-Coder-3B-SFT-StructuredOutput).
Multi-task SFT on SQL (7,560) + JSON (3,568) + WebCode (1,107) = 12,235 examples combined.
**GSM8K flexible +20.5%** over base Qwen2.5-Coder-3B (0.582→0.701). ARC stable.
## Eval vs Baseline
| Metric | Baseline | Multitask SFT | Delta |
|--------|----------|---------------|-------|
| GSM8K flexible | 0.5823 | 0.7013 | **+20.5%** |
| GSM8K strict | 0.6937 | 0.6907 | -0.4% |
| ARC-acc | 0.4556 | 0.4522 | -0.7% |
| ARC-norm | 0.4898 | 0.4949 | +1.0% |
## Available Quants
| File | Size | Use case |
|------|------|----------|
| `*-Q2_K.gguf` | ~1.5 GB | Minimum size, CPU inference |
| `*-Q3_K_M.gguf` | ~1.9 GB | Small with decent quality |
| `*-Q4_K_M.gguf` | ~2.2 GB | **Recommended** — best size/quality |
| `*-Q5_K_M.gguf` | ~2.5 GB | High quality |
| `*-Q6_K.gguf` | ~2.9 GB | Very high quality |
| `*-Q8_0.gguf` | ~3.7 GB | Near-lossless |
## Usage (llama.cpp)
```bash
llama-cli -m Qwen2.5-Coder-3B-SFT-StructuredOutput-Q4_K_M.gguf \
-p "Write a SQL query to find all users who signed up in the last 30 days" \
-n 256
```
---
## DuoNeural
**DuoNeural** is an open AI research lab — human + AI in collaboration.
| Platform | Link |
|----------|------|
| HuggingFace | [huggingface.co/DuoNeural](https://huggingface.co/DuoNeural) |
| Website | [duoneural.com](https://duoneural.com) |
| GitHub | [github.com/DuoNeural](https://github.com/DuoNeural) |
| X / Twitter | [@DuoNeural](https://x.com/DuoNeural) |
*Subscribe: [duoneural.beehiiv.com](https://duoneural.beehiiv.com)*