Qwen2.5-Coder-3B-SFT-StructuredOutput โ GGUF
GGUF quantizations of DuoNeural/Qwen2.5-Coder-3B-SFT-StructuredOutput.
Multi-task SFT on SQL (7,560) + JSON (3,568) + WebCode (1,107) = 12,235 examples combined. GSM8K flexible +20.5% over base Qwen2.5-Coder-3B (0.582โ0.701). ARC stable.
Eval vs Baseline
| Metric | Baseline | Multitask SFT | Delta |
|---|---|---|---|
| GSM8K flexible | 0.5823 | 0.7013 | +20.5% |
| GSM8K strict | 0.6937 | 0.6907 | -0.4% |
| ARC-acc | 0.4556 | 0.4522 | -0.7% |
| ARC-norm | 0.4898 | 0.4949 | +1.0% |
Available Quants
| File | Size | Use case |
|---|---|---|
*-Q2_K.gguf |
~1.5 GB | Minimum size, CPU inference |
*-Q3_K_M.gguf |
~1.9 GB | Small with decent quality |
*-Q4_K_M.gguf |
~2.2 GB | Recommended โ best size/quality |
*-Q5_K_M.gguf |
~2.5 GB | High quality |
*-Q6_K.gguf |
~2.9 GB | Very high quality |
*-Q8_0.gguf |
~3.7 GB | Near-lossless |
Usage (llama.cpp)
llama-cli -m Qwen2.5-Coder-3B-SFT-StructuredOutput-Q4_K_M.gguf \
-p "Write a SQL query to find all users who signed up in the last 30 days" \
-n 256
DuoNeural
DuoNeural is an open AI research lab โ human + AI in collaboration.
| Platform | Link |
|---|---|
| HuggingFace | huggingface.co/DuoNeural |
| Website | duoneural.com |
| GitHub | github.com/DuoNeural |
| X / Twitter | @DuoNeural |
Subscribe: duoneural.beehiiv.com
- Downloads last month
- 234
Hardware compatibility
Log In to add your hardware
2-bit
3-bit
4-bit
5-bit
6-bit
8-bit
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐ Ask for provider support