DuoNeural's picture
Add model card
a88640e verified
metadata
language:
  - en
tags:
  - duoneural
  - gguf
  - qwen
  - sft
  - structured-output
  - sql
  - json
  - webcode
base_model: DuoNeural/Qwen2.5-Coder-3B-SFT-StructuredOutput
license: apache-2.0

Qwen2.5-Coder-3B-SFT-StructuredOutput — GGUF

GGUF quantizations of DuoNeural/Qwen2.5-Coder-3B-SFT-StructuredOutput.

Multi-task SFT on SQL (7,560) + JSON (3,568) + WebCode (1,107) = 12,235 examples combined. GSM8K flexible +20.5% over base Qwen2.5-Coder-3B (0.582→0.701). ARC stable.

Eval vs Baseline

Metric Baseline Multitask SFT Delta
GSM8K flexible 0.5823 0.7013 +20.5%
GSM8K strict 0.6937 0.6907 -0.4%
ARC-acc 0.4556 0.4522 -0.7%
ARC-norm 0.4898 0.4949 +1.0%

Available Quants

File Size Use case
*-Q2_K.gguf ~1.5 GB Minimum size, CPU inference
*-Q3_K_M.gguf ~1.9 GB Small with decent quality
*-Q4_K_M.gguf ~2.2 GB Recommended — best size/quality
*-Q5_K_M.gguf ~2.5 GB High quality
*-Q6_K.gguf ~2.9 GB Very high quality
*-Q8_0.gguf ~3.7 GB Near-lossless

Usage (llama.cpp)

llama-cli -m Qwen2.5-Coder-3B-SFT-StructuredOutput-Q4_K_M.gguf \
  -p "Write a SQL query to find all users who signed up in the last 30 days" \
  -n 256

DuoNeural

DuoNeural is an open AI research lab — human + AI in collaboration.

Platform Link
HuggingFace huggingface.co/DuoNeural
Website duoneural.com
GitHub github.com/DuoNeural
X / Twitter @DuoNeural

Subscribe: duoneural.beehiiv.com