DuoNeural/qwen32b-all-datasets-sft

QLoRA SFT adapter for Qwen2.5-32B-Instruct, trained on the full DuoNeural synthetic dataset collection: instruction following, structured outputs (JSON/SQL), web code generation, and domain-specific reasoning tasks.

Part of our ongoing effort to understand how synthetic post-training affects a large foundation model's reasoning and structured output capabilities — and whether small, targeted SFT datasets can meaningfully shift performance on standard benchmarks.

Model Details

Property	Value
Base Model	Qwen/Qwen2.5-32B-Instruct
Training Method	QLoRA (4-bit base + BF16 LoRA)
Hardware	NVIDIA A100 80GB
Training Data	DuoNeural synthetic SFT collection (5 datasets)
Available Checkpoints	epoch_1, epoch_2, epoch_3 (partial — see notes)

Training Datasets

Dataset	Domain
DuoNeural LIMA Instruction	Instruction following (LIMA-derived)
DuoNeural ArchonLatentGeo	Geometric/spatial reasoning
DuoNeural JSON Structured	JSON schema generation and completion
DuoNeural SQL Expert	SQL query generation across dialects
DuoNeural WebCode	Frontend web code generation (HTML/CSS/JS)

Training Notes

Epochs 1 and 2 completed fully
Epoch 3 checkpoint saved at step ~803/1019 due to pod interruption — treat as a strong late-epoch checkpoint, not a completed epoch
Recommendation: use epoch_2/ for a clean fully-trained adapter, or epoch_3/ for the best available weights

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch

base_id    = "Qwen/Qwen2.5-32B-Instruct"
adapter_id = "DuoNeural/qwen32b-all-datasets-sft"

# Load 4-bit base (matches training setup)
from transformers import BitsAndBytesConfig
bnb_cfg = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.bfloat16,
)

tokenizer = AutoTokenizer.from_pretrained(base_id)
base = AutoModelForCausalLM.from_pretrained(
    base_id,
    quantization_config=bnb_cfg,
    device_map="auto",
)

# Load adapter — choose epoch
model = PeftModel.from_pretrained(base, f"{adapter_id}/epoch_2", is_trainable=False)

# Inference
messages = [{"role": "user", "content": "Generate a JSON schema for a product catalog."}]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
out = model.generate(**inputs, max_new_tokens=512, do_sample=False)
print(tokenizer.decode(out[0][inputs.input_ids.shape[1]:], skip_special_tokens=True))

VRAM requirements:

4-bit inference: ~20–22 GB (A100 40GB, RTX 3090/4090, A6000)
BF16 inference: ~65 GB (A100 80GB, H100)

Benchmark Status

Benchmarks (GSM8K, ARC-Challenge, HellaSwag) against the Qwen2.5-32B-Instruct base are in progress. Results will be added here once complete.

If SFT improves benchmark scores, we will release quantized versions (GGUF, GPTQ, AWQ, EXL2) for broader use.

DuoNeural

DuoNeural is an open AI research lab — human + AI in collaboration.

Platform	Link
HuggingFace	huggingface.co/DuoNeural
Website	duoneural.com
GitHub	github.com/DuoNeural
X / Twitter	@DuoNeural
Email	duoneural@proton.me
Newsletter	duoneural.beehiiv.com
Support	buymeacoffee.com/duoneural

DuoNeural Research Publications

Title	DOI
Nano-CTM: Ternary Continuous Thought Machines with Thought-Space Self-Prediction for Efficient Iterative Reasoning	10.5281/zenodo.19775622
Recurrence as World Model: CTM Learns Implicit Belief States in Partially Observable Physical Environments	10.5281/zenodo.19810620
Per-Object Slot Decomposition for Scalable Neural World Modeling: When Does Attention Beat Mean-Field?	10.5281/zenodo.19846804
The Dynamical Horizon Principle: CTM Gates Converge to the Predictability Limit of Dynamical Systems	10.5281/zenodo.19952612
DHP as Universal Cognitive Constraint: Gradient Descent, Evolution, and Cellular Chemistry Converge on the Lyapunov Time	10.5281/zenodo.20080396

Open access, CC BY 4.0. Authored by Archon, Jesse Caldwell, Aura — DuoNeural.

Research Team

Jesse — Vision, hardware, direction
Archon — Lab Director, post-training, abliteration, experiments
Aura — Research AI, literature synthesis, novel proposals

Subscribe to the lab newsletter at duoneural.beehiiv.com for model drops before they go anywhere else.

Downloads last month: -

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for DuoNeural/qwen32b-all-datasets-sft

Base model

Qwen/Qwen2.5-32B

Finetuned

Qwen/Qwen2.5-32B-Instruct

Adapter

(184)

this model