Instructions to use DuoNeural/qwen32b-all-datasets-sft with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use DuoNeural/qwen32b-all-datasets-sft with PEFT:
Task type is invalid.
- Notebooks
- Google Colab
- Kaggle
DuoNeural/qwen32b-all-datasets-sft
QLoRA SFT adapter for Qwen2.5-32B-Instruct, trained on the full DuoNeural synthetic dataset collection: instruction following, structured outputs (JSON/SQL), web code generation, and domain-specific reasoning tasks.
Part of our ongoing effort to understand how synthetic post-training affects a large foundation model's reasoning and structured output capabilities β and whether small, targeted SFT datasets can meaningfully shift performance on standard benchmarks.
Model Details
| Property | Value |
|---|---|
| Base Model | Qwen/Qwen2.5-32B-Instruct |
| Training Method | QLoRA (4-bit base + BF16 LoRA) |
| Hardware | NVIDIA A100 80GB |
| Training Data | DuoNeural synthetic SFT collection (5 datasets) |
| Available Checkpoints | epoch_1, epoch_2, epoch_3 (partial β see notes) |
Training Datasets
| Dataset | Domain |
|---|---|
| DuoNeural LIMA Instruction | Instruction following (LIMA-derived) |
| DuoNeural ArchonLatentGeo | Geometric/spatial reasoning |
| DuoNeural JSON Structured | JSON schema generation and completion |
| DuoNeural SQL Expert | SQL query generation across dialects |
| DuoNeural WebCode | Frontend web code generation (HTML/CSS/JS) |
Training Notes
- Epochs 1 and 2 completed fully
- Epoch 3 checkpoint saved at step ~803/1019 due to pod interruption β treat as a strong late-epoch checkpoint, not a completed epoch
- Recommendation: use
epoch_2/for a clean fully-trained adapter, orepoch_3/for the best available weights
Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch
base_id = "Qwen/Qwen2.5-32B-Instruct"
adapter_id = "DuoNeural/qwen32b-all-datasets-sft"
# Load 4-bit base (matches training setup)
from transformers import BitsAndBytesConfig
bnb_cfg = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_compute_dtype=torch.bfloat16,
)
tokenizer = AutoTokenizer.from_pretrained(base_id)
base = AutoModelForCausalLM.from_pretrained(
base_id,
quantization_config=bnb_cfg,
device_map="auto",
)
# Load adapter β choose epoch
model = PeftModel.from_pretrained(base, f"{adapter_id}/epoch_2", is_trainable=False)
# Inference
messages = [{"role": "user", "content": "Generate a JSON schema for a product catalog."}]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
out = model.generate(**inputs, max_new_tokens=512, do_sample=False)
print(tokenizer.decode(out[0][inputs.input_ids.shape[1]:], skip_special_tokens=True))
VRAM requirements:
- 4-bit inference: ~20β22 GB (A100 40GB, RTX 3090/4090, A6000)
- BF16 inference: ~65 GB (A100 80GB, H100)
Benchmark Status
Benchmarks (GSM8K, ARC-Challenge, HellaSwag) against the Qwen2.5-32B-Instruct base are in progress. Results will be added here once complete.
If SFT improves benchmark scores, we will release quantized versions (GGUF, GPTQ, AWQ, EXL2) for broader use.
DuoNeural
DuoNeural is an open AI research lab β human + AI in collaboration.
| Platform | Link |
|---|---|
| HuggingFace | huggingface.co/DuoNeural |
| Website | duoneural.com |
| GitHub | github.com/DuoNeural |
| X / Twitter | @DuoNeural |
| duoneural@proton.me | |
| Newsletter | duoneural.beehiiv.com |
| Support | buymeacoffee.com/duoneural |
DuoNeural Research Publications
Open access, CC BY 4.0. Authored by Archon, Jesse Caldwell, Aura β DuoNeural.
Research Team
- Jesse β Vision, hardware, direction
- Archon β Lab Director, post-training, abliteration, experiments
- Aura β Research AI, literature synthesis, novel proposals
Subscribe to the lab newsletter at duoneural.beehiiv.com for model drops before they go anywhere else.
- Downloads last month
- -