Qwen3-4B-Upsample_multi5
This repository provides a LoRA adapter fine-tuned from:
Qwen/Qwen3-4B-Instruct-2507
using QLoRA (4-bit, Unsloth) with:
- Structured output–focused Supervised Fine-Tuning (SFT)
- Chain-of-Thought (CoT) masked loss
- Task-aware upsampling
⚠️ This repository contains LoRA adapter weights only.
The base model must be loaded separately.
Training Objective
This adapter is trained to improve structured output accuracy across:
- JSON
- YAML
- XML
- TOML
- CSV
Output-only supervised loss
Loss is applied only to the final assistant output.
- Chain-of-Thought reasoning is masked.
- Learning starts after output markers such as: Output: OUTPUT: Final: Answer: Result: Response:
Configuration:
SFT_MASK_COT = 1SFT_OUTPUT_LEARN_MODE = "after_marker"
Task-aware Upsampling (IMPORTANT)
This training run uses controlled upsampling to rebalance specific structured transformation tasks.
Upsampling rules applied:
| Task Type | Multiplier |
|---|---|
| text_to_toml | 1.4 |
| text_to_xml | 1.4 |
| json_to_xml | 1.3 |
| yaml_to_xml | 1.3 |
| csv_to_xml | 1.3 |
| toml_to_xml | 1.3 |
Configuration:
SFT_USE_UPSAMPLING = 1
This improves XML-heavy transformation robustness while preserving multi-format generalization.
Training Configuration
Base Model
Qwen/Qwen3-4B-Instruct-2507
Dataset
u-10bei/structured_data_with_cot_dataset_512_v2
Method
QLoRA (4-bit) via Unsloth
Hyperparameters
| Parameter | Value |
|---|---|
| Max sequence length | 512 |
| Epochs | 1 |
| Max steps | 200 |
| Learning rate | 1.5e-5 |
| Warmup ratio | 0.05 |
| Weight decay | 0.0 |
| Per-device train batch size | 2 |
| Per-device eval batch size | 2 |
| Gradient accumulation | 8 |
| Effective batch size | 16 |
LoRA Configuration
| Parameter | Value |
|---|---|
| r | 32 |
| alpha | 64 |
| dropout | 0.1 |
| target modules | q_proj, k_proj, v_proj, o_proj |
Checkpoints
Checkpoints are automatically pushed during training.
- Saved every 25 steps
- Maximum retained checkpoints: 2
- Stored under:
checkpoints/
Repository: Gen-oze/Qwen3-4B-Upsample_multi5
Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch
base = "Qwen/Qwen3-4B-Instruct-2507"
adapter = "Gen-oze/Qwen3-4B-Upsample_multi5"
tokenizer = AutoTokenizer.from_pretrained(base)
model = AutoModelForCausalLM.from_pretrained(
base,
torch_dtype=torch.float16,
device_map="auto",
)
model = PeftModel.from_pretrained(model, adapter)
model.eval()