YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
Qwen3-4B-SFT (STEP100 Checkpoint)
This repository provides the LoRA adapter checkpoint at training step 100
fine-tuned from:
Qwen/Qwen3-4B-Instruct-2507
This is the best-performing checkpoint selected from training.
⚠️ This repository contains LoRA adapter weights only.
The base model must be loaded separately.
Training Overview
This adapter was trained using:
- Supervised Fine-Tuning (SFT)
- QLoRA (4-bit, Unsloth)
- Output-only loss (Chain-of-Thought masked)
- Task-aware upsampling
Checkpoint:
- Selected step: 100
- Training max steps: 200
- Best validation performance observed at step 100
Training Objective
The goal is to improve structured output accuracy across:
- JSON
- YAML
- XML
- TOML
- CSV
Output-only Supervision
Loss is applied only to the final assistant output.
Intermediate reasoning (Chain-of-Thought) is masked.
Learning begins after output markers such as: Output: OUTPUT: Final: Answer: Result: Response:
Configuration:
SFT_MASK_COT = 1SFT_OUTPUT_LEARN_MODE = "after_marker"
Task-aware Upsampling
Upsampling was enabled to rebalance structured transformation tasks.
Applied multipliers:
| Task Type | Multiplier |
|---|---|
| text_to_toml | 1.4 |
| text_to_xml | 1.4 |
| json_to_xml | 1.3 |
| yaml_to_xml | 1.3 |
| csv_to_xml | 1.3 |
| toml_to_xml | 1.3 |
Configuration:
SFT_USE_UPSAMPLING = 1
This improves XML-related transformation robustness while maintaining multi-format generalization.
Training Configuration
Base Model
Qwen/Qwen3-4B-Instruct-2507
Dataset
u-10bei/structured_data_with_cot_dataset_512_v2
Method
QLoRA (4-bit) via Unsloth
Hyperparameters
| Parameter | Value |
|---|---|
| Max sequence length | 512 |
| Epochs | 1 |
| Max steps | 200 |
| Learning rate | 1.5e-5 |
| Warmup ratio | 0.05 |
| Weight decay | 0.0 |
| Per-device train batch size | 2 |
| Gradient accumulation | 8 |
| Effective batch size | 16 |
LoRA Configuration
| Parameter | Value |
|---|---|
| r | 32 |
| alpha | 64 |
| dropout | 0.1 |
| target modules | q_proj, k_proj, v_proj, o_proj |
Repository Contents
This repository contains only:
adapter_config.jsonadapter_model.safetensorsREADME.md
This is the STEP100 checkpoint only (final selected model).
Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch
base = "Qwen/Qwen3-4B-Instruct-2507"
adapter = "Gen-oze/Qwen3-4B-SFT"
tokenizer = AutoTokenizer.from_pretrained(base)
model = AutoModelForCausalLM.from_pretrained(
base,
torch_dtype=torch.float16,
device_map="auto",
)
model = PeftModel.from_pretrained(model, adapter)
model.eval()