Qwen3-4B-Upsample_multi5

This repository provides a LoRA adapter fine-tuned from:

Qwen/Qwen3-4B-Instruct-2507

using QLoRA (4-bit, Unsloth) with:

Structured output–focused Supervised Fine-Tuning (SFT)
Chain-of-Thought (CoT) masked loss
Task-aware upsampling

⚠️ This repository contains LoRA adapter weights only.
The base model must be loaded separately.

Training Objective

This adapter is trained to improve structured output accuracy across:

JSON
YAML
XML
TOML
CSV

Output-only supervised loss

Loss is applied only to the final assistant output.

Chain-of-Thought reasoning is masked.
Learning starts after output markers such as: Output: OUTPUT: Final: Answer: Result: Response:

Configuration:

SFT_MASK_COT = 1
SFT_OUTPUT_LEARN_MODE = "after_marker"

Task-aware Upsampling (IMPORTANT)

This training run uses controlled upsampling to rebalance specific structured transformation tasks.

Upsampling rules applied:

Task Type	Multiplier
text_to_toml	1.4
text_to_xml	1.4
json_to_xml	1.3
yaml_to_xml	1.3
csv_to_xml	1.3
toml_to_xml	1.3

Configuration:

SFT_USE_UPSAMPLING = 1

This improves XML-heavy transformation robustness while preserving multi-format generalization.

Training Configuration

Base Model

Qwen/Qwen3-4B-Instruct-2507

Dataset

u-10bei/structured_data_with_cot_dataset_512_v2

Method

QLoRA (4-bit) via Unsloth

Hyperparameters

Parameter	Value
Max sequence length	512
Epochs	1
Max steps	200
Learning rate	1.5e-5
Warmup ratio	0.05
Weight decay	0.0
Per-device train batch size	2
Per-device eval batch size	2
Gradient accumulation	8
Effective batch size	16

LoRA Configuration

Parameter	Value
r	32
alpha	64
dropout	0.1
target modules	q_proj, k_proj, v_proj, o_proj

Checkpoints

Checkpoints are automatically pushed during training.

Saved every 25 steps
Maximum retained checkpoints: 2
Stored under: checkpoints/

Repository: Gen-oze/Qwen3-4B-Upsample_multi5

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch

base = "Qwen/Qwen3-4B-Instruct-2507"
adapter = "Gen-oze/Qwen3-4B-Upsample_multi5"

tokenizer = AutoTokenizer.from_pretrained(base)

model = AutoModelForCausalLM.from_pretrained(
    base,
    torch_dtype=torch.float16,
    device_map="auto",
)

model = PeftModel.from_pretrained(model, adapter)
model.eval()