qwen3-4b-structeval-clean-lr6e-6

This repository provides a LoRA adapter fine-tuned from Qwen/Qwen3-4B-Instruct-2507 using QLoRA (4-bit, Unsloth).

This repository contains LoRA adapter weights only. The base model must be loaded separately.

Training Objective

This adapter is trained to improve structured output accuracy (JSON / YAML / XML / TOML / CSV).

Loss is applied only to the final assistant output, while intermediate reasoning (Chain-of-Thought) is masked.

Training Configuration

Base model: Qwen/Qwen3-4B-Instruct-2507
Method: QLoRA (4-bit)
Max sequence length: 1024
Epochs: 1
Learning rate: 6e-06
LoRA: r=16, alpha=32

Dataset: Cleaned StructEval (20,000 samples)

Data Cleaning Pipeline:

CoT tags removal: <thinking>...</thinking> completely removed
Code fence removal: yaml, json, xml, toml, ````csv removed
Leading phrase removal: "Here's the output:", "Sure!", etc. removed
Trailing phrase removal: "Let me know if you need help!" etc. removed
Format validation: JSON/YAML/XML/TOML/CSV parsing validation
Deduplication: Exact duplicates removed

Format Distribution:

YAML: ~6,379 (31.9%)
JSON: ~4,706 (23.5%)
XML: ~3,312 (16.6%)
CSV: ~2,824 (14.1%)
TOML: ~2,779 (13.9%)

Source Datasets (combined from 9 HF datasets):

u-10bei/structured_data_with_cot_dataset_512_v2
u-10bei/structured_data_with_cot_dataset_512_v4
u-10bei/structured_data_with_cot_dataset_512_v5
u-10bei/structured_data_with_cot_dataset_512
u-10bei/structured_data_with_cot_dataset_v2
u-10bei/structured_data_with_cot_dataset
daichira/structured-3k-mix-sft
daichira/structured-5k-mix-sft
daichira/structured-hard-sft-4k

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch

base = "Qwen/Qwen3-4B-Instruct-2507"
adapter = "yuk1chan/qwen3-4b-structeval-clean-lr6e-6"

tokenizer = AutoTokenizer.from_pretrained(base)
model = AutoModelForCausalLM.from_pretrained(
    base,
    torch_dtype=torch.float16,
    device_map="auto",
)
model = PeftModel.from_pretrained(model, adapter)

# Inference
prompt = "Generate YAML code for..."
# ... your inference code

Training Results

- Training Loss: ~1.5
- Validation Loss: ~1.6
- Training Steps: ~1,172
- Final Score: TBD (StructEval-T evaluation)

License

Apache 2.0

---
Trained on Cleaned StructEval dataset (20,000 samples)
Learning Rate: 6e-6 (conservative setting)

Downloads last month: 9

Model tree for yuk1chan/qwen3-4b-structeval-clean-lr6e-6

Base model

Qwen/Qwen3-4B-Instruct-2507

Adapter

(4899)

this model