qwen3-4b-structeval-cleaned-20k-lr8e-6-r16-a32-ep1

This repository provides a LoRA adapter fine-tuned from Qwen/Qwen3-4B-Instruct-2507 using QLoRA (4-bit, Unsloth).

This repository contains LoRA adapter weights only. The base model must be loaded separately.

Training Objective

This adapter is trained to improve structured output accuracy (JSON / YAML / XML / TOML / CSV).

Loss is applied only to the final assistant output, while intermediate reasoning (Chain-of-Thought) is masked.

Training Configuration

  • Base model: Qwen/Qwen3-4B-Instruct-2507
  • Method: QLoRA (4-bit)
  • Max sequence length: 1024
  • Epochs: 1
  • Learning rate: 8e-06
  • LoRA: r=16, alpha=32
  • Final Training Loss: ~1.26
  • Final Validation Loss: ~1.52

Dataset: Cleaned StructEval (20,000 samples)

Data Cleaning Pipeline:

  • Removed CoT tags (<thinking>...</thinking>)
  • Removed code fences (yaml, json, xml, toml, ```csv)
  • Removed leading phrases ("Here's the output:", "Sure!", etc.)
  • Removed trailing phrases ("Let me know if you need help!")
  • Format validation (JSON/YAML/XML/TOML/CSV parsing)
  • Deduplication

Format Distribution:

  • YAML: 6,379 (31.9%)
  • JSON: 4,706 (23.5%)
  • XML: 3,312 (16.6%)
  • CSV: 2,824 (14.1%)
  • TOML: 2,779 (13.9%)

Source Datasets (combined from 9 HF datasets):

  • u-10bei/structured_data_with_cot_dataset_512_v2
  • u-10bei/structured_data_with_cot_dataset_512_v4
  • u-10bei/structured_data_with_cot_dataset_512_v5
  • u-10bei/structured_data_with_cot_dataset_512
  • u-10bei/structured_data_with_cot_dataset_v2
  • u-10bei/structured_data_with_cot_dataset
  • daichira/structured-3k-mix-sft
  • daichira/structured-5k-mix-sft
  • daichira/structured-hard-sft-4k

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch

base = "Qwen/Qwen3-4B-Instruct-2507"
adapter = "yuk1chan/qwen3-4b-structeval-cleaned-20k-lr8e-6-r16-a32-ep1"

tokenizer = AutoTokenizer.from_pretrained(base)
model = AutoModelForCausalLM.from_pretrained(
    base,
    torch_dtype=torch.float16,
    device_map="auto",
)
model = PeftModel.from_pretrained(model, adapter)

Sources & Terms (IMPORTANT)

Training data: Cleaned locally from 9 Hugging Face datasets:
- u-10bei/structured_data_with_cot_dataset_512_v2
- u-10bei/structured_data_with_cot_dataset_512_v4
- u-10bei/structured_data_with_cot_dataset_512_v5
- u-10bei/structured_data_with_cot_dataset_512
- u-10bei/structured_data_with_cot_dataset_v2
- u-10bei/structured_data_with_cot_dataset
- daichira/structured-3k-mix-sft
- daichira/structured-5k-mix-sft
- daichira/structured-hard-sft-4k

Original datasets are licensed under MIT License.
This cleaned dataset and adapter are used and distributed under the terms of the MIT License.
Compliance: Users must comply with the MIT license (including copyright notice) and the base model's original terms of use.
Downloads last month
3
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for yuk1chan/qwen3-4b-structeval-cleaned-20k

Adapter
(5321)
this model