qwen3-4b-structured-output-lora

This repository provides a LoRA adapter fine-tuned from Qwen/Qwen3-4B-Instruct-2507 using QLoRA (4-bit) with Unsloth.

⚠️ This repository contains LoRA adapter weights only. The base model must be loaded separately.

Training Objective

This adapter is trained to improve structured output accuracy (JSON / YAML / XML / TOML / CSV).

Loss is applied only to the final assistant output (assistant-only loss).

Chain-of-Thought masking: Enabled
Learning mode: after_marker

Data Preprocessing

Rule-based normalization was applied before training:

  • Extracting content after output markers
  • Removing code fences (json / yaml / xml / toml)
  • Removing leading boilerplate and trailing notes
  • Recursive JSON exact-match deduplication

Dedupe enabled: Yes

Training Configuration

  • Base model: Qwen/Qwen3-4B-Instruct-2507
  • Method: QLoRA (4-bit) + Unsloth
  • Max sequence length: 1024
  • Epochs: 1
  • Learning rate: 3e-05
  • Warmup ratio: 0.06
  • Weight decay: 0.02
  • LoRA: r=48, alpha=96, dropout=0.06
  • Target modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch

base = "Qwen/Qwen3-4B-Instruct-2507"
adapter = "tropico0313/my-lora-test"

tokenizer = AutoTokenizer.from_pretrained(base)
model = AutoModelForCausalLM.from_pretrained(
    base,
    torch_dtype=torch.float16,
    device_map="auto",
)
model = PeftModel.from_pretrained(model, adapter)

Sources & Terms (IMPORTANT)
Training dataset: u-10bei/structured_data_with_cot_dataset_512_v2
Dataset License: MIT License.
Users must comply with the MIT license (including copyright notice)
and the base model's original terms of use.
Downloads last month
265
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for tropico0313/my-lora-test

Adapter
(5186)
this model

Dataset used to train tropico0313/my-lora-test