qwen3-4b-structured-output-lora

This repository provides a LoRA adapter fine-tuned from Qwen/Qwen3-4B-Instruct-2507 using QLoRA (4-bit, Unsloth).

This repository contains LoRA adapter weights only. The base model must be loaded separately.

Training Objective

This adapter is trained to improve structured output accuracy (JSON / YAML / XML / TOML / CSV).

Loss is applied only to the final assistant output, while intermediate reasoning (Chain-of-Thought) is masked.

Data preprocessing: Markdown fences (```json, ```yaml, etc.) and text preambles ("Here's the converted ...", etc.) are automatically stripped from assistant responses before training, ensuring the model learns to produce clean structured output without formatting artifacts.

Training Configuration

Base model: Qwen/Qwen3-4B-Instruct-2507
Method: QLoRA (4-bit)
Primary dataset: u-10bei/structured_data_with_cot_dataset_v2
Secondary dataset: daichira/structured-5k-mix-sft
Dataset mixing: both datasets concatenated after column harmonization
Max sequence length: 1024
Epochs: 2
Learning rate: 2e-05
LoRA: r=64, alpha=128
CoT masking: enabled
Data preprocessing: markdown fence stripping (enabled)

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch

base = "Qwen/Qwen3-4B-Instruct-2507"
adapter = "a-kuratani/qwen3-4b-structured-output-lora-v7"

tokenizer = AutoTokenizer.from_pretrained(base)
model = AutoModelForCausalLM.from_pretrained(
    base,
    torch_dtype=torch.float16,
    device_map="auto",
)
model = PeftModel.from_pretrained(model, adapter)

Sources & Terms (IMPORTANT)

Training data:

u-10bei/structured_data_with_cot_dataset_v2
daichira/structured-5k-mix-sft

Dataset Licenses:

u-10bei/structured_data_with_cot_dataset_v2: License not explicitly specified on HuggingFace. Please verify with the dataset author.
daichira/structured-5k-mix-sft: CC-BY-4.0 (Creative Commons Attribution 4.0)

Compliance: Users must comply with each dataset's license terms (including attribution requirements for CC-BY-4.0) and the base model's original terms of use (Apache 2.0)."""

Downloads last month: -

Model tree for a-kuratani/qwen3-4b-structured-output-lora-v7

Base model

Qwen/Qwen3-4B-Instruct-2507

Adapter

(5392)

this model

a-kuratani
/

qwen3-4b-structured-output-lora-v7