qwen3-4b-structured-output-lora

This repository provides a LoRA adapter fine-tuned from Qwen/Qwen3-4B-Instruct-2507 using QLoRA (4-bit, Unsloth).

This repository contains LoRA adapter weights only. The base model must be loaded separately.

Training Objective

This adapter is trained to improve structured output accuracy (JSON / YAML / XML / TOML / CSV).

Loss is applied only to the final assistant output, while intermediate reasoning (Chain-of-Thought) is masked.

Training Configuration

Base model: Qwen/Qwen3-4B-Instruct-2507
Method: QLoRA (4-bit)
Max sequence length: 512
Epochs: 1
Learning rate: 1e-06
LoRA: r=64, alpha=128

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch

base = "Qwen/Qwen3-4B-Instruct-2507"
adapter = "Chiaki111/lora_structeval_t_qwen3_4b_v13"

tokenizer = AutoTokenizer.from_pretrained(base)
model = AutoModelForCausalLM.from_pretrained(
    base,
    torch_dtype=torch.bfloat16,    #A100用
    device_map="auto",
)
model = PeftModel.from_pretrained(model, adapter)

Sources & Terms (IMPORTANT)

Training data: custom merged dataset (9 HF datasets)

Dataset License: This model was trained on a custom dataset created by merging and cleaning nine datasets originally published on Hugging Face. Each source dataset retains its original license. The merged dataset is used only for training and is not redistributed. Compliance: Users must comply with the licenses of the original source datasets and the base model’s terms of use. No private or sensitive information was included in the training data. All preprocessing and merging were performed locally.

Downloads last month: -

Model tree for Chiaki111/lora_structeval_t_qwen3_4b_v13

Base model

Qwen/Qwen3-4B-Instruct-2507

Adapter

(5492)

this model