qwen3-4b-structured-output-lora
This repository provides a LoRA adapter fine-tuned from Qwen/Qwen3-4B-Instruct-2507 using QLoRA (4-bit, Unsloth).
This repository contains LoRA adapter weights only. The base model must be loaded separately.
Training Objective
This adapter is trained to improve structured output accuracy (JSON / YAML / XML / TOML / CSV).
Loss is applied only to the final assistant output, while intermediate reasoning (Chain-of-Thought) is masked.
Training Configuration
- Base model: Qwen/Qwen3-4B-Instruct-2507
- Method: QLoRA (4-bit)
- Max sequence length: 512
- Epochs: 1
- Learning rate: 1e-06
- LoRA: r=64, alpha=128
Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch
base = "Qwen/Qwen3-4B-Instruct-2507"
adapter = "Chiaki111/lora_structeval_t_qwen3_4b_v13"
tokenizer = AutoTokenizer.from_pretrained(base)
model = AutoModelForCausalLM.from_pretrained(
base,
torch_dtype=torch.bfloat16, #A100用
device_map="auto",
)
model = PeftModel.from_pretrained(model, adapter)
Sources & Terms (IMPORTANT)
Training data: custom merged dataset (9 HF datasets)
Dataset License: This model was trained on a custom dataset created by merging and cleaning nine datasets originally published on Hugging Face. Each source dataset retains its original license. The merged dataset is used only for training and is not redistributed. Compliance: Users must comply with the licenses of the original source datasets and the base model’s terms of use. No private or sensitive information was included in the training data. All preprocessing and merging were performed locally.
- Downloads last month
- 3
Model tree for Chiaki111/lora_structeval_t_qwen3_4b_v13
Base model
Qwen/Qwen3-4B-Instruct-2507