Qwen3-4B Structured Output LoRA (SFT)

This repository provides a LoRA adapter fine-tuned on structured output tasks using supervised fine-tuning (SFT), aiming to improve the stability and correctness of structured outputs such as JSON, YAML, XML, CSV, and TOML.

This repository contains LoRA adapter weights only. The base model must be loaded separately.

Training Method

Base model: Qwen/Qwen3-4B-Instruct-2507
Fine-tuning method: LoRA / QLoRA (PEFT, 4-bit)
Frameworks: Hugging Face Transformers, PEFT, Unsloth

No changes were made to the base model architecture.

Training Configuration

Max sequence length: 512
Epochs: 1
Learning rate: 1e-06
LoRA: r=64, alpha=128

Dataset

This model was trained exclusively on datasets provided or introduced by the course organizers.

Dataset used:
- u-10bei/structured_data_with_cot_dataset_512_v2

No external datasets were used. No data generated or modified using LLMs was used during training.

Usage

Example usage:

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch

base_model = "Qwen/Qwen3-4B-Instruct-2507"
lora_model = "deepkick/qwen3-4b-structured-sft-lora"

tokenizer = AutoTokenizer.from_pretrained(base_model)
model = AutoModelForCausalLM.from_pretrained(
    base_model,
    torch_dtype=torch.float16,
    device_map="auto",
)
model = PeftModel.from_pretrained(model, lora_model)

Intended Use

This adapter is intended for structured output generation tasks where strict adherence to specified output formats is required.

Typical use cases include:

JSON generation
YAML / TOML conversion
XML / CSV structured output

Limitations

This adapter is designed specifically for structured output tasks.
It is not optimized for open-ended reasoning or conversational performance.
The model prioritizes format correctness over semantic richness.

Sources & Terms

Training data: u-10bei/structured_data_with_cot_dataset_512_v2
Dataset License: MIT License (users must comply with the dataset license terms).

License

This model card declares Apache-2.0 to match the base model license. Please also comply with the training dataset license (MIT) and the base model terms.

Downloads last month: 2

Model tree for deepkick/qwen3-4b-structured-sft-lora

Base model

Qwen/Qwen3-4B-Instruct-2507

Adapter

(5591)

this model

deepkick
/

qwen3-4b-structured-sft-lora