Qwen3-4B Structured Output LoRA (SFT)

This repository provides a LoRA adapter fine-tuned on structured output tasks using supervised fine-tuning (SFT), aiming to improve the stability and correctness of structured outputs such as JSON, YAML, XML, CSV, and TOML.

This repository contains LoRA adapter weights only. The base model must be loaded separately.


Training Method

  • Base model: Qwen/Qwen3-4B-Instruct-2507
  • Fine-tuning method: LoRA / QLoRA (PEFT, 4-bit)
  • Frameworks: Hugging Face Transformers, PEFT, Unsloth

No changes were made to the base model architecture.


Training Configuration

  • Max sequence length: 512
  • Epochs: 1
  • Learning rate: 1e-06
  • LoRA: r=64, alpha=128

Dataset

This model was trained exclusively on datasets provided or introduced by the course organizers.

  • Dataset used:
    • u-10bei/structured_data_with_cot_dataset_512_v2

No external datasets were used. No data generated or modified using LLMs was used during training.


Usage

Example usage:

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch

base_model = "Qwen/Qwen3-4B-Instruct-2507"
lora_model = "deepkick/qwen3-4b-structured-sft-lora"

tokenizer = AutoTokenizer.from_pretrained(base_model)
model = AutoModelForCausalLM.from_pretrained(
    base_model,
    torch_dtype=torch.float16,
    device_map="auto",
)
model = PeftModel.from_pretrained(model, lora_model)

Intended Use

This adapter is intended for structured output generation tasks where strict adherence to specified output formats is required.

Typical use cases include:

  • JSON generation
  • YAML / TOML conversion
  • XML / CSV structured output

Limitations

  • This adapter is designed specifically for structured output tasks.
  • It is not optimized for open-ended reasoning or conversational performance.
  • The model prioritizes format correctness over semantic richness.

Sources & Terms

Training data: u-10bei/structured_data_with_cot_dataset_512_v2
Dataset License: MIT License (users must comply with the dataset license terms).


License

This model card declares Apache-2.0 to match the base model license. Please also comply with the training dataset license (MIT) and the base model terms.

Downloads last month
74
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for deepkick/qwen3-4b-structured-sft-lora

Adapter
(1878)
this model

Dataset used to train deepkick/qwen3-4b-structured-sft-lora