Qwen3-4B Structured Output LoRA (SFT)
This repository provides a LoRA adapter fine-tuned on structured output tasks using supervised fine-tuning (SFT), aiming to improve the stability and correctness of structured outputs such as JSON, YAML, XML, CSV, and TOML.
This repository contains LoRA adapter weights only. The base model must be loaded separately.
Training Method
- Base model: Qwen/Qwen3-4B-Instruct-2507
- Fine-tuning method: LoRA / QLoRA (PEFT, 4-bit)
- Frameworks: Hugging Face Transformers, PEFT, Unsloth
No changes were made to the base model architecture.
Training Configuration
- Max sequence length: 512
- Epochs: 1
- Learning rate: 1e-06
- LoRA: r=64, alpha=128
Dataset
This model was trained exclusively on datasets provided or introduced by the course organizers.
- Dataset used:
- u-10bei/structured_data_with_cot_dataset_512_v2
No external datasets were used. No data generated or modified using LLMs was used during training.
Usage
Example usage:
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch
base_model = "Qwen/Qwen3-4B-Instruct-2507"
lora_model = "deepkick/qwen3-4b-structured-sft-lora"
tokenizer = AutoTokenizer.from_pretrained(base_model)
model = AutoModelForCausalLM.from_pretrained(
base_model,
torch_dtype=torch.float16,
device_map="auto",
)
model = PeftModel.from_pretrained(model, lora_model)
Intended Use
This adapter is intended for structured output generation tasks where strict adherence to specified output formats is required.
Typical use cases include:
- JSON generation
- YAML / TOML conversion
- XML / CSV structured output
Limitations
- This adapter is designed specifically for structured output tasks.
- It is not optimized for open-ended reasoning or conversational performance.
- The model prioritizes format correctness over semantic richness.
Sources & Terms
Training data: u-10bei/structured_data_with_cot_dataset_512_v2
Dataset License: MIT License (users must comply with the dataset license terms).
License
This model card declares Apache-2.0 to match the base model license. Please also comply with the training dataset license (MIT) and the base model terms.
- Downloads last month
- 74
Model tree for deepkick/qwen3-4b-structured-sft-lora
Base model
Qwen/Qwen3-4B-Instruct-2507