Qwen3-4B Structured Output LoRA (Phase 1)
This repository provides a LoRA adapter fine-tuned from
unsloth/Qwen3-4B-Instruct-2507 using QLoRA with Unsloth.
It is designed to improve the model’s ability to generate structured outputs such as:
- JSON
- YAML
- XML
- CSV
- other machine-readable formats
What This Repository Contains
⚠ Important
This repository contains LoRA adapter weights only.
It does not include the base model.
To use this adapter, you must load it on top of the original base model:
unsloth/Qwen3-4B-Instruct-2507
Training Details
Training Phase
This adapter was trained as Phase 1 using the following datasets:
u-10bei/structured_data_with_cot_dataset_512_v2u-10bei/structured_data_with_cot_dataset_512_v4u-10bei/structured_data_with_cot_dataset_512_v5
Further training (Phase 2) may be performed later using additional datasets.
Training Method
- Method: QLoRA (4-bit)
- Framework: Unsloth + PEFT
- Base model:
unsloth/Qwen3-4B-Instruct-2507 - Maximum sequence length: 1024
- Loss applied only to final assistant output
- Intermediate Chain-of-Thought reasoning is masked
Hyperparameters (Phase 1)
- LoRA rank (r): 64
- LoRA alpha: 128
- Learning rate: 1e-4
- Epochs: 1
- Batch size: 2
- Gradient accumulation: 8
How to Use
Example Python code to load and use this adapter:
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch
base_model = "unsloth/Qwen3-4B-Instruct-2507"
adapter = "cinnamonrooo/qwen3-structeval-phase1"
tokenizer = AutoTokenizer.from_pretrained(base_model)
model = AutoModelForCausalLM.from_pretrained(
base_model,
torch_dtype=torch.float16,
device_map="auto",
)
model = PeftModel.from_pretrained(model, adapter)
prompt = "Convert the following text into JSON format:\nName: John\nAge: 25"
inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
outputs = model.generate(**inputs, max_new_tokens=200)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
License and Terms
- Training datasets: MIT License
- Base model: subject to original model license
- This adapter follows Apache 2.0 License
Users must comply with both:
- The dataset license
- The original base model terms
Notes
- This adapter is optimized for structured generation tasks
- It may not improve general conversational performance
- Designed primarily for format-following and machine-readable output accuracy
Future Plans
- Additional training with more datasets (Phase 2)
- Evaluation on structured output benchmarks
- Possible quantized release versions
If you have any questions or feedback, feel free to open an issue.
- Downloads last month
- -
Model tree for cinnamonrooo/qwen3-structeval-phase1
Base model
Qwen/Qwen3-4B-Instruct-2507
Finetuned
unsloth/Qwen3-4B-Instruct-2507