Qwen3-4B Structured Output LoRA (Phase 1)

This repository provides a LoRA adapter fine-tuned from
unsloth/Qwen3-4B-Instruct-2507 using QLoRA with Unsloth.

It is designed to improve the model’s ability to generate structured outputs such as:

JSON
YAML
XML
CSV
other machine-readable formats

What This Repository Contains

⚠ Important

This repository contains LoRA adapter weights only.
It does not include the base model.

To use this adapter, you must load it on top of the original base model:

unsloth/Qwen3-4B-Instruct-2507

Training Details

Training Phase

This adapter was trained as Phase 1 using the following datasets:

u-10bei/structured_data_with_cot_dataset_512_v2
u-10bei/structured_data_with_cot_dataset_512_v4
u-10bei/structured_data_with_cot_dataset_512_v5

Further training (Phase 2) may be performed later using additional datasets.

Training Method

Method: QLoRA (4-bit)
Framework: Unsloth + PEFT
Base model: unsloth/Qwen3-4B-Instruct-2507
Maximum sequence length: 1024
Loss applied only to final assistant output
Intermediate Chain-of-Thought reasoning is masked

Hyperparameters (Phase 1)

LoRA rank (r): 64
LoRA alpha: 128
Learning rate: 1e-4
Epochs: 1
Batch size: 2
Gradient accumulation: 8

How to Use

Example Python code to load and use this adapter:

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch

base_model = "unsloth/Qwen3-4B-Instruct-2507"
adapter = "cinnamonrooo/qwen3-structeval-phase1"

tokenizer = AutoTokenizer.from_pretrained(base_model)

model = AutoModelForCausalLM.from_pretrained(
    base_model,
    torch_dtype=torch.float16,
    device_map="auto",
)

model = PeftModel.from_pretrained(model, adapter)

prompt = "Convert the following text into JSON format:\nName: John\nAge: 25"

inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
outputs = model.generate(**inputs, max_new_tokens=200)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))

License and Terms

Training datasets: MIT License
Base model: subject to original model license
This adapter follows Apache 2.0 License

Users must comply with both:

The dataset license
The original base model terms

Notes

This adapter is optimized for structured generation tasks
It may not improve general conversational performance
Designed primarily for format-following and machine-readable output accuracy

Future Plans

Additional training with more datasets (Phase 2)
Evaluation on structured output benchmarks
Possible quantized release versions

If you have any questions or feedback, feel free to open an issue.

Downloads last month: -

Model tree for cinnamonrooo/qwen3-structeval-phase1

Base model

Qwen/Qwen3-4B-Instruct-2507

Finetuned

unsloth/Qwen3-4B-Instruct-2507

Adapter

(97)

this model

cinnamonrooo
/

qwen3-structeval-phase1