Text Generation
PEFT
Safetensors
English
qlora
lora
structured-output
phase1
conversational

Qwen3-4B Structured Output LoRA (Phase 1)

This repository provides a LoRA adapter fine-tuned from
unsloth/Qwen3-4B-Instruct-2507 using QLoRA with Unsloth.

It is designed to improve the model’s ability to generate structured outputs such as:

  • JSON
  • YAML
  • XML
  • CSV
  • other machine-readable formats

What This Repository Contains

Important

This repository contains LoRA adapter weights only.
It does not include the base model.

To use this adapter, you must load it on top of the original base model:

unsloth/Qwen3-4B-Instruct-2507

Training Details

Training Phase

This adapter was trained as Phase 1 using the following datasets:

  • u-10bei/structured_data_with_cot_dataset_512_v2
  • u-10bei/structured_data_with_cot_dataset_512_v4
  • u-10bei/structured_data_with_cot_dataset_512_v5

Further training (Phase 2) may be performed later using additional datasets.


Training Method

  • Method: QLoRA (4-bit)
  • Framework: Unsloth + PEFT
  • Base model: unsloth/Qwen3-4B-Instruct-2507
  • Maximum sequence length: 1024
  • Loss applied only to final assistant output
  • Intermediate Chain-of-Thought reasoning is masked

Hyperparameters (Phase 1)

  • LoRA rank (r): 64
  • LoRA alpha: 128
  • Learning rate: 1e-4
  • Epochs: 1
  • Batch size: 2
  • Gradient accumulation: 8

How to Use

Example Python code to load and use this adapter:

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch

base_model = "unsloth/Qwen3-4B-Instruct-2507"
adapter = "cinnamonrooo/qwen3-structeval-phase1"

tokenizer = AutoTokenizer.from_pretrained(base_model)

model = AutoModelForCausalLM.from_pretrained(
    base_model,
    torch_dtype=torch.float16,
    device_map="auto",
)

model = PeftModel.from_pretrained(model, adapter)

prompt = "Convert the following text into JSON format:\nName: John\nAge: 25"

inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
outputs = model.generate(**inputs, max_new_tokens=200)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))

License and Terms

  • Training datasets: MIT License
  • Base model: subject to original model license
  • This adapter follows Apache 2.0 License

Users must comply with both:

  1. The dataset license
  2. The original base model terms

Notes

  • This adapter is optimized for structured generation tasks
  • It may not improve general conversational performance
  • Designed primarily for format-following and machine-readable output accuracy

Future Plans

  • Additional training with more datasets (Phase 2)
  • Evaluation on structured output benchmarks
  • Possible quantized release versions

If you have any questions or feedback, feel free to open an issue.

Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for cinnamonrooo/qwen3-structeval-phase1

Adapter
(97)
this model

Datasets used to train cinnamonrooo/qwen3-structeval-phase1