Text Generation
PEFT
Safetensors
English
qlora
lora
structured-output
phase1
conversational
cinnamonrooo's picture
Update README.md
cb13f73 verified
---
base_model: unsloth/Qwen3-4B-Instruct-2507
datasets:
- u-10bei/structured_data_with_cot_dataset_512_v2
- u-10bei/structured_data_with_cot_dataset_512_v4
- u-10bei/structured_data_with_cot_dataset_512_v5
language:
- en
license: apache-2.0
library_name: peft
pipeline_tag: text-generation
tags:
- qlora
- lora
- structured-output
- phase1
---
# Qwen3-4B Structured Output LoRA (Phase 1)
This repository provides a **LoRA adapter** fine-tuned from
**unsloth/Qwen3-4B-Instruct-2507** using **QLoRA with Unsloth**.
It is designed to improve the model’s ability to generate **structured outputs** such as:
- JSON
- YAML
- XML
- CSV
- other machine-readable formats
---
## What This Repository Contains
**Important**
This repository contains **LoRA adapter weights only**.
It does **not** include the base model.
To use this adapter, you must load it on top of the original base model:
```
unsloth/Qwen3-4B-Instruct-2507
```
---
## Training Details
### Training Phase
This adapter was trained as **Phase 1** using the following datasets:
- `u-10bei/structured_data_with_cot_dataset_512_v2`
- `u-10bei/structured_data_with_cot_dataset_512_v4`
- `u-10bei/structured_data_with_cot_dataset_512_v5`
Further training (Phase 2) may be performed later using additional datasets.
---
### Training Method
- Method: **QLoRA (4-bit)**
- Framework: **Unsloth + PEFT**
- Base model: `unsloth/Qwen3-4B-Instruct-2507`
- Maximum sequence length: 1024
- Loss applied only to final assistant output
- Intermediate Chain-of-Thought reasoning is masked
---
### Hyperparameters (Phase 1)
- LoRA rank (r): 64
- LoRA alpha: 128
- Learning rate: 1e-4
- Epochs: 1
- Batch size: 2
- Gradient accumulation: 8
---
## How to Use
Example Python code to load and use this adapter:
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch
base_model = "unsloth/Qwen3-4B-Instruct-2507"
adapter = "cinnamonrooo/qwen3-structeval-phase1"
tokenizer = AutoTokenizer.from_pretrained(base_model)
model = AutoModelForCausalLM.from_pretrained(
base_model,
torch_dtype=torch.float16,
device_map="auto",
)
model = PeftModel.from_pretrained(model, adapter)
prompt = "Convert the following text into JSON format:\nName: John\nAge: 25"
inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
outputs = model.generate(**inputs, max_new_tokens=200)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```
---
## License and Terms
- Training datasets: MIT License
- Base model: subject to original model license
- This adapter follows **Apache 2.0 License**
Users must comply with both:
1. The dataset license
2. The original base model terms
---
## Notes
- This adapter is optimized for **structured generation tasks**
- It may not improve general conversational performance
- Designed primarily for format-following and machine-readable output accuracy
---
### Future Plans
- Additional training with more datasets (Phase 2)
- Evaluation on structured output benchmarks
- Possible quantized release versions
---
If you have any questions or feedback, feel free to open an issue.