Text Generation
PEFT
Safetensors
English
qlora
lora
structured-output
phase1
conversational
File size: 3,200 Bytes
9f1d2fc
cb13f73
 
 
 
 
 
 
 
9f1d2fc
 
 
cb13f73
9f1d2fc
cb13f73
 
9f1d2fc
 
cb13f73
9f1d2fc
cb13f73
 
9f1d2fc
cb13f73
9f1d2fc
cb13f73
 
 
 
 
9f1d2fc
cb13f73
9f1d2fc
cb13f73
9f1d2fc
cb13f73
9f1d2fc
cb13f73
 
9f1d2fc
cb13f73
9f1d2fc
cb13f73
 
 
9f1d2fc
cb13f73
9f1d2fc
 
 
cb13f73
9f1d2fc
cb13f73
9f1d2fc
cb13f73
 
 
9f1d2fc
cb13f73
9f1d2fc
cb13f73
9f1d2fc
cb13f73
9f1d2fc
cb13f73
 
 
 
 
 
9f1d2fc
cb13f73
9f1d2fc
cb13f73
9f1d2fc
cb13f73
 
 
 
 
 
9f1d2fc
cb13f73
9f1d2fc
cb13f73
9f1d2fc
cb13f73
9f1d2fc
cb13f73
 
 
 
9f1d2fc
cb13f73
 
9f1d2fc
cb13f73
9f1d2fc
cb13f73
 
 
 
 
9f1d2fc
cb13f73
9f1d2fc
cb13f73
9f1d2fc
cb13f73
 
9f1d2fc
cb13f73
 
9f1d2fc
cb13f73
9f1d2fc
cb13f73
9f1d2fc
cb13f73
 
 
9f1d2fc
cb13f73
9f1d2fc
cb13f73
 
9f1d2fc
cb13f73
9f1d2fc
cb13f73
9f1d2fc
cb13f73
 
 
9f1d2fc
cb13f73
9f1d2fc
cb13f73
9f1d2fc
cb13f73
 
 
9f1d2fc
cb13f73
9f1d2fc
cb13f73
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
---
base_model: unsloth/Qwen3-4B-Instruct-2507
datasets:
- u-10bei/structured_data_with_cot_dataset_512_v2
- u-10bei/structured_data_with_cot_dataset_512_v4
- u-10bei/structured_data_with_cot_dataset_512_v5
language:
- en
license: apache-2.0
library_name: peft
pipeline_tag: text-generation
tags:
- qlora
- lora
- structured-output
- phase1
---

# Qwen3-4B Structured Output LoRA (Phase 1)

This repository provides a **LoRA adapter** fine-tuned from  
**unsloth/Qwen3-4B-Instruct-2507** using **QLoRA with Unsloth**.

It is designed to improve the model’s ability to generate **structured outputs** such as:

- JSON  
- YAML  
- XML  
- CSV  
- other machine-readable formats  

---

## What This Repository Contains**Important**

This repository contains **LoRA adapter weights only**.  
It does **not** include the base model.

To use this adapter, you must load it on top of the original base model:

```
unsloth/Qwen3-4B-Instruct-2507
```

---

## Training Details

### Training Phase

This adapter was trained as **Phase 1** using the following datasets:

- `u-10bei/structured_data_with_cot_dataset_512_v2`
- `u-10bei/structured_data_with_cot_dataset_512_v4`
- `u-10bei/structured_data_with_cot_dataset_512_v5`

Further training (Phase 2) may be performed later using additional datasets.

---

### Training Method

- Method: **QLoRA (4-bit)**
- Framework: **Unsloth + PEFT**
- Base model: `unsloth/Qwen3-4B-Instruct-2507`
- Maximum sequence length: 1024
- Loss applied only to final assistant output  
- Intermediate Chain-of-Thought reasoning is masked

---

### Hyperparameters (Phase 1)

- LoRA rank (r): 64  
- LoRA alpha: 128  
- Learning rate: 1e-4  
- Epochs: 1  
- Batch size: 2  
- Gradient accumulation: 8  

---

## How to Use

Example Python code to load and use this adapter:

```python
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch

base_model = "unsloth/Qwen3-4B-Instruct-2507"
adapter = "cinnamonrooo/qwen3-structeval-phase1"

tokenizer = AutoTokenizer.from_pretrained(base_model)

model = AutoModelForCausalLM.from_pretrained(
    base_model,
    torch_dtype=torch.float16,
    device_map="auto",
)

model = PeftModel.from_pretrained(model, adapter)

prompt = "Convert the following text into JSON format:\nName: John\nAge: 25"

inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
outputs = model.generate(**inputs, max_new_tokens=200)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```

---

## License and Terms

- Training datasets: MIT License  
- Base model: subject to original model license  
- This adapter follows **Apache 2.0 License**

Users must comply with both:

1. The dataset license  
2. The original base model terms  

---

## Notes

- This adapter is optimized for **structured generation tasks**  
- It may not improve general conversational performance  
- Designed primarily for format-following and machine-readable output accuracy  

---

### Future Plans

- Additional training with more datasets (Phase 2)
- Evaluation on structured output benchmarks
- Possible quantized release versions

---

If you have any questions or feedback, feel free to open an issue.