File size: 3,200 Bytes

9f1d2fc
cb13f73
 
 
 
 
 
 
 
9f1d2fc
 
 
cb13f73
9f1d2fc
cb13f73
 
9f1d2fc
 
cb13f73
9f1d2fc
cb13f73
 
9f1d2fc
cb13f73
9f1d2fc
cb13f73
 
 
 
 
9f1d2fc
cb13f73
9f1d2fc
cb13f73
9f1d2fc
cb13f73
9f1d2fc
cb13f73
 
9f1d2fc
cb13f73
9f1d2fc
cb13f73
 
 
9f1d2fc
cb13f73
9f1d2fc
 
 
cb13f73
9f1d2fc
cb13f73
9f1d2fc
cb13f73
 
 
9f1d2fc
cb13f73
9f1d2fc
cb13f73
9f1d2fc
cb13f73
9f1d2fc
cb13f73
 
 
 
 
 
9f1d2fc
cb13f73
9f1d2fc
cb13f73
9f1d2fc
cb13f73
 
 
 
 
 
9f1d2fc
cb13f73
9f1d2fc
cb13f73
9f1d2fc
cb13f73
9f1d2fc
cb13f73
 
 
 
9f1d2fc
cb13f73
 
9f1d2fc
cb13f73
9f1d2fc
cb13f73
 
 
 
 
9f1d2fc
cb13f73
9f1d2fc
cb13f73
9f1d2fc
cb13f73
 
9f1d2fc
cb13f73
 
9f1d2fc
cb13f73
9f1d2fc
cb13f73
9f1d2fc
cb13f73
 
 
9f1d2fc
cb13f73
9f1d2fc
cb13f73
 
9f1d2fc
cb13f73
9f1d2fc
cb13f73
9f1d2fc
cb13f73
 
 
9f1d2fc
cb13f73
9f1d2fc
cb13f73
9f1d2fc
cb13f73
 
 
9f1d2fc
cb13f73
9f1d2fc
cb13f73

---
base_model: unsloth/Qwen3-4B-Instruct-2507
datasets:
- u-10bei/structured_data_with_cot_dataset_512_v2
- u-10bei/structured_data_with_cot_dataset_512_v4
- u-10bei/structured_data_with_cot_dataset_512_v5
language:
- en
license: apache-2.0
library_name: peft
pipeline_tag: text-generation
tags:
- qlora
- lora
- structured-output
- phase1
---

# Qwen3-4B Structured Output LoRA (Phase 1)

This repository provides a **LoRA adapter** fine-tuned from  
**unsloth/Qwen3-4B-Instruct-2507** using **QLoRA with Unsloth**.

It is designed to improve the model’s ability to generate **structured outputs** such as:

- JSON  
- YAML  
- XML  
- CSV  
- other machine-readable formats  

---

## What This Repository Contains

⚠ **Important**

This repository contains **LoRA adapter weights only**.  
It does **not** include the base model.

To use this adapter, you must load it on top of the original base model:

```
unsloth/Qwen3-4B-Instruct-2507
```

---

## Training Details

### Training Phase

This adapter was trained as **Phase 1** using the following datasets:

- `u-10bei/structured_data_with_cot_dataset_512_v2`
- `u-10bei/structured_data_with_cot_dataset_512_v4`
- `u-10bei/structured_data_with_cot_dataset_512_v5`

Further training (Phase 2) may be performed later using additional datasets.

---

### Training Method

- Method: **QLoRA (4-bit)**
- Framework: **Unsloth + PEFT**
- Base model: `unsloth/Qwen3-4B-Instruct-2507`
- Maximum sequence length: 1024
- Loss applied only to final assistant output  
- Intermediate Chain-of-Thought reasoning is masked

---

### Hyperparameters (Phase 1)

- LoRA rank (r): 64  
- LoRA alpha: 128  
- Learning rate: 1e-4  
- Epochs: 1  
- Batch size: 2  
- Gradient accumulation: 8  

---

## How to Use

Example Python code to load and use this adapter:

```python
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch

base_model = "unsloth/Qwen3-4B-Instruct-2507"
adapter = "cinnamonrooo/qwen3-structeval-phase1"

tokenizer = AutoTokenizer.from_pretrained(base_model)

model = AutoModelForCausalLM.from_pretrained(
    base_model,
    torch_dtype=torch.float16,
    device_map="auto",
)

model = PeftModel.from_pretrained(model, adapter)

prompt = "Convert the following text into JSON format:\nName: John\nAge: 25"

inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
outputs = model.generate(**inputs, max_new_tokens=200)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```

---

## License and Terms

- Training datasets: MIT License  
- Base model: subject to original model license  
- This adapter follows **Apache 2.0 License**

Users must comply with both:

1. The dataset license  
2. The original base model terms  

---

## Notes

- This adapter is optimized for **structured generation tasks**  
- It may not improve general conversational performance  
- Designed primarily for format-following and machine-readable output accuracy  

---

### Future Plans

- Additional training with more datasets (Phase 2)
- Evaluation on structured output benchmarks
- Possible quantized release versions

---

If you have any questions or feedback, feel free to open an issue.