---
license: apache-2.0
language:
  - en
library_name: peft
pipeline_tag: text-generation
tags:
  - qlora
  - lora
  - structured-output
datasets:
  - u-10bei/structured_data_with_cot_dataset_512_v2
base_model: Qwen/Qwen3-4B-Instruct-2507
---

# qwen3-4b-structeval-lora

This repository provides a **LoRA adapter** fine-tuned from  
**Qwen/Qwen3-4B-Instruct-2507** for improving **structured output accuracy**.

⚠️ **This repository contains LoRA adapter weights only.**  
The base model must be downloaded separately.

---

## Training Objective

This LoRA adapter is trained to improve the model’s ability to generate
**strictly structured outputs**, such as:

- JSON
- YAML
- XML
- TOML
- CSV

During training:

- **Loss is applied only to the final assistant output**
- Intermediate reasoning (Chain-of-Thought) is **masked**
- Only the content after the `Output:` marker is supervised

This design improves format correctness without exposing or overfitting
internal reasoning traces.

---

## Training Configuration

- Base model: Qwen/Qwen3-4B-Instruct-2507
- Training method: QLoRA (4-bit, Unsloth)
- Max sequence length: 512
- Epochs: 1
- Learning rate: 1e-6
- LoRA configuration:
  - r: 64
  - alpha: 128
- Loss type: assistant-only loss (CoT masked)

---

## Usage

```python
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch

base_model = "Qwen/Qwen3-4B-Instruct-2507"
adapter_model = "hamini58/qwen3-4b-structeval-lora"

tokenizer = AutoTokenizer.from_pretrained(base_model)

model = AutoModelForCausalLM.from_pretrained(
    base_model,
    torch_dtype=torch.float16,
    device_map="auto",
)

model = PeftModel.from_pretrained(model, adapter_model)
model.eval()
```

## Sources & Terms (IMPORTANT)

**Training dataset:**  
u-10bei/structured_data_with_cot_dataset_512_v2

**Dataset License:**  
MIT License

**Compliance:**  
Users must comply with:

- The MIT License of the training dataset (including copyright notice)
- The original license and terms of use of the base model  
  (Qwen/Qwen3-4B-Instruct-2507, Apache License 2.0)

This repository distributes **only LoRA adapter weights** and does not
redistribute the base model.