hamini58's picture
Upload folder using huggingface_hub
3d74207 verified
---
license: apache-2.0
language:
- en
library_name: peft
pipeline_tag: text-generation
tags:
- qlora
- lora
- structured-output
datasets:
- u-10bei/structured_data_with_cot_dataset_512_v2
base_model: Qwen/Qwen3-4B-Instruct-2507
---
# qwen3-4b-structeval-lora
This repository provides a **LoRA adapter** fine-tuned from
**Qwen/Qwen3-4B-Instruct-2507** for improving **structured output accuracy**.
⚠️ **This repository contains LoRA adapter weights only.**
The base model must be downloaded separately.
---
## Training Objective
This LoRA adapter is trained to improve the model’s ability to generate
**strictly structured outputs**, such as:
- JSON
- YAML
- XML
- TOML
- CSV
During training:
- **Loss is applied only to the final assistant output**
- Intermediate reasoning (Chain-of-Thought) is **masked**
- Only the content after the `Output:` marker is supervised
This design improves format correctness without exposing or overfitting
internal reasoning traces.
---
## Training Configuration
- Base model: Qwen/Qwen3-4B-Instruct-2507
- Training method: QLoRA (4-bit, Unsloth)
- Max sequence length: 512
- Epochs: 1
- Learning rate: 1e-6
- LoRA configuration:
- r: 64
- alpha: 128
- Loss type: assistant-only loss (CoT masked)
---
## Usage
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch
base_model = "Qwen/Qwen3-4B-Instruct-2507"
adapter_model = "hamini58/qwen3-4b-structeval-lora"
tokenizer = AutoTokenizer.from_pretrained(base_model)
model = AutoModelForCausalLM.from_pretrained(
base_model,
torch_dtype=torch.float16,
device_map="auto",
)
model = PeftModel.from_pretrained(model, adapter_model)
model.eval()
```
## Sources & Terms (IMPORTANT)
**Training dataset:**
u-10bei/structured_data_with_cot_dataset_512_v2
**Dataset License:**
MIT License
**Compliance:**
Users must comply with:
- The MIT License of the training dataset (including copyright notice)
- The original license and terms of use of the base model
(Qwen/Qwen3-4B-Instruct-2507, Apache License 2.0)
This repository distributes **only LoRA adapter weights** and does not
redistribute the base model.