qwen3-4b-structeval-lora
This repository provides a LoRA adapter fine-tuned from
Qwen/Qwen3-4B-Instruct-2507 for improving structured output accuracy.
⚠️ This repository contains LoRA adapter weights only.
The base model must be downloaded separately.
Training Objective
This LoRA adapter is trained to improve the model’s ability to generate strictly structured outputs, such as:
- JSON
- YAML
- XML
- TOML
- CSV
During training:
- Loss is applied only to the final assistant output
- Intermediate reasoning (Chain-of-Thought) is masked
- Only the content after the
Output:marker is supervised
This design improves format correctness without exposing or overfitting internal reasoning traces.
Training Configuration
- Base model: Qwen/Qwen3-4B-Instruct-2507
- Training method: QLoRA (4-bit, Unsloth)
- Max sequence length: 512
- Epochs: 1
- Learning rate: 1e-6
- LoRA configuration:
- r: 64
- alpha: 128
- Loss type: assistant-only loss (CoT masked)
Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch
base_model = "Qwen/Qwen3-4B-Instruct-2507"
adapter_model = "hamini58/qwen3-4b-structeval-lora"
tokenizer = AutoTokenizer.from_pretrained(base_model)
model = AutoModelForCausalLM.from_pretrained(
base_model,
torch_dtype=torch.float16,
device_map="auto",
)
model = PeftModel.from_pretrained(model, adapter_model)
model.eval()
Sources & Terms (IMPORTANT)
Training dataset:
u-10bei/structured_data_with_cot_dataset_512_v2
Dataset License:
MIT License
Compliance:
Users must comply with:
- The MIT License of the training dataset (including copyright notice)
- The original license and terms of use of the base model
(Qwen/Qwen3-4B-Instruct-2507, Apache License 2.0)
This repository distributes only LoRA adapter weights and does not redistribute the base model.
- Downloads last month
- 29
Model tree for hamini58/qwen3-4b-structeval-lora
Base model
Qwen/Qwen3-4B-Instruct-2507