qwen3-4b-structeval-lora

This repository provides a LoRA adapter fine-tuned from
Qwen/Qwen3-4B-Instruct-2507 for improving structured output accuracy.

⚠️ This repository contains LoRA adapter weights only.
The base model must be downloaded separately.

Training Objective

This LoRA adapter is trained to improve the model’s ability to generate strictly structured outputs, such as:

JSON
YAML
XML
TOML
CSV

During training:

Loss is applied only to the final assistant output
Intermediate reasoning (Chain-of-Thought) is masked
Only the content after the Output: marker is supervised

This design improves format correctness without exposing or overfitting internal reasoning traces.

Training Configuration

Base model: Qwen/Qwen3-4B-Instruct-2507
Training method: QLoRA (4-bit, Unsloth)
Max sequence length: 512
Epochs: 1
Learning rate: 1e-6
LoRA configuration:
- r: 64
- alpha: 128
Loss type: assistant-only loss (CoT masked)

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch

base_model = "Qwen/Qwen3-4B-Instruct-2507"
adapter_model = "hamini58/qwen3-4b-structeval-lora"

tokenizer = AutoTokenizer.from_pretrained(base_model)

model = AutoModelForCausalLM.from_pretrained(
    base_model,
    torch_dtype=torch.float16,
    device_map="auto",
)

model = PeftModel.from_pretrained(model, adapter_model)
model.eval()

Sources & Terms (IMPORTANT)

Training dataset:
u-10bei/structured_data_with_cot_dataset_512_v2

Dataset License:
MIT License

Compliance:
Users must comply with:

The MIT License of the training dataset (including copyright notice)
The original license and terms of use of the base model
(Qwen/Qwen3-4B-Instruct-2507, Apache License 2.0)

This repository distributes only LoRA adapter weights and does not redistribute the base model.

Downloads last month: -

Model tree for hamini58/qwen3-4b-structeval-lora

Base model

Qwen/Qwen3-4B-Instruct-2507

Adapter

(5513)

this model

hamini58
/

qwen3-4b-structeval-lora