qwen3-4b-structeval-lora

This repository provides a LoRA adapter fine-tuned from
Qwen/Qwen3-4B-Instruct-2507 for improving structured output accuracy.

⚠️ This repository contains LoRA adapter weights only.
The base model must be downloaded separately.


Training Objective

This LoRA adapter is trained to improve the model’s ability to generate strictly structured outputs, such as:

  • JSON
  • YAML
  • XML
  • TOML
  • CSV

During training:

  • Loss is applied only to the final assistant output
  • Intermediate reasoning (Chain-of-Thought) is masked
  • Only the content after the Output: marker is supervised

This design improves format correctness without exposing or overfitting internal reasoning traces.


Training Configuration

  • Base model: Qwen/Qwen3-4B-Instruct-2507
  • Training method: QLoRA (4-bit, Unsloth)
  • Max sequence length: 512
  • Epochs: 1
  • Learning rate: 1e-6
  • LoRA configuration:
    • r: 64
    • alpha: 128
  • Loss type: assistant-only loss (CoT masked)

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch

base_model = "Qwen/Qwen3-4B-Instruct-2507"
adapter_model = "hamini58/qwen3-4b-structeval-lora"

tokenizer = AutoTokenizer.from_pretrained(base_model)

model = AutoModelForCausalLM.from_pretrained(
    base_model,
    torch_dtype=torch.float16,
    device_map="auto",
)

model = PeftModel.from_pretrained(model, adapter_model)
model.eval()

Sources & Terms (IMPORTANT)

Training dataset:
u-10bei/structured_data_with_cot_dataset_512_v2

Dataset License:
MIT License

Compliance:
Users must comply with:

  • The MIT License of the training dataset (including copyright notice)
  • The original license and terms of use of the base model
    (Qwen/Qwen3-4B-Instruct-2507, Apache License 2.0)

This repository distributes only LoRA adapter weights and does not redistribute the base model.

Downloads last month
29
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for hamini58/qwen3-4b-structeval-lora

Adapter
(1787)
this model

Dataset used to train hamini58/qwen3-4b-structeval-lora