Qwen3-4B StructEval exp001 - structured-5k-mix-sft

This repository provides a LoRA adapter fine-tuned from Qwen/Qwen3-4B-Instruct-2507 using QLoRA (4-bit, Unsloth).

This repository contains LoRA adapter weights only. The base model must be loaded separately.

Training Objective

This adapter is trained to improve structured output accuracy (JSON / YAML / XML / TOML / CSV).

Loss is applied only to the final assistant output, while intermediate reasoning (Chain-of-Thought) is masked.

Training Configuration

Experiment ID: exp001
Base model: Qwen/Qwen3-4B-Instruct-2507
Training dataset: daichira/structured-5k-mix-sft
Method: QLoRA (4-bit)
Max sequence length: 512
Epochs: 1
Learning rate: 1e-06
LoRA parameters: r=64, alpha=128

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch

base = "Qwen/Qwen3-4B-Instruct-2507"
adapter = "junfukuda/qwen3-structeval-exp002-5kmix"

tokenizer = AutoTokenizer.from_pretrained(base)
model = AutoModelForCausalLM.from_pretrained(
    base,
    torch_dtype=torch.float16,
    device_map="auto",
)
model = PeftModel.from_pretrained(model, adapter)

Sources & Terms (IMPORTANT)

Training data: daichira/structured-5k-mix-sft

Dataset License: The dataset used for training is subject to its original license terms. Please refer to the dataset repository for specific license information.

Compliance: Users must comply with both the dataset's license terms and the base model's original terms of use.

Competition Context

This model was developed as part of the StructEval competition, focusing on accurate structured output generation.

Downloads last month: 22

Model tree for junfukuda/qwen3-structeval-exp002-5kmix

Base model

Qwen/Qwen3-4B-Instruct-2507

Adapter

(1626)

this model

junfukuda
/

qwen3-structeval-exp002-5kmix