StructEval-Oriented Qwen3-4B LoRA (KazumaInoue)
1. Model Overview
This repository provides a LoRA adapter trained for structured output generation tasks based on Qwen/Qwen3-4B-Instruct-2507.
The model was developed as part of the Matsuo Lab LLM Application Course (Final Assignment). Its primary focus is improving output stability when generating structured formats such as JSON, YAML, and XML under explicit formatting constraints.
2. Training Objective and Design Rationale
The objective of this model is to generate outputs that follow explicit structural constraints specified in the prompt.
To achieve this, the following design choices were adopted:
- Loss is applied only to assistant responses (assistant-only loss).
- Output-marker-based masking is used to avoid learning intermediate reasoning tokens.
- Priority is given to format correctness rather than creative or verbose responses.
3. Training Configuration
- Base model: Qwen/Qwen3-4B-Instruct-2507
- Training method: QLoRA (4-bit quantized base model with LoRA adapters)
- Maximum sequence length: 512
- LoRA rank (r): 64
- LoRA alpha: 128
- LoRA dropout: 0.05
- Number of epochs: 1
- Optimizer and scheduler: AdamW with cosine learning rate schedule
- Training framework: Unsloth, PEFT, Hugging Face Transformers
4. Training Data
The model was trained using a synthetic supervised fine-tuning dataset provided by the course organizers.
- Dataset: u-10bei/structured_data_with_cot_dataset_512_v2
- Dataset type: Structured output SFT dataset
- Content: Instruction-following samples with explicit output markers
Dataset License and Compliance
The dataset was used strictly in accordance with the course guidelines and is intended for research and educational purposes only.
5. Usage
This repository contains LoRA adapter weights only. The base model must be obtained separately.
Example usage with Unsloth:
from unsloth import FastLanguageModel
model, tokenizer = FastLanguageModel.from_pretrained(
model_name="Qwen/Qwen3-4B-Instruct-2507",
load_in_4bit=True,
)
model.load_adapter("YOUR_HF_REPO_ID")
- Downloads last month
- 51
Model tree for Kazupal/structeval-lora-qwen3-4b
Base model
Qwen/Qwen3-4B-Instruct-2507