qwen3-4b-sft-exp10f

LoRA adapter for structured output generation (JSON, YAML, TOML, XML, CSV) based on Qwen3-4B-Instruct

This repository provides a LoRA adapter fine-tuned from Qwen/Qwen3-4B-Instruct-2507 using QLoRA (4-bit quantization).

Note: This repository contains LoRA adapter weights only. The base model must be loaded separately.

🎯 Training Objective

This adapter is trained to improve structured output generation accuracy across multiple formats:

JSON - JavaScript Object Notation
YAML - YAML Ain't Markup Language
TOML - Tom's Obvious, Minimal Language
XML - eXtensible Markup Language
CSV - Comma-Separated Values

Training Strategy: Loss is applied only to the final assistant output, while intermediate reasoning (Chain-of-Thought) is masked to improve output quality without overfitting to reasoning patterns.

📊 Performance

Evaluated on public_150.json (松尾研 LLM コンペティション 2025):

Format	Success Rate	Notes
Overall	82.67%	-
JSON	98.0%	Strong performance
YAML	85.7%	Strong performance
CSV	100.0%	Excellent
XML	60.0%	Room for improvement
TOML	52.0%	Needs improvement

Task-Level Analysis

Format conversion: Strong on JSON/YAML/CSV conversions
Challenges: CSV-to-JSON/XML/YAML, YAML-to-XML, Text-to-TOML
Strengths: CSV generation (100%), JSON/YAML parsing and conversion

⚙️ Training Configuration

Parameter	Value
Base Model	Qwen/Qwen3-4B-Instruct-2507
Method	QLoRA (4-bit quantization)
Max Sequence Length	512
Epochs	1
Learning Rate	1e-06
LoRA Rank (r)	64
LoRA Alpha	128
Batch Size (per device)	2
Gradient Accumulation	8 steps
Effective Batch Size	16
Target Modules
Optimizer	AdamW
Weight Decay	0.05
Warmup Ratio	0.1

🚀 Usage

Option 1: Transformers + PEFT (Recommended for CPU/single GPU)

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch

# Load base model and adapter
base_model_id = "Qwen/Qwen3-4B-Instruct-2507"
adapter_id = "poprap/qwen3-4b-sft-exp10f"

tokenizer = AutoTokenizer.from_pretrained(base_model_id)
model = AutoModelForCausalLM.from_pretrained(
    base_model_id,
    torch_dtype=torch.float16,
    device_map="auto",
)
model = PeftModel.from_pretrained(model, adapter_id)

# Generate structured output
prompt = "Generate a JSON object representing a book with title, author, and year."
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(
    **inputs,
    max_new_tokens=512,
    temperature=0.0,
    do_sample=False,
)
result = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(result)

Option 2: vLLM (Recommended for production/batch inference)

from vllm import LLM, SamplingParams

# Initialize vLLM with LoRA support
llm = LLM(
    model="Qwen/Qwen3-4B-Instruct-2507",
    enable_lora=True,
    max_lora_rank=64,
    gpu_memory_utilization=0.9,
)

# Load LoRA adapter
llm.load_lora_adapter(
    lora_name="structeval-adapter",
    lora_path="poprap/qwen3-4b-sft-exp10f",
)

# Generate with LoRA
sampling_params = SamplingParams(
    temperature=0.0,
    max_tokens=512,
)

prompts = [
    "Generate a JSON object representing a book with title, author, and year.",
    "Convert the following to YAML: {\"name\": \"Alice\", \"age\": 30}",
]

outputs = llm.generate(
    prompts,
    sampling_params,
    lora_request=LoRARequest("structeval-adapter", 1, "poprap/qwen3-4b-sft-exp10f"),
)

for output in outputs:
    print(output.outputs[0].text)

Option 3: CLI Usage (Quick Testing)

# Install dependencies
pip install transformers peft torch

# Run inference
python -c "
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch

base = 'Qwen/Qwen3-4B-Instruct-2507'
adapter = 'poprap/qwen3-4b-sft-exp10f'

tokenizer = AutoTokenizer.from_pretrained(base)
model = AutoModelForCausalLM.from_pretrained(base, torch_dtype=torch.float16, device_map='auto')
model = PeftModel.from_pretrained(model, adapter)

prompt = 'Generate a JSON object representing a person with name and age.'
inputs = tokenizer(prompt, return_tensors='pt').to(model.device)
outputs = model.generate(**inputs, max_new_tokens=256, temperature=0.0)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
"

🔧 Installation

# For Transformers + PEFT
pip install transformers peft torch accelerate bitsandbytes

# For vLLM (faster inference)
pip install vllm

# Download adapter
hf repo info poprap/qwen3-4b-sft-exp10f

💡 Tips for Best Results

Use temperature=0.0 for deterministic structured output
Set max_tokens appropriately based on expected output length
For batch inference, use vLLM for 10-20x speedup

For production, consider merging adapter with base model:

merged_model = model.merge_and_unload()
merged_model.save_pretrained("./merged_model")

🏆 Competition Context

This model was developed for the 松尾研 LLM 講座 2025 最終課題コンペティション (Matsuo Lab LLM Course 2025 Final Competition).

Competition: Main Track - StructEval Benchmark
Task: Structured output generation with format conversion
Constraint: Qwen3-4B-Instruct-2507 base model only
Training Data: Official competition datasets only

📦 Training Data & License

Training Datasets

u-10bei/structured_data_with_cot_dataset_512_v2, u-10bei/structured_data_with_cot_dataset_512_v4, u-10bei/structured_data_with_cot_dataset_512, daichira/structured-3k-mix-sft, daichira/structured-hard-sft-4k

License Information

Dataset License: Creative Commons Attribution (CC-BY-4.0)
Base Model License: Apache 2.0 (Qwen3-4B-Instruct)
Adapter License: CC-BY-4.0 (follows training data license)

Important: Users must comply with:

CC-BY-4.0 attribution requirements for training data
Apache 2.0 terms for the Qwen3 base model
Responsible AI guidelines

🐛 Troubleshooting

OOM (Out of Memory) Errors

Use 4-bit quantization: load_in_4bit=True
Reduce max_model_len in vLLM
Use CPU offloading with device_map="auto"

Slow Inference

Switch to vLLM for 10-20x speedup
Use batch inference when possible
Consider model merging for repeated use

Unexpected Output Format

Check that temperature=0.0 for deterministic output
Verify prompt format matches training data
Ensure adapter is properly loaded

📚 Citation

@misc{qwen3-4b-sft-exp10f,
  title={qwen3-4b-sft-exp10f: LoRA Adapter for Structured Output Generation},
  author={Matsuo Lab LLM Competition 2025},
  year={2026},
  publisher={HuggingFace},
  howpublished={\url{https://huggingface.co/poprap/qwen3-4b-sft-exp10f}},
}

🔗 Links

Base Model: Qwen/Qwen3-4B-Instruct-2507
Competition: 松尾研 LLM 講座 2025
GitHub: Project Repository

📧 Contact

For questions or issues, please open an issue on the GitHub repository or contact via HuggingFace discussions.

Downloads last month: 6

Model tree for poprap/qwen3-4b-sft-exp10f

Base model

Qwen/Qwen3-4B-Instruct-2507

Adapter

(4305)

this model