--- license: apache-2.0 language: - en library_name: peft pipeline_tag: text-generation tags: - qlora - lora - structured-output datasets: - u-10bei/structured_data_with_cot_dataset_512_v2 base_model: Qwen/Qwen3-4B-Instruct-2507 --- # qwen3-4b-structeval-lora This repository provides a **LoRA adapter** fine-tuned from **Qwen/Qwen3-4B-Instruct-2507** for improving **structured output accuracy**. ⚠️ **This repository contains LoRA adapter weights only.** The base model must be downloaded separately. --- ## Training Objective This LoRA adapter is trained to improve the model’s ability to generate **strictly structured outputs**, such as: - JSON - YAML - XML - TOML - CSV During training: - **Loss is applied only to the final assistant output** - Intermediate reasoning (Chain-of-Thought) is **masked** - Only the content after the `Output:` marker is supervised This design improves format correctness without exposing or overfitting internal reasoning traces. --- ## Training Configuration - Base model: Qwen/Qwen3-4B-Instruct-2507 - Training method: QLoRA (4-bit, Unsloth) - Max sequence length: 512 - Epochs: 1 - Learning rate: 1e-6 - LoRA configuration: - r: 64 - alpha: 128 - Loss type: assistant-only loss (CoT masked) --- ## Usage ```python from transformers import AutoModelForCausalLM, AutoTokenizer from peft import PeftModel import torch base_model = "Qwen/Qwen3-4B-Instruct-2507" adapter_model = "hamini58/qwen3-4b-structeval-lora" tokenizer = AutoTokenizer.from_pretrained(base_model) model = AutoModelForCausalLM.from_pretrained( base_model, torch_dtype=torch.float16, device_map="auto", ) model = PeftModel.from_pretrained(model, adapter_model) model.eval() ``` ## Sources & Terms (IMPORTANT) **Training dataset:** u-10bei/structured_data_with_cot_dataset_512_v2 **Dataset License:** MIT License **Compliance:** Users must comply with: - The MIT License of the training dataset (including copyright notice) - The original license and terms of use of the base model (Qwen/Qwen3-4B-Instruct-2507, Apache License 2.0) This repository distributes **only LoRA adapter weights** and does not redistribute the base model.