Qwen3 4B Structured Output Model
This repository provides a LoRA adapter fine-tuned from Qwen/Qwen3-4B-Instruct-2507 using QLoRA (4-bit, Unsloth).
Training Objective
This adapter is trained to improve structured output accuracy (JSON / YAML / XML / TOML / CSV) by focusing on raw data generation.
Key Features & Training Strategy
- Complexity-Aware Reasoning: The model is trained with a dynamic approach to Chain-of-Thought (CoT).
- For simple and medium tasks, reasoning is omitted to prioritize direct and high-speed structured data generation.
- For high-complexity tasks, the reasoning process is preserved to ensure accuracy and logical consistency during complex data transformations.
- Noise Reduction (Forbidden Tokens): Common conversational fillers (e.g., "Here is the data...") and markdown code blocks (e.g., ```json) are masked during the training process. This forces the model to output clean, raw structured text suitable for programmatic parsing.
- Assistant-Focused Learning: The training loss is applied exclusively to the final assistant responses. User instructions and internal reasoning steps are excluded from the gradient calculation, focusing the model's capacity on providing the correct final answer.
Training Configuration
- Base model: Qwen/Qwen3-4B-Instruct-2507
- Method: QLoRA (4-bit)
- Max sequence length: 512
- Epochs: 2
- Learning rate: 7e-06
- LoRA: r=64, alpha=128
Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch
base = "Qwen/Qwen3-4B-Instruct-2507"
adapter = "zerg2187/lora_structeval_t_qwen3_penalty_tokens_v2_d5"
tokenizer = AutoTokenizer.from_pretrained(base)
model = AutoModelForCausalLM.from_pretrained(
base,
torch_dtype=torch.float16,
device_map="auto",
)
model = PeftModel.from_pretrained(model, adapter)
Sources & Terms (IMPORTANT)
Training data: u-10bei/structured_data_with_cot_dataset_512_v5
Dataset License: MIT License. This dataset is used and distributed under the terms of the MIT License. Compliance: Users must comply with the MIT license (including copyright notice) and the base model's original terms of use.
- Downloads last month
- -
Model tree for zerg2187/lora_structeval_t_qwen3_penalty_tokens_v2_d5
Base model
Qwen/Qwen3-4B-Instruct-2507