hamini58
/

qwen3-4b-structeval-lora

Text Generation

structured-output

Model card Files Files and versions

qwen3-4b-structeval-lora / README.md

hamini58's picture

Upload folder using huggingface_hub

3d74207 verified 13 days ago

|

history blame contribute delete

2.21 kB

	---
	license: apache-2.0
	language:
	- en
	library_name: peft
	pipeline_tag: text-generation
	tags:
	- qlora
	- lora
	- structured-output
	datasets:
	- u-10bei/structured_data_with_cot_dataset_512_v2
	base_model: Qwen/Qwen3-4B-Instruct-2507
	---

	# qwen3-4b-structeval-lora

	This repository provides a LoRA adapter fine-tuned from
	Qwen/Qwen3-4B-Instruct-2507 for improving structured output accuracy.

	⚠️ This repository contains LoRA adapter weights only.
	The base model must be downloaded separately.

	---

	## Training Objective

	This LoRA adapter is trained to improve the model’s ability to generate
	strictly structured outputs, such as:

	- JSON
	- YAML
	- XML
	- TOML
	- CSV

	During training:

	- Loss is applied only to the final assistant output
	- Intermediate reasoning (Chain-of-Thought) is masked
	- Only the content after the `Output:` marker is supervised

	This design improves format correctness without exposing or overfitting
	internal reasoning traces.

	---

	## Training Configuration

	- Base model: Qwen/Qwen3-4B-Instruct-2507
	- Training method: QLoRA (4-bit, Unsloth)
	- Max sequence length: 512
	- Epochs: 1
	- Learning rate: 1e-6
	- LoRA configuration:
	- r: 64
	- alpha: 128
	- Loss type: assistant-only loss (CoT masked)

	---

	## Usage

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer
	from peft import PeftModel
	import torch

	base_model = "Qwen/Qwen3-4B-Instruct-2507"
	adapter_model = "hamini58/qwen3-4b-structeval-lora"

	tokenizer = AutoTokenizer.from_pretrained(base_model)

	model = AutoModelForCausalLM.from_pretrained(
	base_model,
	torch_dtype=torch.float16,
	device_map="auto",
	)

	model = PeftModel.from_pretrained(model, adapter_model)
	model.eval()
	```

	## Sources & Terms (IMPORTANT)

	Training dataset:
	u-10bei/structured_data_with_cot_dataset_512_v2

	Dataset License:
	MIT License

	Compliance:
	Users must comply with:

	- The MIT License of the training dataset (including copyright notice)
	- The original license and terms of use of the base model
	(Qwen/Qwen3-4B-Instruct-2507, Apache License 2.0)

	This repository distributes only LoRA adapter weights and does not
	redistribute the base model.