cinnamonrooo
/

qwen3-structeval-phase1

Text Generation

structured-output

Model card Files Files and versions

qwen3-structeval-phase1 / README.md

cinnamonrooo's picture

Update README.md

cb13f73 verified about 21 hours ago

|

history blame contribute delete

3.2 kB

	---
	base_model: unsloth/Qwen3-4B-Instruct-2507
	datasets:
	- u-10bei/structured_data_with_cot_dataset_512_v2
	- u-10bei/structured_data_with_cot_dataset_512_v4
	- u-10bei/structured_data_with_cot_dataset_512_v5
	language:
	- en
	license: apache-2.0
	library_name: peft
	pipeline_tag: text-generation
	tags:
	- qlora
	- lora
	- structured-output
	- phase1
	---

	# Qwen3-4B Structured Output LoRA (Phase 1)

	This repository provides a LoRA adapter fine-tuned from
	unsloth/Qwen3-4B-Instruct-2507 using QLoRA with Unsloth.

	It is designed to improve the model’s ability to generate structured outputs such as:

	- JSON
	- YAML
	- XML
	- CSV
	- other machine-readable formats

	---

	## What This Repository Contains

	⚠ Important

	This repository contains LoRA adapter weights only.
	It does not include the base model.

	To use this adapter, you must load it on top of the original base model:

	```
	unsloth/Qwen3-4B-Instruct-2507
	```

	---

	## Training Details

	### Training Phase

	This adapter was trained as Phase 1 using the following datasets:

	- `u-10bei/structured_data_with_cot_dataset_512_v2`
	- `u-10bei/structured_data_with_cot_dataset_512_v4`
	- `u-10bei/structured_data_with_cot_dataset_512_v5`

	Further training (Phase 2) may be performed later using additional datasets.

	---

	### Training Method

	- Method: QLoRA (4-bit)
	- Framework: Unsloth + PEFT
	- Base model: `unsloth/Qwen3-4B-Instruct-2507`
	- Maximum sequence length: 1024
	- Loss applied only to final assistant output
	- Intermediate Chain-of-Thought reasoning is masked

	---

	### Hyperparameters (Phase 1)

	- LoRA rank (r): 64
	- LoRA alpha: 128
	- Learning rate: 1e-4
	- Epochs: 1
	- Batch size: 2
	- Gradient accumulation: 8

	---

	## How to Use

	Example Python code to load and use this adapter:

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer
	from peft import PeftModel
	import torch

	base_model = "unsloth/Qwen3-4B-Instruct-2507"
	adapter = "cinnamonrooo/qwen3-structeval-phase1"

	tokenizer = AutoTokenizer.from_pretrained(base_model)

	model = AutoModelForCausalLM.from_pretrained(
	base_model,
	torch_dtype=torch.float16,
	device_map="auto",
	)

	model = PeftModel.from_pretrained(model, adapter)

	prompt = "Convert the following text into JSON format:\nName: John\nAge: 25"

	inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
	outputs = model.generate(**inputs, max_new_tokens=200)

	print(tokenizer.decode(outputs[0], skip_special_tokens=True))
	```

	---

	## License and Terms

	- Training datasets: MIT License
	- Base model: subject to original model license
	- This adapter follows Apache 2.0 License

	Users must comply with both:

	1. The dataset license
	2. The original base model terms

	---

	## Notes

	- This adapter is optimized for structured generation tasks
	- It may not improve general conversational performance
	- Designed primarily for format-following and machine-readable output accuracy

	---

	### Future Plans

	- Additional training with more datasets (Phase 2)
	- Evaluation on structured output benchmarks
	- Possible quantized release versions

	---

	If you have any questions or feedback, feel free to open an issue.