qwen3-flask-lora / README.md

Update README.md

f8d83ea verified 6 months ago

3.89 kB

	---
	license: mit
	tags:
	- qwen
	- fine-tuning
	- LoRA
	- flask
	- instruction-tuning
	- PEFT
	- qwen3
	language: en
	base_model: Qwen/Qwen3-0.6B-Base
	library_name: transformers
	model-index:
	- name: Qwen3-Flask LoRA Adapter
	results: []
	---

	# 🧩 Qwen3-Flask LoRA Adapter (Q&A Fine-Tuning)

	This repository contains the LoRA adapter only version of [Qwen/Qwen3-0.6B-Base](https://huggingface.co/Qwen/Qwen3-0.6B-Base), fine-tuned on a high-quality dataset derived from Flask's official documentation, source code, and changelogs.

	Use this if you want a lightweight, plug-and-play adapter on top of the base Qwen3-0.7B model.

	---

	## 🎯 Objective

	- Help developers understand Flask’s internals more intuitively
	- Convert verbose docstrings and changelogs into question-answer pairs
	- Enable real-world integration using Alpaca-style instruction prompts

	---

	## 🧠 Use Cases

	- Answer questions about Flask APIs (`before_request`, `url_defaults`, etc.)
	- Provide upgrade/migration insights from older Flask versions
	- Summarize internal logic in conversational, Q&A format

	---

	## 🛠️ Adapter Details

	\| Setting \| Value \|
	\|----------------\|----------------------\|
	\| PEFT method \| LoRA (Low-Rank Adaptation) \|
	\| LoRA Rank \| 16 \|
	\| Alpha \| 32 \|
	\| Target Modules \| `query_key_value` \|
	\| Quantization \| 4-bit NF4 (bitsandbytes) \|
	\| Base Model \| Qwen/Qwen3-0.6B-Base \|

	---

	## 🧪 Dataset Overview

	- Total chunks extracted from Flask: 804
	- Valid, logic-rich chunks selected: 345
	- Final Gemini-generated Q&A pairs: 1,425

	Example:
	```json
	{
	"instruction": "How does `url_defaults` work in Flask?",
	"input": "When used on an app, this is called for every request...",
	"output": "`url_defaults` is triggered for every request when registered on an app. When registered on a blueprint, it affects only requests handled by that blueprint..."
	}
	````

	---

	## 🧠 Prompt Format (Alpaca Style)

	```text
	### Instruction:
	What is the difference between `url_defaults` on app vs blueprint?

	### Input:
	Docstring excerpt from Flask...

	### Response:
	<Model-generated explanation>
	```

	---

	## 🧪 How to Use (with PEFT)

	```python
	from peft import PeftModel
	from transformers import AutoModelForCausalLM, AutoTokenizer

	base_model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen3-0.6B-Base", trust_remote_code=True, device_map="auto")
	tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen3-0.6B-Base", trust_remote_code=True)

	model = PeftModel.from_pretrained(base_model, "devanshdhir/qwen3-flask-lora")

	prompt = """### Instruction:
	What does `before_request` do in Flask?
	### Input:
	None
	### Response:"""

	inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
	outputs = model.generate(**inputs, max_new_tokens=300)
	print(tokenizer.decode(outputs[0], skip_special_tokens=True))
	```

	---

	## 🔁 Want the Merged Model?

	Use [devanshdhir/qwen3-flask-full](https://huggingface.co/devanshdhir/qwen3-flask-full)
	It has the LoRA adapter merged into the base weights for direct inference — no PEFT loading required.

	---

	## ⚠️ Limitations

	* Responses depend on Alpaca-style prompting
	* Does not generalize well outside Flask/internal documentation
	* Data generated using Gemini, not manually curated

	---

	## 🔗 Related

	* 🧠 Base model: [Qwen/Qwen3-0.6B-Base](https://huggingface.co/Qwen/Qwen3-0.6B-Base)
	* 🧠 Merged version: [devanshdhir/qwen3-flask-full](https://huggingface.co/devanshdhir/qwen3-flask-full)
	* 🧪 PEFT Documentation: [https://github.com/huggingface/peft](https://github.com/huggingface/peft)

	---

	## 📎 Citation

	```bibtex
	@misc{qwen3flasklora2025,
	title = {Qwen3-Flask LoRA Adapter},
	author = {Devansh Dhir},
	year = {2025},
	url = {https://huggingface.co/devanshdhir/qwen3-flask-lora}
	}