Spaces:

crazycog
/

qwen-lora-training

Runtime error

App Files Files Community

qwen-lora-training / README.md

Brad Wilson

Deploy Qwen2.5 LoRA fine-tuning application

51af7d3 4 months ago

preview code

raw

history blame contribute delete

3.97 kB

A newer version of the Gradio SDK is available: 6.13.0

Upgrade

metadata

title: Qwen2.5 LoRA Fine-tuning
emoji: 🤖
colorFrom: purple
colorTo: pink
sdk: gradio
sdk_version: 4.0.0
app_file: app.py
pinned: false
license: apache-2.0

🤖 LoRA Fine-tuning for Qwen2.5-7B-Instruct

Train Qwen2.5-7B-Instruct with LoRA on your sysadmin personality dataset using proper Qwen2 chat templates.

Features

✅ Qwen2 Chat Template - Proper system/user/assistant formatting
✅ 4-bit Quantization - QLoRA for memory efficiency
✅ PEFT Integration - Parameter-efficient fine-tuning
✅ Custom System Prompt - Configurable personality
✅ Gradio UI - Easy web interface
✅ Auto Push to Hub - Direct upload after training

Quick Start

Upgrade to GPU: Settings → Hardware → Select GPU (A10G recommended)
Configure Training: Set your dataset and parameters
Set System Prompt: Customize the AI personality
Add HF Token: (Optional) For private datasets or pushing models
Start Training: Click the button and monitor progress

Default Configuration

Optimized for A10G (24GB VRAM):

Batch size: 4
Gradient accumulation: 4
Max sequence length: 2048
LoRA rank: 16
4-bit quantization: Enabled

For T4 (16GB VRAM):

Reduce batch size to 2
Increase gradient accumulation to 8
Reduce max sequence length to 1024

Qwen2 Chat Template

The training automatically formats your data using Qwen2's chat template:

<|im_start|>system
You are an experienced Linux system administrator.<|im_end|>
<|im_start|>user
How do I check disk usage?<|im_end|>
<|im_start|>assistant
Use the 'df -h' command...<|im_end|>

Dataset Format

Your dataset should have Q&A pairs in one of these formats:

{"question": "...", "answer": "..."}
{"instruction": "...", "response": "..."}
{"input": "...", "output": "..."}
{"text": "..."}

The app auto-detects and converts to proper chat format.

Using Your Fine-tuned Model

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel

# Load model
base_model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2.5-7B-Instruct")
model = PeftModel.from_pretrained(base_model, "crazycog/qwen-lora-sysadmin")
tokenizer = AutoTokenizer.from_pretrained("crazycog/qwen-lora-sysadmin")

# Use with chat template
messages = [
    {"role": "system", "content": "You are an experienced Linux system administrator."},
    {"role": "user", "content": "How do I check memory usage?"}
]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to("cuda")
outputs = model.generate(**inputs, max_new_tokens=256)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Training Time & Cost

GPU	Cost/Hour	Training Time (10k examples)	Total Cost
T4 (16GB)	$0.60	~4 hours	~$2.40
A10G (24GB) ⭐	$1.00	~2 hours	~$2.00
A100 (40GB)	$3.00	~1 hour	~$3.00

System Prompt Examples

Linux SysAdmin:

You are an experienced Linux system administrator with deep knowledge of system operations, troubleshooting, and best practices.

DevOps Engineer:

You are a DevOps engineer specializing in cloud infrastructure, CI/CD, and container orchestration.

Security Expert:

You are a cybersecurity expert specializing in Linux hardening, network security, and threat detection.

Troubleshooting

Out of Memory

Enable 4-bit quantization ✓
Reduce batch size to 2
Increase gradient accumulation to 8
Reduce max sequence length to 1024

Slow Training

Verify GPU is enabled in Space settings
Upgrade to A10G or A100
Reduce max sequence length if most examples are shorter

Upload Failed

Check HF token has write permissions
Verify repo name doesn't conflict
Try creating the repo manually first

License

Apache 2.0