Spaces:
Sleeping
Sleeping
A newer version of the Gradio SDK is available:
6.1.0
Deploying to Hugging Face Spaces with GPU
This guide shows how to deploy the fine-tuning project to Hugging Face Spaces to leverage GPU training.
Prerequisites
- Hugging Face account with Pro license (for GPU access)
- Hugging Face CLI installed and authenticated
Setup Steps
1. Authenticate with Hugging Face
huggingface-cli login
Enter your HF token when prompted.
2. Create a New Space
Go to https://huggingface.co/spaces and click "Create new Space":
- Owner: Your username/organization
- Space name:
qwen-codeforces-finetune(or your preferred name) - License: Apache 2.0 (or your choice)
- Space SDK: Gradio
- Space hardware: GPU - T4 small (or higher for faster training)
- Important: You need HF Pro to access GPU hardware
- T4 small is sufficient for this 0.5B model
- For faster training, consider A10G or A100
3. Clone Your New Space
git clone https://huggingface.co/spaces/YOUR_USERNAME/qwen-codeforces-finetune
cd qwen-codeforces-finetune
4. Copy Project Files
Copy these files from your local project to the Space directory:
cp app.py requirements.txt finetune.py test_model.py README.md .gitignore ./qwen-codeforces-finetune/
5. Push to Space
git add .
git commit -m "Initial commit: Qwen fine-tuning on Codeforces CoTs"
git push
6. Configure Space Hardware
After pushing, go to your Space settings:
- Navigate to "Settings" tab
- Under "Space hardware", select a GPU option:
- T4 small: Good for testing (16 GB VRAM)
- A10G small: Faster training (24 GB VRAM)
- A100: Fastest but more expensive (40 GB VRAM)
7. Monitor Training
Once the Space builds and runs:
- Click the "Start Training" button
- Watch the real-time output in the interface
- Training will save checkpoints every 200 steps
- Final model saved to
./qwen-codeforces-cots/
Training Time Estimates
With 1000 steps and batch size 1 (gradient accumulation 16):
- T4 small: ~2-3 hours
- A10G small: ~1-2 hours
- A100: ~30-60 minutes
Downloading the Trained Model
After training completes on Spaces:
Option 1: Via Files Tab
- Go to your Space's "Files" tab
- Navigate to
qwen-codeforces-cots/ - Download the adapter files:
adapter_config.jsonadapter_model.safetensors(or.bin)tokenizer_config.jsonspecial_tokens_map.json- Other tokenizer files
Option 2: Via Git
git clone https://huggingface.co/spaces/YOUR_USERNAME/qwen-codeforces-finetune
cd qwen-codeforces-finetune
# Model will be in qwen-codeforces-cots/ directory
Option 3: Upload to Model Hub
After training, you can upload the adapter to the Hugging Face Model Hub:
from huggingface_hub import HfApi
api = HfApi()
api.upload_folder(
folder_path="./qwen-codeforces-cots",
repo_id="YOUR_USERNAME/qwen-codeforces-cots-lora",
repo_type="model",
)
Then load it anywhere:
from peft import PeftModel
from transformers import AutoModelForCausalLM
base_model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2.5-0.5B-Instruct")
model = PeftModel.from_pretrained(base_model, "YOUR_USERNAME/qwen-codeforces-cots-lora")
Cost Considerations
With HF Pro ($9/month):
- Get 5 free GPU hours per month
- Additional GPU time is charged based on hardware tier
- T4 small: ~$0.60/hour
- A10G small: ~$3.15/hour
For 1000 steps (~2-3 hours on T4), training costs:
- Within free tier: $0
- If exceeding free hours: ~$1.20-1.80
Troubleshooting
Space Crashes or OOM
- Reduce
per_device_train_batch_sizein finetune.py - Reduce
max_seq_lengthto 1024 or 512 - Ensure you selected a GPU hardware option
Training Not Starting
- Check Space logs in the "Logs" tab
- Verify all dependencies are in requirements.txt
- Make sure GPU hardware is selected (not CPU)
Slow Training
- Upgrade to A10G or A100 hardware
- Increase batch size if you have VRAM headroom
- Check if using 4-bit quantization (should be automatic with CUDA)
Alternative: Hugging Face AutoTrain
For a no-code option, consider using Hugging Face AutoTrain:
pip install autotrain-advanced
autotrain llm --train --model Qwen/Qwen2.5-0.5B-Instruct \
--data-path . --lr 2e-4 --batch-size 1 \
--epochs 1 --trainer sft