Spaces:
Paused
Paused
A newer version of the Gradio SDK is available: 6.13.0
metadata
title: Phi-3 DPO Training on BEIR
emoji: 🚀
colorFrom: blue
colorTo: green
sdk: docker
app_port: 7860
pinned: false
models:
- microsoft/Phi-3-mini-4k-instruct
Phi-3 Mini DPO Training on BEIR Dataset
This Space trains Phi-3 Mini using DPO (Direct Preference Optimization) on BEIR relevance data.
Features
- Automatic checkpoint saving to Hugging Face Hub
- Live inference during training via API
- Custom validation similar to evaluate.py
- Optimized for A10G GPU
Setup
Set these secrets in your Space:
HF_TOKEN: Your Hugging Face write tokenHF_USERNAME: Your Hugging Face usernameWANDB_API_KEY: (Optional) For experiment tracking
The model will be saved to:
https://huggingface.co/{HF_USERNAME}/phi3-dpo-beir
Training Progress
Training automatically starts when the Space is running. Check the logs for progress.
API Endpoints
While training, you can test the model:
# Check health
curl http://localhost:5000/health
# Run inference
curl -X POST http://localhost:5000/inference \
-H "Content-Type: application/json" \
-d '{
"query": "What is machine learning?",
"document": "Machine learning is a subset of AI..."
}'
# List checkpoints
curl http://localhost:5000/checkpoints
# Force reload latest checkpoint
curl -X POST http://localhost:5000/inference?reload=true \
-H "Content-Type: application/json" \
-d '{"query": "...", "document": "..."}'
Files
train.csv: Training data with prompts, chosen, and rejected responsesval.csv: Validation data in same formattest.csv: Test data in same format