train-mbed / README_HF_SPACE.md
amos1088's picture
tt
d8bb5bb

A newer version of the Gradio SDK is available: 6.13.0

Upgrade
metadata
title: Phi-3 DPO Training on BEIR
emoji: 🚀
colorFrom: blue
colorTo: green
sdk: docker
app_port: 7860
pinned: false
models:
  - microsoft/Phi-3-mini-4k-instruct

Phi-3 Mini DPO Training on BEIR Dataset

This Space trains Phi-3 Mini using DPO (Direct Preference Optimization) on BEIR relevance data.

Features

  • Automatic checkpoint saving to Hugging Face Hub
  • Live inference during training via API
  • Custom validation similar to evaluate.py
  • Optimized for A10G GPU

Setup

  1. Set these secrets in your Space:

    • HF_TOKEN: Your Hugging Face write token
    • HF_USERNAME: Your Hugging Face username
    • WANDB_API_KEY: (Optional) For experiment tracking
  2. The model will be saved to: https://huggingface.co/{HF_USERNAME}/phi3-dpo-beir

Training Progress

Training automatically starts when the Space is running. Check the logs for progress.

API Endpoints

While training, you can test the model:

# Check health
curl http://localhost:5000/health

# Run inference
curl -X POST http://localhost:5000/inference \
  -H "Content-Type: application/json" \
  -d '{
    "query": "What is machine learning?",
    "document": "Machine learning is a subset of AI..."
  }'

# List checkpoints
curl http://localhost:5000/checkpoints

# Force reload latest checkpoint
curl -X POST http://localhost:5000/inference?reload=true \
  -H "Content-Type: application/json" \
  -d '{"query": "...", "document": "..."}'

Files

  • train.csv: Training data with prompts, chosen, and rejected responses
  • val.csv: Validation data in same format
  • test.csv: Test data in same format