pi0-FAST YouTube Cleaning Experiment

Thesis: Open-source YouTube videos can fine-tune VLA models to produce valid robot actions.

Key Results

Metric Value
Base model loss 39.47
Fine-tuned loss 12.91
Improvement 67.3%
Joint limit compliance 100%
Training time 77 min (1 epoch, A100 80GB)
Trainable params 13.3M / 2.9B (0.45% LoRA)

Pipeline

YouTube cleaning videos β†’ HaMeR 3D hand tracking β†’ VLM labeling β†’
Franka IK retargeting β†’ LeRobot HDF5 β†’ pi0-FAST LoRA fine-tuning

Repository Structure

β”œβ”€β”€ EXPERIMENT_SUMMARY.txt     # Full thesis, results, deployment path
β”œβ”€β”€ training/
β”‚   β”œβ”€β”€ training_results.json  # All hyperparams, loss curve, eval metrics
β”‚   β”œβ”€β”€ train_pi0fast.py       # Training script
β”‚   β”œβ”€β”€ train.log              # Raw training log
β”‚   └── adapter_config.json    # LoRA adapter config
β”œβ”€β”€ evaluation/
β”‚   β”œβ”€β”€ eval_results.json      # Per-episode evaluation metrics
β”‚   └── eval_*.mp4             # MuJoCo rendering videos (source + Franka)
β”œβ”€β”€ plots/
β”‚   β”œβ”€β”€ loss_curve.png         # Training loss curve
β”‚   β”œβ”€β”€ loss_curve_log.png     # Loss curve (log scale)
β”‚   β”œβ”€β”€ lr_schedule.png        # Learning rate schedule
β”‚   β”œβ”€β”€ base_vs_finetuned.png  # Comparison bar chart
β”‚   └── training_speed.png     # Wall clock time vs steps
└── checkpoint_epoch1/         # LoRA adapter weights (51MB)
    β”œβ”€β”€ adapter_model.safetensors
    └── adapter_config.json

What We Proved

  1. Valid actions: 100% within Franka joint limits from YouTube-derived data
  2. Significant learning: 67.3% loss reduction in 1 epoch
  3. Preserved base knowledge: LoRA (0.45% params) keeps pretrained capabilities
  4. Minimal data pipeline: 2 YouTube videos β†’ 361 episodes, fully automated

See EXPERIMENT_SUMMARY.txt for full details on deployment path.

Downloads last month

-

Downloads are not tracked for this model. How to track
Video Preview
loading