Spaces:
Running
EpisodeVault : open source tool to find out why your LeRobot model regressed
Been building with LeRobot v3 and kept hitting the same wall: retrain a policy, it gets worse, no clear idea why. DVC tells you files changed. MLflow tells you which run. Nobody tells you which tasks dropped or which episodes degraded between dataset versions.
Built a small open source library to fill that gap. Four commands:
episodevault track ./my_dataset
episodevault commit -m "added kitchen episodes"
episodevault diff v1.0 v2.0
episodevault blame model_v3
The blame command traces a model version back to the exact dataset version that trained it and shows the diff. One line in your training script to enable it.
GitHub: https://github.com/Rohan-Prabhakar/EpisodeVault
Pypi: https://pypi.org/project/episodevault/
Would genuinely like to know if this matches pain people are feeling. Happy to add support for other dataset structures if there are edge cases I haven't covered.