hubnemo/so101_matchbox_reward_fpv_less_bias
Viewer • Updated • 9.03k • 108
How to use Orellius/so101_matchbox_fpv_reward_model with LeRobot:
A reward classifier is a lightweight neural network that scores observations or trajectories for task success, providing a learned reward signal or offline evaluation when explicit rewards are unavailable.
This policy has been trained and pushed to the Hub using LeRobot. See the full documentation at LeRobot Docs.
For a complete walkthrough, see the training guide. Below is the short version on how to train and run inference/eval:
python lerobot/scripts/train.py \
--dataset.repo_id=${HF_USER}/<dataset> \
--policy.type=act \
--output_dir=outputs/train/<desired_policy_repo_id> \
--job_name=lerobot_training \
--policy.device=cuda \
--policy.repo_id=${HF_USER}/<desired_policy_repo_id>
--wandb.enable=true
Writes checkpoints to outputs/train/<desired_policy_repo_id>/checkpoints/.
python -m lerobot.record \
--robot.type=so100_follower \
--dataset.repo_id=<hf_user>/eval_<dataset> \
--policy.path=<hf_user>/<desired_policy_repo_id> \
--episodes=10
Prefix the dataset repo with eval_ and supply --policy.path pointing to a local or hub checkpoint.