qb1t/lekiwi-hilserl-reward-data
Viewer • Updated • 3.01k
How to use qb1t/lekiwi-reward-classifier with LeRobot:
A reward classifier is a lightweight neural network that scores observations or trajectories for task success, providing a learned reward signal or offline evaluation when explicit rewards are unavailable.
This reward model has been trained and pushed to the Hub using LeRobot. See the full documentation at LeRobot Docs.
lerobot-train \
--dataset.repo_id=${HF_USER}/<dataset> \
--reward_model.type=reward_classifier \
--output_dir=outputs/train/<desired_reward_model_repo_id> \
--job_name=lerobot_reward_training \
--reward_model.device=cuda \
--reward_model.repo_id=${HF_USER}/<desired_reward_model_repo_id> \
--wandb.enable=true
Writes checkpoints to outputs/train/<desired_reward_model_repo_id>/checkpoints/.
from lerobot.rewards import make_reward_model
reward_model = make_reward_model(pretrained_path="<hf_user>/<reward_model_repo_id>")
reward = reward_model.compute_reward(batch)