Q-Learning Agent Playing Taxi-v3
This repository contains a tabular Q-learning agent trained locally from the Hugging Face Deep RL Course Unit 2 setup.
Results
- Mean reward:
7.52 +/- 2.70 - Evaluation episodes:
100
Hyperparameters
env_id: Taxi-v3
repo_id: jnforja/q-Taxi-v3
seed: 42
n_training_episodes: 25000
learning_rate: 0.7
n_eval_episodes: 100
max_steps: 99
gamma: 0.95
max_epsilon: 1.0
min_epsilon: 0.05
decay_rate: 0.005
Files
q-learning.pkl: pickled Q-table, environment id, config, and evaluation seeds.q-table.npy: raw NumPy Q-table.results.json: evaluation output.replay.mp4: rendered greedy-policy episode preview.
Evaluation results
- mean_reward on Taxi-v3self-reported7.52 +/- 2.70