🚖 Q-Learning Agent for Taxi-v3

This is a trained Q-Learning agent for the Taxi-v3 environment using tabular Q-learning.

Developer

Vishand S (@Vishand03)

Frameworks

NumPy
Gymnasium

Training Details

Algorithm: Q-Learning
Timesteps / Episodes: 2,000,000
Learning rate: 0.1
Discount factor (γ): 0.99
Epsilon decay: 0.0005
Max / Min epsilon: 1.0 / 0.01
Mean Reward: ~7.92 ± 2.60

🎥 Demo (Preview)

🎬 Full Demo Video

👉 Watch the full video here

🛠 Usage

import gymnasium as gym
import numpy as np
import pickle
from huggingface_hub import hf_hub_download

# -------------------------
# Load Q-table from Hugging Face
# -------------------------
q_table_path = hf_hub_download("Vishand03/q-Taxi-v3", "q-learning.pkl")
with open(q_table_path, "rb") as f:
    Qtable = pickle.load(f)

# -------------------------
# Create Taxi Environment
# -------------------------
env = gym.make("Taxi-v3", render_mode="human")
state, _ = env.reset()
terminated, truncated = False, False

# -------------------------
# Run one episode
# -------------------------
total_reward = 0
while not terminated and not truncated:
    action = np.argmax(Qtable[state])
    state, reward, terminated, truncated, _ = env.step(action)
    total_reward += reward

print(f"Episode finished with total reward: {total_reward}")

Downloads last month: -; Downloads are not tracked for this model. How to track

Video Preview

Reinforcement Learning

Evaluation results

mean_reward on Taxi-v3
self-reported

7.92 +/- 2.60