--- tags: - CliffWalking-v0 - q-learning - reinforcement-learning - custom-implementation model-index: - name: qlearning results: - task: type: reinforcement-learning name: reinforcement-learning dataset: name: CliffWalking-v0 type: CliffWalking-v0 metrics: - type: mean_reward value: -13.00 +/- 0.00 name: mean_reward verified: false --- # Q-Learning Agent playing CliffWalking-v0 This is a trained model of a Q-Learning agent playing **CliffWalking-v0**. The agent was trained for 100000 episodes. ## Evaluation Results - Mean Reward: -13.00 +/- 0.00 ## Usage ```python import gymnasium as gym import pickle from huggingface_hub import hf_hub_download def load_from_hub(repo_id, filename): pickle_model = hf_hub_download(repo_id=repo_id, filename=filename) with open(pickle_model, 'rb') as f: downloaded_model_file = pickle.load(f) return downloaded_model_file model_data = load_from_hub(repo_id="dllmpg/qlearning", filename="q-learning.pkl") q_table = model_data["qtable"] env_id = model_data["env_id"] # Example of running the loaded agent env = gym.make(env_id) raw_state, info = env.reset() state_idx = raw_state # CliffWalking uses direct state indexing # ... run agent using greedy_policy(q_table, state_idx) ... ```