|
|
---
|
|
|
tags:
|
|
|
- CliffWalking-v0
|
|
|
- q-learning
|
|
|
- reinforcement-learning
|
|
|
- custom-implementation
|
|
|
model-index:
|
|
|
- name: qlearning
|
|
|
results:
|
|
|
- task:
|
|
|
type: reinforcement-learning
|
|
|
name: reinforcement-learning
|
|
|
dataset:
|
|
|
name: CliffWalking-v0
|
|
|
type: CliffWalking-v0
|
|
|
metrics:
|
|
|
- type: mean_reward
|
|
|
value: -13.00 +/- 0.00
|
|
|
name: mean_reward
|
|
|
verified: false
|
|
|
---
|
|
|
|
|
|
# Q-Learning Agent playing CliffWalking-v0
|
|
|
|
|
|
This is a trained model of a Q-Learning agent playing **CliffWalking-v0**.
|
|
|
The agent was trained for 100000 episodes.
|
|
|
|
|
|
## Evaluation Results
|
|
|
- Mean Reward: -13.00 +/- 0.00
|
|
|
|
|
|
## Usage
|
|
|
```python
|
|
|
import gymnasium as gym
|
|
|
import pickle
|
|
|
from huggingface_hub import hf_hub_download
|
|
|
|
|
|
def load_from_hub(repo_id, filename):
|
|
|
pickle_model = hf_hub_download(repo_id=repo_id, filename=filename)
|
|
|
with open(pickle_model, 'rb') as f:
|
|
|
downloaded_model_file = pickle.load(f)
|
|
|
return downloaded_model_file
|
|
|
|
|
|
model_data = load_from_hub(repo_id="dllmpg/qlearning", filename="q-learning.pkl")
|
|
|
q_table = model_data["qtable"]
|
|
|
env_id = model_data["env_id"]
|
|
|
|
|
|
# Example of running the loaded agent
|
|
|
env = gym.make(env_id)
|
|
|
raw_state, info = env.reset()
|
|
|
state_idx = raw_state # CliffWalking uses direct state indexing
|
|
|
# ... run agent using greedy_policy(q_table, state_idx) ...
|
|
|
```
|
|
|
|