π₯ Demo Video
Here is a replay of the trained agent (tuned by Optuna) solving the CartPole-v1 environment.
A2C Agent for CartPole-v1 (Tuned with Optuna)
This is an A2C agent trained on the CartPole-v1 environment.
This model was trained using Stable-Baselines3 and the hyperparameters were automatically tuned using Optuna.
- Repository:
zikangzheng/CartPole-Optuna - Environment:
CartPole-v1 - RL Algorithm:
A2C - Framework:
stable-baselines3 - Tuning Library:
Optuna
Installation
pip install stable-baselines3[extra] huggingface_sb3 gymnasium
import gymnasium as gym
from huggingface_sb3 import load_from_hub
from stable_baselines3 import A2C
repo_id = "zikangzheng/CartPole-Optuna"
filename = "a2c_cartpole_optuna_best.zip"
model = load_from_hub(repo_id, filename)
env = gym.make("CartPole-v1", render_mode="human")
(obs, info) = env.reset()
for _ in range(1000):
action, _states = model.predict(obs, deterministic=True)
obs, reward, terminated, truncated, info = env.step(action)
if terminated or truncated:
obs, info = env.reset()
π Metric
- Mean Reward: 489.5 +/- 15.2
βοΈ Hyperparameters
{
"gamma": 0.99,
"lr": 0.00068,
"n_steps": 256,
"max_grad_norm": 0.8,
"net_arch": "small",
"activation_fn": "tanh"
}
Inference Providers NEW
This model isn't deployed by any Inference Provider. π Ask for provider support