| --- |
| license: mit |
| tags: |
| - reinforcement-learning |
| - ppo |
| - openfront |
| - game-ai |
| --- |
| |
| # OpenFront RL Agent |
|
|
| PPO-trained agent for [OpenFront.io](https://openfront.io), a multiplayer territory control game. |
|
|
| ## Training Details |
|
|
| - **Algorithm:** PPO (Proximal Policy Optimization) |
| - **Architecture:** Actor-Critic with shared backbone (512→512→256) |
| - **Observation dim:** 96 |
| - **Max neighbors:** 16 |
| - **Maps:** plains, big_plains, ocean_and_land, half_land_half_ocean (random per episode) |
| - **Opponents:** N/A Easy bots |
| - **Parallel envs:** 16 |
| - **Learning rate:** 0.00034 |
| - **Rollout steps:** 1024 |
| - **Updates trained:** 660 |
| - **Global steps:** 86507520 |
| - **Best mean reward:** -0.06284408122301102 |
|
|
| ## Final Training Metrics |
|
|
| - **Mean reward:** -0.5554914677888155 |
| - **Mean episode length:** 7626.04 |
| - **Loss:** -0.16370002925395966 |
|
|
| ## Usage |
|
|
| ```python |
| from train import ActorCritic |
| import torch |
| |
| model = ActorCritic(obs_dim=96, max_neighbors=16, hidden_sizes=[512, 512, 256]) |
| model.load_state_dict(torch.load("best_model.pt", weights_only=True)) |
| model.eval() |
| ``` |
|
|
| ## Repository |
|
|
| Trained from [josh-freeman/openfront-rl](https://github.com/josh-freeman/openfront-rl). |
|
|