JoshuaFreeman commited on
Commit
78daf4f
·
verified ·
1 Parent(s): 939547d

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +9 -9
README.md CHANGED
@@ -14,23 +14,23 @@ PPO-trained agent for [OpenFront.io](https://openfront.io), a multiplayer territ
14
  ## Training Details
15
 
16
  - **Algorithm:** PPO (Proximal Policy Optimization)
17
- - **Architecture:** Actor-Critic with shared backbone (256256128)
18
  - **Observation dim:** 80
19
  - **Max neighbors:** 16
20
  - **Maps:** plains, big_plains, world, giantworldmap, ocean_and_land, half_land_half_ocean (random per episode)
21
  - **Opponents:** 2 Easy bots
22
  - **Parallel envs:** 8
23
- - **Learning rate:** 0.0002
24
  - **Rollout steps:** 512
25
- - **Updates trained:** 1650
26
- - **Global steps:** 6758400
27
- - **Best mean reward:** 468.54246531009676
28
 
29
  ## Final Training Metrics
30
 
31
- - **Mean reward:** 231.36178754091262
32
- - **Mean episode length:** 6722.13
33
- - **Loss:** 1.217943549156189
34
 
35
  ## Usage
36
 
@@ -38,7 +38,7 @@ PPO-trained agent for [OpenFront.io](https://openfront.io), a multiplayer territ
38
  from train import ActorCritic
39
  import torch
40
 
41
- model = ActorCritic(obs_dim=80, max_neighbors=16)
42
  model.load_state_dict(torch.load("best_model.pt", weights_only=True))
43
  model.eval()
44
  ```
 
14
  ## Training Details
15
 
16
  - **Algorithm:** PPO (Proximal Policy Optimization)
17
+ - **Architecture:** Actor-Critic with shared backbone (512512256)
18
  - **Observation dim:** 80
19
  - **Max neighbors:** 16
20
  - **Maps:** plains, big_plains, world, giantworldmap, ocean_and_land, half_land_half_ocean (random per episode)
21
  - **Opponents:** 2 Easy bots
22
  - **Parallel envs:** 8
23
+ - **Learning rate:** 0.00015
24
  - **Rollout steps:** 512
25
+ - **Updates trained:** 330
26
+ - **Global steps:** 1351680
27
+ - **Best mean reward:** 591.3189961528778
28
 
29
  ## Final Training Metrics
30
 
31
+ - **Mean reward:** 591.3189961528778
32
+ - **Mean episode length:** 3142.3
33
+ - **Loss:** 1779.034423828125
34
 
35
  ## Usage
36
 
 
38
  from train import ActorCritic
39
  import torch
40
 
41
+ model = ActorCritic(obs_dim=80, max_neighbors=16, hidden_sizes=[512, 512, 256])
42
  model.load_state_dict(torch.load("best_model.pt", weights_only=True))
43
  model.eval()
44
  ```