JoshuaFreeman commited on
Commit
2e1dcd7
·
verified ·
1 Parent(s): 2b3a44d

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +7 -7
README.md CHANGED
@@ -17,20 +17,20 @@ PPO-trained agent for [OpenFront.io](https://openfront.io), a multiplayer territ
17
  - **Architecture:** Actor-Critic with shared backbone (512→512→256)
18
  - **Observation dim:** 80
19
  - **Max neighbors:** 16
20
- - **Maps:** plains, big_plains, world, giantworldmap, ocean_and_land, half_land_half_ocean (random per episode)
21
  - **Opponents:** 2 Easy bots
22
  - **Parallel envs:** 8
23
  - **Learning rate:** 0.00015
24
  - **Rollout steps:** 512
25
- - **Updates trained:** 330
26
- - **Global steps:** 1351680
27
- - **Best mean reward:** 591.3189961528778
28
 
29
  ## Final Training Metrics
30
 
31
- - **Mean reward:** 591.3189961528778
32
- - **Mean episode length:** 3142.3
33
- - **Loss:** 1779.034423828125
34
 
35
  ## Usage
36
 
 
17
  - **Architecture:** Actor-Critic with shared backbone (512→512→256)
18
  - **Observation dim:** 80
19
  - **Max neighbors:** 16
20
+ - **Maps:** plains, big_plains, world, giantworldmap, ocean_and_land, half_land_half_ocean, europe, europeclassic, northamerica, africa, asia, australia, southamerica, mediterranean, britannia, britanniaclassic, eastasia, oceania, pangaea, mena, aegean, alps, amazonriver, amazonriverwide, arctic, baikal, beringstrait, betweentwoseas, blacksea, bosphorusstraits, deglaciatedantarctica, falklandislands, faroeislands, fourislands, gatewaytotheatlantic, gulfofstlawrence, halkidiki, hawaii, iceland, italia, japan, lemnos, lisbon, manicouagan, niledelta, passage, sanfrancisco, straitofgibraltar, straitofhormuz, surrounded, thebox, theboxplus, tourney1, tourney2, tourney3, tourney4, tradersdream, twolakes, worldrotated, yenisei, achiran, mars, milkyway, montreal, newyorkcity, pluto, reglaciatedantarctica (random per episode)
21
  - **Opponents:** 2 Easy bots
22
  - **Parallel envs:** 8
23
  - **Learning rate:** 0.00015
24
  - **Rollout steps:** 512
25
+ - **Updates trained:** 2480
26
+ - **Global steps:** 10158080
27
+ - **Best mean reward:** 1445.384720082283
28
 
29
  ## Final Training Metrics
30
 
31
+ - **Mean reward:** 254.607934988462
32
+ - **Mean episode length:** 2687.8153846153846
33
+ - **Loss:** 227.44029235839844
34
 
35
  ## Usage
36