Battleship PPO Agent โ 106.9 mean shots
Best checkpoint from Mar 21 autoresearch session (50 experiments).
Key Result
| Metric | Value |
|---|---|
| eval/mean_shots | 106.9 |
| previous_best | 109.6 |
| improvement | 2.7 shots |
Training Config
| Parameter | Value |
|---|---|
| lr_schedule | linear |
| n_epochs | 2 |
| lr | 1e-5 |
| clip_range | 0.05 |
| ent_coef | 0.002 |
| ent_coef_final | 0.0002 |
| n_envs | 4 |
| n_steps | 128 |
| batch_size | 128 |
| gamma | 0.995 |
| reward_preset | efficiency_v2_infogain |
| seed | 51 |
| device | cuda (RTX 5070 Ti) |
| time_budget | 30min |
- Downloads last month
- 74