Battleship PPO Agent โ€” 106.9 mean shots

Best checkpoint from Mar 21 autoresearch session (50 experiments).

Key Result

Metric Value
eval/mean_shots 106.9
previous_best 109.6
improvement 2.7 shots

Training Config

Parameter Value
lr_schedule linear
n_epochs 2
lr 1e-5
clip_range 0.05
ent_coef 0.002
ent_coef_final 0.0002
n_envs 4
n_steps 128
batch_size 128
gamma 0.995
reward_preset efficiency_v2_infogain
seed 51
device cuda (RTX 5070 Ti)
time_budget 30min
Downloads last month
74
Video Preview
loading