diff --git "a/sf_log.txt" "b/sf_log.txt" --- "a/sf_log.txt" +++ "b/sf_log.txt" @@ -1,32 +1,39 @@ -[2023-09-26 07:47:17,521][91478] Saving configuration to ./train_atari/atari_frostbite/config.json... -[2023-09-26 07:47:17,839][91478] Rollout worker 0 uses device cpu -[2023-09-26 07:47:17,839][91478] Rollout worker 1 uses device cpu -[2023-09-26 07:47:17,840][91478] Rollout worker 2 uses device cpu -[2023-09-26 07:47:17,841][91478] Rollout worker 3 uses device cpu -[2023-09-26 07:47:17,841][91478] Rollout worker 4 uses device cpu -[2023-09-26 07:47:17,842][91478] Rollout worker 5 uses device cpu -[2023-09-26 07:47:17,842][91478] Rollout worker 6 uses device cpu -[2023-09-26 07:47:17,843][91478] Rollout worker 7 uses device cpu -[2023-09-26 07:47:17,843][91478] In synchronous mode, we only accumulate one batch. Setting num_batches_to_accumulate to 1 -[2023-09-26 07:47:17,888][91478] Using GPUs [0] for process 0 (actually maps to GPUs [0]) -[2023-09-26 07:47:17,888][91478] InferenceWorker_p0-w0: min num requests: 1 -[2023-09-26 07:47:17,891][91478] Using GPUs [1] for process 1 (actually maps to GPUs [1]) -[2023-09-26 07:47:17,892][91478] InferenceWorker_p1-w0: min num requests: 1 -[2023-09-26 07:47:17,915][91478] Starting all processes... -[2023-09-26 07:47:17,915][91478] Starting process learner_proc0 -[2023-09-26 07:47:19,511][91478] Starting process learner_proc1 -[2023-09-26 07:47:19,514][91993] Using GPUs [0] for process 0 (actually maps to GPUs [0]) -[2023-09-26 07:47:19,515][91993] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0 -[2023-09-26 07:47:19,533][91993] Num visible devices: 1 -[2023-09-26 07:47:19,554][91993] Starting seed is not provided -[2023-09-26 07:47:19,554][91993] Using GPUs [0] for process 0 (actually maps to GPUs [0]) -[2023-09-26 07:47:19,554][91993] Initializing actor-critic model on device cuda:0 -[2023-09-26 07:47:19,554][91993] RunningMeanStd input shape: (4, 84, 84) -[2023-09-26 07:47:19,555][91993] RunningMeanStd input shape: (1,) -[2023-09-26 07:47:19,566][91993] ConvEncoder: input_channels=4 -[2023-09-26 07:47:19,728][91993] Conv encoder output size: 512 -[2023-09-26 07:47:19,730][91993] Created Actor Critic model with architecture: -[2023-09-26 07:47:19,730][91993] ActorCriticSharedWeights( +[2023-10-11 14:54:20,022][84230] Saving configuration to ./train_atari/atari_frostbite_APPO/config.json... +[2023-10-11 14:54:20,339][84230] Rollout worker 0 uses device cpu +[2023-10-11 14:54:20,339][84230] Rollout worker 1 uses device cpu +[2023-10-11 14:54:20,340][84230] Rollout worker 2 uses device cpu +[2023-10-11 14:54:20,341][84230] Rollout worker 3 uses device cpu +[2023-10-11 14:54:20,341][84230] Rollout worker 4 uses device cpu +[2023-10-11 14:54:20,341][84230] Rollout worker 5 uses device cpu +[2023-10-11 14:54:20,342][84230] Rollout worker 6 uses device cpu +[2023-10-11 14:54:20,342][84230] Rollout worker 7 uses device cpu +[2023-10-11 14:54:20,343][84230] Rollout worker 8 uses device cpu +[2023-10-11 14:54:20,343][84230] Rollout worker 9 uses device cpu +[2023-10-11 14:54:20,344][84230] Rollout worker 10 uses device cpu +[2023-10-11 14:54:20,344][84230] Rollout worker 11 uses device cpu +[2023-10-11 14:54:20,344][84230] Rollout worker 12 uses device cpu +[2023-10-11 14:54:20,345][84230] Rollout worker 13 uses device cpu +[2023-10-11 14:54:20,345][84230] Rollout worker 14 uses device cpu +[2023-10-11 14:54:20,345][84230] Rollout worker 15 uses device cpu +[2023-10-11 14:54:20,639][84230] Using GPUs [0] for process 0 (actually maps to GPUs [0]) +[2023-10-11 14:54:20,640][84230] InferenceWorker_p0-w0: min num requests: 2 +[2023-10-11 14:54:20,643][84230] Using GPUs [1] for process 1 (actually maps to GPUs [1]) +[2023-10-11 14:54:20,643][84230] InferenceWorker_p1-w0: min num requests: 2 +[2023-10-11 14:54:20,688][84230] Starting all processes... +[2023-10-11 14:54:20,688][84230] Starting process learner_proc0 +[2023-10-11 14:54:22,376][84230] Starting process learner_proc1 +[2023-10-11 14:54:22,381][84801] Using GPUs [0] for process 0 (actually maps to GPUs [0]) +[2023-10-11 14:54:22,381][84801] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0 +[2023-10-11 14:54:22,400][84801] Num visible devices: 1 +[2023-10-11 14:54:22,420][84801] Setting fixed seed 1234 +[2023-10-11 14:54:22,421][84801] Using GPUs [0] for process 0 (actually maps to GPUs [0]) +[2023-10-11 14:54:22,422][84801] Initializing actor-critic model on device cuda:0 +[2023-10-11 14:54:22,422][84801] RunningMeanStd input shape: (4, 84, 84) +[2023-10-11 14:54:22,422][84801] RunningMeanStd input shape: (1,) +[2023-10-11 14:54:22,434][84801] ConvEncoder: input_channels=4 +[2023-10-11 14:54:22,587][84801] Conv encoder output size: 512 +[2023-10-11 14:54:22,589][84801] Created Actor Critic model with architecture: +[2023-10-11 14:54:22,589][84801] ActorCriticSharedWeights( (obs_normalizer): ObservationNormalizer( (running_mean_std): RunningMeanStdDictInPlace( (running_mean_std): ModuleDict( @@ -67,35 +74,41 @@ (distribution_linear): Linear(in_features=512, out_features=18, bias=True) ) ) -[2023-09-26 07:47:20,288][91993] Using optimizer -[2023-09-26 07:47:20,289][91993] No checkpoints found -[2023-09-26 07:47:20,289][91993] Did not load from checkpoint, starting from scratch! -[2023-09-26 07:47:20,289][91993] Initialized policy 0 weights for model version 0 -[2023-09-26 07:47:20,291][91993] LearnerWorker_p0 finished initialization! -[2023-09-26 07:47:20,291][91993] Using GPUs [0] for process 0 (actually maps to GPUs [0]) -[2023-09-26 07:47:21,108][91478] Starting all processes... -[2023-09-26 07:47:21,111][92345] Using GPUs [1] for process 1 (actually maps to GPUs [1]) -[2023-09-26 07:47:21,111][92345] Set environment var CUDA_VISIBLE_DEVICES to '1' (GPU indices [1]) for learning process 1 -[2023-09-26 07:47:21,115][91478] Starting process inference_proc0-0 -[2023-09-26 07:47:21,115][91478] Starting process inference_proc1-0 -[2023-09-26 07:47:21,116][91478] Starting process rollout_proc0 -[2023-09-26 07:47:21,130][92345] Num visible devices: 1 -[2023-09-26 07:47:21,116][91478] Starting process rollout_proc1 -[2023-09-26 07:47:21,116][91478] Starting process rollout_proc2 -[2023-09-26 07:47:21,117][91478] Starting process rollout_proc3 -[2023-09-26 07:47:21,120][91478] Starting process rollout_proc4 -[2023-09-26 07:47:21,149][92345] Starting seed is not provided -[2023-09-26 07:47:21,149][92345] Using GPUs [0] for process 1 (actually maps to GPUs [1]) -[2023-09-26 07:47:21,149][92345] Initializing actor-critic model on device cuda:0 -[2023-09-26 07:47:21,149][92345] RunningMeanStd input shape: (4, 84, 84) -[2023-09-26 07:47:21,150][92345] RunningMeanStd input shape: (1,) -[2023-09-26 07:47:21,122][91478] Starting process rollout_proc5 -[2023-09-26 07:47:21,122][91478] Starting process rollout_proc6 -[2023-09-26 07:47:21,125][91478] Starting process rollout_proc7 -[2023-09-26 07:47:21,162][92345] ConvEncoder: input_channels=4 -[2023-09-26 07:47:21,509][92345] Conv encoder output size: 512 -[2023-09-26 07:47:21,511][92345] Created Actor Critic model with architecture: -[2023-09-26 07:47:21,511][92345] ActorCriticSharedWeights( +[2023-10-11 14:54:23,165][84801] Using optimizer +[2023-10-11 14:54:23,166][84801] No checkpoints found +[2023-10-11 14:54:23,166][84801] Did not load from checkpoint, starting from scratch! +[2023-10-11 14:54:23,166][84801] Initialized policy 0 weights for model version 0 +[2023-10-11 14:54:23,168][84801] LearnerWorker_p0 finished initialization! +[2023-10-11 14:54:23,169][84801] Using GPUs [0] for process 0 (actually maps to GPUs [0]) +[2023-10-11 14:54:24,154][84230] Starting all processes... +[2023-10-11 14:54:24,157][85000] Using GPUs [1] for process 1 (actually maps to GPUs [1]) +[2023-10-11 14:54:24,158][85000] Set environment var CUDA_VISIBLE_DEVICES to '1' (GPU indices [1]) for learning process 1 +[2023-10-11 14:54:24,163][84230] Starting process inference_proc0-0 +[2023-10-11 14:54:24,163][84230] Starting process inference_proc1-0 +[2023-10-11 14:54:24,163][84230] Starting process rollout_proc0 +[2023-10-11 14:54:24,176][85000] Num visible devices: 1 +[2023-10-11 14:54:24,163][84230] Starting process rollout_proc1 +[2023-10-11 14:54:24,164][84230] Starting process rollout_proc2 +[2023-10-11 14:54:24,164][84230] Starting process rollout_proc3 +[2023-10-11 14:54:24,168][84230] Starting process rollout_proc4 +[2023-10-11 14:54:24,172][84230] Starting process rollout_proc5 +[2023-10-11 14:54:24,195][85000] Setting fixed seed 1234 +[2023-10-11 14:54:24,196][85000] Using GPUs [0] for process 1 (actually maps to GPUs [1]) +[2023-10-11 14:54:24,196][85000] Initializing actor-critic model on device cuda:0 +[2023-10-11 14:54:24,196][85000] RunningMeanStd input shape: (4, 84, 84) +[2023-10-11 14:54:24,197][85000] RunningMeanStd input shape: (1,) +[2023-10-11 14:54:24,174][84230] Starting process rollout_proc6 +[2023-10-11 14:54:24,175][84230] Starting process rollout_proc7 +[2023-10-11 14:54:24,176][84230] Starting process rollout_proc8 +[2023-10-11 14:54:24,176][84230] Starting process rollout_proc9 +[2023-10-11 14:54:24,183][84230] Starting process rollout_proc10 +[2023-10-11 14:54:24,209][85000] ConvEncoder: input_channels=4 +[2023-10-11 14:54:24,183][84230] Starting process rollout_proc11 +[2023-10-11 14:54:24,183][84230] Starting process rollout_proc12 +[2023-10-11 14:54:24,184][84230] Starting process rollout_proc13 +[2023-10-11 14:54:24,700][85000] Conv encoder output size: 512 +[2023-10-11 14:54:24,703][85000] Created Actor Critic model with architecture: +[2023-10-11 14:54:24,703][85000] ActorCriticSharedWeights( (obs_normalizer): ObservationNormalizer( (running_mean_std): RunningMeanStdDictInPlace( (running_mean_std): ModuleDict( @@ -136,2106 +149,26590 @@ (distribution_linear): Linear(in_features=512, out_features=18, bias=True) ) ) -[2023-09-26 07:47:22,101][92345] Using optimizer -[2023-09-26 07:47:22,102][92345] No checkpoints found -[2023-09-26 07:47:22,102][92345] Did not load from checkpoint, starting from scratch! -[2023-09-26 07:47:22,102][92345] Initialized policy 1 weights for model version 0 -[2023-09-26 07:47:22,104][92345] LearnerWorker_p1 finished initialization! -[2023-09-26 07:47:22,104][92345] Using GPUs [0] for process 1 (actually maps to GPUs [1]) -[2023-09-26 07:47:23,038][92475] Worker 0 uses CPU cores [0, 1, 2, 3] -[2023-09-26 07:47:23,049][92474] Using GPUs [1] for process 1 (actually maps to GPUs [1]) -[2023-09-26 07:47:23,049][92474] Set environment var CUDA_VISIBLE_DEVICES to '1' (GPU indices [1]) for inference process 1 -[2023-09-26 07:47:23,067][92474] Num visible devices: 1 -[2023-09-26 07:47:23,074][92513] Worker 6 uses CPU cores [24, 25, 26, 27] -[2023-09-26 07:47:23,089][92511] Worker 1 uses CPU cores [4, 5, 6, 7] -[2023-09-26 07:47:23,128][92507] Worker 3 uses CPU cores [12, 13, 14, 15] -[2023-09-26 07:47:23,144][92509] Worker 2 uses CPU cores [8, 9, 10, 11] -[2023-09-26 07:47:23,148][92512] Worker 5 uses CPU cores [20, 21, 22, 23] -[2023-09-26 07:47:23,164][92510] Worker 4 uses CPU cores [16, 17, 18, 19] -[2023-09-26 07:47:23,190][92473] Using GPUs [0] for process 0 (actually maps to GPUs [0]) -[2023-09-26 07:47:23,191][92473] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0 -[2023-09-26 07:47:23,210][92473] Num visible devices: 1 -[2023-09-26 07:47:23,264][92514] Worker 7 uses CPU cores [28, 29, 30, 31] -[2023-09-26 07:47:23,705][92474] RunningMeanStd input shape: (4, 84, 84) -[2023-09-26 07:47:23,705][92474] RunningMeanStd input shape: (1,) -[2023-09-26 07:47:23,716][92474] ConvEncoder: input_channels=4 -[2023-09-26 07:47:23,762][91478] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 0. Throughput: 0: nan, 1: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) -[2023-09-26 07:47:23,778][92473] RunningMeanStd input shape: (4, 84, 84) -[2023-09-26 07:47:23,778][92473] RunningMeanStd input shape: (1,) -[2023-09-26 07:47:23,789][92473] ConvEncoder: input_channels=4 -[2023-09-26 07:47:23,815][92474] Conv encoder output size: 512 -[2023-09-26 07:47:23,821][91478] Inference worker 1-0 is ready! -[2023-09-26 07:47:23,883][92473] Conv encoder output size: 512 -[2023-09-26 07:47:23,889][91478] Inference worker 0-0 is ready! -[2023-09-26 07:47:23,889][91478] All inference workers are ready! Signal rollout workers to start! -[2023-09-26 07:47:24,351][92511] Decorrelating experience for 0 frames... -[2023-09-26 07:47:24,354][92514] Decorrelating experience for 0 frames... -[2023-09-26 07:47:24,358][92510] Decorrelating experience for 0 frames... -[2023-09-26 07:47:24,358][92507] Decorrelating experience for 0 frames... -[2023-09-26 07:47:24,360][92512] Decorrelating experience for 0 frames... -[2023-09-26 07:47:24,361][92475] Decorrelating experience for 0 frames... -[2023-09-26 07:47:24,393][92513] Decorrelating experience for 0 frames... -[2023-09-26 07:47:24,408][92509] Decorrelating experience for 0 frames... -[2023-09-26 07:47:28,762][91478] Fps is (10 sec: 1638.4, 60 sec: 1638.4, 300 sec: 1638.4). Total num frames: 8192. Throughput: 0: 204.8, 1: 204.8. Samples: 2048. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 07:47:28,763][91478] Avg episode reward: [(0, '1.800'), (1, '1.562')] -[2023-09-26 07:47:33,762][91478] Fps is (10 sec: 3276.8, 60 sec: 3276.8, 300 sec: 3276.8). Total num frames: 32768. Throughput: 0: 404.7, 1: 395.1. Samples: 7998. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) -[2023-09-26 07:47:33,763][91478] Avg episode reward: [(0, '1.667'), (1, '2.178')] -[2023-09-26 07:47:37,875][91478] Heartbeat connected on Batcher_0 -[2023-09-26 07:47:37,878][91478] Heartbeat connected on LearnerWorker_p0 -[2023-09-26 07:47:37,881][91478] Heartbeat connected on Batcher_1 -[2023-09-26 07:47:37,884][91478] Heartbeat connected on LearnerWorker_p1 -[2023-09-26 07:47:37,890][91478] Heartbeat connected on InferenceWorker_p0-w0 -[2023-09-26 07:47:37,894][91478] Heartbeat connected on InferenceWorker_p1-w0 -[2023-09-26 07:47:37,897][91478] Heartbeat connected on RolloutWorker_w0 -[2023-09-26 07:47:37,898][91478] Heartbeat connected on RolloutWorker_w1 -[2023-09-26 07:47:37,903][91478] Heartbeat connected on RolloutWorker_w2 -[2023-09-26 07:47:37,903][91478] Heartbeat connected on RolloutWorker_w3 -[2023-09-26 07:47:37,909][91478] Heartbeat connected on RolloutWorker_w5 -[2023-09-26 07:47:37,912][91478] Heartbeat connected on RolloutWorker_w6 -[2023-09-26 07:47:37,913][91478] Heartbeat connected on RolloutWorker_w4 -[2023-09-26 07:47:37,914][91478] Heartbeat connected on RolloutWorker_w7 -[2023-09-26 07:47:38,762][91478] Fps is (10 sec: 5734.4, 60 sec: 4369.1, 300 sec: 4369.1). Total num frames: 65536. Throughput: 0: 409.1, 1: 409.6. Samples: 12280. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) -[2023-09-26 07:47:38,763][91478] Avg episode reward: [(0, '1.725'), (1, '1.872')] -[2023-09-26 07:47:41,424][92473] Updated weights for policy 0, policy_version 160 (0.0017) -[2023-09-26 07:47:41,425][92474] Updated weights for policy 1, policy_version 160 (0.0017) -[2023-09-26 07:47:43,762][91478] Fps is (10 sec: 5734.4, 60 sec: 4505.6, 300 sec: 4505.6). Total num frames: 90112. Throughput: 0: 534.7, 1: 531.3. Samples: 21320. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) -[2023-09-26 07:47:43,762][91478] Avg episode reward: [(0, '1.960'), (1, '1.810')] -[2023-09-26 07:47:48,762][91478] Fps is (10 sec: 5734.6, 60 sec: 4915.3, 300 sec: 4915.3). Total num frames: 122880. Throughput: 0: 615.6, 1: 614.6. Samples: 30753. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 07:47:48,762][91478] Avg episode reward: [(0, '2.250'), (1, '2.260')] -[2023-09-26 07:47:53,762][91478] Fps is (10 sec: 6553.5, 60 sec: 5188.3, 300 sec: 5188.3). Total num frames: 155648. Throughput: 0: 595.1, 1: 592.6. Samples: 35631. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) -[2023-09-26 07:47:53,763][91478] Avg episode reward: [(0, '2.870'), (1, '2.630')] -[2023-09-26 07:47:54,335][92474] Updated weights for policy 1, policy_version 320 (0.0019) -[2023-09-26 07:47:54,336][92473] Updated weights for policy 0, policy_version 320 (0.0016) -[2023-09-26 07:47:58,762][91478] Fps is (10 sec: 6553.4, 60 sec: 5383.3, 300 sec: 5383.3). Total num frames: 188416. Throughput: 0: 643.7, 1: 643.7. Samples: 45056. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0) -[2023-09-26 07:47:58,763][91478] Avg episode reward: [(0, '2.920'), (1, '2.900')] -[2023-09-26 07:48:03,762][91478] Fps is (10 sec: 6553.6, 60 sec: 5529.6, 300 sec: 5529.6). Total num frames: 221184. Throughput: 0: 681.2, 1: 680.6. Samples: 54471. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 07:48:03,763][91478] Avg episode reward: [(0, '3.530'), (1, '2.920')] -[2023-09-26 07:48:03,764][91993] Saving new best policy, reward=3.530! -[2023-09-26 07:48:03,765][92345] Saving new best policy, reward=2.920! -[2023-09-26 07:48:07,542][92474] Updated weights for policy 1, policy_version 480 (0.0020) -[2023-09-26 07:48:07,542][92473] Updated weights for policy 0, policy_version 480 (0.0018) -[2023-09-26 07:48:08,762][91478] Fps is (10 sec: 5734.5, 60 sec: 5461.3, 300 sec: 5461.3). Total num frames: 245760. Throughput: 0: 658.5, 1: 657.3. Samples: 59211. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) -[2023-09-26 07:48:08,763][91478] Avg episode reward: [(0, '3.820'), (1, '2.790')] -[2023-09-26 07:48:08,957][91993] Saving new best policy, reward=3.820! -[2023-09-26 07:48:13,762][91478] Fps is (10 sec: 5734.6, 60 sec: 5570.6, 300 sec: 5570.6). Total num frames: 278528. Throughput: 0: 736.2, 1: 735.0. Samples: 68251. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0) -[2023-09-26 07:48:13,762][91478] Avg episode reward: [(0, '4.040'), (1, '3.340')] -[2023-09-26 07:48:13,767][92345] Saving new best policy, reward=3.340! -[2023-09-26 07:48:13,767][91993] Saving new best policy, reward=4.040! -[2023-09-26 07:48:18,762][91478] Fps is (10 sec: 6553.6, 60 sec: 5659.9, 300 sec: 5659.9). Total num frames: 311296. Throughput: 0: 772.9, 1: 773.5. Samples: 77585. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 07:48:18,763][91478] Avg episode reward: [(0, '3.910'), (1, '3.430')] -[2023-09-26 07:48:18,764][92345] Saving new best policy, reward=3.430! -[2023-09-26 07:48:20,814][92474] Updated weights for policy 1, policy_version 640 (0.0018) -[2023-09-26 07:48:20,814][92473] Updated weights for policy 0, policy_version 640 (0.0018) -[2023-09-26 07:48:23,762][91478] Fps is (10 sec: 6553.6, 60 sec: 5734.4, 300 sec: 5734.4). Total num frames: 344064. Throughput: 0: 775.3, 1: 773.8. Samples: 81991. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0) -[2023-09-26 07:48:23,762][91478] Avg episode reward: [(0, '4.260'), (1, '3.570')] -[2023-09-26 07:48:23,763][92345] Saving new best policy, reward=3.570! -[2023-09-26 07:48:23,763][91993] Saving new best policy, reward=4.260! -[2023-09-26 07:48:28,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6144.0, 300 sec: 5797.4). Total num frames: 376832. Throughput: 0: 784.2, 1: 784.0. Samples: 91888. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) -[2023-09-26 07:48:28,763][91478] Avg episode reward: [(0, '4.510'), (1, '3.790')] -[2023-09-26 07:48:28,773][91993] Saving new best policy, reward=4.510! -[2023-09-26 07:48:28,774][92345] Saving new best policy, reward=3.790! -[2023-09-26 07:48:33,696][92473] Updated weights for policy 0, policy_version 800 (0.0018) -[2023-09-26 07:48:33,696][92474] Updated weights for policy 1, policy_version 800 (0.0017) -[2023-09-26 07:48:33,762][91478] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 5851.4). Total num frames: 409600. Throughput: 0: 782.8, 1: 782.2. Samples: 101175. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 07:48:33,763][91478] Avg episode reward: [(0, '4.790'), (1, '4.210')] -[2023-09-26 07:48:33,764][91993] Saving new best policy, reward=4.790! -[2023-09-26 07:48:33,764][92345] Saving new best policy, reward=4.210! -[2023-09-26 07:48:38,762][91478] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 5789.0). Total num frames: 434176. Throughput: 0: 782.0, 1: 782.2. Samples: 106021. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0) -[2023-09-26 07:48:38,763][91478] Avg episode reward: [(0, '5.000'), (1, '4.170')] -[2023-09-26 07:48:38,764][91993] Saving new best policy, reward=5.000! -[2023-09-26 07:48:43,762][91478] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 5836.8). Total num frames: 466944. Throughput: 0: 774.8, 1: 773.8. Samples: 114741. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 07:48:43,763][91478] Avg episode reward: [(0, '5.120'), (1, '4.140')] -[2023-09-26 07:48:43,773][91993] Saving new best policy, reward=5.120! -[2023-09-26 07:48:47,115][92474] Updated weights for policy 1, policy_version 960 (0.0017) -[2023-09-26 07:48:47,115][92473] Updated weights for policy 0, policy_version 960 (0.0017) -[2023-09-26 07:48:48,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 5879.0). Total num frames: 499712. Throughput: 0: 776.8, 1: 776.0. Samples: 124348. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) -[2023-09-26 07:48:48,763][91478] Avg episode reward: [(0, '5.140'), (1, '4.330')] -[2023-09-26 07:48:48,764][91993] Saving new best policy, reward=5.140! -[2023-09-26 07:48:48,764][92345] Saving new best policy, reward=4.330! -[2023-09-26 07:48:53,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 5916.4). Total num frames: 532480. Throughput: 0: 775.1, 1: 775.5. Samples: 128989. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) -[2023-09-26 07:48:53,763][91478] Avg episode reward: [(0, '5.090'), (1, '4.570')] -[2023-09-26 07:48:53,765][92345] Saving new best policy, reward=4.570! -[2023-09-26 07:48:58,762][91478] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 5863.7). Total num frames: 557056. Throughput: 0: 779.0, 1: 779.0. Samples: 138362. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-09-26 07:48:58,763][91478] Avg episode reward: [(0, '5.220'), (1, '4.840')] -[2023-09-26 07:48:58,833][92345] Saving new best policy, reward=4.840! -[2023-09-26 07:48:58,899][91993] Saving new best policy, reward=5.220! -[2023-09-26 07:49:00,167][92473] Updated weights for policy 0, policy_version 1120 (0.0016) -[2023-09-26 07:49:00,168][92474] Updated weights for policy 1, policy_version 1120 (0.0017) -[2023-09-26 07:49:03,762][91478] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 5898.2). Total num frames: 589824. Throughput: 0: 778.7, 1: 778.7. Samples: 147668. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 07:49:03,763][91478] Avg episode reward: [(0, '5.290'), (1, '5.120')] -[2023-09-26 07:49:03,764][91993] Saving new best policy, reward=5.290! -[2023-09-26 07:49:03,764][92345] Saving new best policy, reward=5.120! -[2023-09-26 07:49:08,762][91478] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 5929.5). Total num frames: 622592. Throughput: 0: 782.4, 1: 782.9. Samples: 152428. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) -[2023-09-26 07:49:08,763][91478] Avg episode reward: [(0, '5.410'), (1, '5.360')] -[2023-09-26 07:49:08,764][91993] Saving new best policy, reward=5.410! -[2023-09-26 07:49:08,764][92345] Saving new best policy, reward=5.360! -[2023-09-26 07:49:13,185][92474] Updated weights for policy 1, policy_version 1280 (0.0017) -[2023-09-26 07:49:13,185][92473] Updated weights for policy 0, policy_version 1280 (0.0018) -[2023-09-26 07:49:13,762][91478] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 5957.8). Total num frames: 655360. Throughput: 0: 776.0, 1: 777.6. Samples: 161798. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) -[2023-09-26 07:49:13,762][91478] Avg episode reward: [(0, '5.470'), (1, '5.570')] -[2023-09-26 07:49:13,770][92345] Saving ./train_atari/atari_frostbite/checkpoint_p1/checkpoint_000001280_327680.pth... -[2023-09-26 07:49:13,770][91993] Saving ./train_atari/atari_frostbite/checkpoint_p0/checkpoint_000001280_327680.pth... -[2023-09-26 07:49:13,807][92345] Saving new best policy, reward=5.570! -[2023-09-26 07:49:13,808][91993] Saving new best policy, reward=5.470! -[2023-09-26 07:49:18,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 5983.7). Total num frames: 688128. Throughput: 0: 782.4, 1: 782.6. Samples: 171596. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 07:49:18,763][91478] Avg episode reward: [(0, '5.410'), (1, '5.360')] -[2023-09-26 07:49:23,762][91478] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6007.5). Total num frames: 720896. Throughput: 0: 778.2, 1: 779.7. Samples: 176128. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) -[2023-09-26 07:49:23,763][91478] Avg episode reward: [(0, '5.600'), (1, '5.430')] -[2023-09-26 07:49:23,764][91993] Saving new best policy, reward=5.600! -[2023-09-26 07:49:26,168][92474] Updated weights for policy 1, policy_version 1440 (0.0019) -[2023-09-26 07:49:26,168][92473] Updated weights for policy 0, policy_version 1440 (0.0018) -[2023-09-26 07:49:28,762][91478] Fps is (10 sec: 5734.5, 60 sec: 6144.0, 300 sec: 5963.8). Total num frames: 745472. Throughput: 0: 786.9, 1: 787.5. Samples: 185588. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 07:49:28,762][91478] Avg episode reward: [(0, '5.780'), (1, '5.390')] -[2023-09-26 07:49:28,904][91993] Saving new best policy, reward=5.780! -[2023-09-26 07:49:33,762][91478] Fps is (10 sec: 5734.5, 60 sec: 6144.0, 300 sec: 5986.5). Total num frames: 778240. Throughput: 0: 780.0, 1: 780.9. Samples: 194588. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 07:49:33,762][91478] Avg episode reward: [(0, '5.940'), (1, '5.490')] -[2023-09-26 07:49:33,763][91993] Saving new best policy, reward=5.940! -[2023-09-26 07:49:38,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6280.6, 300 sec: 6007.5). Total num frames: 811008. Throughput: 0: 779.8, 1: 779.3. Samples: 199151. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0) -[2023-09-26 07:49:38,762][91478] Avg episode reward: [(0, '5.830'), (1, '5.420')] -[2023-09-26 07:49:39,447][92473] Updated weights for policy 0, policy_version 1600 (0.0018) -[2023-09-26 07:49:39,448][92474] Updated weights for policy 1, policy_version 1600 (0.0019) -[2023-09-26 07:49:43,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6027.0). Total num frames: 843776. Throughput: 0: 783.2, 1: 783.9. Samples: 208879. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 07:49:43,762][91478] Avg episode reward: [(0, '5.930'), (1, '5.400')] -[2023-09-26 07:49:48,762][91478] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 5988.6). Total num frames: 868352. Throughput: 0: 778.8, 1: 778.8. Samples: 217760. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) -[2023-09-26 07:49:48,763][91478] Avg episode reward: [(0, '5.990'), (1, '5.480')] -[2023-09-26 07:49:48,773][91993] Saving new best policy, reward=5.990! -[2023-09-26 07:49:52,894][92473] Updated weights for policy 0, policy_version 1760 (0.0017) -[2023-09-26 07:49:52,894][92474] Updated weights for policy 1, policy_version 1760 (0.0017) -[2023-09-26 07:49:53,762][91478] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6007.5). Total num frames: 901120. Throughput: 0: 780.0, 1: 780.0. Samples: 222626. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) -[2023-09-26 07:49:53,763][91478] Avg episode reward: [(0, '5.990'), (1, '5.540')] -[2023-09-26 07:49:58,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6025.1). Total num frames: 933888. Throughput: 0: 776.2, 1: 775.2. Samples: 231612. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 07:49:58,763][91478] Avg episode reward: [(0, '6.080'), (1, '5.610')] -[2023-09-26 07:49:58,773][91993] Saving new best policy, reward=6.080! -[2023-09-26 07:49:58,773][92345] Saving new best policy, reward=5.610! -[2023-09-26 07:50:03,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6041.6). Total num frames: 966656. Throughput: 0: 773.4, 1: 773.4. Samples: 241199. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 07:50:03,763][91478] Avg episode reward: [(0, '5.920'), (1, '5.560')] -[2023-09-26 07:50:05,969][92474] Updated weights for policy 1, policy_version 1920 (0.0015) -[2023-09-26 07:50:05,969][92473] Updated weights for policy 0, policy_version 1920 (0.0017) -[2023-09-26 07:50:08,762][91478] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6057.1). Total num frames: 999424. Throughput: 0: 773.7, 1: 773.7. Samples: 245760. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 07:50:08,762][91478] Avg episode reward: [(0, '5.770'), (1, '5.600')] -[2023-09-26 07:50:13,762][91478] Fps is (10 sec: 6144.0, 60 sec: 6212.2, 300 sec: 6047.6). Total num frames: 1028096. Throughput: 0: 774.4, 1: 773.3. Samples: 255232. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 07:50:13,763][91478] Avg episode reward: [(0, '5.730'), (1, '5.600')] -[2023-09-26 07:50:18,762][91478] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6038.7). Total num frames: 1056768. Throughput: 0: 776.5, 1: 775.4. Samples: 264422. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 07:50:18,763][91478] Avg episode reward: [(0, '5.770'), (1, '5.650')] -[2023-09-26 07:50:18,764][92345] Saving new best policy, reward=5.650! -[2023-09-26 07:50:19,017][92474] Updated weights for policy 1, policy_version 2080 (0.0017) -[2023-09-26 07:50:19,017][92473] Updated weights for policy 0, policy_version 2080 (0.0019) -[2023-09-26 07:50:23,762][91478] Fps is (10 sec: 6144.1, 60 sec: 6144.0, 300 sec: 6053.0). Total num frames: 1089536. Throughput: 0: 780.3, 1: 779.8. Samples: 269356. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) -[2023-09-26 07:50:23,762][91478] Avg episode reward: [(0, '5.940'), (1, '5.800')] -[2023-09-26 07:50:23,763][92345] Saving new best policy, reward=5.800! -[2023-09-26 07:50:28,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6066.5). Total num frames: 1122304. Throughput: 0: 776.8, 1: 775.6. Samples: 278735. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 07:50:28,763][91478] Avg episode reward: [(0, '6.040'), (1, '5.770')] -[2023-09-26 07:50:32,162][92474] Updated weights for policy 1, policy_version 2240 (0.0017) -[2023-09-26 07:50:32,162][92473] Updated weights for policy 0, policy_version 2240 (0.0016) -[2023-09-26 07:50:33,762][91478] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6079.3). Total num frames: 1155072. Throughput: 0: 780.5, 1: 779.8. Samples: 287973. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 07:50:33,763][91478] Avg episode reward: [(0, '6.260'), (1, '5.870')] -[2023-09-26 07:50:33,764][91993] Saving new best policy, reward=6.260! -[2023-09-26 07:50:33,764][92345] Saving new best policy, reward=5.870! -[2023-09-26 07:50:38,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6091.5). Total num frames: 1187840. Throughput: 0: 780.0, 1: 780.2. Samples: 292837. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) -[2023-09-26 07:50:38,763][91478] Avg episode reward: [(0, '6.300'), (1, '5.840')] -[2023-09-26 07:50:38,764][91993] Saving new best policy, reward=6.300! -[2023-09-26 07:50:43,762][91478] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6062.1). Total num frames: 1212416. Throughput: 0: 783.0, 1: 782.9. Samples: 302077. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) -[2023-09-26 07:50:43,763][91478] Avg episode reward: [(0, '6.270'), (1, '6.050')] -[2023-09-26 07:50:43,883][92345] Saving new best policy, reward=6.050! -[2023-09-26 07:50:45,209][92474] Updated weights for policy 1, policy_version 2400 (0.0017) -[2023-09-26 07:50:45,209][92473] Updated weights for policy 0, policy_version 2400 (0.0019) -[2023-09-26 07:50:48,762][91478] Fps is (10 sec: 5734.5, 60 sec: 6280.6, 300 sec: 6074.1). Total num frames: 1245184. Throughput: 0: 780.4, 1: 779.8. Samples: 311408. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 07:50:48,762][91478] Avg episode reward: [(0, '6.320'), (1, '5.920')] -[2023-09-26 07:50:48,763][91993] Saving new best policy, reward=6.320! -[2023-09-26 07:50:53,762][91478] Fps is (10 sec: 6553.7, 60 sec: 6280.6, 300 sec: 6085.5). Total num frames: 1277952. Throughput: 0: 783.2, 1: 781.6. Samples: 316178. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 07:50:53,762][91478] Avg episode reward: [(0, '6.140'), (1, '6.060')] -[2023-09-26 07:50:53,763][92345] Saving new best policy, reward=6.060! -[2023-09-26 07:50:58,456][92474] Updated weights for policy 1, policy_version 2560 (0.0016) -[2023-09-26 07:50:58,458][92473] Updated weights for policy 0, policy_version 2560 (0.0018) -[2023-09-26 07:50:58,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6280.6, 300 sec: 6096.4). Total num frames: 1310720. Throughput: 0: 780.8, 1: 783.0. Samples: 325602. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) -[2023-09-26 07:50:58,762][91478] Avg episode reward: [(0, '6.250'), (1, '6.160')] -[2023-09-26 07:50:58,770][92345] Saving new best policy, reward=6.160! -[2023-09-26 07:51:03,762][91478] Fps is (10 sec: 6143.9, 60 sec: 6212.3, 300 sec: 6088.1). Total num frames: 1339392. Throughput: 0: 779.1, 1: 779.0. Samples: 334535. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 07:51:03,763][91478] Avg episode reward: [(0, '6.330'), (1, '6.260')] -[2023-09-26 07:51:03,764][91993] Saving new best policy, reward=6.330! -[2023-09-26 07:51:03,780][92345] Saving new best policy, reward=6.260! -[2023-09-26 07:51:08,762][91478] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6080.3). Total num frames: 1368064. Throughput: 0: 777.8, 1: 777.4. Samples: 339340. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 07:51:08,763][91478] Avg episode reward: [(0, '6.490'), (1, '6.250')] -[2023-09-26 07:51:08,764][91993] Saving new best policy, reward=6.490! -[2023-09-26 07:51:11,703][92473] Updated weights for policy 0, policy_version 2720 (0.0017) -[2023-09-26 07:51:11,703][92474] Updated weights for policy 1, policy_version 2720 (0.0017) -[2023-09-26 07:51:13,762][91478] Fps is (10 sec: 6144.0, 60 sec: 6212.3, 300 sec: 6090.6). Total num frames: 1400832. Throughput: 0: 774.2, 1: 774.6. Samples: 348430. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 07:51:13,763][91478] Avg episode reward: [(0, '6.240'), (1, '6.230')] -[2023-09-26 07:51:13,773][91993] Saving ./train_atari/atari_frostbite/checkpoint_p0/checkpoint_000002736_700416.pth... -[2023-09-26 07:51:13,774][92345] Saving ./train_atari/atari_frostbite/checkpoint_p1/checkpoint_000002736_700416.pth... -[2023-09-26 07:51:18,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6100.4). Total num frames: 1433600. Throughput: 0: 777.2, 1: 778.7. Samples: 357991. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) -[2023-09-26 07:51:18,763][91478] Avg episode reward: [(0, '6.330'), (1, '6.130')] -[2023-09-26 07:51:23,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6109.9). Total num frames: 1466368. Throughput: 0: 773.8, 1: 774.3. Samples: 362501. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 07:51:23,763][91478] Avg episode reward: [(0, '6.420'), (1, '6.130')] -[2023-09-26 07:51:24,744][92473] Updated weights for policy 0, policy_version 2880 (0.0019) -[2023-09-26 07:51:24,744][92474] Updated weights for policy 1, policy_version 2880 (0.0018) -[2023-09-26 07:51:28,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6118.9). Total num frames: 1499136. Throughput: 0: 779.9, 1: 777.3. Samples: 372151. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 07:51:28,763][91478] Avg episode reward: [(0, '6.130'), (1, '6.170')] -[2023-09-26 07:51:33,762][91478] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6094.8). Total num frames: 1523712. Throughput: 0: 775.6, 1: 775.9. Samples: 381224. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 07:51:33,763][91478] Avg episode reward: [(0, '6.140'), (1, '6.160')] -[2023-09-26 07:51:37,859][92473] Updated weights for policy 0, policy_version 3040 (0.0017) -[2023-09-26 07:51:37,860][92474] Updated weights for policy 1, policy_version 3040 (0.0017) -[2023-09-26 07:51:38,762][91478] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6103.8). Total num frames: 1556480. Throughput: 0: 776.5, 1: 777.0. Samples: 386084. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 07:51:38,763][91478] Avg episode reward: [(0, '6.250'), (1, '6.170')] -[2023-09-26 07:51:43,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6112.5). Total num frames: 1589248. Throughput: 0: 774.6, 1: 773.7. Samples: 395276. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 07:51:43,763][91478] Avg episode reward: [(0, '6.330'), (1, '6.080')] -[2023-09-26 07:51:48,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6120.8). Total num frames: 1622016. Throughput: 0: 780.9, 1: 780.9. Samples: 404814. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) -[2023-09-26 07:51:48,763][91478] Avg episode reward: [(0, '6.580'), (1, '6.400')] -[2023-09-26 07:51:48,764][91993] Saving new best policy, reward=6.580! -[2023-09-26 07:51:48,764][92345] Saving new best policy, reward=6.400! -[2023-09-26 07:51:50,965][92474] Updated weights for policy 1, policy_version 3200 (0.0016) -[2023-09-26 07:51:50,965][92473] Updated weights for policy 0, policy_version 3200 (0.0018) -[2023-09-26 07:51:53,762][91478] Fps is (10 sec: 6553.8, 60 sec: 6280.5, 300 sec: 6128.8). Total num frames: 1654784. Throughput: 0: 779.6, 1: 781.7. Samples: 409600. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 07:51:53,762][91478] Avg episode reward: [(0, '6.450'), (1, '6.360')] -[2023-09-26 07:51:58,762][91478] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6106.8). Total num frames: 1679360. Throughput: 0: 784.3, 1: 784.5. Samples: 419024. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 07:51:58,763][91478] Avg episode reward: [(0, '6.520'), (1, '6.390')] -[2023-09-26 07:52:03,762][91478] Fps is (10 sec: 5734.4, 60 sec: 6212.3, 300 sec: 6114.8). Total num frames: 1712128. Throughput: 0: 777.9, 1: 778.5. Samples: 428032. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 07:52:03,762][91478] Avg episode reward: [(0, '6.270'), (1, '6.310')] -[2023-09-26 07:52:04,307][92474] Updated weights for policy 1, policy_version 3360 (0.0017) -[2023-09-26 07:52:04,307][92473] Updated weights for policy 0, policy_version 3360 (0.0018) -[2023-09-26 07:52:08,762][91478] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6122.4). Total num frames: 1744896. Throughput: 0: 780.2, 1: 778.7. Samples: 432653. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-09-26 07:52:08,763][91478] Avg episode reward: [(0, '6.400'), (1, '6.320')] -[2023-09-26 07:52:13,762][91478] Fps is (10 sec: 6553.5, 60 sec: 6280.6, 300 sec: 6129.9). Total num frames: 1777664. Throughput: 0: 777.4, 1: 777.2. Samples: 442110. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) -[2023-09-26 07:52:13,762][91478] Avg episode reward: [(0, '6.420'), (1, '6.310')] -[2023-09-26 07:52:17,605][92473] Updated weights for policy 0, policy_version 3520 (0.0017) -[2023-09-26 07:52:17,606][92474] Updated weights for policy 1, policy_version 3520 (0.0018) -[2023-09-26 07:52:18,762][91478] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6109.3). Total num frames: 1802240. Throughput: 0: 775.8, 1: 775.8. Samples: 451046. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) -[2023-09-26 07:52:18,763][91478] Avg episode reward: [(0, '6.250'), (1, '6.390')] -[2023-09-26 07:52:23,762][91478] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6192.6). Total num frames: 1835008. Throughput: 0: 777.7, 1: 776.9. Samples: 456039. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) -[2023-09-26 07:52:23,762][91478] Avg episode reward: [(0, '6.460'), (1, '6.470')] -[2023-09-26 07:52:23,763][92345] Saving new best policy, reward=6.470! -[2023-09-26 07:52:28,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 1867776. Throughput: 0: 778.0, 1: 776.5. Samples: 465227. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 07:52:28,763][91478] Avg episode reward: [(0, '6.620'), (1, '6.530')] -[2023-09-26 07:52:28,774][91993] Saving new best policy, reward=6.620! -[2023-09-26 07:52:28,774][92345] Saving new best policy, reward=6.530! -[2023-09-26 07:52:30,555][92473] Updated weights for policy 0, policy_version 3680 (0.0017) -[2023-09-26 07:52:30,555][92474] Updated weights for policy 1, policy_version 3680 (0.0017) -[2023-09-26 07:52:33,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6280.6, 300 sec: 6220.4). Total num frames: 1900544. Throughput: 0: 779.5, 1: 780.8. Samples: 475029. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 07:52:33,762][91478] Avg episode reward: [(0, '6.460'), (1, '6.530')] -[2023-09-26 07:52:38,762][91478] Fps is (10 sec: 6553.7, 60 sec: 6280.6, 300 sec: 6248.1). Total num frames: 1933312. Throughput: 0: 777.6, 1: 776.0. Samples: 479510. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 07:52:38,762][91478] Avg episode reward: [(0, '6.550'), (1, '6.390')] -[2023-09-26 07:52:43,467][92473] Updated weights for policy 0, policy_version 3840 (0.0019) -[2023-09-26 07:52:43,467][92474] Updated weights for policy 1, policy_version 3840 (0.0017) -[2023-09-26 07:52:43,762][91478] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 1966080. Throughput: 0: 780.3, 1: 779.9. Samples: 489232. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 07:52:43,763][91478] Avg episode reward: [(0, '6.340'), (1, '6.470')] -[2023-09-26 07:52:48,762][91478] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 1998848. Throughput: 0: 783.4, 1: 781.9. Samples: 498470. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) -[2023-09-26 07:52:48,763][91478] Avg episode reward: [(0, '6.430'), (1, '6.550')] -[2023-09-26 07:52:48,765][92345] Saving new best policy, reward=6.550! -[2023-09-26 07:52:53,762][91478] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 2023424. Throughput: 0: 785.0, 1: 785.1. Samples: 503311. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) -[2023-09-26 07:52:53,763][91478] Avg episode reward: [(0, '6.680'), (1, '6.670')] -[2023-09-26 07:52:53,764][91993] Saving new best policy, reward=6.680! -[2023-09-26 07:52:53,764][92345] Saving new best policy, reward=6.670! -[2023-09-26 07:52:56,777][92473] Updated weights for policy 0, policy_version 4000 (0.0016) -[2023-09-26 07:52:56,777][92474] Updated weights for policy 1, policy_version 4000 (0.0017) -[2023-09-26 07:52:58,762][91478] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 2056192. Throughput: 0: 776.6, 1: 779.4. Samples: 512131. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 07:52:58,763][91478] Avg episode reward: [(0, '6.410'), (1, '6.610')] -[2023-09-26 07:53:03,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 2088960. Throughput: 0: 785.6, 1: 785.5. Samples: 521748. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 07:53:03,763][91478] Avg episode reward: [(0, '6.490'), (1, '6.520')] -[2023-09-26 07:53:08,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 2121728. Throughput: 0: 780.1, 1: 781.4. Samples: 526307. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 07:53:08,763][91478] Avg episode reward: [(0, '6.400'), (1, '6.330')] -[2023-09-26 07:53:09,982][92474] Updated weights for policy 1, policy_version 4160 (0.0017) -[2023-09-26 07:53:09,983][92473] Updated weights for policy 0, policy_version 4160 (0.0017) -[2023-09-26 07:53:13,762][91478] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 2146304. Throughput: 0: 779.7, 1: 778.9. Samples: 535365. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 07:53:13,763][91478] Avg episode reward: [(0, '6.540'), (1, '6.370')] -[2023-09-26 07:53:13,776][91993] Saving ./train_atari/atari_frostbite/checkpoint_p0/checkpoint_000004192_1073152.pth... -[2023-09-26 07:53:13,776][92345] Saving ./train_atari/atari_frostbite/checkpoint_p1/checkpoint_000004192_1073152.pth... -[2023-09-26 07:53:13,808][91993] Removing ./train_atari/atari_frostbite/checkpoint_p0/checkpoint_000001280_327680.pth -[2023-09-26 07:53:13,811][92345] Removing ./train_atari/atari_frostbite/checkpoint_p1/checkpoint_000001280_327680.pth -[2023-09-26 07:53:18,762][91478] Fps is (10 sec: 5734.3, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 2179072. Throughput: 0: 774.7, 1: 774.6. Samples: 544746. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 07:53:18,763][91478] Avg episode reward: [(0, '6.520'), (1, '6.250')] -[2023-09-26 07:53:23,427][92473] Updated weights for policy 0, policy_version 4320 (0.0017) -[2023-09-26 07:53:23,427][92474] Updated weights for policy 1, policy_version 4320 (0.0017) -[2023-09-26 07:53:23,762][91478] Fps is (10 sec: 6553.8, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 2211840. Throughput: 0: 770.1, 1: 771.4. Samples: 548879. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 07:53:23,762][91478] Avg episode reward: [(0, '6.470'), (1, '6.470')] -[2023-09-26 07:53:28,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 2244608. Throughput: 0: 772.4, 1: 772.2. Samples: 558738. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-09-26 07:53:28,763][91478] Avg episode reward: [(0, '6.510'), (1, '6.320')] -[2023-09-26 07:53:33,762][91478] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 2277376. Throughput: 0: 773.7, 1: 774.0. Samples: 568118. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) -[2023-09-26 07:53:33,762][91478] Avg episode reward: [(0, '6.700'), (1, '6.350')] -[2023-09-26 07:53:33,763][91993] Saving new best policy, reward=6.700! -[2023-09-26 07:53:36,327][92473] Updated weights for policy 0, policy_version 4480 (0.0020) -[2023-09-26 07:53:36,327][92474] Updated weights for policy 1, policy_version 4480 (0.0020) -[2023-09-26 07:53:38,762][91478] Fps is (10 sec: 5734.5, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 2301952. Throughput: 0: 774.5, 1: 773.6. Samples: 572977. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) -[2023-09-26 07:53:38,763][91478] Avg episode reward: [(0, '6.590'), (1, '6.220')] -[2023-09-26 07:53:43,762][91478] Fps is (10 sec: 5734.2, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 2334720. Throughput: 0: 776.1, 1: 776.4. Samples: 581992. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) -[2023-09-26 07:53:43,763][91478] Avg episode reward: [(0, '6.460'), (1, '6.090')] -[2023-09-26 07:53:48,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 2367488. Throughput: 0: 778.4, 1: 778.3. Samples: 591802. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 07:53:48,763][91478] Avg episode reward: [(0, '6.640'), (1, '6.310')] -[2023-09-26 07:53:49,431][92473] Updated weights for policy 0, policy_version 4640 (0.0017) -[2023-09-26 07:53:49,431][92474] Updated weights for policy 1, policy_version 4640 (0.0018) -[2023-09-26 07:53:53,762][91478] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 2400256. Throughput: 0: 775.4, 1: 774.6. Samples: 596057. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) -[2023-09-26 07:53:53,763][91478] Avg episode reward: [(0, '6.430'), (1, '6.280')] -[2023-09-26 07:53:58,762][91478] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 2424832. Throughput: 0: 773.3, 1: 774.6. Samples: 605019. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) -[2023-09-26 07:53:58,763][91478] Avg episode reward: [(0, '6.420'), (1, '6.410')] -[2023-09-26 07:54:03,079][92473] Updated weights for policy 0, policy_version 4800 (0.0017) -[2023-09-26 07:54:03,080][92474] Updated weights for policy 1, policy_version 4800 (0.0017) -[2023-09-26 07:54:03,762][91478] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 2457600. Throughput: 0: 773.7, 1: 774.1. Samples: 614396. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 07:54:03,763][91478] Avg episode reward: [(0, '6.490'), (1, '6.410')] -[2023-09-26 07:54:08,762][91478] Fps is (10 sec: 6553.7, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 2490368. Throughput: 0: 779.6, 1: 778.5. Samples: 618994. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) -[2023-09-26 07:54:08,762][91478] Avg episode reward: [(0, '6.400'), (1, '6.360')] -[2023-09-26 07:54:13,762][91478] Fps is (10 sec: 6553.8, 60 sec: 6280.6, 300 sec: 6220.4). Total num frames: 2523136. Throughput: 0: 776.9, 1: 778.4. Samples: 628727. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 07:54:13,762][91478] Avg episode reward: [(0, '6.500'), (1, '6.430')] -[2023-09-26 07:54:16,016][92473] Updated weights for policy 0, policy_version 4960 (0.0018) -[2023-09-26 07:54:16,016][92474] Updated weights for policy 1, policy_version 4960 (0.0018) -[2023-09-26 07:54:18,762][91478] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 2555904. Throughput: 0: 776.3, 1: 775.8. Samples: 637959. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 07:54:18,763][91478] Avg episode reward: [(0, '6.350'), (1, '6.370')] -[2023-09-26 07:54:23,762][91478] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 2580480. Throughput: 0: 775.4, 1: 777.3. Samples: 642848. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) -[2023-09-26 07:54:23,763][91478] Avg episode reward: [(0, '6.420'), (1, '6.420')] -[2023-09-26 07:54:28,762][91478] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 2613248. Throughput: 0: 775.9, 1: 775.3. Samples: 651797. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0) -[2023-09-26 07:54:28,763][91478] Avg episode reward: [(0, '6.430'), (1, '6.500')] -[2023-09-26 07:54:29,280][92474] Updated weights for policy 1, policy_version 5120 (0.0016) -[2023-09-26 07:54:29,281][92473] Updated weights for policy 0, policy_version 5120 (0.0015) -[2023-09-26 07:54:33,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 2646016. Throughput: 0: 772.2, 1: 772.9. Samples: 661335. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) -[2023-09-26 07:54:33,763][91478] Avg episode reward: [(0, '6.540'), (1, '6.300')] -[2023-09-26 07:54:38,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 2678784. Throughput: 0: 772.7, 1: 773.6. Samples: 665637. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0) -[2023-09-26 07:54:38,763][91478] Avg episode reward: [(0, '6.320'), (1, '6.470')] -[2023-09-26 07:54:42,542][92473] Updated weights for policy 0, policy_version 5280 (0.0016) -[2023-09-26 07:54:42,543][92474] Updated weights for policy 1, policy_version 5280 (0.0015) -[2023-09-26 07:54:43,762][91478] Fps is (10 sec: 5734.5, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 2703360. Throughput: 0: 778.3, 1: 776.8. Samples: 674998. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 07:54:43,762][91478] Avg episode reward: [(0, '6.510'), (1, '6.550')] -[2023-09-26 07:54:48,762][91478] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 2736128. Throughput: 0: 775.6, 1: 774.2. Samples: 684138. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) -[2023-09-26 07:54:48,763][91478] Avg episode reward: [(0, '6.550'), (1, '6.630')] -[2023-09-26 07:54:53,762][91478] Fps is (10 sec: 6553.5, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 2768896. Throughput: 0: 778.6, 1: 778.7. Samples: 689074. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 07:54:53,763][91478] Avg episode reward: [(0, '6.600'), (1, '6.610')] -[2023-09-26 07:54:55,545][92473] Updated weights for policy 0, policy_version 5440 (0.0017) -[2023-09-26 07:54:55,545][92474] Updated weights for policy 1, policy_version 5440 (0.0015) -[2023-09-26 07:54:58,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 2801664. Throughput: 0: 775.4, 1: 774.2. Samples: 698461. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 07:54:58,763][91478] Avg episode reward: [(0, '6.590'), (1, '6.400')] -[2023-09-26 07:55:03,762][91478] Fps is (10 sec: 6553.7, 60 sec: 6280.6, 300 sec: 6220.4). Total num frames: 2834432. Throughput: 0: 778.6, 1: 777.4. Samples: 707982. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) -[2023-09-26 07:55:03,762][91478] Avg episode reward: [(0, '6.620'), (1, '6.520')] -[2023-09-26 07:55:08,714][92474] Updated weights for policy 1, policy_version 5600 (0.0017) -[2023-09-26 07:55:08,714][92473] Updated weights for policy 0, policy_version 5600 (0.0017) -[2023-09-26 07:55:08,762][91478] Fps is (10 sec: 6553.9, 60 sec: 6280.5, 300 sec: 6234.3). Total num frames: 2867200. Throughput: 0: 775.9, 1: 774.2. Samples: 712600. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) -[2023-09-26 07:55:08,762][91478] Avg episode reward: [(0, '6.610'), (1, '6.440')] -[2023-09-26 07:55:13,762][91478] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 2891776. Throughput: 0: 777.5, 1: 777.4. Samples: 721764. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 07:55:13,762][91478] Avg episode reward: [(0, '6.490'), (1, '6.400')] -[2023-09-26 07:55:13,769][92345] Saving ./train_atari/atari_frostbite/checkpoint_p1/checkpoint_000005648_1445888.pth... -[2023-09-26 07:55:13,769][91993] Saving ./train_atari/atari_frostbite/checkpoint_p0/checkpoint_000005648_1445888.pth... -[2023-09-26 07:55:13,800][92345] Removing ./train_atari/atari_frostbite/checkpoint_p1/checkpoint_000002736_700416.pth -[2023-09-26 07:55:13,805][91993] Removing ./train_atari/atari_frostbite/checkpoint_p0/checkpoint_000002736_700416.pth -[2023-09-26 07:55:18,762][91478] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 2924544. Throughput: 0: 775.1, 1: 775.9. Samples: 731132. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) -[2023-09-26 07:55:18,763][91478] Avg episode reward: [(0, '6.520'), (1, '6.440')] -[2023-09-26 07:55:21,922][92473] Updated weights for policy 0, policy_version 5760 (0.0017) -[2023-09-26 07:55:21,922][92474] Updated weights for policy 1, policy_version 5760 (0.0017) -[2023-09-26 07:55:23,762][91478] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 2957312. Throughput: 0: 779.0, 1: 778.0. Samples: 735703. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) -[2023-09-26 07:55:23,763][91478] Avg episode reward: [(0, '6.700'), (1, '6.490')] -[2023-09-26 07:55:28,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 2990080. Throughput: 0: 781.4, 1: 783.0. Samples: 745398. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) -[2023-09-26 07:55:28,763][91478] Avg episode reward: [(0, '6.570'), (1, '6.430')] -[2023-09-26 07:55:33,762][91478] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 3022848. Throughput: 0: 784.7, 1: 784.8. Samples: 754763. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 07:55:33,762][91478] Avg episode reward: [(0, '6.530'), (1, '6.510')] -[2023-09-26 07:55:34,832][92474] Updated weights for policy 1, policy_version 5920 (0.0017) -[2023-09-26 07:55:34,832][92473] Updated weights for policy 0, policy_version 5920 (0.0019) -[2023-09-26 07:55:38,762][91478] Fps is (10 sec: 6144.0, 60 sec: 6212.3, 300 sec: 6234.3). Total num frames: 3051520. Throughput: 0: 785.3, 1: 785.7. Samples: 759769. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) -[2023-09-26 07:55:38,763][91478] Avg episode reward: [(0, '6.630'), (1, '6.650')] -[2023-09-26 07:55:43,762][91478] Fps is (10 sec: 5734.3, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 3080192. Throughput: 0: 782.1, 1: 781.5. Samples: 768821. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) -[2023-09-26 07:55:43,763][91478] Avg episode reward: [(0, '6.420'), (1, '6.640')] -[2023-09-26 07:55:47,927][92473] Updated weights for policy 0, policy_version 6080 (0.0017) -[2023-09-26 07:55:47,934][92474] Updated weights for policy 1, policy_version 6080 (0.0017) -[2023-09-26 07:55:48,762][91478] Fps is (10 sec: 6144.1, 60 sec: 6280.6, 300 sec: 6220.4). Total num frames: 3112960. Throughput: 0: 779.2, 1: 782.1. Samples: 778241. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) -[2023-09-26 07:55:48,762][91478] Avg episode reward: [(0, '6.520'), (1, '6.420')] -[2023-09-26 07:55:53,762][91478] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 3145728. Throughput: 0: 781.0, 1: 782.4. Samples: 782951. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) -[2023-09-26 07:55:53,762][91478] Avg episode reward: [(0, '6.520'), (1, '6.510')] -[2023-09-26 07:55:58,762][91478] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6234.3). Total num frames: 3178496. Throughput: 0: 785.8, 1: 787.8. Samples: 792576. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 07:55:58,763][91478] Avg episode reward: [(0, '6.350'), (1, '6.490')] -[2023-09-26 07:56:01,033][92473] Updated weights for policy 0, policy_version 6240 (0.0016) -[2023-09-26 07:56:01,034][92474] Updated weights for policy 1, policy_version 6240 (0.0017) -[2023-09-26 07:56:03,762][91478] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 3211264. Throughput: 0: 785.7, 1: 784.0. Samples: 801772. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 07:56:03,763][91478] Avg episode reward: [(0, '6.590'), (1, '6.470')] -[2023-09-26 07:56:08,762][91478] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 3235840. Throughput: 0: 789.4, 1: 788.4. Samples: 806702. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 07:56:08,763][91478] Avg episode reward: [(0, '6.690'), (1, '6.270')] -[2023-09-26 07:56:13,762][91478] Fps is (10 sec: 5734.5, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 3268608. Throughput: 0: 783.4, 1: 783.0. Samples: 815885. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) -[2023-09-26 07:56:13,762][91478] Avg episode reward: [(0, '6.630'), (1, '6.430')] -[2023-09-26 07:56:14,132][92473] Updated weights for policy 0, policy_version 6400 (0.0016) -[2023-09-26 07:56:14,132][92474] Updated weights for policy 1, policy_version 6400 (0.0017) -[2023-09-26 07:56:18,762][91478] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 3301376. Throughput: 0: 783.4, 1: 783.7. Samples: 825282. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 07:56:18,763][91478] Avg episode reward: [(0, '6.800'), (1, '6.490')] -[2023-09-26 07:56:18,764][91993] Saving new best policy, reward=6.800! -[2023-09-26 07:56:23,762][91478] Fps is (10 sec: 6553.4, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 3334144. Throughput: 0: 774.8, 1: 774.7. Samples: 829496. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) -[2023-09-26 07:56:23,763][91478] Avg episode reward: [(0, '6.730'), (1, '6.440')] -[2023-09-26 07:56:27,307][92473] Updated weights for policy 0, policy_version 6560 (0.0018) -[2023-09-26 07:56:27,308][92474] Updated weights for policy 1, policy_version 6560 (0.0016) -[2023-09-26 07:56:28,762][91478] Fps is (10 sec: 6553.8, 60 sec: 6280.6, 300 sec: 6248.1). Total num frames: 3366912. Throughput: 0: 782.0, 1: 782.7. Samples: 839234. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) -[2023-09-26 07:56:28,762][91478] Avg episode reward: [(0, '6.730'), (1, '6.480')] -[2023-09-26 07:56:33,762][91478] Fps is (10 sec: 6144.0, 60 sec: 6212.2, 300 sec: 6234.3). Total num frames: 3395584. Throughput: 0: 782.1, 1: 782.3. Samples: 848636. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) -[2023-09-26 07:56:33,763][91478] Avg episode reward: [(0, '6.630'), (1, '6.570')] -[2023-09-26 07:56:38,762][91478] Fps is (10 sec: 5734.3, 60 sec: 6212.3, 300 sec: 6220.4). Total num frames: 3424256. Throughput: 0: 782.1, 1: 780.6. Samples: 853273. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 07:56:38,763][91478] Avg episode reward: [(0, '6.740'), (1, '6.580')] -[2023-09-26 07:56:40,329][92474] Updated weights for policy 1, policy_version 6720 (0.0018) -[2023-09-26 07:56:40,329][92473] Updated weights for policy 0, policy_version 6720 (0.0018) -[2023-09-26 07:56:43,762][91478] Fps is (10 sec: 6144.0, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 3457024. Throughput: 0: 779.5, 1: 777.4. Samples: 862635. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0) -[2023-09-26 07:56:43,763][91478] Avg episode reward: [(0, '6.680'), (1, '6.400')] -[2023-09-26 07:56:48,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 3489792. Throughput: 0: 777.8, 1: 779.8. Samples: 871862. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0) -[2023-09-26 07:56:48,762][91478] Avg episode reward: [(0, '6.670'), (1, '6.560')] -[2023-09-26 07:56:53,698][92473] Updated weights for policy 0, policy_version 6880 (0.0016) -[2023-09-26 07:56:53,699][92474] Updated weights for policy 1, policy_version 6880 (0.0018) -[2023-09-26 07:56:53,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 3522560. Throughput: 0: 774.8, 1: 777.3. Samples: 876548. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) -[2023-09-26 07:56:53,763][91478] Avg episode reward: [(0, '6.750'), (1, '6.420')] -[2023-09-26 07:56:58,762][91478] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 3547136. Throughput: 0: 777.1, 1: 777.1. Samples: 885824. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 07:56:58,762][91478] Avg episode reward: [(0, '6.640'), (1, '6.390')] -[2023-09-26 07:57:03,762][91478] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 3579904. Throughput: 0: 776.7, 1: 776.4. Samples: 895172. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) -[2023-09-26 07:57:03,763][91478] Avg episode reward: [(0, '6.540'), (1, '6.510')] -[2023-09-26 07:57:06,653][92474] Updated weights for policy 1, policy_version 7040 (0.0017) -[2023-09-26 07:57:06,653][92473] Updated weights for policy 0, policy_version 7040 (0.0016) -[2023-09-26 07:57:08,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6280.6, 300 sec: 6220.4). Total num frames: 3612672. Throughput: 0: 784.0, 1: 783.2. Samples: 900018. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) -[2023-09-26 07:57:08,762][91478] Avg episode reward: [(0, '6.540'), (1, '6.350')] -[2023-09-26 07:57:13,762][91478] Fps is (10 sec: 6553.8, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 3645440. Throughput: 0: 778.9, 1: 779.4. Samples: 909358. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) -[2023-09-26 07:57:13,762][91478] Avg episode reward: [(0, '6.430'), (1, '6.470')] -[2023-09-26 07:57:13,772][92345] Saving ./train_atari/atari_frostbite/checkpoint_p1/checkpoint_000007120_1822720.pth... -[2023-09-26 07:57:13,772][91993] Saving ./train_atari/atari_frostbite/checkpoint_p0/checkpoint_000007120_1822720.pth... -[2023-09-26 07:57:13,801][92345] Removing ./train_atari/atari_frostbite/checkpoint_p1/checkpoint_000004192_1073152.pth -[2023-09-26 07:57:13,814][91993] Removing ./train_atari/atari_frostbite/checkpoint_p0/checkpoint_000004192_1073152.pth -[2023-09-26 07:57:18,762][91478] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 3678208. Throughput: 0: 780.1, 1: 778.3. Samples: 918761. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) -[2023-09-26 07:57:18,763][91478] Avg episode reward: [(0, '6.610'), (1, '6.550')] -[2023-09-26 07:57:19,838][92474] Updated weights for policy 1, policy_version 7200 (0.0017) -[2023-09-26 07:57:19,839][92473] Updated weights for policy 0, policy_version 7200 (0.0017) -[2023-09-26 07:57:23,762][91478] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 3710976. Throughput: 0: 780.8, 1: 782.1. Samples: 923605. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0) -[2023-09-26 07:57:23,763][91478] Avg episode reward: [(0, '6.570'), (1, '6.570')] -[2023-09-26 07:57:28,762][91478] Fps is (10 sec: 6144.0, 60 sec: 6212.2, 300 sec: 6234.2). Total num frames: 3739648. Throughput: 0: 781.6, 1: 782.4. Samples: 933016. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) -[2023-09-26 07:57:28,764][91478] Avg episode reward: [(0, '6.820'), (1, '6.410')] -[2023-09-26 07:57:28,775][91993] Saving new best policy, reward=6.820! -[2023-09-26 07:57:32,757][92474] Updated weights for policy 1, policy_version 7360 (0.0017) -[2023-09-26 07:57:32,757][92473] Updated weights for policy 0, policy_version 7360 (0.0017) -[2023-09-26 07:57:33,762][91478] Fps is (10 sec: 5734.5, 60 sec: 6212.3, 300 sec: 6220.4). Total num frames: 3768320. Throughput: 0: 782.8, 1: 781.4. Samples: 942249. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) -[2023-09-26 07:57:33,762][91478] Avg episode reward: [(0, '6.630'), (1, '6.490')] -[2023-09-26 07:57:38,762][91478] Fps is (10 sec: 6144.0, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 3801088. Throughput: 0: 786.3, 1: 784.5. Samples: 947234. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 07:57:38,763][91478] Avg episode reward: [(0, '6.690'), (1, '6.470')] -[2023-09-26 07:57:43,762][91478] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 3833856. Throughput: 0: 785.2, 1: 785.2. Samples: 956492. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 07:57:43,762][91478] Avg episode reward: [(0, '6.580'), (1, '6.510')] -[2023-09-26 07:57:45,633][92473] Updated weights for policy 0, policy_version 7520 (0.0017) -[2023-09-26 07:57:45,633][92474] Updated weights for policy 1, policy_version 7520 (0.0017) -[2023-09-26 07:57:48,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 3866624. Throughput: 0: 789.5, 1: 789.7. Samples: 966235. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 07:57:48,763][91478] Avg episode reward: [(0, '6.660'), (1, '6.490')] -[2023-09-26 07:57:53,762][91478] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 3899392. Throughput: 0: 785.1, 1: 786.9. Samples: 970757. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-09-26 07:57:53,763][91478] Avg episode reward: [(0, '6.780'), (1, '6.410')] -[2023-09-26 07:57:58,700][92473] Updated weights for policy 0, policy_version 7680 (0.0015) -[2023-09-26 07:57:58,701][92474] Updated weights for policy 1, policy_version 7680 (0.0018) -[2023-09-26 07:57:58,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6417.0, 300 sec: 6248.1). Total num frames: 3932160. Throughput: 0: 789.3, 1: 788.7. Samples: 980368. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 07:57:58,763][91478] Avg episode reward: [(0, '6.730'), (1, '6.330')] -[2023-09-26 07:58:03,762][91478] Fps is (10 sec: 5734.5, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 3956736. Throughput: 0: 789.4, 1: 789.6. Samples: 989817. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 07:58:03,763][91478] Avg episode reward: [(0, '6.660'), (1, '6.420')] -[2023-09-26 07:58:08,762][91478] Fps is (10 sec: 5734.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 3989504. Throughput: 0: 789.6, 1: 789.0. Samples: 994642. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) -[2023-09-26 07:58:08,762][91478] Avg episode reward: [(0, '6.640'), (1, '6.590')] -[2023-09-26 07:58:11,682][92473] Updated weights for policy 0, policy_version 7840 (0.0016) -[2023-09-26 07:58:11,682][92474] Updated weights for policy 1, policy_version 7840 (0.0016) -[2023-09-26 07:58:13,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 4022272. Throughput: 0: 786.9, 1: 786.5. Samples: 1003820. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) -[2023-09-26 07:58:13,763][91478] Avg episode reward: [(0, '6.720'), (1, '6.500')] -[2023-09-26 07:58:18,762][91478] Fps is (10 sec: 6553.4, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 4055040. Throughput: 0: 789.2, 1: 789.9. Samples: 1013306. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) -[2023-09-26 07:58:18,763][91478] Avg episode reward: [(0, '6.630'), (1, '6.340')] -[2023-09-26 07:58:23,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 4087808. Throughput: 0: 783.8, 1: 785.6. Samples: 1017854. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 07:58:23,763][91478] Avg episode reward: [(0, '6.830'), (1, '6.490')] -[2023-09-26 07:58:23,764][91993] Saving new best policy, reward=6.830! -[2023-09-26 07:58:25,132][92474] Updated weights for policy 1, policy_version 8000 (0.0015) -[2023-09-26 07:58:25,132][92473] Updated weights for policy 0, policy_version 8000 (0.0017) -[2023-09-26 07:58:28,762][91478] Fps is (10 sec: 5734.5, 60 sec: 6212.3, 300 sec: 6220.4). Total num frames: 4112384. Throughput: 0: 780.8, 1: 780.8. Samples: 1026767. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 07:58:28,762][91478] Avg episode reward: [(0, '6.710'), (1, '6.280')] -[2023-09-26 07:58:33,762][91478] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 4145152. Throughput: 0: 777.8, 1: 778.2. Samples: 1036253. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) -[2023-09-26 07:58:33,763][91478] Avg episode reward: [(0, '6.600'), (1, '6.380')] -[2023-09-26 07:58:38,315][92474] Updated weights for policy 1, policy_version 8160 (0.0018) -[2023-09-26 07:58:38,316][92473] Updated weights for policy 0, policy_version 8160 (0.0019) -[2023-09-26 07:58:38,762][91478] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 4177920. Throughput: 0: 777.2, 1: 776.0. Samples: 1040653. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) -[2023-09-26 07:58:38,763][91478] Avg episode reward: [(0, '6.690'), (1, '6.420')] -[2023-09-26 07:58:43,762][91478] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 4210688. Throughput: 0: 778.2, 1: 777.8. Samples: 1050386. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) -[2023-09-26 07:58:43,763][91478] Avg episode reward: [(0, '6.790'), (1, '6.350')] -[2023-09-26 07:58:48,762][91478] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 4243456. Throughput: 0: 775.3, 1: 775.0. Samples: 1059577. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 07:58:48,763][91478] Avg episode reward: [(0, '6.560'), (1, '6.450')] -[2023-09-26 07:58:51,419][92473] Updated weights for policy 0, policy_version 8320 (0.0016) -[2023-09-26 07:58:51,421][92474] Updated weights for policy 1, policy_version 8320 (0.0016) -[2023-09-26 07:58:53,762][91478] Fps is (10 sec: 5734.7, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 4268032. Throughput: 0: 775.8, 1: 775.7. Samples: 1064461. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 07:58:53,762][91478] Avg episode reward: [(0, '6.610'), (1, '6.640')] -[2023-09-26 07:58:58,762][91478] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 4300800. Throughput: 0: 775.2, 1: 775.0. Samples: 1073579. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 07:58:58,763][91478] Avg episode reward: [(0, '6.520'), (1, '6.440')] -[2023-09-26 07:59:03,762][91478] Fps is (10 sec: 6553.4, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 4333568. Throughput: 0: 774.5, 1: 774.4. Samples: 1083004. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 07:59:03,763][91478] Avg episode reward: [(0, '6.620'), (1, '6.310')] -[2023-09-26 07:59:04,564][92473] Updated weights for policy 0, policy_version 8480 (0.0019) -[2023-09-26 07:59:04,564][92474] Updated weights for policy 1, policy_version 8480 (0.0020) -[2023-09-26 07:59:08,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 4366336. Throughput: 0: 773.9, 1: 773.8. Samples: 1087499. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) -[2023-09-26 07:59:08,763][91478] Avg episode reward: [(0, '6.620'), (1, '6.460')] -[2023-09-26 07:59:13,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 4399104. Throughput: 0: 782.3, 1: 781.4. Samples: 1097133. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) -[2023-09-26 07:59:13,763][91478] Avg episode reward: [(0, '6.890'), (1, '6.430')] -[2023-09-26 07:59:13,778][92345] Saving ./train_atari/atari_frostbite/checkpoint_p1/checkpoint_000008592_2199552.pth... -[2023-09-26 07:59:13,779][91993] Saving ./train_atari/atari_frostbite/checkpoint_p0/checkpoint_000008592_2199552.pth... -[2023-09-26 07:59:13,809][92345] Removing ./train_atari/atari_frostbite/checkpoint_p1/checkpoint_000005648_1445888.pth -[2023-09-26 07:59:13,819][91993] Removing ./train_atari/atari_frostbite/checkpoint_p0/checkpoint_000005648_1445888.pth -[2023-09-26 07:59:13,823][91993] Saving new best policy, reward=6.890! -[2023-09-26 07:59:17,709][92474] Updated weights for policy 1, policy_version 8640 (0.0017) -[2023-09-26 07:59:17,710][92473] Updated weights for policy 0, policy_version 8640 (0.0017) -[2023-09-26 07:59:18,762][91478] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 4423680. Throughput: 0: 777.5, 1: 776.6. Samples: 1106186. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) -[2023-09-26 07:59:18,763][91478] Avg episode reward: [(0, '6.790'), (1, '6.470')] -[2023-09-26 07:59:23,762][91478] Fps is (10 sec: 5734.5, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 4456448. Throughput: 0: 781.0, 1: 780.3. Samples: 1110913. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) -[2023-09-26 07:59:23,762][91478] Avg episode reward: [(0, '6.560'), (1, '6.390')] -[2023-09-26 07:59:28,762][91478] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 4489216. Throughput: 0: 777.3, 1: 777.9. Samples: 1120369. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 07:59:28,763][91478] Avg episode reward: [(0, '6.530'), (1, '6.370')] -[2023-09-26 07:59:30,641][92474] Updated weights for policy 1, policy_version 8800 (0.0017) -[2023-09-26 07:59:30,641][92473] Updated weights for policy 0, policy_version 8800 (0.0016) -[2023-09-26 07:59:33,762][91478] Fps is (10 sec: 6553.4, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 4521984. Throughput: 0: 784.7, 1: 784.5. Samples: 1130194. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 07:59:33,763][91478] Avg episode reward: [(0, '6.550'), (1, '6.300')] -[2023-09-26 07:59:38,762][91478] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 4554752. Throughput: 0: 779.5, 1: 780.1. Samples: 1134642. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 07:59:38,763][91478] Avg episode reward: [(0, '6.700'), (1, '6.440')] -[2023-09-26 07:59:43,652][92474] Updated weights for policy 1, policy_version 8960 (0.0016) -[2023-09-26 07:59:43,654][92473] Updated weights for policy 0, policy_version 8960 (0.0017) -[2023-09-26 07:59:43,762][91478] Fps is (10 sec: 6553.8, 60 sec: 6280.6, 300 sec: 6275.9). Total num frames: 4587520. Throughput: 0: 785.4, 1: 786.3. Samples: 1144307. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 07:59:43,762][91478] Avg episode reward: [(0, '6.750'), (1, '6.320')] -[2023-09-26 07:59:48,762][91478] Fps is (10 sec: 5734.5, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 4612096. Throughput: 0: 785.2, 1: 785.2. Samples: 1153671. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-09-26 07:59:48,762][91478] Avg episode reward: [(0, '6.760'), (1, '6.250')] -[2023-09-26 07:59:53,762][91478] Fps is (10 sec: 5734.3, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 4644864. Throughput: 0: 787.8, 1: 786.8. Samples: 1158358. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-09-26 07:59:53,763][91478] Avg episode reward: [(0, '6.780'), (1, '6.210')] -[2023-09-26 07:59:56,739][92473] Updated weights for policy 0, policy_version 9120 (0.0017) -[2023-09-26 07:59:56,740][92474] Updated weights for policy 1, policy_version 9120 (0.0017) -[2023-09-26 07:59:58,762][91478] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 4677632. Throughput: 0: 782.4, 1: 783.1. Samples: 1167581. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) -[2023-09-26 07:59:58,763][91478] Avg episode reward: [(0, '6.510'), (1, '6.460')] -[2023-09-26 08:00:03,762][91478] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 4710400. Throughput: 0: 792.3, 1: 792.6. Samples: 1177504. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) -[2023-09-26 08:00:03,763][91478] Avg episode reward: [(0, '6.560'), (1, '6.360')] -[2023-09-26 08:00:08,762][91478] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 4743168. Throughput: 0: 789.4, 1: 789.9. Samples: 1181983. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) -[2023-09-26 08:00:08,762][91478] Avg episode reward: [(0, '6.560'), (1, '6.470')] -[2023-09-26 08:00:09,560][92474] Updated weights for policy 1, policy_version 9280 (0.0018) -[2023-09-26 08:00:09,560][92473] Updated weights for policy 0, policy_version 9280 (0.0018) -[2023-09-26 08:00:13,762][91478] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 4775936. Throughput: 0: 794.4, 1: 793.5. Samples: 1191827. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:00:13,763][91478] Avg episode reward: [(0, '6.730'), (1, '6.280')] -[2023-09-26 08:00:18,762][91478] Fps is (10 sec: 6553.5, 60 sec: 6417.1, 300 sec: 6275.9). Total num frames: 4808704. Throughput: 0: 788.1, 1: 788.5. Samples: 1201143. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:00:18,763][91478] Avg episode reward: [(0, '6.510'), (1, '6.320')] -[2023-09-26 08:00:22,515][92474] Updated weights for policy 1, policy_version 9440 (0.0017) -[2023-09-26 08:00:22,515][92473] Updated weights for policy 0, policy_version 9440 (0.0018) -[2023-09-26 08:00:23,762][91478] Fps is (10 sec: 6144.0, 60 sec: 6348.8, 300 sec: 6262.0). Total num frames: 4837376. Throughput: 0: 795.4, 1: 794.8. Samples: 1206201. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:00:23,763][91478] Avg episode reward: [(0, '6.650'), (1, '6.310')] -[2023-09-26 08:00:28,762][91478] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 4866048. Throughput: 0: 789.7, 1: 788.7. Samples: 1215336. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:00:28,763][91478] Avg episode reward: [(0, '6.640'), (1, '6.570')] -[2023-09-26 08:00:33,762][91478] Fps is (10 sec: 6144.0, 60 sec: 6280.5, 300 sec: 6262.0). Total num frames: 4898816. Throughput: 0: 789.0, 1: 789.5. Samples: 1224704. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:00:33,763][91478] Avg episode reward: [(0, '6.600'), (1, '6.500')] -[2023-09-26 08:00:35,694][92474] Updated weights for policy 1, policy_version 9600 (0.0016) -[2023-09-26 08:00:35,694][92473] Updated weights for policy 0, policy_version 9600 (0.0018) -[2023-09-26 08:00:38,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 4931584. Throughput: 0: 786.1, 1: 785.7. Samples: 1229088. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) -[2023-09-26 08:00:38,763][91478] Avg episode reward: [(0, '6.640'), (1, '6.450')] -[2023-09-26 08:00:43,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 4964352. Throughput: 0: 791.1, 1: 791.0. Samples: 1238777. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) -[2023-09-26 08:00:43,763][91478] Avg episode reward: [(0, '6.660'), (1, '6.390')] -[2023-09-26 08:00:48,762][91478] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 4988928. Throughput: 0: 780.5, 1: 780.2. Samples: 1247736. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) -[2023-09-26 08:00:48,763][91478] Avg episode reward: [(0, '6.640'), (1, '6.320')] -[2023-09-26 08:00:48,862][92473] Updated weights for policy 0, policy_version 9760 (0.0017) -[2023-09-26 08:00:48,862][92474] Updated weights for policy 1, policy_version 9760 (0.0016) -[2023-09-26 08:00:53,762][91478] Fps is (10 sec: 5734.6, 60 sec: 6280.6, 300 sec: 6248.1). Total num frames: 5021696. Throughput: 0: 785.2, 1: 786.0. Samples: 1252688. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:00:53,762][91478] Avg episode reward: [(0, '6.660'), (1, '6.400')] -[2023-09-26 08:00:58,762][91478] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 5054464. Throughput: 0: 774.5, 1: 776.1. Samples: 1261601. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) -[2023-09-26 08:00:58,763][91478] Avg episode reward: [(0, '6.620'), (1, '6.400')] -[2023-09-26 08:01:02,251][92473] Updated weights for policy 0, policy_version 9920 (0.0019) -[2023-09-26 08:01:02,251][92474] Updated weights for policy 1, policy_version 9920 (0.0019) -[2023-09-26 08:01:03,762][91478] Fps is (10 sec: 6553.4, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 5087232. Throughput: 0: 775.6, 1: 775.7. Samples: 1270948. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) -[2023-09-26 08:01:03,763][91478] Avg episode reward: [(0, '6.630'), (1, '6.560')] -[2023-09-26 08:01:08,762][91478] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 5111808. Throughput: 0: 771.2, 1: 771.2. Samples: 1275613. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:01:08,763][91478] Avg episode reward: [(0, '6.480'), (1, '6.400')] -[2023-09-26 08:01:13,762][91478] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 5144576. Throughput: 0: 773.4, 1: 773.5. Samples: 1284947. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:01:13,763][91478] Avg episode reward: [(0, '6.570'), (1, '6.390')] -[2023-09-26 08:01:13,775][92345] Saving ./train_atari/atari_frostbite/checkpoint_p1/checkpoint_000010048_2572288.pth... -[2023-09-26 08:01:13,776][91993] Saving ./train_atari/atari_frostbite/checkpoint_p0/checkpoint_000010048_2572288.pth... -[2023-09-26 08:01:13,811][91993] Removing ./train_atari/atari_frostbite/checkpoint_p0/checkpoint_000007120_1822720.pth -[2023-09-26 08:01:13,811][92345] Removing ./train_atari/atari_frostbite/checkpoint_p1/checkpoint_000007120_1822720.pth -[2023-09-26 08:01:15,516][92473] Updated weights for policy 0, policy_version 10080 (0.0018) -[2023-09-26 08:01:15,516][92474] Updated weights for policy 1, policy_version 10080 (0.0018) -[2023-09-26 08:01:18,762][91478] Fps is (10 sec: 6553.7, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 5177344. Throughput: 0: 773.7, 1: 773.6. Samples: 1294332. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:01:18,762][91478] Avg episode reward: [(0, '6.650'), (1, '6.500')] -[2023-09-26 08:01:23,762][91478] Fps is (10 sec: 6553.7, 60 sec: 6212.3, 300 sec: 6248.1). Total num frames: 5210112. Throughput: 0: 773.9, 1: 774.6. Samples: 1298772. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:01:23,763][91478] Avg episode reward: [(0, '6.680'), (1, '6.450')] -[2023-09-26 08:01:28,661][92473] Updated weights for policy 0, policy_version 10240 (0.0015) -[2023-09-26 08:01:28,662][92474] Updated weights for policy 1, policy_version 10240 (0.0017) -[2023-09-26 08:01:28,762][91478] Fps is (10 sec: 6553.4, 60 sec: 6280.5, 300 sec: 6262.0). Total num frames: 5242880. Throughput: 0: 770.7, 1: 770.4. Samples: 1308125. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) -[2023-09-26 08:01:28,763][91478] Avg episode reward: [(0, '6.670'), (1, '6.580')] -[2023-09-26 08:01:33,762][91478] Fps is (10 sec: 5734.5, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 5267456. Throughput: 0: 773.5, 1: 773.7. Samples: 1317360. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) -[2023-09-26 08:01:33,763][91478] Avg episode reward: [(0, '6.670'), (1, '6.500')] -[2023-09-26 08:01:38,762][91478] Fps is (10 sec: 5734.5, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 5300224. Throughput: 0: 772.2, 1: 771.0. Samples: 1322132. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:01:38,763][91478] Avg episode reward: [(0, '6.560'), (1, '6.480')] -[2023-09-26 08:01:41,535][92474] Updated weights for policy 1, policy_version 10400 (0.0015) -[2023-09-26 08:01:41,535][92473] Updated weights for policy 0, policy_version 10400 (0.0018) -[2023-09-26 08:01:43,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 5332992. Throughput: 0: 780.4, 1: 779.6. Samples: 1331800. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:01:43,763][91478] Avg episode reward: [(0, '6.630'), (1, '6.340')] -[2023-09-26 08:01:48,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 5365760. Throughput: 0: 782.6, 1: 783.9. Samples: 1341440. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:01:48,763][91478] Avg episode reward: [(0, '6.680'), (1, '6.390')] -[2023-09-26 08:01:53,762][91478] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 5398528. Throughput: 0: 781.0, 1: 780.9. Samples: 1345901. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) -[2023-09-26 08:01:53,763][91478] Avg episode reward: [(0, '6.620'), (1, '6.240')] -[2023-09-26 08:01:54,733][92474] Updated weights for policy 1, policy_version 10560 (0.0018) -[2023-09-26 08:01:54,733][92473] Updated weights for policy 0, policy_version 10560 (0.0014) -[2023-09-26 08:01:58,762][91478] Fps is (10 sec: 6553.8, 60 sec: 6280.6, 300 sec: 6275.9). Total num frames: 5431296. Throughput: 0: 780.4, 1: 780.8. Samples: 1355200. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) -[2023-09-26 08:01:58,762][91478] Avg episode reward: [(0, '6.490'), (1, '6.350')] -[2023-09-26 08:02:03,762][91478] Fps is (10 sec: 6144.1, 60 sec: 6212.3, 300 sec: 6262.0). Total num frames: 5459968. Throughput: 0: 782.3, 1: 780.2. Samples: 1364642. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:02:03,763][91478] Avg episode reward: [(0, '6.580'), (1, '6.510')] -[2023-09-26 08:02:07,656][92474] Updated weights for policy 1, policy_version 10720 (0.0017) -[2023-09-26 08:02:07,658][92473] Updated weights for policy 0, policy_version 10720 (0.0018) -[2023-09-26 08:02:08,762][91478] Fps is (10 sec: 5734.3, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 5488640. Throughput: 0: 786.3, 1: 785.9. Samples: 1369521. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:02:08,763][91478] Avg episode reward: [(0, '6.440'), (1, '6.310')] -[2023-09-26 08:02:13,762][91478] Fps is (10 sec: 6143.9, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 5521408. Throughput: 0: 785.5, 1: 786.4. Samples: 1378859. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:02:13,763][91478] Avg episode reward: [(0, '6.570'), (1, '6.350')] -[2023-09-26 08:02:18,762][91478] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 5554176. Throughput: 0: 790.2, 1: 791.3. Samples: 1388529. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:02:18,762][91478] Avg episode reward: [(0, '6.730'), (1, '6.540')] -[2023-09-26 08:02:20,714][92473] Updated weights for policy 0, policy_version 10880 (0.0017) -[2023-09-26 08:02:20,714][92474] Updated weights for policy 1, policy_version 10880 (0.0016) -[2023-09-26 08:02:23,762][91478] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6262.0). Total num frames: 5586944. Throughput: 0: 786.3, 1: 786.3. Samples: 1392896. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:02:23,763][91478] Avg episode reward: [(0, '6.610'), (1, '6.400')] -[2023-09-26 08:02:28,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6280.6, 300 sec: 6275.9). Total num frames: 5619712. Throughput: 0: 784.6, 1: 784.4. Samples: 1402404. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:02:28,762][91478] Avg episode reward: [(0, '6.510'), (1, '6.430')] -[2023-09-26 08:02:33,738][92474] Updated weights for policy 1, policy_version 11040 (0.0018) -[2023-09-26 08:02:33,738][92473] Updated weights for policy 0, policy_version 11040 (0.0017) -[2023-09-26 08:02:33,762][91478] Fps is (10 sec: 6553.7, 60 sec: 6417.1, 300 sec: 6275.9). Total num frames: 5652480. Throughput: 0: 782.8, 1: 781.1. Samples: 1411813. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:02:33,762][91478] Avg episode reward: [(0, '6.560'), (1, '6.290')] -[2023-09-26 08:02:38,762][91478] Fps is (10 sec: 5734.3, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 5677056. Throughput: 0: 783.8, 1: 785.3. Samples: 1416513. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:02:38,763][91478] Avg episode reward: [(0, '6.600'), (1, '6.390')] -[2023-09-26 08:02:43,762][91478] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 5709824. Throughput: 0: 786.3, 1: 786.5. Samples: 1425976. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:02:43,762][91478] Avg episode reward: [(0, '6.600'), (1, '6.440')] -[2023-09-26 08:02:46,710][92474] Updated weights for policy 1, policy_version 11200 (0.0016) -[2023-09-26 08:02:46,710][92473] Updated weights for policy 0, policy_version 11200 (0.0016) -[2023-09-26 08:02:48,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 5742592. Throughput: 0: 787.9, 1: 790.0. Samples: 1435648. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) -[2023-09-26 08:02:48,763][91478] Avg episode reward: [(0, '6.750'), (1, '6.600')] -[2023-09-26 08:02:53,762][91478] Fps is (10 sec: 6553.4, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 5775360. Throughput: 0: 785.6, 1: 785.4. Samples: 1440214. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) -[2023-09-26 08:02:53,763][91478] Avg episode reward: [(0, '6.590'), (1, '6.450')] -[2023-09-26 08:02:58,762][91478] Fps is (10 sec: 6553.8, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 5808128. Throughput: 0: 789.3, 1: 788.4. Samples: 1449853. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) -[2023-09-26 08:02:58,762][91478] Avg episode reward: [(0, '6.460'), (1, '6.570')] -[2023-09-26 08:02:59,757][92473] Updated weights for policy 0, policy_version 11360 (0.0018) -[2023-09-26 08:02:59,757][92474] Updated weights for policy 1, policy_version 11360 (0.0017) -[2023-09-26 08:03:03,762][91478] Fps is (10 sec: 6553.7, 60 sec: 6348.8, 300 sec: 6275.9). Total num frames: 5840896. Throughput: 0: 783.6, 1: 783.2. Samples: 1459035. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) -[2023-09-26 08:03:03,763][91478] Avg episode reward: [(0, '6.510'), (1, '6.500')] -[2023-09-26 08:03:08,762][91478] Fps is (10 sec: 5734.2, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 5865472. Throughput: 0: 785.5, 1: 786.0. Samples: 1463616. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:03:08,763][91478] Avg episode reward: [(0, '6.590'), (1, '6.380')] -[2023-09-26 08:03:13,029][92474] Updated weights for policy 1, policy_version 11520 (0.0018) -[2023-09-26 08:03:13,029][92473] Updated weights for policy 0, policy_version 11520 (0.0017) -[2023-09-26 08:03:13,762][91478] Fps is (10 sec: 5734.4, 60 sec: 6280.6, 300 sec: 6248.1). Total num frames: 5898240. Throughput: 0: 782.1, 1: 782.0. Samples: 1472790. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:03:13,762][91478] Avg episode reward: [(0, '6.570'), (1, '6.290')] -[2023-09-26 08:03:13,772][91993] Saving ./train_atari/atari_frostbite/checkpoint_p0/checkpoint_000011520_2949120.pth... -[2023-09-26 08:03:13,772][92345] Saving ./train_atari/atari_frostbite/checkpoint_p1/checkpoint_000011520_2949120.pth... -[2023-09-26 08:03:13,807][91993] Removing ./train_atari/atari_frostbite/checkpoint_p0/checkpoint_000008592_2199552.pth -[2023-09-26 08:03:13,814][92345] Removing ./train_atari/atari_frostbite/checkpoint_p1/checkpoint_000008592_2199552.pth -[2023-09-26 08:03:18,762][91478] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 5931008. Throughput: 0: 782.8, 1: 783.0. Samples: 1482277. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:03:18,763][91478] Avg episode reward: [(0, '6.580'), (1, '6.530')] -[2023-09-26 08:03:23,762][91478] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 5963776. Throughput: 0: 781.4, 1: 781.6. Samples: 1486848. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:03:23,763][91478] Avg episode reward: [(0, '6.820'), (1, '6.570')] -[2023-09-26 08:03:26,279][92474] Updated weights for policy 1, policy_version 11680 (0.0017) -[2023-09-26 08:03:26,279][92473] Updated weights for policy 0, policy_version 11680 (0.0015) -[2023-09-26 08:03:28,762][91478] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 5988352. Throughput: 0: 778.1, 1: 779.4. Samples: 1496063. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:03:28,763][91478] Avg episode reward: [(0, '6.640'), (1, '6.390')] -[2023-09-26 08:03:33,762][91478] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 6021120. Throughput: 0: 773.7, 1: 773.7. Samples: 1505281. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:03:33,763][91478] Avg episode reward: [(0, '6.630'), (1, '6.470')] -[2023-09-26 08:03:38,762][91478] Fps is (10 sec: 6553.7, 60 sec: 6280.6, 300 sec: 6248.1). Total num frames: 6053888. Throughput: 0: 776.6, 1: 776.9. Samples: 1510121. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:03:38,762][91478] Avg episode reward: [(0, '6.720'), (1, '6.520')] -[2023-09-26 08:03:39,443][92474] Updated weights for policy 1, policy_version 11840 (0.0016) -[2023-09-26 08:03:39,444][92473] Updated weights for policy 0, policy_version 11840 (0.0017) -[2023-09-26 08:03:43,762][91478] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 6086656. Throughput: 0: 774.1, 1: 776.1. Samples: 1519616. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0) -[2023-09-26 08:03:43,763][91478] Avg episode reward: [(0, '6.820'), (1, '6.550')] -[2023-09-26 08:03:48,762][91478] Fps is (10 sec: 6553.3, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 6119424. Throughput: 0: 778.9, 1: 777.7. Samples: 1529081. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0) -[2023-09-26 08:03:48,763][91478] Avg episode reward: [(0, '6.660'), (1, '6.350')] -[2023-09-26 08:03:52,261][92474] Updated weights for policy 1, policy_version 12000 (0.0017) -[2023-09-26 08:03:52,262][92473] Updated weights for policy 0, policy_version 12000 (0.0015) -[2023-09-26 08:03:53,762][91478] Fps is (10 sec: 6553.8, 60 sec: 6280.6, 300 sec: 6275.9). Total num frames: 6152192. Throughput: 0: 780.8, 1: 782.2. Samples: 1533952. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:03:53,762][91478] Avg episode reward: [(0, '6.650'), (1, '6.390')] -[2023-09-26 08:03:58,762][91478] Fps is (10 sec: 5734.7, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 6176768. Throughput: 0: 782.8, 1: 785.6. Samples: 1543369. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:03:58,762][91478] Avg episode reward: [(0, '6.500'), (1, '6.580')] -[2023-09-26 08:04:03,762][91478] Fps is (10 sec: 5734.2, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 6209536. Throughput: 0: 778.2, 1: 779.7. Samples: 1552384. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) -[2023-09-26 08:04:03,763][91478] Avg episode reward: [(0, '6.770'), (1, '6.510')] -[2023-09-26 08:04:05,587][92473] Updated weights for policy 0, policy_version 12160 (0.0014) -[2023-09-26 08:04:05,588][92474] Updated weights for policy 1, policy_version 12160 (0.0018) -[2023-09-26 08:04:08,762][91478] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 6242304. Throughput: 0: 780.3, 1: 778.8. Samples: 1557011. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) -[2023-09-26 08:04:08,763][91478] Avg episode reward: [(0, '6.470'), (1, '6.440')] -[2023-09-26 08:04:13,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 6275072. Throughput: 0: 784.8, 1: 783.1. Samples: 1566617. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0) -[2023-09-26 08:04:13,763][91478] Avg episode reward: [(0, '6.510'), (1, '6.560')] -[2023-09-26 08:04:18,762][91478] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 6299648. Throughput: 0: 780.2, 1: 779.0. Samples: 1575443. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0) -[2023-09-26 08:04:18,763][91478] Avg episode reward: [(0, '6.580'), (1, '6.410')] -[2023-09-26 08:04:18,906][92474] Updated weights for policy 1, policy_version 12320 (0.0018) -[2023-09-26 08:04:18,906][92473] Updated weights for policy 0, policy_version 12320 (0.0014) -[2023-09-26 08:04:23,762][91478] Fps is (10 sec: 5734.5, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 6332416. Throughput: 0: 779.2, 1: 778.9. Samples: 1580237. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) -[2023-09-26 08:04:23,763][91478] Avg episode reward: [(0, '6.650'), (1, '6.410')] -[2023-09-26 08:04:28,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 6365184. Throughput: 0: 773.8, 1: 773.7. Samples: 1589252. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) -[2023-09-26 08:04:28,763][91478] Avg episode reward: [(0, '6.640'), (1, '6.530')] -[2023-09-26 08:04:32,081][92474] Updated weights for policy 1, policy_version 12480 (0.0017) -[2023-09-26 08:04:32,081][92473] Updated weights for policy 0, policy_version 12480 (0.0016) -[2023-09-26 08:04:33,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 6397952. Throughput: 0: 775.9, 1: 776.3. Samples: 1598929. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) -[2023-09-26 08:04:33,763][91478] Avg episode reward: [(0, '6.660'), (1, '6.450')] -[2023-09-26 08:04:38,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 6430720. Throughput: 0: 773.7, 1: 773.7. Samples: 1603584. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) -[2023-09-26 08:04:38,763][91478] Avg episode reward: [(0, '6.680'), (1, '6.470')] -[2023-09-26 08:04:43,762][91478] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 6455296. Throughput: 0: 774.2, 1: 771.1. Samples: 1612905. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:04:43,763][91478] Avg episode reward: [(0, '6.540'), (1, '6.570')] -[2023-09-26 08:04:45,270][92473] Updated weights for policy 0, policy_version 12640 (0.0016) -[2023-09-26 08:04:45,270][92474] Updated weights for policy 1, policy_version 12640 (0.0018) -[2023-09-26 08:04:48,762][91478] Fps is (10 sec: 5734.5, 60 sec: 6144.1, 300 sec: 6248.1). Total num frames: 6488064. Throughput: 0: 773.8, 1: 773.7. Samples: 1622020. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:04:48,762][91478] Avg episode reward: [(0, '6.670'), (1, '6.400')] -[2023-09-26 08:04:53,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 6520832. Throughput: 0: 773.6, 1: 775.1. Samples: 1626704. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) -[2023-09-26 08:04:53,763][91478] Avg episode reward: [(0, '6.790'), (1, '6.630')] -[2023-09-26 08:04:58,647][92474] Updated weights for policy 1, policy_version 12800 (0.0017) -[2023-09-26 08:04:58,647][92473] Updated weights for policy 0, policy_version 12800 (0.0017) -[2023-09-26 08:04:58,762][91478] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 6553600. Throughput: 0: 770.3, 1: 770.2. Samples: 1635943. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) -[2023-09-26 08:04:58,763][91478] Avg episode reward: [(0, '6.680'), (1, '6.510')] -[2023-09-26 08:05:03,762][91478] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 6578176. Throughput: 0: 774.0, 1: 774.5. Samples: 1645126. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) -[2023-09-26 08:05:03,763][91478] Avg episode reward: [(0, '6.620'), (1, '6.450')] -[2023-09-26 08:05:08,762][91478] Fps is (10 sec: 5734.5, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 6610944. Throughput: 0: 775.1, 1: 775.6. Samples: 1650016. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) -[2023-09-26 08:05:08,762][91478] Avg episode reward: [(0, '6.660'), (1, '6.550')] -[2023-09-26 08:05:11,865][92473] Updated weights for policy 0, policy_version 12960 (0.0016) -[2023-09-26 08:05:11,865][92474] Updated weights for policy 1, policy_version 12960 (0.0017) -[2023-09-26 08:05:13,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 6643712. Throughput: 0: 774.0, 1: 773.7. Samples: 1658900. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:05:13,762][91478] Avg episode reward: [(0, '6.540'), (1, '6.640')] -[2023-09-26 08:05:13,773][92345] Saving ./train_atari/atari_frostbite/checkpoint_p1/checkpoint_000012976_3321856.pth... -[2023-09-26 08:05:13,774][91993] Saving ./train_atari/atari_frostbite/checkpoint_p0/checkpoint_000012976_3321856.pth... -[2023-09-26 08:05:13,812][92345] Removing ./train_atari/atari_frostbite/checkpoint_p1/checkpoint_000010048_2572288.pth -[2023-09-26 08:05:13,816][91993] Removing ./train_atari/atari_frostbite/checkpoint_p0/checkpoint_000010048_2572288.pth -[2023-09-26 08:05:18,762][91478] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6234.3). Total num frames: 6676480. Throughput: 0: 773.7, 1: 774.4. Samples: 1668591. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:05:18,763][91478] Avg episode reward: [(0, '6.860'), (1, '6.560')] -[2023-09-26 08:05:23,762][91478] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 6709248. Throughput: 0: 773.8, 1: 773.7. Samples: 1673220. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) -[2023-09-26 08:05:23,763][91478] Avg episode reward: [(0, '6.650'), (1, '6.610')] -[2023-09-26 08:05:24,879][92474] Updated weights for policy 1, policy_version 13120 (0.0016) -[2023-09-26 08:05:24,879][92473] Updated weights for policy 0, policy_version 13120 (0.0017) -[2023-09-26 08:05:28,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 6742016. Throughput: 0: 776.7, 1: 776.7. Samples: 1682808. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) -[2023-09-26 08:05:28,763][91478] Avg episode reward: [(0, '6.620'), (1, '6.560')] -[2023-09-26 08:05:33,762][91478] Fps is (10 sec: 5734.5, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 6766592. Throughput: 0: 778.0, 1: 776.9. Samples: 1691991. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) -[2023-09-26 08:05:33,762][91478] Avg episode reward: [(0, '6.630'), (1, '6.610')] -[2023-09-26 08:05:37,836][92474] Updated weights for policy 1, policy_version 13280 (0.0017) -[2023-09-26 08:05:37,836][92473] Updated weights for policy 0, policy_version 13280 (0.0016) -[2023-09-26 08:05:38,762][91478] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 6799360. Throughput: 0: 780.6, 1: 778.9. Samples: 1696882. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:05:38,763][91478] Avg episode reward: [(0, '6.700'), (1, '6.570')] -[2023-09-26 08:05:43,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6280.6, 300 sec: 6248.1). Total num frames: 6832128. Throughput: 0: 780.7, 1: 780.9. Samples: 1706218. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:05:43,762][91478] Avg episode reward: [(0, '6.810'), (1, '6.540')] -[2023-09-26 08:05:48,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 6864896. Throughput: 0: 785.9, 1: 783.6. Samples: 1715756. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) -[2023-09-26 08:05:48,763][91478] Avg episode reward: [(0, '6.670'), (1, '6.590')] -[2023-09-26 08:05:51,050][92473] Updated weights for policy 0, policy_version 13440 (0.0014) -[2023-09-26 08:05:51,050][92474] Updated weights for policy 1, policy_version 13440 (0.0017) -[2023-09-26 08:05:53,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6280.6, 300 sec: 6248.1). Total num frames: 6897664. Throughput: 0: 780.6, 1: 781.7. Samples: 1720317. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) -[2023-09-26 08:05:53,762][91478] Avg episode reward: [(0, '6.540'), (1, '6.500')] -[2023-09-26 08:05:58,762][91478] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 6922240. Throughput: 0: 786.6, 1: 785.3. Samples: 1729635. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0) -[2023-09-26 08:05:58,763][91478] Avg episode reward: [(0, '6.700'), (1, '6.430')] -[2023-09-26 08:06:03,762][91478] Fps is (10 sec: 5734.3, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 6955008. Throughput: 0: 779.2, 1: 779.9. Samples: 1738753. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0) -[2023-09-26 08:06:03,763][91478] Avg episode reward: [(0, '6.810'), (1, '6.200')] -[2023-09-26 08:06:04,217][92473] Updated weights for policy 0, policy_version 13600 (0.0017) -[2023-09-26 08:06:04,217][92474] Updated weights for policy 1, policy_version 13600 (0.0017) -[2023-09-26 08:06:08,762][91478] Fps is (10 sec: 6553.8, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 6987776. Throughput: 0: 782.3, 1: 780.9. Samples: 1743564. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:06:08,762][91478] Avg episode reward: [(0, '6.840'), (1, '6.330')] -[2023-09-26 08:06:13,762][91478] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 7020544. Throughput: 0: 780.1, 1: 781.8. Samples: 1753092. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:06:13,763][91478] Avg episode reward: [(0, '6.840'), (1, '6.290')] -[2023-09-26 08:06:17,140][92474] Updated weights for policy 1, policy_version 13760 (0.0016) -[2023-09-26 08:06:17,140][92473] Updated weights for policy 0, policy_version 13760 (0.0017) -[2023-09-26 08:06:18,762][91478] Fps is (10 sec: 6553.4, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 7053312. Throughput: 0: 785.3, 1: 785.6. Samples: 1762682. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:06:18,763][91478] Avg episode reward: [(0, '6.720'), (1, '6.400')] -[2023-09-26 08:06:23,762][91478] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 7086080. Throughput: 0: 783.0, 1: 783.9. Samples: 1767392. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:06:23,763][91478] Avg episode reward: [(0, '6.720'), (1, '6.480')] -[2023-09-26 08:06:28,762][91478] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 7110656. Throughput: 0: 780.7, 1: 781.1. Samples: 1776498. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:06:28,763][91478] Avg episode reward: [(0, '6.740'), (1, '6.370')] -[2023-09-26 08:06:30,414][92473] Updated weights for policy 0, policy_version 13920 (0.0018) -[2023-09-26 08:06:30,414][92474] Updated weights for policy 1, policy_version 13920 (0.0016) -[2023-09-26 08:06:33,762][91478] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 7143424. Throughput: 0: 777.4, 1: 780.4. Samples: 1785856. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:06:33,763][91478] Avg episode reward: [(0, '6.600'), (1, '6.590')] -[2023-09-26 08:06:38,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 7176192. Throughput: 0: 775.4, 1: 774.0. Samples: 1790043. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0) -[2023-09-26 08:06:38,763][91478] Avg episode reward: [(0, '6.760'), (1, '6.450')] -[2023-09-26 08:06:43,762][91478] Fps is (10 sec: 6144.0, 60 sec: 6212.2, 300 sec: 6234.3). Total num frames: 7204864. Throughput: 0: 777.0, 1: 777.4. Samples: 1799586. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0) -[2023-09-26 08:06:43,763][91478] Avg episode reward: [(0, '6.500'), (1, '6.480')] -[2023-09-26 08:06:43,765][92474] Updated weights for policy 1, policy_version 14080 (0.0018) -[2023-09-26 08:06:43,765][92473] Updated weights for policy 0, policy_version 14080 (0.0017) -[2023-09-26 08:06:48,762][91478] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 7233536. Throughput: 0: 775.1, 1: 773.8. Samples: 1808455. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:06:48,763][91478] Avg episode reward: [(0, '6.450'), (1, '6.500')] -[2023-09-26 08:06:53,762][91478] Fps is (10 sec: 6144.1, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 7266304. Throughput: 0: 775.1, 1: 774.5. Samples: 1813297. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:06:53,762][91478] Avg episode reward: [(0, '6.480'), (1, '6.510')] -[2023-09-26 08:06:56,980][92474] Updated weights for policy 1, policy_version 14240 (0.0018) -[2023-09-26 08:06:56,980][92473] Updated weights for policy 0, policy_version 14240 (0.0018) -[2023-09-26 08:06:58,762][91478] Fps is (10 sec: 6553.7, 60 sec: 6280.6, 300 sec: 6234.3). Total num frames: 7299072. Throughput: 0: 773.6, 1: 773.7. Samples: 1822720. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) -[2023-09-26 08:06:58,762][91478] Avg episode reward: [(0, '6.800'), (1, '6.570')] -[2023-09-26 08:07:03,762][91478] Fps is (10 sec: 6143.9, 60 sec: 6212.3, 300 sec: 6234.3). Total num frames: 7327744. Throughput: 0: 766.0, 1: 765.7. Samples: 1831610. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) -[2023-09-26 08:07:03,763][91478] Avg episode reward: [(0, '6.470'), (1, '6.610')] -[2023-09-26 08:07:08,762][91478] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 7356416. Throughput: 0: 769.0, 1: 768.2. Samples: 1836562. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) -[2023-09-26 08:07:08,762][91478] Avg episode reward: [(0, '6.570'), (1, '6.460')] -[2023-09-26 08:07:10,356][92474] Updated weights for policy 1, policy_version 14400 (0.0019) -[2023-09-26 08:07:10,356][92473] Updated weights for policy 0, policy_version 14400 (0.0019) -[2023-09-26 08:07:13,762][91478] Fps is (10 sec: 6144.1, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 7389184. Throughput: 0: 767.2, 1: 767.5. Samples: 1845560. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:07:13,762][91478] Avg episode reward: [(0, '6.650'), (1, '6.580')] -[2023-09-26 08:07:13,772][91993] Saving ./train_atari/atari_frostbite/checkpoint_p0/checkpoint_000014432_3694592.pth... -[2023-09-26 08:07:13,772][92345] Saving ./train_atari/atari_frostbite/checkpoint_p1/checkpoint_000014432_3694592.pth... -[2023-09-26 08:07:13,802][92345] Removing ./train_atari/atari_frostbite/checkpoint_p1/checkpoint_000011520_2949120.pth -[2023-09-26 08:07:13,810][91993] Removing ./train_atari/atari_frostbite/checkpoint_p0/checkpoint_000011520_2949120.pth -[2023-09-26 08:07:18,762][91478] Fps is (10 sec: 6553.5, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 7421952. Throughput: 0: 769.1, 1: 768.0. Samples: 1855027. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:07:18,763][91478] Avg episode reward: [(0, '6.750'), (1, '6.490')] -[2023-09-26 08:07:23,477][92474] Updated weights for policy 1, policy_version 14560 (0.0019) -[2023-09-26 08:07:23,477][92473] Updated weights for policy 0, policy_version 14560 (0.0017) -[2023-09-26 08:07:23,762][91478] Fps is (10 sec: 6553.4, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 7454720. Throughput: 0: 772.0, 1: 773.4. Samples: 1859588. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:07:23,763][91478] Avg episode reward: [(0, '6.680'), (1, '6.670')] -[2023-09-26 08:07:28,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 7487488. Throughput: 0: 773.7, 1: 773.4. Samples: 1869203. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:07:28,763][91478] Avg episode reward: [(0, '6.600'), (1, '6.780')] -[2023-09-26 08:07:28,774][92345] Saving new best policy, reward=6.780! -[2023-09-26 08:07:33,762][91478] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 7512064. Throughput: 0: 778.5, 1: 777.7. Samples: 1878487. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:07:33,763][91478] Avg episode reward: [(0, '6.850'), (1, '6.540')] -[2023-09-26 08:07:36,426][92473] Updated weights for policy 0, policy_version 14720 (0.0017) -[2023-09-26 08:07:36,427][92474] Updated weights for policy 1, policy_version 14720 (0.0019) -[2023-09-26 08:07:38,762][91478] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 7544832. Throughput: 0: 779.5, 1: 780.2. Samples: 1883486. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:07:38,763][91478] Avg episode reward: [(0, '6.970'), (1, '6.510')] -[2023-09-26 08:07:38,764][91993] Saving new best policy, reward=6.970! -[2023-09-26 08:07:43,762][91478] Fps is (10 sec: 6553.5, 60 sec: 6212.3, 300 sec: 6220.4). Total num frames: 7577600. Throughput: 0: 778.0, 1: 776.5. Samples: 1892673. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:07:43,763][91478] Avg episode reward: [(0, '6.820'), (1, '6.850')] -[2023-09-26 08:07:43,775][92345] Saving new best policy, reward=6.850! -[2023-09-26 08:07:48,762][91478] Fps is (10 sec: 6553.7, 60 sec: 6280.6, 300 sec: 6220.4). Total num frames: 7610368. Throughput: 0: 786.0, 1: 786.8. Samples: 1902387. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:07:48,762][91478] Avg episode reward: [(0, '6.680'), (1, '6.590')] -[2023-09-26 08:07:49,566][92474] Updated weights for policy 1, policy_version 14880 (0.0014) -[2023-09-26 08:07:49,566][92473] Updated weights for policy 0, policy_version 14880 (0.0019) -[2023-09-26 08:07:53,762][91478] Fps is (10 sec: 6553.8, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 7643136. Throughput: 0: 778.7, 1: 780.0. Samples: 1906703. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:07:53,762][91478] Avg episode reward: [(0, '6.540'), (1, '6.570')] -[2023-09-26 08:07:58,762][91478] Fps is (10 sec: 6143.9, 60 sec: 6212.3, 300 sec: 6206.5). Total num frames: 7671808. Throughput: 0: 784.1, 1: 785.5. Samples: 1916194. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) -[2023-09-26 08:07:58,763][91478] Avg episode reward: [(0, '6.760'), (1, '6.790')] -[2023-09-26 08:08:02,795][92474] Updated weights for policy 1, policy_version 15040 (0.0019) -[2023-09-26 08:08:02,795][92473] Updated weights for policy 0, policy_version 15040 (0.0018) -[2023-09-26 08:08:03,762][91478] Fps is (10 sec: 5734.2, 60 sec: 6212.3, 300 sec: 6220.4). Total num frames: 7700480. Throughput: 0: 780.0, 1: 779.7. Samples: 1925217. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) -[2023-09-26 08:08:03,763][91478] Avg episode reward: [(0, '6.880'), (1, '6.610')] -[2023-09-26 08:08:08,762][91478] Fps is (10 sec: 6144.0, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 7733248. Throughput: 0: 780.8, 1: 779.5. Samples: 1929805. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:08:08,763][91478] Avg episode reward: [(0, '6.690'), (1, '6.470')] -[2023-09-26 08:08:13,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 7766016. Throughput: 0: 779.5, 1: 776.1. Samples: 1939208. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:08:13,763][91478] Avg episode reward: [(0, '6.580'), (1, '6.640')] -[2023-09-26 08:08:16,131][92474] Updated weights for policy 1, policy_version 15200 (0.0017) -[2023-09-26 08:08:16,131][92473] Updated weights for policy 0, policy_version 15200 (0.0018) -[2023-09-26 08:08:18,762][91478] Fps is (10 sec: 6553.7, 60 sec: 6280.6, 300 sec: 6220.4). Total num frames: 7798784. Throughput: 0: 778.3, 1: 779.0. Samples: 1948565. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) -[2023-09-26 08:08:18,762][91478] Avg episode reward: [(0, '6.530'), (1, '6.790')] -[2023-09-26 08:08:23,762][91478] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 7823360. Throughput: 0: 778.5, 1: 778.2. Samples: 1953538. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) -[2023-09-26 08:08:23,763][91478] Avg episode reward: [(0, '6.600'), (1, '6.510')] -[2023-09-26 08:08:28,762][91478] Fps is (10 sec: 5734.2, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 7856128. Throughput: 0: 778.8, 1: 779.0. Samples: 1962777. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) -[2023-09-26 08:08:28,763][91478] Avg episode reward: [(0, '6.730'), (1, '6.620')] -[2023-09-26 08:08:29,004][92473] Updated weights for policy 0, policy_version 15360 (0.0016) -[2023-09-26 08:08:29,004][92474] Updated weights for policy 1, policy_version 15360 (0.0015) -[2023-09-26 08:08:33,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 7888896. Throughput: 0: 775.7, 1: 776.3. Samples: 1972224. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:08:33,763][91478] Avg episode reward: [(0, '6.730'), (1, '6.610')] -[2023-09-26 08:08:38,762][91478] Fps is (10 sec: 6553.8, 60 sec: 6280.6, 300 sec: 6220.4). Total num frames: 7921664. Throughput: 0: 777.2, 1: 776.3. Samples: 1976612. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:08:38,762][91478] Avg episode reward: [(0, '6.620'), (1, '6.700')] -[2023-09-26 08:08:42,197][92474] Updated weights for policy 1, policy_version 15520 (0.0016) -[2023-09-26 08:08:42,197][92473] Updated weights for policy 0, policy_version 15520 (0.0015) -[2023-09-26 08:08:43,762][91478] Fps is (10 sec: 6553.4, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 7954432. Throughput: 0: 782.0, 1: 778.2. Samples: 1986405. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) -[2023-09-26 08:08:43,764][91478] Avg episode reward: [(0, '6.750'), (1, '6.740')] -[2023-09-26 08:08:48,762][91478] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 7987200. Throughput: 0: 781.3, 1: 781.4. Samples: 1995539. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) -[2023-09-26 08:08:48,763][91478] Avg episode reward: [(0, '6.750'), (1, '6.900')] -[2023-09-26 08:08:48,764][92345] Saving new best policy, reward=6.900! -[2023-09-26 08:08:53,762][91478] Fps is (10 sec: 5734.6, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 8011776. Throughput: 0: 783.4, 1: 784.2. Samples: 2000349. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) -[2023-09-26 08:08:53,763][91478] Avg episode reward: [(0, '6.760'), (1, '6.830')] -[2023-09-26 08:08:55,390][92473] Updated weights for policy 0, policy_version 15680 (0.0018) -[2023-09-26 08:08:55,390][92474] Updated weights for policy 1, policy_version 15680 (0.0018) -[2023-09-26 08:08:58,762][91478] Fps is (10 sec: 5734.3, 60 sec: 6212.2, 300 sec: 6220.4). Total num frames: 8044544. Throughput: 0: 778.1, 1: 781.8. Samples: 2009402. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) -[2023-09-26 08:08:58,763][91478] Avg episode reward: [(0, '6.750'), (1, '6.930')] -[2023-09-26 08:08:58,775][92345] Saving new best policy, reward=6.930! -[2023-09-26 08:09:03,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 8077312. Throughput: 0: 780.6, 1: 782.6. Samples: 2018907. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) -[2023-09-26 08:09:03,763][91478] Avg episode reward: [(0, '6.740'), (1, '6.690')] -[2023-09-26 08:09:08,544][92473] Updated weights for policy 0, policy_version 15840 (0.0018) -[2023-09-26 08:09:08,545][92474] Updated weights for policy 1, policy_version 15840 (0.0018) -[2023-09-26 08:09:08,762][91478] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 8110080. Throughput: 0: 775.8, 1: 777.3. Samples: 2023428. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:09:08,763][91478] Avg episode reward: [(0, '6.840'), (1, '6.850')] -[2023-09-26 08:09:13,762][91478] Fps is (10 sec: 6143.8, 60 sec: 6212.3, 300 sec: 6234.2). Total num frames: 8138752. Throughput: 0: 780.5, 1: 780.4. Samples: 2033017. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:09:13,763][91478] Avg episode reward: [(0, '6.830'), (1, '7.180')] -[2023-09-26 08:09:13,778][92345] Saving ./train_atari/atari_frostbite/checkpoint_p1/checkpoint_000015904_4071424.pth... -[2023-09-26 08:09:13,809][92345] Removing ./train_atari/atari_frostbite/checkpoint_p1/checkpoint_000012976_3321856.pth -[2023-09-26 08:09:13,812][92345] Saving new best policy, reward=7.180! -[2023-09-26 08:09:13,813][91993] Saving ./train_atari/atari_frostbite/checkpoint_p0/checkpoint_000015904_4071424.pth... -[2023-09-26 08:09:13,842][91993] Removing ./train_atari/atari_frostbite/checkpoint_p0/checkpoint_000012976_3321856.pth -[2023-09-26 08:09:18,762][91478] Fps is (10 sec: 5734.6, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 8167424. Throughput: 0: 775.2, 1: 773.9. Samples: 2041936. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) -[2023-09-26 08:09:18,762][91478] Avg episode reward: [(0, '6.730'), (1, '6.740')] -[2023-09-26 08:09:21,724][92473] Updated weights for policy 0, policy_version 16000 (0.0019) -[2023-09-26 08:09:21,724][92474] Updated weights for policy 1, policy_version 16000 (0.0018) -[2023-09-26 08:09:23,762][91478] Fps is (10 sec: 6144.1, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 8200192. Throughput: 0: 779.1, 1: 778.8. Samples: 2046719. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) -[2023-09-26 08:09:23,763][91478] Avg episode reward: [(0, '6.520'), (1, '6.590')] -[2023-09-26 08:09:28,762][91478] Fps is (10 sec: 6553.4, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 8232960. Throughput: 0: 773.7, 1: 777.1. Samples: 2056192. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) -[2023-09-26 08:09:28,763][91478] Avg episode reward: [(0, '6.560'), (1, '6.570')] -[2023-09-26 08:09:33,762][91478] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 8265728. Throughput: 0: 778.4, 1: 778.3. Samples: 2065589. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:09:33,763][91478] Avg episode reward: [(0, '6.440'), (1, '6.740')] -[2023-09-26 08:09:34,835][92473] Updated weights for policy 0, policy_version 16160 (0.0017) -[2023-09-26 08:09:34,836][92474] Updated weights for policy 1, policy_version 16160 (0.0019) -[2023-09-26 08:09:38,762][91478] Fps is (10 sec: 6553.8, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 8298496. Throughput: 0: 779.5, 1: 778.5. Samples: 2070457. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:09:38,762][91478] Avg episode reward: [(0, '6.580'), (1, '6.500')] -[2023-09-26 08:09:43,762][91478] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 8323072. Throughput: 0: 780.6, 1: 780.7. Samples: 2079661. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:09:43,763][91478] Avg episode reward: [(0, '6.730'), (1, '6.510')] -[2023-09-26 08:09:47,769][92474] Updated weights for policy 1, policy_version 16320 (0.0014) -[2023-09-26 08:09:47,770][92473] Updated weights for policy 0, policy_version 16320 (0.0015) -[2023-09-26 08:09:48,762][91478] Fps is (10 sec: 5734.2, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 8355840. Throughput: 0: 780.9, 1: 779.1. Samples: 2089110. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:09:48,763][91478] Avg episode reward: [(0, '6.660'), (1, '6.690')] -[2023-09-26 08:09:53,762][91478] Fps is (10 sec: 6553.8, 60 sec: 6280.6, 300 sec: 6220.4). Total num frames: 8388608. Throughput: 0: 783.7, 1: 782.5. Samples: 2093909. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:09:53,762][91478] Avg episode reward: [(0, '6.480'), (1, '6.720')] -[2023-09-26 08:09:58,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 8421376. Throughput: 0: 780.2, 1: 781.6. Samples: 2103300. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:09:58,763][91478] Avg episode reward: [(0, '6.460'), (1, '6.670')] -[2023-09-26 08:10:00,828][92473] Updated weights for policy 0, policy_version 16480 (0.0016) -[2023-09-26 08:10:00,828][92474] Updated weights for policy 1, policy_version 16480 (0.0017) -[2023-09-26 08:10:03,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 8454144. Throughput: 0: 785.6, 1: 786.7. Samples: 2112692. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:10:03,762][91478] Avg episode reward: [(0, '6.620'), (1, '6.820')] -[2023-09-26 08:10:08,762][91478] Fps is (10 sec: 5734.5, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 8478720. Throughput: 0: 787.0, 1: 786.3. Samples: 2117519. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:10:08,763][91478] Avg episode reward: [(0, '6.630'), (1, '6.630')] -[2023-09-26 08:10:13,762][91478] Fps is (10 sec: 5734.2, 60 sec: 6212.3, 300 sec: 6220.4). Total num frames: 8511488. Throughput: 0: 784.8, 1: 783.4. Samples: 2126760. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:10:13,763][91478] Avg episode reward: [(0, '6.610'), (1, '6.830')] -[2023-09-26 08:10:13,970][92474] Updated weights for policy 1, policy_version 16640 (0.0019) -[2023-09-26 08:10:13,970][92473] Updated weights for policy 0, policy_version 16640 (0.0019) -[2023-09-26 08:10:18,762][91478] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 8544256. Throughput: 0: 783.0, 1: 783.8. Samples: 2136097. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:10:18,763][91478] Avg episode reward: [(0, '6.520'), (1, '6.420')] -[2023-09-26 08:10:23,762][91478] Fps is (10 sec: 6553.8, 60 sec: 6280.6, 300 sec: 6220.4). Total num frames: 8577024. Throughput: 0: 779.7, 1: 780.0. Samples: 2140640. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:10:23,762][91478] Avg episode reward: [(0, '6.460'), (1, '6.440')] -[2023-09-26 08:10:27,053][92473] Updated weights for policy 0, policy_version 16800 (0.0016) -[2023-09-26 08:10:27,053][92474] Updated weights for policy 1, policy_version 16800 (0.0016) -[2023-09-26 08:10:28,762][91478] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 8609792. Throughput: 0: 785.4, 1: 786.5. Samples: 2150400. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:10:28,763][91478] Avg episode reward: [(0, '6.570'), (1, '6.650')] -[2023-09-26 08:10:33,762][91478] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 8642560. Throughput: 0: 784.2, 1: 785.1. Samples: 2159726. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) -[2023-09-26 08:10:33,763][91478] Avg episode reward: [(0, '6.580'), (1, '6.710')] -[2023-09-26 08:10:38,762][91478] Fps is (10 sec: 5734.5, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 8667136. Throughput: 0: 784.3, 1: 782.1. Samples: 2164397. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) -[2023-09-26 08:10:38,763][91478] Avg episode reward: [(0, '6.650'), (1, '6.480')] -[2023-09-26 08:10:40,213][92473] Updated weights for policy 0, policy_version 16960 (0.0016) -[2023-09-26 08:10:40,214][92474] Updated weights for policy 1, policy_version 16960 (0.0016) -[2023-09-26 08:10:43,762][91478] Fps is (10 sec: 5734.3, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 8699904. Throughput: 0: 782.3, 1: 780.1. Samples: 2173605. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) -[2023-09-26 08:10:43,763][91478] Avg episode reward: [(0, '6.620'), (1, '6.520')] -[2023-09-26 08:10:48,762][91478] Fps is (10 sec: 6553.7, 60 sec: 6280.6, 300 sec: 6220.4). Total num frames: 8732672. Throughput: 0: 783.0, 1: 783.2. Samples: 2183171. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:10:48,762][91478] Avg episode reward: [(0, '6.540'), (1, '6.460')] -[2023-09-26 08:10:53,183][92474] Updated weights for policy 1, policy_version 17120 (0.0017) -[2023-09-26 08:10:53,183][92473] Updated weights for policy 0, policy_version 17120 (0.0017) -[2023-09-26 08:10:53,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 8765440. Throughput: 0: 779.7, 1: 780.5. Samples: 2187729. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:10:53,763][91478] Avg episode reward: [(0, '6.640'), (1, '6.180')] -[2023-09-26 08:10:58,762][91478] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 8798208. Throughput: 0: 785.3, 1: 786.8. Samples: 2197504. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:10:58,762][91478] Avg episode reward: [(0, '6.710'), (1, '6.040')] -[2023-09-26 08:11:03,762][91478] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 8830976. Throughput: 0: 788.8, 1: 787.8. Samples: 2207045. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:11:03,763][91478] Avg episode reward: [(0, '6.760'), (1, '6.190')] -[2023-09-26 08:11:06,239][92473] Updated weights for policy 0, policy_version 17280 (0.0017) -[2023-09-26 08:11:06,239][92474] Updated weights for policy 1, policy_version 17280 (0.0016) -[2023-09-26 08:11:08,762][91478] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 8855552. Throughput: 0: 788.2, 1: 790.8. Samples: 2211694. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:11:08,763][91478] Avg episode reward: [(0, '6.760'), (1, '6.290')] -[2023-09-26 08:11:13,762][91478] Fps is (10 sec: 5734.5, 60 sec: 6280.6, 300 sec: 6220.4). Total num frames: 8888320. Throughput: 0: 783.9, 1: 782.5. Samples: 2220890. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:11:13,762][91478] Avg episode reward: [(0, '6.590'), (1, '6.640')] -[2023-09-26 08:11:13,901][92345] Saving ./train_atari/atari_frostbite/checkpoint_p1/checkpoint_000017376_4448256.pth... -[2023-09-26 08:11:13,927][92345] Removing ./train_atari/atari_frostbite/checkpoint_p1/checkpoint_000014432_3694592.pth -[2023-09-26 08:11:13,930][91993] Saving ./train_atari/atari_frostbite/checkpoint_p0/checkpoint_000017376_4448256.pth... -[2023-09-26 08:11:13,963][91993] Removing ./train_atari/atari_frostbite/checkpoint_p0/checkpoint_000014432_3694592.pth -[2023-09-26 08:11:18,762][91478] Fps is (10 sec: 6553.8, 60 sec: 6280.6, 300 sec: 6220.4). Total num frames: 8921088. Throughput: 0: 784.2, 1: 784.1. Samples: 2230298. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:11:18,762][91478] Avg episode reward: [(0, '6.650'), (1, '6.540')] -[2023-09-26 08:11:19,135][92474] Updated weights for policy 1, policy_version 17440 (0.0018) -[2023-09-26 08:11:19,135][92473] Updated weights for policy 0, policy_version 17440 (0.0015) -[2023-09-26 08:11:23,762][91478] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 8953856. Throughput: 0: 784.6, 1: 786.1. Samples: 2235076. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:11:23,763][91478] Avg episode reward: [(0, '6.550'), (1, '6.320')] -[2023-09-26 08:11:28,762][91478] Fps is (10 sec: 6553.4, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 8986624. Throughput: 0: 789.4, 1: 790.3. Samples: 2244689. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) -[2023-09-26 08:11:28,763][91478] Avg episode reward: [(0, '6.650'), (1, '6.470')] -[2023-09-26 08:11:32,096][92474] Updated weights for policy 1, policy_version 17600 (0.0016) -[2023-09-26 08:11:32,096][92473] Updated weights for policy 0, policy_version 17600 (0.0016) -[2023-09-26 08:11:33,762][91478] Fps is (10 sec: 6553.7, 60 sec: 6280.6, 300 sec: 6248.1). Total num frames: 9019392. Throughput: 0: 789.8, 1: 788.9. Samples: 2254215. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) -[2023-09-26 08:11:33,762][91478] Avg episode reward: [(0, '6.600'), (1, '6.610')] -[2023-09-26 08:11:38,762][91478] Fps is (10 sec: 6553.8, 60 sec: 6417.1, 300 sec: 6262.0). Total num frames: 9052160. Throughput: 0: 790.6, 1: 789.1. Samples: 2258817. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) -[2023-09-26 08:11:38,762][91478] Avg episode reward: [(0, '6.700'), (1, '6.580')] -[2023-09-26 08:11:43,762][91478] Fps is (10 sec: 5734.4, 60 sec: 6280.6, 300 sec: 6248.1). Total num frames: 9076736. Throughput: 0: 782.5, 1: 781.0. Samples: 2267859. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) -[2023-09-26 08:11:43,762][91478] Avg episode reward: [(0, '6.670'), (1, '6.480')] -[2023-09-26 08:11:45,432][92474] Updated weights for policy 1, policy_version 17760 (0.0018) -[2023-09-26 08:11:45,432][92473] Updated weights for policy 0, policy_version 17760 (0.0019) -[2023-09-26 08:11:48,762][91478] Fps is (10 sec: 5734.3, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 9109504. Throughput: 0: 780.7, 1: 781.6. Samples: 2277348. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) -[2023-09-26 08:11:48,763][91478] Avg episode reward: [(0, '6.780'), (1, '6.520')] -[2023-09-26 08:11:53,762][91478] Fps is (10 sec: 6553.7, 60 sec: 6280.6, 300 sec: 6248.1). Total num frames: 9142272. Throughput: 0: 779.5, 1: 777.1. Samples: 2281740. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) -[2023-09-26 08:11:53,762][91478] Avg episode reward: [(0, '6.590'), (1, '6.750')] -[2023-09-26 08:11:58,558][92474] Updated weights for policy 1, policy_version 17920 (0.0014) -[2023-09-26 08:11:58,559][92473] Updated weights for policy 0, policy_version 17920 (0.0018) -[2023-09-26 08:11:58,762][91478] Fps is (10 sec: 6553.8, 60 sec: 6280.6, 300 sec: 6262.0). Total num frames: 9175040. Throughput: 0: 785.6, 1: 784.2. Samples: 2291531. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) -[2023-09-26 08:11:58,762][91478] Avg episode reward: [(0, '6.760'), (1, '6.470')] -[2023-09-26 08:12:03,762][91478] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 9207808. Throughput: 0: 782.7, 1: 782.5. Samples: 2300732. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) -[2023-09-26 08:12:03,763][91478] Avg episode reward: [(0, '6.700'), (1, '6.460')] -[2023-09-26 08:12:08,762][91478] Fps is (10 sec: 5734.3, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 9232384. Throughput: 0: 783.3, 1: 784.4. Samples: 2305623. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) -[2023-09-26 08:12:08,763][91478] Avg episode reward: [(0, '6.710'), (1, '6.720')] -[2023-09-26 08:12:11,504][92474] Updated weights for policy 1, policy_version 18080 (0.0017) -[2023-09-26 08:12:11,504][92473] Updated weights for policy 0, policy_version 18080 (0.0017) -[2023-09-26 08:12:13,762][91478] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 9265152. Throughput: 0: 780.5, 1: 780.4. Samples: 2314929. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) -[2023-09-26 08:12:13,763][91478] Avg episode reward: [(0, '6.590'), (1, '6.700')] -[2023-09-26 08:12:18,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 9297920. Throughput: 0: 780.2, 1: 780.2. Samples: 2324433. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) -[2023-09-26 08:12:18,763][91478] Avg episode reward: [(0, '6.700'), (1, '6.660')] -[2023-09-26 08:12:23,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 9330688. Throughput: 0: 778.1, 1: 779.6. Samples: 2328914. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) -[2023-09-26 08:12:23,763][91478] Avg episode reward: [(0, '6.790'), (1, '6.790')] -[2023-09-26 08:12:24,548][92474] Updated weights for policy 1, policy_version 18240 (0.0015) -[2023-09-26 08:12:24,548][92473] Updated weights for policy 0, policy_version 18240 (0.0015) -[2023-09-26 08:12:28,762][91478] Fps is (10 sec: 6553.7, 60 sec: 6280.6, 300 sec: 6275.9). Total num frames: 9363456. Throughput: 0: 785.4, 1: 787.9. Samples: 2338656. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) -[2023-09-26 08:12:28,762][91478] Avg episode reward: [(0, '6.690'), (1, '6.620')] -[2023-09-26 08:12:33,762][91478] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 9388032. Throughput: 0: 780.0, 1: 779.1. Samples: 2347510. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) -[2023-09-26 08:12:33,763][91478] Avg episode reward: [(0, '6.590'), (1, '6.630')] -[2023-09-26 08:12:37,845][92474] Updated weights for policy 1, policy_version 18400 (0.0018) -[2023-09-26 08:12:37,846][92473] Updated weights for policy 0, policy_version 18400 (0.0019) -[2023-09-26 08:12:38,762][91478] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 9420800. Throughput: 0: 784.3, 1: 784.0. Samples: 2352314. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) -[2023-09-26 08:12:38,763][91478] Avg episode reward: [(0, '6.570'), (1, '6.780')] -[2023-09-26 08:12:43,762][91478] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 9453568. Throughput: 0: 777.4, 1: 778.8. Samples: 2361561. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) -[2023-09-26 08:12:43,762][91478] Avg episode reward: [(0, '6.540'), (1, '6.750')] -[2023-09-26 08:12:48,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 9486336. Throughput: 0: 783.0, 1: 782.1. Samples: 2371161. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) -[2023-09-26 08:12:48,762][91478] Avg episode reward: [(0, '6.850'), (1, '6.700')] -[2023-09-26 08:12:51,079][92474] Updated weights for policy 1, policy_version 18560 (0.0019) -[2023-09-26 08:12:51,080][92473] Updated weights for policy 0, policy_version 18560 (0.0019) -[2023-09-26 08:12:53,762][91478] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6262.0). Total num frames: 9519104. Throughput: 0: 778.0, 1: 778.2. Samples: 2375652. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) -[2023-09-26 08:12:53,762][91478] Avg episode reward: [(0, '6.830'), (1, '6.800')] -[2023-09-26 08:12:58,762][91478] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 9543680. Throughput: 0: 774.8, 1: 775.5. Samples: 2384691. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) -[2023-09-26 08:12:58,763][91478] Avg episode reward: [(0, '6.510'), (1, '6.620')] -[2023-09-26 08:13:03,762][91478] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 9576448. Throughput: 0: 775.8, 1: 775.4. Samples: 2394239. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) -[2023-09-26 08:13:03,763][91478] Avg episode reward: [(0, '6.510'), (1, '6.770')] -[2023-09-26 08:13:04,055][92474] Updated weights for policy 1, policy_version 18720 (0.0015) -[2023-09-26 08:13:04,056][92473] Updated weights for policy 0, policy_version 18720 (0.0017) -[2023-09-26 08:13:08,762][91478] Fps is (10 sec: 6553.8, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 9609216. Throughput: 0: 781.0, 1: 780.8. Samples: 2399196. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) -[2023-09-26 08:13:08,762][91478] Avg episode reward: [(0, '6.530'), (1, '6.750')] -[2023-09-26 08:13:13,762][91478] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 9641984. Throughput: 0: 777.1, 1: 775.0. Samples: 2408504. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) -[2023-09-26 08:13:13,763][91478] Avg episode reward: [(0, '6.510'), (1, '6.220')] -[2023-09-26 08:13:13,776][91993] Saving ./train_atari/atari_frostbite/checkpoint_p0/checkpoint_000018832_4820992.pth... -[2023-09-26 08:13:13,776][92345] Saving ./train_atari/atari_frostbite/checkpoint_p1/checkpoint_000018832_4820992.pth... -[2023-09-26 08:13:13,811][91993] Removing ./train_atari/atari_frostbite/checkpoint_p0/checkpoint_000015904_4071424.pth -[2023-09-26 08:13:13,812][92345] Removing ./train_atari/atari_frostbite/checkpoint_p1/checkpoint_000015904_4071424.pth -[2023-09-26 08:13:17,045][92474] Updated weights for policy 1, policy_version 18880 (0.0019) -[2023-09-26 08:13:17,045][92473] Updated weights for policy 0, policy_version 18880 (0.0019) -[2023-09-26 08:13:18,762][91478] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 9674752. Throughput: 0: 784.6, 1: 784.8. Samples: 2418132. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) -[2023-09-26 08:13:18,762][91478] Avg episode reward: [(0, '6.760'), (1, '6.640')] -[2023-09-26 08:13:23,762][91478] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 9707520. Throughput: 0: 782.3, 1: 783.7. Samples: 2422785. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:13:23,763][91478] Avg episode reward: [(0, '6.610'), (1, '6.500')] -[2023-09-26 08:13:28,762][91478] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 9732096. Throughput: 0: 782.0, 1: 782.2. Samples: 2431950. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:13:28,763][91478] Avg episode reward: [(0, '6.620'), (1, '6.470')] -[2023-09-26 08:13:30,193][92473] Updated weights for policy 0, policy_version 19040 (0.0018) -[2023-09-26 08:13:30,193][92474] Updated weights for policy 1, policy_version 19040 (0.0016) -[2023-09-26 08:13:33,762][91478] Fps is (10 sec: 5734.5, 60 sec: 6280.6, 300 sec: 6248.1). Total num frames: 9764864. Throughput: 0: 778.7, 1: 779.3. Samples: 2441270. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:13:33,762][91478] Avg episode reward: [(0, '6.740'), (1, '6.790')] -[2023-09-26 08:13:38,762][91478] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 9797632. Throughput: 0: 782.7, 1: 781.9. Samples: 2446060. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0) -[2023-09-26 08:13:38,763][91478] Avg episode reward: [(0, '6.720'), (1, '6.740')] -[2023-09-26 08:13:43,274][92474] Updated weights for policy 1, policy_version 19200 (0.0017) -[2023-09-26 08:13:43,274][92473] Updated weights for policy 0, policy_version 19200 (0.0016) -[2023-09-26 08:13:43,762][91478] Fps is (10 sec: 6553.4, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 9830400. Throughput: 0: 787.0, 1: 787.7. Samples: 2455552. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0) -[2023-09-26 08:13:43,763][91478] Avg episode reward: [(0, '6.670'), (1, '6.690')] -[2023-09-26 08:13:48,762][91478] Fps is (10 sec: 6553.8, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 9863168. Throughput: 0: 786.4, 1: 786.3. Samples: 2465008. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0) -[2023-09-26 08:13:48,762][91478] Avg episode reward: [(0, '6.740'), (1, '6.570')] -[2023-09-26 08:13:53,762][91478] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 9895936. Throughput: 0: 784.7, 1: 786.2. Samples: 2469884. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-09-26 08:13:53,763][91478] Avg episode reward: [(0, '6.780'), (1, '6.480')] -[2023-09-26 08:13:56,342][92473] Updated weights for policy 0, policy_version 19360 (0.0017) -[2023-09-26 08:13:56,342][92474] Updated weights for policy 1, policy_version 19360 (0.0016) -[2023-09-26 08:13:58,762][91478] Fps is (10 sec: 5734.3, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 9920512. Throughput: 0: 784.5, 1: 783.3. Samples: 2479057. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-09-26 08:13:58,763][91478] Avg episode reward: [(0, '6.500'), (1, '6.280')] -[2023-09-26 08:14:03,762][91478] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 9953280. Throughput: 0: 781.6, 1: 781.2. Samples: 2488458. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-09-26 08:14:03,763][91478] Avg episode reward: [(0, '6.570'), (1, '6.100')] -[2023-09-26 08:14:08,762][91478] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6262.0). Total num frames: 9986048. Throughput: 0: 785.3, 1: 783.9. Samples: 2493399. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) -[2023-09-26 08:14:08,762][91478] Avg episode reward: [(0, '6.720'), (1, '6.550')] -[2023-09-26 08:14:09,217][92474] Updated weights for policy 1, policy_version 19520 (0.0016) -[2023-09-26 08:14:09,219][92473] Updated weights for policy 0, policy_version 19520 (0.0016) -[2023-09-26 08:14:13,762][91478] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 10018816. Throughput: 0: 786.5, 1: 786.4. Samples: 2502732. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) -[2023-09-26 08:14:13,763][91478] Avg episode reward: [(0, '6.940'), (1, '6.390')] -[2023-09-26 08:14:18,762][91478] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 10051584. Throughput: 0: 794.8, 1: 793.9. Samples: 2512759. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) -[2023-09-26 08:14:18,762][91478] Avg episode reward: [(0, '6.600'), (1, '6.620')] -[2023-09-26 08:14:21,998][92474] Updated weights for policy 1, policy_version 19680 (0.0017) -[2023-09-26 08:14:21,998][92473] Updated weights for policy 0, policy_version 19680 (0.0015) -[2023-09-26 08:14:23,762][91478] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 10084352. Throughput: 0: 790.2, 1: 790.3. Samples: 2517186. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:14:23,763][91478] Avg episode reward: [(0, '6.560'), (1, '6.870')] -[2023-09-26 08:14:28,762][91478] Fps is (10 sec: 6553.4, 60 sec: 6417.1, 300 sec: 6275.9). Total num frames: 10117120. Throughput: 0: 794.8, 1: 793.1. Samples: 2527007. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:14:28,763][91478] Avg episode reward: [(0, '6.690'), (1, '6.780')] -[2023-09-26 08:14:33,762][91478] Fps is (10 sec: 5734.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 10141696. Throughput: 0: 789.8, 1: 789.4. Samples: 2536071. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:14:33,762][91478] Avg episode reward: [(0, '6.710'), (1, '6.930')] -[2023-09-26 08:14:35,068][92474] Updated weights for policy 1, policy_version 19840 (0.0016) -[2023-09-26 08:14:35,068][92473] Updated weights for policy 0, policy_version 19840 (0.0017) -[2023-09-26 08:14:38,762][91478] Fps is (10 sec: 5734.5, 60 sec: 6280.6, 300 sec: 6275.9). Total num frames: 10174464. Throughput: 0: 791.9, 1: 790.4. Samples: 2541085. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) -[2023-09-26 08:14:38,762][91478] Avg episode reward: [(0, '6.610'), (1, '6.800')] -[2023-09-26 08:14:43,762][91478] Fps is (10 sec: 6553.4, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 10207232. Throughput: 0: 786.3, 1: 787.2. Samples: 2549864. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) -[2023-09-26 08:14:43,763][91478] Avg episode reward: [(0, '6.570'), (1, '6.840')] -[2023-09-26 08:14:48,226][92474] Updated weights for policy 1, policy_version 20000 (0.0018) -[2023-09-26 08:14:48,226][92473] Updated weights for policy 0, policy_version 20000 (0.0017) -[2023-09-26 08:14:48,762][91478] Fps is (10 sec: 6553.4, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 10240000. Throughput: 0: 791.8, 1: 792.0. Samples: 2559726. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) -[2023-09-26 08:14:48,763][91478] Avg episode reward: [(0, '6.950'), (1, '6.710')] -[2023-09-26 08:14:53,762][91478] Fps is (10 sec: 6553.7, 60 sec: 6280.6, 300 sec: 6275.9). Total num frames: 10272768. Throughput: 0: 787.2, 1: 787.2. Samples: 2564248. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:14:53,763][91478] Avg episode reward: [(0, '6.850'), (1, '6.600')] -[2023-09-26 08:14:58,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6275.9). Total num frames: 10305536. Throughput: 0: 791.3, 1: 791.9. Samples: 2573978. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:14:58,763][91478] Avg episode reward: [(0, '6.740'), (1, '6.740')] -[2023-09-26 08:15:01,182][92474] Updated weights for policy 1, policy_version 20160 (0.0017) -[2023-09-26 08:15:01,182][92473] Updated weights for policy 0, policy_version 20160 (0.0018) -[2023-09-26 08:15:03,762][91478] Fps is (10 sec: 6143.9, 60 sec: 6348.8, 300 sec: 6289.8). Total num frames: 10334208. Throughput: 0: 782.6, 1: 784.1. Samples: 2583262. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:15:03,763][91478] Avg episode reward: [(0, '6.740'), (1, '6.710')] -[2023-09-26 08:15:08,762][91478] Fps is (10 sec: 5734.5, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 10362880. Throughput: 0: 784.6, 1: 783.4. Samples: 2587749. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:15:08,763][91478] Avg episode reward: [(0, '6.610'), (1, '6.820')] -[2023-09-26 08:15:13,762][91478] Fps is (10 sec: 6144.0, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 10395648. Throughput: 0: 777.8, 1: 778.4. Samples: 2597037. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:15:13,763][91478] Avg episode reward: [(0, '6.710'), (1, '6.690')] -[2023-09-26 08:15:13,773][91993] Saving ./train_atari/atari_frostbite/checkpoint_p0/checkpoint_000020304_5197824.pth... -[2023-09-26 08:15:13,774][92345] Saving ./train_atari/atari_frostbite/checkpoint_p1/checkpoint_000020304_5197824.pth... -[2023-09-26 08:15:13,812][92345] Removing ./train_atari/atari_frostbite/checkpoint_p1/checkpoint_000017376_4448256.pth -[2023-09-26 08:15:13,814][91993] Removing ./train_atari/atari_frostbite/checkpoint_p0/checkpoint_000017376_4448256.pth -[2023-09-26 08:15:14,371][92474] Updated weights for policy 1, policy_version 20320 (0.0016) -[2023-09-26 08:15:14,372][92473] Updated weights for policy 0, policy_version 20320 (0.0017) -[2023-09-26 08:15:18,762][91478] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 10428416. Throughput: 0: 786.2, 1: 787.4. Samples: 2606887. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) -[2023-09-26 08:15:18,762][91478] Avg episode reward: [(0, '6.740'), (1, '6.870')] -[2023-09-26 08:15:23,762][91478] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 10461184. Throughput: 0: 779.5, 1: 780.0. Samples: 2611261. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) -[2023-09-26 08:15:23,763][91478] Avg episode reward: [(0, '6.660'), (1, '6.670')] -[2023-09-26 08:15:27,236][92473] Updated weights for policy 0, policy_version 20480 (0.0017) -[2023-09-26 08:15:27,236][92474] Updated weights for policy 1, policy_version 20480 (0.0018) -[2023-09-26 08:15:28,762][91478] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 10493952. Throughput: 0: 792.4, 1: 792.0. Samples: 2621162. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) -[2023-09-26 08:15:28,763][91478] Avg episode reward: [(0, '6.770'), (1, '6.850')] -[2023-09-26 08:15:33,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6417.0, 300 sec: 6303.7). Total num frames: 10526720. Throughput: 0: 788.6, 1: 787.8. Samples: 2630664. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:15:33,763][91478] Avg episode reward: [(0, '6.630'), (1, '6.900')] -[2023-09-26 08:15:38,762][91478] Fps is (10 sec: 6144.0, 60 sec: 6348.8, 300 sec: 6289.8). Total num frames: 10555392. Throughput: 0: 792.0, 1: 792.5. Samples: 2635551. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:15:38,763][91478] Avg episode reward: [(0, '6.710'), (1, '6.970')] -[2023-09-26 08:15:40,094][92473] Updated weights for policy 0, policy_version 20640 (0.0015) -[2023-09-26 08:15:40,094][92474] Updated weights for policy 1, policy_version 20640 (0.0017) -[2023-09-26 08:15:43,762][91478] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 10584064. Throughput: 0: 786.1, 1: 785.6. Samples: 2644705. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:15:43,763][91478] Avg episode reward: [(0, '6.710'), (1, '6.940')] -[2023-09-26 08:15:48,762][91478] Fps is (10 sec: 6144.1, 60 sec: 6280.6, 300 sec: 6275.9). Total num frames: 10616832. Throughput: 0: 788.1, 1: 788.1. Samples: 2654190. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:15:48,762][91478] Avg episode reward: [(0, '6.670'), (1, '6.810')] -[2023-09-26 08:15:53,312][92474] Updated weights for policy 1, policy_version 20800 (0.0016) -[2023-09-26 08:15:53,313][92473] Updated weights for policy 0, policy_version 20800 (0.0017) -[2023-09-26 08:15:53,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 10649600. Throughput: 0: 786.5, 1: 787.4. Samples: 2658574. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:15:53,763][91478] Avg episode reward: [(0, '6.460'), (1, '7.050')] -[2023-09-26 08:15:58,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6280.6, 300 sec: 6275.9). Total num frames: 10682368. Throughput: 0: 791.8, 1: 792.4. Samples: 2668324. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:15:58,762][91478] Avg episode reward: [(0, '6.750'), (1, '6.760')] -[2023-09-26 08:16:03,762][91478] Fps is (10 sec: 5734.4, 60 sec: 6212.3, 300 sec: 6275.9). Total num frames: 10706944. Throughput: 0: 782.8, 1: 782.7. Samples: 2677336. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0) -[2023-09-26 08:16:03,763][91478] Avg episode reward: [(0, '6.650'), (1, '6.710')] -[2023-09-26 08:16:06,578][92473] Updated weights for policy 0, policy_version 20960 (0.0015) -[2023-09-26 08:16:06,578][92474] Updated weights for policy 1, policy_version 20960 (0.0017) -[2023-09-26 08:16:08,762][91478] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 10739712. Throughput: 0: 786.8, 1: 786.8. Samples: 2682075. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0) -[2023-09-26 08:16:08,762][91478] Avg episode reward: [(0, '6.690'), (1, '6.780')] -[2023-09-26 08:16:13,762][91478] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 10772480. Throughput: 0: 776.1, 1: 777.6. Samples: 2691078. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0) -[2023-09-26 08:16:13,763][91478] Avg episode reward: [(0, '6.830'), (1, '6.660')] -[2023-09-26 08:16:18,762][91478] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 10805248. Throughput: 0: 781.9, 1: 783.0. Samples: 2701088. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:16:18,763][91478] Avg episode reward: [(0, '6.750'), (1, '6.950')] -[2023-09-26 08:16:19,475][92473] Updated weights for policy 0, policy_version 21120 (0.0019) -[2023-09-26 08:16:19,476][92474] Updated weights for policy 1, policy_version 21120 (0.0018) -[2023-09-26 08:16:23,762][91478] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 10838016. Throughput: 0: 778.3, 1: 778.1. Samples: 2705590. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:16:23,763][91478] Avg episode reward: [(0, '6.750'), (1, '6.770')] -[2023-09-26 08:16:28,762][91478] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 10870784. Throughput: 0: 781.7, 1: 782.6. Samples: 2715098. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:16:28,763][91478] Avg episode reward: [(0, '6.750'), (1, '6.910')] -[2023-09-26 08:16:32,591][92474] Updated weights for policy 1, policy_version 21280 (0.0017) -[2023-09-26 08:16:32,592][92473] Updated weights for policy 0, policy_version 21280 (0.0017) -[2023-09-26 08:16:33,762][91478] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 10895360. Throughput: 0: 779.6, 1: 778.3. Samples: 2724292. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:16:33,763][91478] Avg episode reward: [(0, '6.860'), (1, '6.740')] -[2023-09-26 08:16:38,762][91478] Fps is (10 sec: 5734.5, 60 sec: 6212.3, 300 sec: 6275.9). Total num frames: 10928128. Throughput: 0: 783.3, 1: 784.5. Samples: 2729128. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:16:38,763][91478] Avg episode reward: [(0, '6.720'), (1, '6.570')] -[2023-09-26 08:16:43,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6280.6, 300 sec: 6275.9). Total num frames: 10960896. Throughput: 0: 776.0, 1: 776.4. Samples: 2738182. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:16:43,762][91478] Avg episode reward: [(0, '6.720'), (1, '6.910')] -[2023-09-26 08:16:46,059][92474] Updated weights for policy 1, policy_version 21440 (0.0019) -[2023-09-26 08:16:46,059][92473] Updated weights for policy 0, policy_version 21440 (0.0015) -[2023-09-26 08:16:48,762][91478] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 10993664. Throughput: 0: 778.5, 1: 776.2. Samples: 2747295. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:16:48,763][91478] Avg episode reward: [(0, '6.680'), (1, '6.720')] -[2023-09-26 08:16:53,762][91478] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 11018240. Throughput: 0: 774.4, 1: 776.7. Samples: 2751871. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:16:53,763][91478] Avg episode reward: [(0, '6.680'), (1, '6.810')] -[2023-09-26 08:16:58,762][91478] Fps is (10 sec: 5734.5, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 11051008. Throughput: 0: 782.1, 1: 780.5. Samples: 2761394. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:16:58,762][91478] Avg episode reward: [(0, '6.820'), (1, '6.800')] -[2023-09-26 08:16:59,071][92474] Updated weights for policy 1, policy_version 21600 (0.0016) -[2023-09-26 08:16:59,071][92473] Updated weights for policy 0, policy_version 21600 (0.0017) -[2023-09-26 08:17:03,762][91478] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 11083776. Throughput: 0: 775.6, 1: 776.7. Samples: 2770944. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) -[2023-09-26 08:17:03,762][91478] Avg episode reward: [(0, '6.640'), (1, '6.670')] -[2023-09-26 08:17:08,762][91478] Fps is (10 sec: 6553.4, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 11116544. Throughput: 0: 777.7, 1: 777.2. Samples: 2775559. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) -[2023-09-26 08:17:08,763][91478] Avg episode reward: [(0, '6.590'), (1, '6.870')] -[2023-09-26 08:17:12,033][92473] Updated weights for policy 0, policy_version 21760 (0.0018) -[2023-09-26 08:17:12,033][92474] Updated weights for policy 1, policy_version 21760 (0.0017) -[2023-09-26 08:17:13,762][91478] Fps is (10 sec: 6553.5, 60 sec: 6280.6, 300 sec: 6275.9). Total num frames: 11149312. Throughput: 0: 779.6, 1: 780.0. Samples: 2785280. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) -[2023-09-26 08:17:13,763][91478] Avg episode reward: [(0, '6.700'), (1, '6.810')] -[2023-09-26 08:17:13,773][91993] Saving ./train_atari/atari_frostbite/checkpoint_p0/checkpoint_000021776_5574656.pth... -[2023-09-26 08:17:13,773][92345] Saving ./train_atari/atari_frostbite/checkpoint_p1/checkpoint_000021776_5574656.pth... -[2023-09-26 08:17:13,802][91993] Removing ./train_atari/atari_frostbite/checkpoint_p0/checkpoint_000018832_4820992.pth -[2023-09-26 08:17:13,814][92345] Removing ./train_atari/atari_frostbite/checkpoint_p1/checkpoint_000018832_4820992.pth -[2023-09-26 08:17:18,762][91478] Fps is (10 sec: 6553.8, 60 sec: 6280.6, 300 sec: 6275.9). Total num frames: 11182080. Throughput: 0: 782.4, 1: 782.6. Samples: 2794718. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) -[2023-09-26 08:17:18,762][91478] Avg episode reward: [(0, '6.740'), (1, '7.080')] -[2023-09-26 08:17:23,762][91478] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 11214848. Throughput: 0: 782.9, 1: 782.5. Samples: 2799571. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) -[2023-09-26 08:17:23,763][91478] Avg episode reward: [(0, '6.640'), (1, '6.910')] -[2023-09-26 08:17:25,116][92473] Updated weights for policy 0, policy_version 21920 (0.0017) -[2023-09-26 08:17:25,116][92474] Updated weights for policy 1, policy_version 21920 (0.0018) -[2023-09-26 08:17:28,762][91478] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6275.9). Total num frames: 11239424. Throughput: 0: 783.9, 1: 782.9. Samples: 2808688. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) -[2023-09-26 08:17:28,763][91478] Avg episode reward: [(0, '6.620'), (1, '6.730')] -[2023-09-26 08:17:33,762][91478] Fps is (10 sec: 5734.5, 60 sec: 6280.6, 300 sec: 6275.9). Total num frames: 11272192. Throughput: 0: 784.6, 1: 787.7. Samples: 2818047. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:17:33,762][91478] Avg episode reward: [(0, '6.650'), (1, '6.640')] -[2023-09-26 08:17:38,330][92473] Updated weights for policy 0, policy_version 22080 (0.0016) -[2023-09-26 08:17:38,331][92474] Updated weights for policy 1, policy_version 22080 (0.0017) -[2023-09-26 08:17:38,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 11304960. Throughput: 0: 786.4, 1: 784.8. Samples: 2822573. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:17:38,763][91478] Avg episode reward: [(0, '6.840'), (1, '6.650')] -[2023-09-26 08:17:43,762][91478] Fps is (10 sec: 6143.8, 60 sec: 6212.2, 300 sec: 6262.0). Total num frames: 11333632. Throughput: 0: 781.2, 1: 780.5. Samples: 2831674. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:17:43,763][91478] Avg episode reward: [(0, '6.860'), (1, '6.720')] -[2023-09-26 08:17:48,762][91478] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 11362304. Throughput: 0: 773.8, 1: 773.7. Samples: 2840582. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:17:48,763][91478] Avg episode reward: [(0, '6.640'), (1, '6.540')] -[2023-09-26 08:17:51,835][92474] Updated weights for policy 1, policy_version 22240 (0.0017) -[2023-09-26 08:17:51,835][92473] Updated weights for policy 0, policy_version 22240 (0.0017) -[2023-09-26 08:17:53,762][91478] Fps is (10 sec: 6144.2, 60 sec: 6280.6, 300 sec: 6275.9). Total num frames: 11395072. Throughput: 0: 775.5, 1: 775.4. Samples: 2845349. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) -[2023-09-26 08:17:53,762][91478] Avg episode reward: [(0, '6.750'), (1, '7.010')] -[2023-09-26 08:17:58,762][91478] Fps is (10 sec: 6553.8, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 11427840. Throughput: 0: 773.7, 1: 773.7. Samples: 2854912. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) -[2023-09-26 08:17:58,762][91478] Avg episode reward: [(0, '6.640'), (1, '6.860')] -[2023-09-26 08:18:03,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 11460608. Throughput: 0: 771.3, 1: 772.0. Samples: 2864166. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) -[2023-09-26 08:18:03,762][91478] Avg episode reward: [(0, '6.850'), (1, '6.810')] -[2023-09-26 08:18:04,920][92474] Updated weights for policy 1, policy_version 22400 (0.0017) -[2023-09-26 08:18:04,922][92473] Updated weights for policy 0, policy_version 22400 (0.0015) -[2023-09-26 08:18:08,762][91478] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 11485184. Throughput: 0: 771.1, 1: 770.1. Samples: 2868923. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:18:08,763][91478] Avg episode reward: [(0, '6.710'), (1, '6.950')] -[2023-09-26 08:18:13,762][91478] Fps is (10 sec: 5734.2, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 11517952. Throughput: 0: 773.7, 1: 773.5. Samples: 2878311. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:18:13,763][91478] Avg episode reward: [(0, '6.780'), (1, '6.880')] -[2023-09-26 08:18:17,782][92473] Updated weights for policy 0, policy_version 22560 (0.0016) -[2023-09-26 08:18:17,782][92474] Updated weights for policy 1, policy_version 22560 (0.0015) -[2023-09-26 08:18:18,762][91478] Fps is (10 sec: 6553.5, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 11550720. Throughput: 0: 775.9, 1: 774.4. Samples: 2887812. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:18:18,763][91478] Avg episode reward: [(0, '6.660'), (1, '6.600')] -[2023-09-26 08:18:23,762][91478] Fps is (10 sec: 6553.7, 60 sec: 6144.0, 300 sec: 6275.9). Total num frames: 11583488. Throughput: 0: 779.1, 1: 778.6. Samples: 2892667. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) -[2023-09-26 08:18:23,763][91478] Avg episode reward: [(0, '6.640'), (1, '6.650')] -[2023-09-26 08:18:28,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 11616256. Throughput: 0: 780.4, 1: 782.8. Samples: 2902018. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) -[2023-09-26 08:18:28,763][91478] Avg episode reward: [(0, '6.650'), (1, '6.690')] -[2023-09-26 08:18:30,884][92474] Updated weights for policy 1, policy_version 22720 (0.0015) -[2023-09-26 08:18:30,884][92473] Updated weights for policy 0, policy_version 22720 (0.0015) -[2023-09-26 08:18:33,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 11649024. Throughput: 0: 785.2, 1: 784.7. Samples: 2911226. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) -[2023-09-26 08:18:33,763][91478] Avg episode reward: [(0, '6.770'), (1, '6.600')] -[2023-09-26 08:18:38,762][91478] Fps is (10 sec: 6144.1, 60 sec: 6212.3, 300 sec: 6262.0). Total num frames: 11677696. Throughput: 0: 786.2, 1: 786.3. Samples: 2916111. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) -[2023-09-26 08:18:38,763][91478] Avg episode reward: [(0, '6.880'), (1, '7.000')] -[2023-09-26 08:18:43,762][91478] Fps is (10 sec: 5734.4, 60 sec: 6212.3, 300 sec: 6248.1). Total num frames: 11706368. Throughput: 0: 785.9, 1: 785.0. Samples: 2925601. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) -[2023-09-26 08:18:43,763][91478] Avg episode reward: [(0, '6.820'), (1, '6.980')] -[2023-09-26 08:18:43,941][92473] Updated weights for policy 0, policy_version 22880 (0.0017) -[2023-09-26 08:18:43,941][92474] Updated weights for policy 1, policy_version 22880 (0.0017) -[2023-09-26 08:18:48,762][91478] Fps is (10 sec: 6144.0, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 11739136. Throughput: 0: 784.3, 1: 783.3. Samples: 2934708. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) -[2023-09-26 08:18:48,763][91478] Avg episode reward: [(0, '6.660'), (1, '6.900')] -[2023-09-26 08:18:53,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 11771904. Throughput: 0: 780.2, 1: 780.7. Samples: 2939165. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:18:53,763][91478] Avg episode reward: [(0, '6.350'), (1, '6.880')] -[2023-09-26 08:18:57,210][92474] Updated weights for policy 1, policy_version 23040 (0.0017) -[2023-09-26 08:18:57,210][92473] Updated weights for policy 0, policy_version 23040 (0.0017) -[2023-09-26 08:18:58,762][91478] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 11804672. Throughput: 0: 784.2, 1: 785.6. Samples: 2948951. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:18:58,762][91478] Avg episode reward: [(0, '6.270'), (1, '6.950')] -[2023-09-26 08:19:03,762][91478] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 11829248. Throughput: 0: 775.9, 1: 775.5. Samples: 2957622. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:19:03,763][91478] Avg episode reward: [(0, '6.340'), (1, '6.830')] -[2023-09-26 08:19:08,762][91478] Fps is (10 sec: 5734.2, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 11862016. Throughput: 0: 776.0, 1: 775.0. Samples: 2962462. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:19:08,763][91478] Avg episode reward: [(0, '6.650'), (1, '6.690')] -[2023-09-26 08:19:10,497][92473] Updated weights for policy 0, policy_version 23200 (0.0017) -[2023-09-26 08:19:10,497][92474] Updated weights for policy 1, policy_version 23200 (0.0016) -[2023-09-26 08:19:13,762][91478] Fps is (10 sec: 6553.7, 60 sec: 6280.6, 300 sec: 6248.1). Total num frames: 11894784. Throughput: 0: 774.3, 1: 773.7. Samples: 2971681. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:19:13,762][91478] Avg episode reward: [(0, '6.740'), (1, '6.830')] -[2023-09-26 08:19:13,773][91993] Saving ./train_atari/atari_frostbite/checkpoint_p0/checkpoint_000023232_5947392.pth... -[2023-09-26 08:19:13,774][92345] Saving ./train_atari/atari_frostbite/checkpoint_p1/checkpoint_000023232_5947392.pth... -[2023-09-26 08:19:13,810][92345] Removing ./train_atari/atari_frostbite/checkpoint_p1/checkpoint_000020304_5197824.pth -[2023-09-26 08:19:13,813][91993] Removing ./train_atari/atari_frostbite/checkpoint_p0/checkpoint_000020304_5197824.pth -[2023-09-26 08:19:18,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 11927552. Throughput: 0: 779.0, 1: 779.9. Samples: 2981377. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:19:18,763][91478] Avg episode reward: [(0, '6.730'), (1, '6.910')] -[2023-09-26 08:19:23,573][92473] Updated weights for policy 0, policy_version 23360 (0.0017) -[2023-09-26 08:19:23,575][92474] Updated weights for policy 1, policy_version 23360 (0.0015) -[2023-09-26 08:19:23,762][91478] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 11960320. Throughput: 0: 775.6, 1: 777.1. Samples: 2985984. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:19:23,763][91478] Avg episode reward: [(0, '6.790'), (1, '6.830')] -[2023-09-26 08:19:28,762][91478] Fps is (10 sec: 6144.0, 60 sec: 6212.3, 300 sec: 6262.0). Total num frames: 11988992. Throughput: 0: 777.9, 1: 777.6. Samples: 2995598. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:19:28,763][91478] Avg episode reward: [(0, '6.730'), (1, '7.050')] -[2023-09-26 08:19:33,762][91478] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 12017664. Throughput: 0: 774.5, 1: 775.5. Samples: 3004455. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:19:33,763][91478] Avg episode reward: [(0, '6.610'), (1, '6.750')] -[2023-09-26 08:19:36,716][92473] Updated weights for policy 0, policy_version 23520 (0.0017) -[2023-09-26 08:19:36,716][92474] Updated weights for policy 1, policy_version 23520 (0.0018) -[2023-09-26 08:19:38,762][91478] Fps is (10 sec: 6144.0, 60 sec: 6212.3, 300 sec: 6248.1). Total num frames: 12050432. Throughput: 0: 780.6, 1: 780.4. Samples: 3009407. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:19:38,763][91478] Avg episode reward: [(0, '6.940'), (1, '7.050')] -[2023-09-26 08:19:43,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 12083200. Throughput: 0: 777.6, 1: 776.1. Samples: 3018865. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:19:43,763][91478] Avg episode reward: [(0, '6.840'), (1, '6.920')] -[2023-09-26 08:19:48,762][91478] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 12115968. Throughput: 0: 786.8, 1: 787.1. Samples: 3028448. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:19:48,763][91478] Avg episode reward: [(0, '6.730'), (1, '6.800')] -[2023-09-26 08:19:49,660][92474] Updated weights for policy 1, policy_version 23680 (0.0018) -[2023-09-26 08:19:49,661][92473] Updated weights for policy 0, policy_version 23680 (0.0017) -[2023-09-26 08:19:53,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 12148736. Throughput: 0: 783.8, 1: 785.7. Samples: 3033088. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:19:53,763][91478] Avg episode reward: [(0, '6.620'), (1, '7.070')] -[2023-09-26 08:19:58,762][91478] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6234.3). Total num frames: 12173312. Throughput: 0: 786.5, 1: 785.4. Samples: 3042418. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:19:58,763][91478] Avg episode reward: [(0, '6.850'), (1, '6.850')] -[2023-09-26 08:20:02,762][92474] Updated weights for policy 1, policy_version 23840 (0.0017) -[2023-09-26 08:20:02,762][92473] Updated weights for policy 0, policy_version 23840 (0.0017) -[2023-09-26 08:20:03,762][91478] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 12206080. Throughput: 0: 781.9, 1: 780.4. Samples: 3051680. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:20:03,763][91478] Avg episode reward: [(0, '6.730'), (1, '6.830')] -[2023-09-26 08:20:08,762][91478] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 12238848. Throughput: 0: 785.8, 1: 784.3. Samples: 3056637. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:20:08,763][91478] Avg episode reward: [(0, '6.820'), (1, '6.960')] -[2023-09-26 08:20:13,762][91478] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 12271616. Throughput: 0: 782.0, 1: 781.7. Samples: 3065965. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:20:13,762][91478] Avg episode reward: [(0, '6.710'), (1, '7.000')] -[2023-09-26 08:20:15,699][92474] Updated weights for policy 1, policy_version 24000 (0.0018) -[2023-09-26 08:20:15,699][92473] Updated weights for policy 0, policy_version 24000 (0.0016) -[2023-09-26 08:20:18,762][91478] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 12304384. Throughput: 0: 789.7, 1: 788.5. Samples: 3075476. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0) -[2023-09-26 08:20:18,763][91478] Avg episode reward: [(0, '6.620'), (1, '6.800')] -[2023-09-26 08:20:23,762][91478] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 12337152. Throughput: 0: 785.8, 1: 787.2. Samples: 3080193. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0) -[2023-09-26 08:20:23,763][91478] Avg episode reward: [(0, '6.640'), (1, '6.810')] -[2023-09-26 08:20:28,744][92474] Updated weights for policy 1, policy_version 24160 (0.0018) -[2023-09-26 08:20:28,744][92473] Updated weights for policy 0, policy_version 24160 (0.0018) -[2023-09-26 08:20:28,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6348.8, 300 sec: 6248.1). Total num frames: 12369920. Throughput: 0: 787.0, 1: 786.5. Samples: 3089673. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0) -[2023-09-26 08:20:28,763][91478] Avg episode reward: [(0, '6.740'), (1, '6.860')] -[2023-09-26 08:20:33,762][91478] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6234.2). Total num frames: 12394496. Throughput: 0: 785.0, 1: 785.4. Samples: 3099115. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:20:33,763][91478] Avg episode reward: [(0, '6.950'), (1, '6.870')] -[2023-09-26 08:20:38,762][91478] Fps is (10 sec: 5734.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 12427264. Throughput: 0: 786.4, 1: 784.4. Samples: 3103774. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:20:38,763][91478] Avg episode reward: [(0, '6.720'), (1, '6.760')] -[2023-09-26 08:20:41,791][92473] Updated weights for policy 0, policy_version 24320 (0.0017) -[2023-09-26 08:20:41,791][92474] Updated weights for policy 1, policy_version 24320 (0.0017) -[2023-09-26 08:20:43,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 12460032. Throughput: 0: 784.9, 1: 785.2. Samples: 3113074. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:20:43,763][91478] Avg episode reward: [(0, '6.730'), (1, '6.920')] -[2023-09-26 08:20:48,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 12492800. Throughput: 0: 790.4, 1: 791.2. Samples: 3122852. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:20:48,763][91478] Avg episode reward: [(0, '6.620'), (1, '7.060')] -[2023-09-26 08:20:53,762][91478] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 12525568. Throughput: 0: 784.5, 1: 785.8. Samples: 3127300. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) -[2023-09-26 08:20:53,762][91478] Avg episode reward: [(0, '6.850'), (1, '6.950')] -[2023-09-26 08:20:54,751][92474] Updated weights for policy 1, policy_version 24480 (0.0017) -[2023-09-26 08:20:54,752][92473] Updated weights for policy 0, policy_version 24480 (0.0016) -[2023-09-26 08:20:58,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6275.9). Total num frames: 12558336. Throughput: 0: 788.9, 1: 788.9. Samples: 3136966. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) -[2023-09-26 08:20:58,763][91478] Avg episode reward: [(0, '6.700'), (1, '6.820')] -[2023-09-26 08:21:03,762][91478] Fps is (10 sec: 5734.3, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 12582912. Throughput: 0: 785.3, 1: 785.3. Samples: 3146150. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) -[2023-09-26 08:21:03,763][91478] Avg episode reward: [(0, '6.580'), (1, '6.960')] -[2023-09-26 08:21:07,768][92473] Updated weights for policy 0, policy_version 24640 (0.0020) -[2023-09-26 08:21:07,768][92474] Updated weights for policy 1, policy_version 24640 (0.0019) -[2023-09-26 08:21:08,762][91478] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 12615680. Throughput: 0: 788.7, 1: 787.5. Samples: 3151123. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:21:08,763][91478] Avg episode reward: [(0, '6.390'), (1, '6.990')] -[2023-09-26 08:21:13,762][91478] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 12648448. Throughput: 0: 784.2, 1: 784.6. Samples: 3160269. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:21:13,763][91478] Avg episode reward: [(0, '6.510'), (1, '7.100')] -[2023-09-26 08:21:13,772][91993] Saving ./train_atari/atari_frostbite/checkpoint_p0/checkpoint_000024704_6324224.pth... -[2023-09-26 08:21:13,772][92345] Saving ./train_atari/atari_frostbite/checkpoint_p1/checkpoint_000024704_6324224.pth... -[2023-09-26 08:21:13,806][91993] Removing ./train_atari/atari_frostbite/checkpoint_p0/checkpoint_000021776_5574656.pth -[2023-09-26 08:21:13,808][92345] Removing ./train_atari/atari_frostbite/checkpoint_p1/checkpoint_000021776_5574656.pth -[2023-09-26 08:21:18,762][91478] Fps is (10 sec: 6553.7, 60 sec: 6280.6, 300 sec: 6248.1). Total num frames: 12681216. Throughput: 0: 787.5, 1: 787.4. Samples: 3169984. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:21:18,762][91478] Avg episode reward: [(0, '6.700'), (1, '6.950')] -[2023-09-26 08:21:20,832][92473] Updated weights for policy 0, policy_version 24800 (0.0016) -[2023-09-26 08:21:20,832][92474] Updated weights for policy 1, policy_version 24800 (0.0017) -[2023-09-26 08:21:23,762][91478] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 12713984. Throughput: 0: 784.7, 1: 785.9. Samples: 3174449. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:21:23,763][91478] Avg episode reward: [(0, '6.670'), (1, '6.800')] -[2023-09-26 08:21:28,762][91478] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 12738560. Throughput: 0: 786.8, 1: 786.7. Samples: 3183882. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:21:28,763][91478] Avg episode reward: [(0, '6.600'), (1, '6.880')] -[2023-09-26 08:21:33,762][91478] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 12771328. Throughput: 0: 777.3, 1: 777.3. Samples: 3192811. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:21:33,763][91478] Avg episode reward: [(0, '6.650'), (1, '7.040')] -[2023-09-26 08:21:34,396][92474] Updated weights for policy 1, policy_version 24960 (0.0018) -[2023-09-26 08:21:34,396][92473] Updated weights for policy 0, policy_version 24960 (0.0015) -[2023-09-26 08:21:38,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 12804096. Throughput: 0: 778.4, 1: 776.9. Samples: 3197291. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:21:38,763][91478] Avg episode reward: [(0, '6.730'), (1, '6.840')] -[2023-09-26 08:21:43,762][91478] Fps is (10 sec: 6553.8, 60 sec: 6280.6, 300 sec: 6248.1). Total num frames: 12836864. Throughput: 0: 775.8, 1: 776.2. Samples: 3206805. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:21:43,762][91478] Avg episode reward: [(0, '6.580'), (1, '6.940')] -[2023-09-26 08:21:47,420][92473] Updated weights for policy 0, policy_version 25120 (0.0014) -[2023-09-26 08:21:47,420][92474] Updated weights for policy 1, policy_version 25120 (0.0016) -[2023-09-26 08:21:48,762][91478] Fps is (10 sec: 6144.1, 60 sec: 6212.3, 300 sec: 6262.0). Total num frames: 12865536. Throughput: 0: 778.3, 1: 777.4. Samples: 3216154. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:21:48,763][91478] Avg episode reward: [(0, '6.650'), (1, '6.980')] -[2023-09-26 08:21:53,762][91478] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 12894208. Throughput: 0: 773.3, 1: 773.9. Samples: 3220747. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:21:53,762][91478] Avg episode reward: [(0, '6.570'), (1, '7.010')] -[2023-09-26 08:21:58,762][91478] Fps is (10 sec: 6144.1, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 12926976. Throughput: 0: 772.6, 1: 773.1. Samples: 3229828. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:21:58,762][91478] Avg episode reward: [(0, '6.740'), (1, '7.020')] -[2023-09-26 08:22:00,653][92473] Updated weights for policy 0, policy_version 25280 (0.0017) -[2023-09-26 08:22:00,654][92474] Updated weights for policy 1, policy_version 25280 (0.0018) -[2023-09-26 08:22:03,762][91478] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 12959744. Throughput: 0: 771.5, 1: 770.4. Samples: 3239366. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:22:03,763][91478] Avg episode reward: [(0, '6.720'), (1, '7.050')] -[2023-09-26 08:22:08,762][91478] Fps is (10 sec: 6143.9, 60 sec: 6212.3, 300 sec: 6234.3). Total num frames: 12988416. Throughput: 0: 772.2, 1: 771.0. Samples: 3243895. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:22:08,763][91478] Avg episode reward: [(0, '6.720'), (1, '7.070')] -[2023-09-26 08:22:13,762][91478] Fps is (10 sec: 5734.5, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 13017088. Throughput: 0: 769.5, 1: 770.3. Samples: 3253170. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:22:13,762][91478] Avg episode reward: [(0, '6.540'), (1, '7.090')] -[2023-09-26 08:22:13,962][92474] Updated weights for policy 1, policy_version 25440 (0.0017) -[2023-09-26 08:22:13,962][92473] Updated weights for policy 0, policy_version 25440 (0.0016) -[2023-09-26 08:22:18,762][91478] Fps is (10 sec: 6144.1, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 13049856. Throughput: 0: 773.7, 1: 774.2. Samples: 3262465. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:22:18,762][91478] Avg episode reward: [(0, '6.720'), (1, '7.120')] -[2023-09-26 08:22:23,762][91478] Fps is (10 sec: 6553.5, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 13082624. Throughput: 0: 773.6, 1: 773.9. Samples: 3266926. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:22:23,763][91478] Avg episode reward: [(0, '6.600'), (1, '7.090')] -[2023-09-26 08:22:27,281][92473] Updated weights for policy 0, policy_version 25600 (0.0017) -[2023-09-26 08:22:27,281][92474] Updated weights for policy 1, policy_version 25600 (0.0018) -[2023-09-26 08:22:28,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6280.6, 300 sec: 6248.1). Total num frames: 13115392. Throughput: 0: 773.6, 1: 775.6. Samples: 3276515. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:22:28,762][91478] Avg episode reward: [(0, '6.730'), (1, '7.080')] -[2023-09-26 08:22:33,762][91478] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 13139968. Throughput: 0: 769.6, 1: 770.8. Samples: 3285469. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) -[2023-09-26 08:22:33,763][91478] Avg episode reward: [(0, '6.700'), (1, '6.800')] -[2023-09-26 08:22:38,762][91478] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6234.3). Total num frames: 13172736. Throughput: 0: 774.6, 1: 773.8. Samples: 3290421. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) -[2023-09-26 08:22:38,763][91478] Avg episode reward: [(0, '6.700'), (1, '6.740')] -[2023-09-26 08:22:40,439][92474] Updated weights for policy 1, policy_version 25760 (0.0018) -[2023-09-26 08:22:40,440][92473] Updated weights for policy 0, policy_version 25760 (0.0016) -[2023-09-26 08:22:43,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 13205504. Throughput: 0: 774.2, 1: 773.8. Samples: 3299488. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) -[2023-09-26 08:22:43,763][91478] Avg episode reward: [(0, '6.670'), (1, '6.650')] -[2023-09-26 08:22:48,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6212.3, 300 sec: 6248.1). Total num frames: 13238272. Throughput: 0: 776.4, 1: 777.0. Samples: 3309269. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) -[2023-09-26 08:22:48,763][91478] Avg episode reward: [(0, '6.560'), (1, '6.730')] -[2023-09-26 08:22:53,762][91478] Fps is (10 sec: 6144.0, 60 sec: 6212.2, 300 sec: 6234.2). Total num frames: 13266944. Throughput: 0: 774.2, 1: 776.3. Samples: 3313664. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) -[2023-09-26 08:22:53,763][91478] Avg episode reward: [(0, '6.940'), (1, '6.680')] -[2023-09-26 08:22:53,780][92473] Updated weights for policy 0, policy_version 25920 (0.0017) -[2023-09-26 08:22:53,780][92474] Updated weights for policy 1, policy_version 25920 (0.0018) -[2023-09-26 08:22:58,762][91478] Fps is (10 sec: 5734.5, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 13295616. Throughput: 0: 771.8, 1: 771.0. Samples: 3322596. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) -[2023-09-26 08:22:58,763][91478] Avg episode reward: [(0, '6.860'), (1, '6.610')] -[2023-09-26 08:23:03,762][91478] Fps is (10 sec: 6143.9, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 13328384. Throughput: 0: 773.4, 1: 771.8. Samples: 3332001. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) -[2023-09-26 08:23:03,763][91478] Avg episode reward: [(0, '6.620'), (1, '6.880')] -[2023-09-26 08:23:07,137][92474] Updated weights for policy 1, policy_version 26080 (0.0018) -[2023-09-26 08:23:07,137][92473] Updated weights for policy 0, policy_version 26080 (0.0018) -[2023-09-26 08:23:08,762][91478] Fps is (10 sec: 6553.5, 60 sec: 6212.3, 300 sec: 6248.1). Total num frames: 13361152. Throughput: 0: 769.2, 1: 770.3. Samples: 3336205. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0) -[2023-09-26 08:23:08,763][91478] Avg episode reward: [(0, '6.760'), (1, '7.150')] -[2023-09-26 08:23:13,762][91478] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 13385728. Throughput: 0: 769.9, 1: 769.1. Samples: 3345770. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0) -[2023-09-26 08:23:13,763][91478] Avg episode reward: [(0, '6.710'), (1, '7.030')] -[2023-09-26 08:23:13,776][91993] Saving ./train_atari/atari_frostbite/checkpoint_p0/checkpoint_000026160_6696960.pth... -[2023-09-26 08:23:13,803][91993] Removing ./train_atari/atari_frostbite/checkpoint_p0/checkpoint_000023232_5947392.pth -[2023-09-26 08:23:13,827][92345] Saving ./train_atari/atari_frostbite/checkpoint_p1/checkpoint_000026160_6696960.pth... -[2023-09-26 08:23:13,854][92345] Removing ./train_atari/atari_frostbite/checkpoint_p1/checkpoint_000023232_5947392.pth -[2023-09-26 08:23:18,762][91478] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 13418496. Throughput: 0: 771.0, 1: 771.7. Samples: 3354890. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0) -[2023-09-26 08:23:18,763][91478] Avg episode reward: [(0, '6.710'), (1, '6.840')] -[2023-09-26 08:23:20,282][92473] Updated weights for policy 0, policy_version 26240 (0.0017) -[2023-09-26 08:23:20,283][92474] Updated weights for policy 1, policy_version 26240 (0.0016) -[2023-09-26 08:23:23,762][91478] Fps is (10 sec: 6553.7, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 13451264. Throughput: 0: 771.5, 1: 771.3. Samples: 3359847. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0) -[2023-09-26 08:23:23,763][91478] Avg episode reward: [(0, '6.590'), (1, '6.750')] -[2023-09-26 08:23:28,762][91478] Fps is (10 sec: 6553.5, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 13484032. Throughput: 0: 771.2, 1: 772.6. Samples: 3368958. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:23:28,763][91478] Avg episode reward: [(0, '6.650'), (1, '6.930')] -[2023-09-26 08:23:33,622][92473] Updated weights for policy 0, policy_version 26400 (0.0017) -[2023-09-26 08:23:33,622][92474] Updated weights for policy 1, policy_version 26400 (0.0017) -[2023-09-26 08:23:33,762][91478] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6234.3). Total num frames: 13516800. Throughput: 0: 766.0, 1: 766.7. Samples: 3378241. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:23:33,763][91478] Avg episode reward: [(0, '6.770'), (1, '6.780')] -[2023-09-26 08:23:38,762][91478] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 13541376. Throughput: 0: 771.5, 1: 771.2. Samples: 3383086. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:23:38,763][91478] Avg episode reward: [(0, '6.870'), (1, '6.820')] -[2023-09-26 08:23:43,762][91478] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 13574144. Throughput: 0: 771.6, 1: 771.9. Samples: 3392054. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:23:43,763][91478] Avg episode reward: [(0, '6.660'), (1, '7.130')] -[2023-09-26 08:23:46,866][92474] Updated weights for policy 1, policy_version 26560 (0.0018) -[2023-09-26 08:23:46,866][92473] Updated weights for policy 0, policy_version 26560 (0.0017) -[2023-09-26 08:23:48,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 13606912. Throughput: 0: 773.1, 1: 773.3. Samples: 3401590. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:23:48,763][91478] Avg episode reward: [(0, '6.710'), (1, '7.020')] -[2023-09-26 08:23:53,762][91478] Fps is (10 sec: 6553.5, 60 sec: 6212.3, 300 sec: 6220.4). Total num frames: 13639680. Throughput: 0: 775.4, 1: 774.1. Samples: 3405929. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:23:53,763][91478] Avg episode reward: [(0, '6.600'), (1, '6.950')] -[2023-09-26 08:23:58,762][91478] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 13672448. Throughput: 0: 779.1, 1: 777.7. Samples: 3415826. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:23:58,762][91478] Avg episode reward: [(0, '6.810'), (1, '7.020')] -[2023-09-26 08:23:59,804][92473] Updated weights for policy 0, policy_version 26720 (0.0017) -[2023-09-26 08:23:59,806][92474] Updated weights for policy 1, policy_version 26720 (0.0017) -[2023-09-26 08:24:03,762][91478] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 13705216. Throughput: 0: 781.4, 1: 781.0. Samples: 3425198. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) -[2023-09-26 08:24:03,763][91478] Avg episode reward: [(0, '6.700'), (1, '7.070')] -[2023-09-26 08:24:08,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 13737984. Throughput: 0: 781.4, 1: 780.3. Samples: 3430123. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) -[2023-09-26 08:24:08,763][91478] Avg episode reward: [(0, '6.780'), (1, '7.010')] -[2023-09-26 08:24:12,607][92474] Updated weights for policy 1, policy_version 26880 (0.0019) -[2023-09-26 08:24:12,607][92473] Updated weights for policy 0, policy_version 26880 (0.0020) -[2023-09-26 08:24:13,762][91478] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 13762560. Throughput: 0: 786.0, 1: 784.7. Samples: 3439641. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) -[2023-09-26 08:24:13,763][91478] Avg episode reward: [(0, '6.780'), (1, '6.810')] -[2023-09-26 08:24:18,762][91478] Fps is (10 sec: 5734.3, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 13795328. Throughput: 0: 783.8, 1: 784.9. Samples: 3448832. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) -[2023-09-26 08:24:18,763][91478] Avg episode reward: [(0, '6.560'), (1, '6.980')] -[2023-09-26 08:24:23,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6234.3). Total num frames: 13828096. Throughput: 0: 782.5, 1: 780.3. Samples: 3453414. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:24:23,763][91478] Avg episode reward: [(0, '6.590'), (1, '6.940')] -[2023-09-26 08:24:26,066][92474] Updated weights for policy 1, policy_version 27040 (0.0017) -[2023-09-26 08:24:26,066][92473] Updated weights for policy 0, policy_version 27040 (0.0017) -[2023-09-26 08:24:28,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 13860864. Throughput: 0: 784.3, 1: 784.0. Samples: 3462628. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:24:28,763][91478] Avg episode reward: [(0, '6.630'), (1, '6.820')] -[2023-09-26 08:24:33,762][91478] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 13885440. Throughput: 0: 778.0, 1: 777.8. Samples: 3471604. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:24:33,763][91478] Avg episode reward: [(0, '6.580'), (1, '6.900')] -[2023-09-26 08:24:38,762][91478] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 13918208. Throughput: 0: 783.5, 1: 783.2. Samples: 3476434. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:24:38,763][91478] Avg episode reward: [(0, '6.710'), (1, '6.850')] -[2023-09-26 08:24:39,302][92474] Updated weights for policy 1, policy_version 27200 (0.0016) -[2023-09-26 08:24:39,302][92473] Updated weights for policy 0, policy_version 27200 (0.0017) -[2023-09-26 08:24:43,762][91478] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 13950976. Throughput: 0: 775.7, 1: 777.0. Samples: 3485696. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:24:43,762][91478] Avg episode reward: [(0, '6.780'), (1, '6.760')] -[2023-09-26 08:24:48,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 13983744. Throughput: 0: 778.6, 1: 778.2. Samples: 3495251. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:24:48,763][91478] Avg episode reward: [(0, '6.750'), (1, '6.870')] -[2023-09-26 08:24:52,389][92474] Updated weights for policy 1, policy_version 27360 (0.0020) -[2023-09-26 08:24:52,389][92473] Updated weights for policy 0, policy_version 27360 (0.0015) -[2023-09-26 08:24:53,762][91478] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 14016512. Throughput: 0: 775.4, 1: 778.0. Samples: 3500026. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:24:53,763][91478] Avg episode reward: [(0, '6.740'), (1, '7.170')] -[2023-09-26 08:24:58,762][91478] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 14041088. Throughput: 0: 769.9, 1: 771.1. Samples: 3508988. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:24:58,763][91478] Avg episode reward: [(0, '6.660'), (1, '7.170')] -[2023-09-26 08:25:03,762][91478] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 14073856. Throughput: 0: 773.8, 1: 773.7. Samples: 3518468. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) -[2023-09-26 08:25:03,763][91478] Avg episode reward: [(0, '6.580'), (1, '6.680')] -[2023-09-26 08:25:05,529][92473] Updated weights for policy 0, policy_version 27520 (0.0016) -[2023-09-26 08:25:05,529][92474] Updated weights for policy 1, policy_version 27520 (0.0015) -[2023-09-26 08:25:08,762][91478] Fps is (10 sec: 6553.8, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 14106624. Throughput: 0: 773.8, 1: 775.3. Samples: 3523122. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) -[2023-09-26 08:25:08,762][91478] Avg episode reward: [(0, '6.810'), (1, '6.810')] -[2023-09-26 08:25:13,762][91478] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 14139392. Throughput: 0: 778.8, 1: 780.4. Samples: 3532795. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) -[2023-09-26 08:25:13,763][91478] Avg episode reward: [(0, '6.750'), (1, '6.840')] -[2023-09-26 08:25:13,773][92345] Saving ./train_atari/atari_frostbite/checkpoint_p1/checkpoint_000027616_7069696.pth... -[2023-09-26 08:25:13,773][91993] Saving ./train_atari/atari_frostbite/checkpoint_p0/checkpoint_000027616_7069696.pth... -[2023-09-26 08:25:13,808][91993] Removing ./train_atari/atari_frostbite/checkpoint_p0/checkpoint_000024704_6324224.pth -[2023-09-26 08:25:13,809][92345] Removing ./train_atari/atari_frostbite/checkpoint_p1/checkpoint_000024704_6324224.pth -[2023-09-26 08:25:18,526][92474] Updated weights for policy 1, policy_version 27680 (0.0015) -[2023-09-26 08:25:18,527][92473] Updated weights for policy 0, policy_version 27680 (0.0017) -[2023-09-26 08:25:18,762][91478] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 14172160. Throughput: 0: 784.4, 1: 784.2. Samples: 3542189. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) -[2023-09-26 08:25:18,763][91478] Avg episode reward: [(0, '6.720'), (1, '6.590')] -[2023-09-26 08:25:23,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 14204928. Throughput: 0: 784.7, 1: 785.9. Samples: 3547113. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) -[2023-09-26 08:25:23,763][91478] Avg episode reward: [(0, '6.600'), (1, '6.840')] -[2023-09-26 08:25:28,762][91478] Fps is (10 sec: 5734.5, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 14229504. Throughput: 0: 783.7, 1: 782.1. Samples: 3556157. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) -[2023-09-26 08:25:28,762][91478] Avg episode reward: [(0, '6.650'), (1, '6.390')] -[2023-09-26 08:25:31,666][92474] Updated weights for policy 1, policy_version 27840 (0.0017) -[2023-09-26 08:25:31,667][92473] Updated weights for policy 0, policy_version 27840 (0.0016) -[2023-09-26 08:25:33,762][91478] Fps is (10 sec: 5734.5, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 14262272. Throughput: 0: 780.5, 1: 782.1. Samples: 3565569. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) -[2023-09-26 08:25:33,763][91478] Avg episode reward: [(0, '6.750'), (1, '6.700')] -[2023-09-26 08:25:38,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6280.6, 300 sec: 6220.4). Total num frames: 14295040. Throughput: 0: 783.6, 1: 782.2. Samples: 3570486. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:25:38,762][91478] Avg episode reward: [(0, '6.760'), (1, '6.970')] -[2023-09-26 08:25:43,762][91478] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 14327808. Throughput: 0: 790.1, 1: 788.7. Samples: 3580035. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:25:43,762][91478] Avg episode reward: [(0, '6.510'), (1, '7.030')] -[2023-09-26 08:25:44,398][92473] Updated weights for policy 0, policy_version 28000 (0.0014) -[2023-09-26 08:25:44,400][92474] Updated weights for policy 1, policy_version 28000 (0.0017) -[2023-09-26 08:25:48,762][91478] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 14360576. Throughput: 0: 794.2, 1: 792.6. Samples: 3589876. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:25:48,763][91478] Avg episode reward: [(0, '6.360'), (1, '6.860')] -[2023-09-26 08:25:53,762][91478] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 14393344. Throughput: 0: 790.2, 1: 790.8. Samples: 3594263. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:25:53,763][91478] Avg episode reward: [(0, '6.500'), (1, '6.980')] -[2023-09-26 08:25:57,373][92474] Updated weights for policy 1, policy_version 28160 (0.0015) -[2023-09-26 08:25:57,373][92473] Updated weights for policy 0, policy_version 28160 (0.0018) -[2023-09-26 08:25:58,762][91478] Fps is (10 sec: 6144.0, 60 sec: 6348.8, 300 sec: 6234.3). Total num frames: 14422016. Throughput: 0: 791.7, 1: 791.1. Samples: 3604022. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) -[2023-09-26 08:25:58,763][91478] Avg episode reward: [(0, '6.570'), (1, '7.010')] -[2023-09-26 08:26:03,762][91478] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 14450688. Throughput: 0: 783.0, 1: 784.2. Samples: 3612711. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) -[2023-09-26 08:26:03,763][91478] Avg episode reward: [(0, '6.740'), (1, '6.910')] -[2023-09-26 08:26:08,762][91478] Fps is (10 sec: 6144.1, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 14483456. Throughput: 0: 783.6, 1: 782.6. Samples: 3617595. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) -[2023-09-26 08:26:08,762][91478] Avg episode reward: [(0, '6.830'), (1, '6.870')] -[2023-09-26 08:26:10,700][92474] Updated weights for policy 1, policy_version 28320 (0.0017) -[2023-09-26 08:26:10,701][92473] Updated weights for policy 0, policy_version 28320 (0.0017) -[2023-09-26 08:26:13,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 14516224. Throughput: 0: 786.4, 1: 788.1. Samples: 3627008. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) -[2023-09-26 08:26:13,763][91478] Avg episode reward: [(0, '6.700'), (1, '7.070')] -[2023-09-26 08:26:18,762][91478] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 14548992. Throughput: 0: 787.0, 1: 786.0. Samples: 3636353. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) -[2023-09-26 08:26:18,763][91478] Avg episode reward: [(0, '6.580'), (1, '7.020')] -[2023-09-26 08:26:23,762][91478] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 14573568. Throughput: 0: 784.4, 1: 784.5. Samples: 3641087. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) -[2023-09-26 08:26:23,763][91478] Avg episode reward: [(0, '6.650'), (1, '7.000')] -[2023-09-26 08:26:23,839][92473] Updated weights for policy 0, policy_version 28480 (0.0017) -[2023-09-26 08:26:23,839][92474] Updated weights for policy 1, policy_version 28480 (0.0017) -[2023-09-26 08:26:28,762][91478] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 14606336. Throughput: 0: 779.5, 1: 779.8. Samples: 3650200. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) -[2023-09-26 08:26:28,763][91478] Avg episode reward: [(0, '6.540'), (1, '6.660')] -[2023-09-26 08:26:33,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 14639104. Throughput: 0: 775.8, 1: 777.5. Samples: 3659776. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) -[2023-09-26 08:26:33,763][91478] Avg episode reward: [(0, '6.790'), (1, '6.680')] -[2023-09-26 08:26:37,055][92473] Updated weights for policy 0, policy_version 28640 (0.0017) -[2023-09-26 08:26:37,055][92474] Updated weights for policy 1, policy_version 28640 (0.0017) -[2023-09-26 08:26:38,762][91478] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 14671872. Throughput: 0: 775.1, 1: 774.1. Samples: 3663979. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) -[2023-09-26 08:26:38,763][91478] Avg episode reward: [(0, '6.660'), (1, '6.750')] -[2023-09-26 08:26:43,762][91478] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6234.3). Total num frames: 14704640. Throughput: 0: 776.7, 1: 775.9. Samples: 3673890. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) -[2023-09-26 08:26:43,763][91478] Avg episode reward: [(0, '6.670'), (1, '6.760')] -[2023-09-26 08:26:48,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 14737408. Throughput: 0: 785.3, 1: 783.4. Samples: 3683301. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) -[2023-09-26 08:26:48,763][91478] Avg episode reward: [(0, '6.520'), (1, '6.980')] -[2023-09-26 08:26:50,075][92473] Updated weights for policy 0, policy_version 28800 (0.0018) -[2023-09-26 08:26:50,075][92474] Updated weights for policy 1, policy_version 28800 (0.0015) -[2023-09-26 08:26:53,762][91478] Fps is (10 sec: 5734.5, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 14761984. Throughput: 0: 780.4, 1: 781.2. Samples: 3687868. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) -[2023-09-26 08:26:53,762][91478] Avg episode reward: [(0, '6.710'), (1, '6.730')] -[2023-09-26 08:26:58,762][91478] Fps is (10 sec: 5734.4, 60 sec: 6212.3, 300 sec: 6220.4). Total num frames: 14794752. Throughput: 0: 778.7, 1: 777.1. Samples: 3697020. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) -[2023-09-26 08:26:58,763][91478] Avg episode reward: [(0, '6.710'), (1, '6.750')] -[2023-09-26 08:27:03,289][92473] Updated weights for policy 0, policy_version 28960 (0.0018) -[2023-09-26 08:27:03,289][92474] Updated weights for policy 1, policy_version 28960 (0.0018) -[2023-09-26 08:27:03,762][91478] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6234.2). Total num frames: 14827520. Throughput: 0: 782.0, 1: 779.5. Samples: 3706621. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) -[2023-09-26 08:27:03,763][91478] Avg episode reward: [(0, '6.630'), (1, '7.000')] -[2023-09-26 08:27:08,762][91478] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 14860288. Throughput: 0: 775.8, 1: 777.3. Samples: 3710976. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) -[2023-09-26 08:27:08,762][91478] Avg episode reward: [(0, '6.720'), (1, '6.890')] -[2023-09-26 08:27:13,762][91478] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 14893056. Throughput: 0: 782.6, 1: 782.1. Samples: 3720612. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:27:13,763][91478] Avg episode reward: [(0, '6.680'), (1, '6.900')] -[2023-09-26 08:27:13,775][92345] Saving ./train_atari/atari_frostbite/checkpoint_p1/checkpoint_000029088_7446528.pth... -[2023-09-26 08:27:13,776][91993] Saving ./train_atari/atari_frostbite/checkpoint_p0/checkpoint_000029088_7446528.pth... -[2023-09-26 08:27:13,809][91993] Removing ./train_atari/atari_frostbite/checkpoint_p0/checkpoint_000026160_6696960.pth -[2023-09-26 08:27:13,816][92345] Removing ./train_atari/atari_frostbite/checkpoint_p1/checkpoint_000026160_6696960.pth -[2023-09-26 08:27:16,465][92473] Updated weights for policy 0, policy_version 29120 (0.0017) -[2023-09-26 08:27:16,465][92474] Updated weights for policy 1, policy_version 29120 (0.0017) -[2023-09-26 08:27:18,762][91478] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 14917632. Throughput: 0: 775.3, 1: 774.0. Samples: 3729498. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:27:18,763][91478] Avg episode reward: [(0, '6.790'), (1, '6.910')] -[2023-09-26 08:27:23,762][91478] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 14950400. Throughput: 0: 779.8, 1: 780.5. Samples: 3734194. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:27:23,763][91478] Avg episode reward: [(0, '6.680'), (1, '6.900')] -[2023-09-26 08:27:28,762][91478] Fps is (10 sec: 6553.8, 60 sec: 6280.6, 300 sec: 6248.1). Total num frames: 14983168. Throughput: 0: 775.4, 1: 775.7. Samples: 3743691. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:27:28,762][91478] Avg episode reward: [(0, '6.710'), (1, '6.890')] -[2023-09-26 08:27:29,805][92474] Updated weights for policy 1, policy_version 29280 (0.0017) -[2023-09-26 08:27:29,805][92473] Updated weights for policy 0, policy_version 29280 (0.0017) -[2023-09-26 08:27:33,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 15015936. Throughput: 0: 771.5, 1: 772.9. Samples: 3752802. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:27:33,763][91478] Avg episode reward: [(0, '6.620'), (1, '6.750')] -[2023-09-26 08:27:38,762][91478] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 15040512. Throughput: 0: 774.6, 1: 773.7. Samples: 3757541. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:27:38,762][91478] Avg episode reward: [(0, '6.730'), (1, '6.880')] -[2023-09-26 08:27:43,009][92473] Updated weights for policy 0, policy_version 29440 (0.0017) -[2023-09-26 08:27:43,010][92474] Updated weights for policy 1, policy_version 29440 (0.0018) -[2023-09-26 08:27:43,762][91478] Fps is (10 sec: 5734.6, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 15073280. Throughput: 0: 772.8, 1: 773.5. Samples: 3766603. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:27:43,762][91478] Avg episode reward: [(0, '6.840'), (1, '6.990')] -[2023-09-26 08:27:48,762][91478] Fps is (10 sec: 6553.4, 60 sec: 6144.0, 300 sec: 6234.2). Total num frames: 15106048. Throughput: 0: 773.1, 1: 775.2. Samples: 3776295. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:27:48,763][91478] Avg episode reward: [(0, '6.820'), (1, '6.840')] -[2023-09-26 08:27:53,762][91478] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 15138816. Throughput: 0: 774.7, 1: 773.8. Samples: 3780657. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:27:53,763][91478] Avg episode reward: [(0, '6.610'), (1, '6.940')] -[2023-09-26 08:27:56,026][92474] Updated weights for policy 1, policy_version 29600 (0.0018) -[2023-09-26 08:27:56,027][92473] Updated weights for policy 0, policy_version 29600 (0.0017) -[2023-09-26 08:27:58,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 15171584. Throughput: 0: 774.6, 1: 775.8. Samples: 3790376. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:27:58,763][91478] Avg episode reward: [(0, '6.710'), (1, '7.000')] -[2023-09-26 08:28:03,762][91478] Fps is (10 sec: 5734.5, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 15196160. Throughput: 0: 775.7, 1: 775.4. Samples: 3799298. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:28:03,762][91478] Avg episode reward: [(0, '6.680'), (1, '6.870')] -[2023-09-26 08:28:08,762][91478] Fps is (10 sec: 5734.6, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 15228928. Throughput: 0: 779.0, 1: 777.1. Samples: 3804221. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:28:08,762][91478] Avg episode reward: [(0, '6.810'), (1, '6.980')] -[2023-09-26 08:28:09,167][92473] Updated weights for policy 0, policy_version 29760 (0.0016) -[2023-09-26 08:28:09,168][92474] Updated weights for policy 1, policy_version 29760 (0.0017) -[2023-09-26 08:28:13,762][91478] Fps is (10 sec: 6553.5, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 15261696. Throughput: 0: 776.5, 1: 776.3. Samples: 3813567. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) -[2023-09-26 08:28:13,763][91478] Avg episode reward: [(0, '6.700'), (1, '7.140')] -[2023-09-26 08:28:18,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6280.6, 300 sec: 6248.1). Total num frames: 15294464. Throughput: 0: 785.1, 1: 784.1. Samples: 3823415. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) -[2023-09-26 08:28:18,762][91478] Avg episode reward: [(0, '6.550'), (1, '6.800')] -[2023-09-26 08:28:22,071][92474] Updated weights for policy 1, policy_version 29920 (0.0015) -[2023-09-26 08:28:22,072][92473] Updated weights for policy 0, policy_version 29920 (0.0016) -[2023-09-26 08:28:23,762][91478] Fps is (10 sec: 6553.7, 60 sec: 6280.6, 300 sec: 6248.1). Total num frames: 15327232. Throughput: 0: 780.3, 1: 780.6. Samples: 3827778. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) -[2023-09-26 08:28:23,762][91478] Avg episode reward: [(0, '6.380'), (1, '6.750')] -[2023-09-26 08:28:28,762][91478] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 15360000. Throughput: 0: 791.5, 1: 791.5. Samples: 3837837. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) -[2023-09-26 08:28:28,763][91478] Avg episode reward: [(0, '6.180'), (1, '6.730')] -[2023-09-26 08:28:33,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6280.6, 300 sec: 6275.9). Total num frames: 15392768. Throughput: 0: 787.3, 1: 787.7. Samples: 3847169. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) -[2023-09-26 08:28:33,762][91478] Avg episode reward: [(0, '6.230'), (1, '6.850')] -[2023-09-26 08:28:34,956][92474] Updated weights for policy 1, policy_version 30080 (0.0015) -[2023-09-26 08:28:34,956][92473] Updated weights for policy 0, policy_version 30080 (0.0014) -[2023-09-26 08:28:38,762][91478] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 15417344. Throughput: 0: 791.5, 1: 792.6. Samples: 3851941. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) -[2023-09-26 08:28:38,763][91478] Avg episode reward: [(0, '6.520'), (1, '6.880')] -[2023-09-26 08:28:43,762][91478] Fps is (10 sec: 5734.2, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 15450112. Throughput: 0: 786.0, 1: 785.5. Samples: 3861093. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) -[2023-09-26 08:28:43,763][91478] Avg episode reward: [(0, '6.650'), (1, '6.840')] -[2023-09-26 08:28:48,097][92474] Updated weights for policy 1, policy_version 30240 (0.0020) -[2023-09-26 08:28:48,097][92473] Updated weights for policy 0, policy_version 30240 (0.0020) -[2023-09-26 08:28:48,762][91478] Fps is (10 sec: 6553.7, 60 sec: 6280.6, 300 sec: 6248.1). Total num frames: 15482880. Throughput: 0: 792.8, 1: 794.1. Samples: 3870710. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) -[2023-09-26 08:28:48,762][91478] Avg episode reward: [(0, '6.710'), (1, '6.920')] -[2023-09-26 08:28:53,762][91478] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 15515648. Throughput: 0: 787.2, 1: 788.2. Samples: 3875113. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) -[2023-09-26 08:28:53,763][91478] Avg episode reward: [(0, '6.710'), (1, '7.020')] -[2023-09-26 08:28:58,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6280.6, 300 sec: 6248.1). Total num frames: 15548416. Throughput: 0: 793.2, 1: 793.5. Samples: 3884966. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) -[2023-09-26 08:28:58,762][91478] Avg episode reward: [(0, '6.590'), (1, '6.910')] -[2023-09-26 08:29:01,027][92474] Updated weights for policy 1, policy_version 30400 (0.0014) -[2023-09-26 08:29:01,028][92473] Updated weights for policy 0, policy_version 30400 (0.0017) -[2023-09-26 08:29:03,762][91478] Fps is (10 sec: 6553.7, 60 sec: 6417.1, 300 sec: 6248.1). Total num frames: 15581184. Throughput: 0: 787.0, 1: 788.0. Samples: 3894291. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) -[2023-09-26 08:29:03,762][91478] Avg episode reward: [(0, '6.630'), (1, '6.930')] -[2023-09-26 08:29:08,762][91478] Fps is (10 sec: 6143.9, 60 sec: 6348.8, 300 sec: 6262.0). Total num frames: 15609856. Throughput: 0: 793.9, 1: 794.4. Samples: 3899253. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) -[2023-09-26 08:29:08,763][91478] Avg episode reward: [(0, '6.850'), (1, '6.730')] -[2023-09-26 08:29:13,762][91478] Fps is (10 sec: 5734.3, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 15638528. Throughput: 0: 783.9, 1: 783.3. Samples: 3908361. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) -[2023-09-26 08:29:13,763][91478] Avg episode reward: [(0, '6.850'), (1, '6.790')] -[2023-09-26 08:29:13,773][92345] Saving ./train_atari/atari_frostbite/checkpoint_p1/checkpoint_000030544_7819264.pth... -[2023-09-26 08:29:13,773][91993] Saving ./train_atari/atari_frostbite/checkpoint_p0/checkpoint_000030544_7819264.pth... -[2023-09-26 08:29:13,807][92345] Removing ./train_atari/atari_frostbite/checkpoint_p1/checkpoint_000027616_7069696.pth -[2023-09-26 08:29:13,809][91993] Removing ./train_atari/atari_frostbite/checkpoint_p0/checkpoint_000027616_7069696.pth -[2023-09-26 08:29:14,033][92474] Updated weights for policy 1, policy_version 30560 (0.0015) -[2023-09-26 08:29:14,033][92473] Updated weights for policy 0, policy_version 30560 (0.0017) -[2023-09-26 08:29:18,762][91478] Fps is (10 sec: 6144.1, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 15671296. Throughput: 0: 784.6, 1: 785.6. Samples: 3917825. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:29:18,762][91478] Avg episode reward: [(0, '6.630'), (1, '6.880')] -[2023-09-26 08:29:23,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 15704064. Throughput: 0: 784.3, 1: 782.5. Samples: 3922447. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:29:23,763][91478] Avg episode reward: [(0, '6.740'), (1, '6.830')] -[2023-09-26 08:29:27,083][92473] Updated weights for policy 0, policy_version 30720 (0.0015) -[2023-09-26 08:29:27,083][92474] Updated weights for policy 1, policy_version 30720 (0.0018) -[2023-09-26 08:29:28,762][91478] Fps is (10 sec: 6553.4, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 15736832. Throughput: 0: 789.1, 1: 789.8. Samples: 3932141. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:29:28,763][91478] Avg episode reward: [(0, '6.730'), (1, '6.780')] -[2023-09-26 08:29:33,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 15769600. Throughput: 0: 786.7, 1: 786.0. Samples: 3941481. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:29:33,763][91478] Avg episode reward: [(0, '6.620'), (1, '6.750')] -[2023-09-26 08:29:38,762][91478] Fps is (10 sec: 6553.7, 60 sec: 6417.1, 300 sec: 6275.9). Total num frames: 15802368. Throughput: 0: 792.4, 1: 793.2. Samples: 3946464. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:29:38,763][91478] Avg episode reward: [(0, '6.620'), (1, '6.940')] -[2023-09-26 08:29:39,936][92473] Updated weights for policy 0, policy_version 30880 (0.0017) -[2023-09-26 08:29:39,936][92474] Updated weights for policy 1, policy_version 30880 (0.0017) -[2023-09-26 08:29:43,762][91478] Fps is (10 sec: 6553.5, 60 sec: 6417.1, 300 sec: 6275.9). Total num frames: 15835136. Throughput: 0: 788.9, 1: 788.2. Samples: 3955939. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:29:43,763][91478] Avg episode reward: [(0, '6.720'), (1, '6.940')] -[2023-09-26 08:29:48,762][91478] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 15859712. Throughput: 0: 790.5, 1: 790.0. Samples: 3965415. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:29:48,763][91478] Avg episode reward: [(0, '6.840'), (1, '7.030')] -[2023-09-26 08:29:52,698][92474] Updated weights for policy 1, policy_version 31040 (0.0018) -[2023-09-26 08:29:52,699][92473] Updated weights for policy 0, policy_version 31040 (0.0016) -[2023-09-26 08:29:53,762][91478] Fps is (10 sec: 5734.6, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 15892480. Throughput: 0: 790.6, 1: 789.9. Samples: 3970374. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:29:53,762][91478] Avg episode reward: [(0, '6.830'), (1, '6.900')] -[2023-09-26 08:29:58,762][91478] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 15925248. Throughput: 0: 790.9, 1: 790.8. Samples: 3979538. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:29:58,762][91478] Avg episode reward: [(0, '6.840'), (1, '6.740')] -[2023-09-26 08:30:03,762][91478] Fps is (10 sec: 6553.4, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 15958016. Throughput: 0: 796.0, 1: 795.1. Samples: 3989425. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:30:03,763][91478] Avg episode reward: [(0, '6.720'), (1, '6.780')] -[2023-09-26 08:30:05,747][92474] Updated weights for policy 1, policy_version 31200 (0.0017) -[2023-09-26 08:30:05,747][92473] Updated weights for policy 0, policy_version 31200 (0.0019) -[2023-09-26 08:30:08,762][91478] Fps is (10 sec: 6553.5, 60 sec: 6348.8, 300 sec: 6275.9). Total num frames: 15990784. Throughput: 0: 792.9, 1: 792.7. Samples: 3993798. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:30:08,763][91478] Avg episode reward: [(0, '6.850'), (1, '6.810')] -[2023-09-26 08:30:13,762][91478] Fps is (10 sec: 6144.1, 60 sec: 6348.8, 300 sec: 6262.0). Total num frames: 16019456. Throughput: 0: 790.6, 1: 788.9. Samples: 4003220. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:30:13,763][91478] Avg episode reward: [(0, '6.720'), (1, '6.880')] -[2023-09-26 08:30:18,762][91478] Fps is (10 sec: 5734.3, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 16048128. Throughput: 0: 789.1, 1: 788.3. Samples: 4012463. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-09-26 08:30:18,763][91478] Avg episode reward: [(0, '6.700'), (1, '6.780')] -[2023-09-26 08:30:18,938][92473] Updated weights for policy 0, policy_version 31360 (0.0014) -[2023-09-26 08:30:18,939][92474] Updated weights for policy 1, policy_version 31360 (0.0018) -[2023-09-26 08:30:23,762][91478] Fps is (10 sec: 6144.0, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 16080896. Throughput: 0: 785.6, 1: 787.1. Samples: 4017238. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-09-26 08:30:23,763][91478] Avg episode reward: [(0, '6.690'), (1, '7.070')] -[2023-09-26 08:30:28,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 16113664. Throughput: 0: 781.6, 1: 782.6. Samples: 4026331. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-09-26 08:30:28,763][91478] Avg episode reward: [(0, '6.660'), (1, '6.890')] -[2023-09-26 08:30:32,258][92474] Updated weights for policy 1, policy_version 31520 (0.0017) -[2023-09-26 08:30:32,258][92473] Updated weights for policy 0, policy_version 31520 (0.0017) -[2023-09-26 08:30:33,762][91478] Fps is (10 sec: 6553.7, 60 sec: 6280.6, 300 sec: 6275.9). Total num frames: 16146432. Throughput: 0: 781.1, 1: 781.2. Samples: 4035716. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-09-26 08:30:33,762][91478] Avg episode reward: [(0, '6.570'), (1, '6.950')] -[2023-09-26 08:30:38,762][91478] Fps is (10 sec: 6144.0, 60 sec: 6212.3, 300 sec: 6262.0). Total num frames: 16175104. Throughput: 0: 780.5, 1: 778.4. Samples: 4040525. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:30:38,763][91478] Avg episode reward: [(0, '6.680'), (1, '6.740')] -[2023-09-26 08:30:43,762][91478] Fps is (10 sec: 5734.2, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 16203776. Throughput: 0: 779.3, 1: 779.2. Samples: 4049669. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:30:43,763][91478] Avg episode reward: [(0, '6.820'), (1, '6.860')] -[2023-09-26 08:30:45,379][92474] Updated weights for policy 1, policy_version 31680 (0.0017) -[2023-09-26 08:30:45,379][92473] Updated weights for policy 0, policy_version 31680 (0.0018) -[2023-09-26 08:30:48,762][91478] Fps is (10 sec: 6144.1, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 16236544. Throughput: 0: 774.1, 1: 775.0. Samples: 4059136. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:30:48,762][91478] Avg episode reward: [(0, '6.610'), (1, '6.830')] -[2023-09-26 08:30:53,762][91478] Fps is (10 sec: 6553.8, 60 sec: 6280.5, 300 sec: 6262.0). Total num frames: 16269312. Throughput: 0: 776.4, 1: 776.5. Samples: 4063679. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:30:53,762][91478] Avg episode reward: [(0, '6.730'), (1, '6.800')] -[2023-09-26 08:30:58,350][92474] Updated weights for policy 1, policy_version 31840 (0.0016) -[2023-09-26 08:30:58,350][92473] Updated weights for policy 0, policy_version 31840 (0.0018) -[2023-09-26 08:30:58,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 16302080. Throughput: 0: 779.5, 1: 781.3. Samples: 4073455. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:30:58,762][91478] Avg episode reward: [(0, '6.630'), (1, '6.760')] -[2023-09-26 08:31:03,762][91478] Fps is (10 sec: 6553.4, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 16334848. Throughput: 0: 782.4, 1: 783.0. Samples: 4082903. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:31:03,763][91478] Avg episode reward: [(0, '6.960'), (1, '6.790')] -[2023-09-26 08:31:08,762][91478] Fps is (10 sec: 6553.4, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 16367616. Throughput: 0: 784.1, 1: 780.4. Samples: 4087642. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:31:08,763][91478] Avg episode reward: [(0, '6.740'), (1, '6.980')] -[2023-09-26 08:31:11,427][92474] Updated weights for policy 1, policy_version 32000 (0.0016) -[2023-09-26 08:31:11,427][92473] Updated weights for policy 0, policy_version 32000 (0.0018) -[2023-09-26 08:31:13,762][91478] Fps is (10 sec: 5734.4, 60 sec: 6212.3, 300 sec: 6248.1). Total num frames: 16392192. Throughput: 0: 783.6, 1: 783.3. Samples: 4096844. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:31:13,763][91478] Avg episode reward: [(0, '6.710'), (1, '6.900')] -[2023-09-26 08:31:13,956][91993] Saving ./train_atari/atari_frostbite/checkpoint_p0/checkpoint_000032032_8200192.pth... -[2023-09-26 08:31:13,976][92345] Saving ./train_atari/atari_frostbite/checkpoint_p1/checkpoint_000032032_8200192.pth... -[2023-09-26 08:31:13,989][91993] Removing ./train_atari/atari_frostbite/checkpoint_p0/checkpoint_000029088_7446528.pth -[2023-09-26 08:31:14,004][92345] Removing ./train_atari/atari_frostbite/checkpoint_p1/checkpoint_000029088_7446528.pth -[2023-09-26 08:31:18,762][91478] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 16424960. Throughput: 0: 784.9, 1: 784.8. Samples: 4106354. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:31:18,763][91478] Avg episode reward: [(0, '6.720'), (1, '6.900')] -[2023-09-26 08:31:23,762][91478] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 16457728. Throughput: 0: 784.3, 1: 786.6. Samples: 4111215. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:31:23,763][91478] Avg episode reward: [(0, '6.800'), (1, '7.350')] -[2023-09-26 08:31:23,764][92345] Saving new best policy, reward=7.350! -[2023-09-26 08:31:24,242][92474] Updated weights for policy 1, policy_version 32160 (0.0016) -[2023-09-26 08:31:24,242][92473] Updated weights for policy 0, policy_version 32160 (0.0019) -[2023-09-26 08:31:28,762][91478] Fps is (10 sec: 6553.8, 60 sec: 6280.6, 300 sec: 6275.9). Total num frames: 16490496. Throughput: 0: 787.5, 1: 788.7. Samples: 4120597. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:31:28,762][91478] Avg episode reward: [(0, '6.690'), (1, '6.980')] -[2023-09-26 08:31:33,762][91478] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 16523264. Throughput: 0: 784.3, 1: 783.0. Samples: 4129665. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:31:33,763][91478] Avg episode reward: [(0, '6.590'), (1, '7.250')] -[2023-09-26 08:31:37,528][92473] Updated weights for policy 0, policy_version 32320 (0.0016) -[2023-09-26 08:31:37,528][92474] Updated weights for policy 1, policy_version 32320 (0.0016) -[2023-09-26 08:31:38,762][91478] Fps is (10 sec: 5734.3, 60 sec: 6212.3, 300 sec: 6248.1). Total num frames: 16547840. Throughput: 0: 787.1, 1: 786.8. Samples: 4134502. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:31:38,763][91478] Avg episode reward: [(0, '6.620'), (1, '6.960')] -[2023-09-26 08:31:43,762][91478] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 16580608. Throughput: 0: 784.5, 1: 784.5. Samples: 4144060. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:31:43,763][91478] Avg episode reward: [(0, '6.650'), (1, '6.870')] -[2023-09-26 08:31:48,762][91478] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 16613376. Throughput: 0: 782.2, 1: 783.2. Samples: 4153346. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:31:48,762][91478] Avg episode reward: [(0, '6.540'), (1, '6.850')] -[2023-09-26 08:31:50,534][92473] Updated weights for policy 0, policy_version 32480 (0.0017) -[2023-09-26 08:31:50,534][92474] Updated weights for policy 1, policy_version 32480 (0.0017) -[2023-09-26 08:31:53,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 16646144. Throughput: 0: 781.8, 1: 783.5. Samples: 4158082. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:31:53,763][91478] Avg episode reward: [(0, '6.740'), (1, '7.590')] -[2023-09-26 08:31:53,764][92345] Saving new best policy, reward=7.590! -[2023-09-26 08:31:58,762][91478] Fps is (10 sec: 6553.4, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 16678912. Throughput: 0: 786.7, 1: 787.7. Samples: 4167690. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:31:58,763][91478] Avg episode reward: [(0, '6.750'), (1, '7.650')] -[2023-09-26 08:31:58,773][92345] Saving new best policy, reward=7.650! -[2023-09-26 08:32:03,442][92473] Updated weights for policy 0, policy_version 32640 (0.0018) -[2023-09-26 08:32:03,442][92474] Updated weights for policy 1, policy_version 32640 (0.0017) -[2023-09-26 08:32:03,762][91478] Fps is (10 sec: 6553.7, 60 sec: 6280.6, 300 sec: 6275.9). Total num frames: 16711680. Throughput: 0: 789.6, 1: 788.1. Samples: 4177351. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:32:03,762][91478] Avg episode reward: [(0, '6.820'), (1, '7.640')] -[2023-09-26 08:32:08,762][91478] Fps is (10 sec: 6553.7, 60 sec: 6280.6, 300 sec: 6275.9). Total num frames: 16744448. Throughput: 0: 786.0, 1: 786.5. Samples: 4181979. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:32:08,762][91478] Avg episode reward: [(0, '6.600'), (1, '7.550')] -[2023-09-26 08:32:13,762][91478] Fps is (10 sec: 5734.3, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 16769024. Throughput: 0: 788.4, 1: 787.1. Samples: 4191495. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:32:13,763][91478] Avg episode reward: [(0, '6.560'), (1, '7.040')] -[2023-09-26 08:32:16,510][92474] Updated weights for policy 1, policy_version 32800 (0.0017) -[2023-09-26 08:32:16,510][92473] Updated weights for policy 0, policy_version 32800 (0.0016) -[2023-09-26 08:32:18,762][91478] Fps is (10 sec: 5734.3, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 16801792. Throughput: 0: 788.2, 1: 787.9. Samples: 4200586. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:32:18,763][91478] Avg episode reward: [(0, '6.600'), (1, '6.820')] -[2023-09-26 08:32:23,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 16834560. Throughput: 0: 787.0, 1: 786.5. Samples: 4205311. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:32:23,763][91478] Avg episode reward: [(0, '6.820'), (1, '6.900')] -[2023-09-26 08:32:28,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 16867328. Throughput: 0: 785.6, 1: 784.3. Samples: 4214707. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:32:28,763][91478] Avg episode reward: [(0, '6.690'), (1, '6.960')] -[2023-09-26 08:32:29,780][92474] Updated weights for policy 1, policy_version 32960 (0.0018) -[2023-09-26 08:32:29,780][92473] Updated weights for policy 0, policy_version 32960 (0.0019) -[2023-09-26 08:32:33,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6303.7). Total num frames: 16900096. Throughput: 0: 784.9, 1: 784.0. Samples: 4223947. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:32:33,763][91478] Avg episode reward: [(0, '6.730'), (1, '6.860')] -[2023-09-26 08:32:38,762][91478] Fps is (10 sec: 5734.5, 60 sec: 6280.6, 300 sec: 6275.9). Total num frames: 16924672. Throughput: 0: 784.4, 1: 783.2. Samples: 4228625. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:32:38,762][91478] Avg episode reward: [(0, '6.700'), (1, '6.980')] -[2023-09-26 08:32:42,981][92473] Updated weights for policy 0, policy_version 33120 (0.0017) -[2023-09-26 08:32:42,981][92474] Updated weights for policy 1, policy_version 33120 (0.0017) -[2023-09-26 08:32:43,762][91478] Fps is (10 sec: 5734.5, 60 sec: 6280.6, 300 sec: 6275.9). Total num frames: 16957440. Throughput: 0: 778.7, 1: 777.4. Samples: 4237717. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:32:43,762][91478] Avg episode reward: [(0, '6.840'), (1, '6.960')] -[2023-09-26 08:32:48,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 16990208. Throughput: 0: 774.6, 1: 777.4. Samples: 4247189. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:32:48,762][91478] Avg episode reward: [(0, '6.710'), (1, '6.820')] -[2023-09-26 08:32:53,762][91478] Fps is (10 sec: 6553.4, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 17022976. Throughput: 0: 773.6, 1: 774.5. Samples: 4251646. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) -[2023-09-26 08:32:53,763][91478] Avg episode reward: [(0, '6.680'), (1, '7.230')] -[2023-09-26 08:32:56,335][92473] Updated weights for policy 0, policy_version 33280 (0.0016) -[2023-09-26 08:32:56,336][92474] Updated weights for policy 1, policy_version 33280 (0.0018) -[2023-09-26 08:32:58,762][91478] Fps is (10 sec: 5734.2, 60 sec: 6144.0, 300 sec: 6275.9). Total num frames: 17047552. Throughput: 0: 769.8, 1: 769.2. Samples: 4260750. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) -[2023-09-26 08:32:58,763][91478] Avg episode reward: [(0, '6.680'), (1, '6.840')] -[2023-09-26 08:33:03,762][91478] Fps is (10 sec: 5734.5, 60 sec: 6144.0, 300 sec: 6275.9). Total num frames: 17080320. Throughput: 0: 770.2, 1: 772.9. Samples: 4270027. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) -[2023-09-26 08:33:03,762][91478] Avg episode reward: [(0, '6.700'), (1, '7.040')] -[2023-09-26 08:33:08,762][91478] Fps is (10 sec: 6553.7, 60 sec: 6144.0, 300 sec: 6275.9). Total num frames: 17113088. Throughput: 0: 767.4, 1: 768.4. Samples: 4274422. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) -[2023-09-26 08:33:08,762][91478] Avg episode reward: [(0, '6.590'), (1, '7.140')] -[2023-09-26 08:33:09,713][92473] Updated weights for policy 0, policy_version 33440 (0.0018) -[2023-09-26 08:33:09,713][92474] Updated weights for policy 1, policy_version 33440 (0.0017) -[2023-09-26 08:33:13,762][91478] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 17145856. Throughput: 0: 770.2, 1: 771.4. Samples: 4284077. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:33:13,763][91478] Avg episode reward: [(0, '6.620'), (1, '6.850')] -[2023-09-26 08:33:13,774][91993] Saving ./train_atari/atari_frostbite/checkpoint_p0/checkpoint_000033488_8572928.pth... -[2023-09-26 08:33:13,775][92345] Saving ./train_atari/atari_frostbite/checkpoint_p1/checkpoint_000033488_8572928.pth... -[2023-09-26 08:33:13,812][91993] Removing ./train_atari/atari_frostbite/checkpoint_p0/checkpoint_000030544_7819264.pth -[2023-09-26 08:33:13,812][92345] Removing ./train_atari/atari_frostbite/checkpoint_p1/checkpoint_000030544_7819264.pth -[2023-09-26 08:33:18,762][91478] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 17170432. Throughput: 0: 769.3, 1: 768.6. Samples: 4293153. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:33:18,763][91478] Avg episode reward: [(0, '6.600'), (1, '6.890')] -[2023-09-26 08:33:22,867][92473] Updated weights for policy 0, policy_version 33600 (0.0017) -[2023-09-26 08:33:22,868][92474] Updated weights for policy 1, policy_version 33600 (0.0019) -[2023-09-26 08:33:23,762][91478] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 17203200. Throughput: 0: 768.5, 1: 768.7. Samples: 4297799. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:33:23,763][91478] Avg episode reward: [(0, '6.800'), (1, '7.170')] -[2023-09-26 08:33:28,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 17235968. Throughput: 0: 770.9, 1: 770.7. Samples: 4307091. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:33:28,763][91478] Avg episode reward: [(0, '6.670'), (1, '6.920')] -[2023-09-26 08:33:33,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6144.0, 300 sec: 6275.9). Total num frames: 17268736. Throughput: 0: 768.4, 1: 768.4. Samples: 4316344. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:33:33,763][91478] Avg episode reward: [(0, '6.590'), (1, '7.030')] -[2023-09-26 08:33:36,175][92473] Updated weights for policy 0, policy_version 33760 (0.0018) -[2023-09-26 08:33:36,175][92474] Updated weights for policy 1, policy_version 33760 (0.0015) -[2023-09-26 08:33:38,762][91478] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 17293312. Throughput: 0: 772.4, 1: 771.2. Samples: 4321109. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) -[2023-09-26 08:33:38,763][91478] Avg episode reward: [(0, '6.590'), (1, '7.150')] -[2023-09-26 08:33:43,762][91478] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 17326080. Throughput: 0: 772.2, 1: 772.7. Samples: 4330271. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) -[2023-09-26 08:33:43,763][91478] Avg episode reward: [(0, '6.820'), (1, '6.660')] -[2023-09-26 08:33:48,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 17358848. Throughput: 0: 775.0, 1: 773.7. Samples: 4339717. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) -[2023-09-26 08:33:48,763][91478] Avg episode reward: [(0, '6.770'), (1, '6.810')] -[2023-09-26 08:33:49,204][92473] Updated weights for policy 0, policy_version 33920 (0.0016) -[2023-09-26 08:33:49,204][92474] Updated weights for policy 1, policy_version 33920 (0.0017) -[2023-09-26 08:33:53,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 17391616. Throughput: 0: 778.6, 1: 779.0. Samples: 4344516. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) -[2023-09-26 08:33:53,763][91478] Avg episode reward: [(0, '6.640'), (1, '6.910')] -[2023-09-26 08:33:58,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 17424384. Throughput: 0: 777.2, 1: 777.7. Samples: 4354048. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:33:58,763][91478] Avg episode reward: [(0, '6.750'), (1, '6.960')] -[2023-09-26 08:34:02,327][92474] Updated weights for policy 1, policy_version 34080 (0.0019) -[2023-09-26 08:34:02,327][92473] Updated weights for policy 0, policy_version 34080 (0.0020) -[2023-09-26 08:34:03,762][91478] Fps is (10 sec: 6553.8, 60 sec: 6280.5, 300 sec: 6262.0). Total num frames: 17457152. Throughput: 0: 778.7, 1: 778.4. Samples: 4363224. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:34:03,762][91478] Avg episode reward: [(0, '6.870'), (1, '6.920')] -[2023-09-26 08:34:08,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 17489920. Throughput: 0: 781.6, 1: 782.7. Samples: 4368193. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:34:08,763][91478] Avg episode reward: [(0, '6.870'), (1, '6.810')] -[2023-09-26 08:34:13,762][91478] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 17514496. Throughput: 0: 781.8, 1: 781.8. Samples: 4377450. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:34:13,763][91478] Avg episode reward: [(0, '6.870'), (1, '6.930')] -[2023-09-26 08:34:15,354][92474] Updated weights for policy 1, policy_version 34240 (0.0016) -[2023-09-26 08:34:15,354][92473] Updated weights for policy 0, policy_version 34240 (0.0016) -[2023-09-26 08:34:18,762][91478] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 17547264. Throughput: 0: 782.9, 1: 783.2. Samples: 4386816. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:34:18,763][91478] Avg episode reward: [(0, '6.640'), (1, '6.960')] -[2023-09-26 08:34:23,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 17580032. Throughput: 0: 778.9, 1: 778.7. Samples: 4391202. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) -[2023-09-26 08:34:23,763][91478] Avg episode reward: [(0, '6.740'), (1, '7.150')] -[2023-09-26 08:34:28,554][92473] Updated weights for policy 0, policy_version 34400 (0.0016) -[2023-09-26 08:34:28,554][92474] Updated weights for policy 1, policy_version 34400 (0.0016) -[2023-09-26 08:34:28,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 17612800. Throughput: 0: 784.3, 1: 784.1. Samples: 4400850. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) -[2023-09-26 08:34:28,763][91478] Avg episode reward: [(0, '6.840'), (1, '7.090')] -[2023-09-26 08:34:33,762][91478] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 17637376. Throughput: 0: 782.2, 1: 780.5. Samples: 4410040. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) -[2023-09-26 08:34:33,763][91478] Avg episode reward: [(0, '6.670'), (1, '7.010')] -[2023-09-26 08:34:38,762][91478] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 17670144. Throughput: 0: 782.3, 1: 782.1. Samples: 4414914. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) -[2023-09-26 08:34:38,763][91478] Avg episode reward: [(0, '6.540'), (1, '7.340')] -[2023-09-26 08:34:41,610][92473] Updated weights for policy 0, policy_version 34560 (0.0016) -[2023-09-26 08:34:41,610][92474] Updated weights for policy 1, policy_version 34560 (0.0016) -[2023-09-26 08:34:43,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 17702912. Throughput: 0: 778.7, 1: 778.4. Samples: 4424117. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) -[2023-09-26 08:34:43,763][91478] Avg episode reward: [(0, '6.650'), (1, '7.030')] -[2023-09-26 08:34:48,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 17735680. Throughput: 0: 783.7, 1: 784.6. Samples: 4433798. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) -[2023-09-26 08:34:48,763][91478] Avg episode reward: [(0, '6.620'), (1, '7.270')] -[2023-09-26 08:34:53,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 17768448. Throughput: 0: 776.3, 1: 776.5. Samples: 4438071. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) -[2023-09-26 08:34:53,763][91478] Avg episode reward: [(0, '6.570'), (1, '6.980')] -[2023-09-26 08:34:54,930][92474] Updated weights for policy 1, policy_version 34720 (0.0017) -[2023-09-26 08:34:54,930][92473] Updated weights for policy 0, policy_version 34720 (0.0016) -[2023-09-26 08:34:58,762][91478] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 17793024. Throughput: 0: 777.0, 1: 778.1. Samples: 4447427. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) -[2023-09-26 08:34:58,763][91478] Avg episode reward: [(0, '6.710'), (1, '6.650')] -[2023-09-26 08:35:03,762][91478] Fps is (10 sec: 5734.5, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 17825792. Throughput: 0: 775.6, 1: 774.2. Samples: 4456557. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) -[2023-09-26 08:35:03,762][91478] Avg episode reward: [(0, '6.630'), (1, '6.820')] -[2023-09-26 08:35:08,132][92473] Updated weights for policy 0, policy_version 34880 (0.0017) -[2023-09-26 08:35:08,132][92474] Updated weights for policy 1, policy_version 34880 (0.0018) -[2023-09-26 08:35:08,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6144.0, 300 sec: 6234.2). Total num frames: 17858560. Throughput: 0: 780.6, 1: 778.6. Samples: 4461363. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:35:08,763][91478] Avg episode reward: [(0, '6.690'), (1, '6.760')] -[2023-09-26 08:35:13,762][91478] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 17891328. Throughput: 0: 776.0, 1: 777.8. Samples: 4470772. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:35:13,763][91478] Avg episode reward: [(0, '6.700'), (1, '6.890')] -[2023-09-26 08:35:13,776][91993] Saving ./train_atari/atari_frostbite/checkpoint_p0/checkpoint_000034944_8945664.pth... -[2023-09-26 08:35:13,776][92345] Saving ./train_atari/atari_frostbite/checkpoint_p1/checkpoint_000034944_8945664.pth... -[2023-09-26 08:35:13,812][92345] Removing ./train_atari/atari_frostbite/checkpoint_p1/checkpoint_000032032_8200192.pth -[2023-09-26 08:35:13,813][91993] Removing ./train_atari/atari_frostbite/checkpoint_p0/checkpoint_000032032_8200192.pth -[2023-09-26 08:35:18,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 17924096. Throughput: 0: 777.6, 1: 777.3. Samples: 4480011. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:35:18,763][91478] Avg episode reward: [(0, '6.710'), (1, '6.950')] -[2023-09-26 08:35:21,188][92473] Updated weights for policy 0, policy_version 35040 (0.0016) -[2023-09-26 08:35:21,189][92474] Updated weights for policy 1, policy_version 35040 (0.0015) -[2023-09-26 08:35:23,762][91478] Fps is (10 sec: 6553.8, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 17956864. Throughput: 0: 777.6, 1: 777.4. Samples: 4484887. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:35:23,763][91478] Avg episode reward: [(0, '6.670'), (1, '7.090')] -[2023-09-26 08:35:28,762][91478] Fps is (10 sec: 5734.6, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 17981440. Throughput: 0: 776.4, 1: 775.5. Samples: 4493953. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:35:28,762][91478] Avg episode reward: [(0, '6.740'), (1, '6.800')] -[2023-09-26 08:35:33,762][91478] Fps is (10 sec: 5734.4, 60 sec: 6280.6, 300 sec: 6234.3). Total num frames: 18014208. Throughput: 0: 774.5, 1: 774.6. Samples: 4503507. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:35:33,762][91478] Avg episode reward: [(0, '6.940'), (1, '7.040')] -[2023-09-26 08:35:34,468][92473] Updated weights for policy 0, policy_version 35200 (0.0016) -[2023-09-26 08:35:34,469][92474] Updated weights for policy 1, policy_version 35200 (0.0017) -[2023-09-26 08:35:38,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6280.6, 300 sec: 6248.1). Total num frames: 18046976. Throughput: 0: 774.9, 1: 774.5. Samples: 4507794. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:35:38,762][91478] Avg episode reward: [(0, '6.510'), (1, '7.070')] -[2023-09-26 08:35:43,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6280.6, 300 sec: 6248.1). Total num frames: 18079744. Throughput: 0: 779.2, 1: 778.8. Samples: 4517538. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:35:43,762][91478] Avg episode reward: [(0, '6.520'), (1, '6.770')] -[2023-09-26 08:35:47,382][92474] Updated weights for policy 1, policy_version 35360 (0.0015) -[2023-09-26 08:35:47,383][92473] Updated weights for policy 0, policy_version 35360 (0.0017) -[2023-09-26 08:35:48,762][91478] Fps is (10 sec: 6553.4, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 18112512. Throughput: 0: 782.6, 1: 782.8. Samples: 4527002. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:35:48,763][91478] Avg episode reward: [(0, '6.450'), (1, '6.760')] -[2023-09-26 08:35:53,762][91478] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 18137088. Throughput: 0: 778.2, 1: 781.4. Samples: 4531544. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) -[2023-09-26 08:35:53,763][91478] Avg episode reward: [(0, '6.670'), (1, '6.850')] -[2023-09-26 08:35:58,762][91478] Fps is (10 sec: 5734.5, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 18169856. Throughput: 0: 779.9, 1: 778.5. Samples: 4540901. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) -[2023-09-26 08:35:58,762][91478] Avg episode reward: [(0, '6.820'), (1, '7.010')] -[2023-09-26 08:36:00,505][92474] Updated weights for policy 1, policy_version 35520 (0.0016) -[2023-09-26 08:36:00,505][92473] Updated weights for policy 0, policy_version 35520 (0.0016) -[2023-09-26 08:36:03,762][91478] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 18202624. Throughput: 0: 783.9, 1: 785.2. Samples: 4550620. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) -[2023-09-26 08:36:03,762][91478] Avg episode reward: [(0, '6.650'), (1, '7.160')] -[2023-09-26 08:36:08,762][91478] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 18235392. Throughput: 0: 780.6, 1: 780.6. Samples: 4555140. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) -[2023-09-26 08:36:08,763][91478] Avg episode reward: [(0, '6.740'), (1, '6.920')] -[2023-09-26 08:36:13,368][92474] Updated weights for policy 1, policy_version 35680 (0.0015) -[2023-09-26 08:36:13,369][92473] Updated weights for policy 0, policy_version 35680 (0.0014) -[2023-09-26 08:36:13,762][91478] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 18268160. Throughput: 0: 788.7, 1: 787.8. Samples: 4564894. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) -[2023-09-26 08:36:13,763][91478] Avg episode reward: [(0, '6.610'), (1, '6.970')] -[2023-09-26 08:36:18,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 18300928. Throughput: 0: 786.8, 1: 786.6. Samples: 4574310. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) -[2023-09-26 08:36:18,763][91478] Avg episode reward: [(0, '6.800'), (1, '7.090')] -[2023-09-26 08:36:23,762][91478] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 18325504. Throughput: 0: 792.7, 1: 792.6. Samples: 4579132. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) -[2023-09-26 08:36:23,763][91478] Avg episode reward: [(0, '6.710'), (1, '7.090')] -[2023-09-26 08:36:26,387][92473] Updated weights for policy 0, policy_version 35840 (0.0017) -[2023-09-26 08:36:26,388][92474] Updated weights for policy 1, policy_version 35840 (0.0016) -[2023-09-26 08:36:28,762][91478] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 18358272. Throughput: 0: 788.0, 1: 787.2. Samples: 4588422. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) -[2023-09-26 08:36:28,763][91478] Avg episode reward: [(0, '6.680'), (1, '6.960')] -[2023-09-26 08:36:33,762][91478] Fps is (10 sec: 6553.8, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 18391040. Throughput: 0: 787.4, 1: 787.1. Samples: 4597858. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) -[2023-09-26 08:36:33,762][91478] Avg episode reward: [(0, '6.520'), (1, '6.920')] -[2023-09-26 08:36:38,762][91478] Fps is (10 sec: 6553.8, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 18423808. Throughput: 0: 787.7, 1: 787.3. Samples: 4602419. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) -[2023-09-26 08:36:38,762][91478] Avg episode reward: [(0, '6.600'), (1, '6.840')] -[2023-09-26 08:36:39,735][92474] Updated weights for policy 1, policy_version 36000 (0.0017) -[2023-09-26 08:36:39,735][92473] Updated weights for policy 0, policy_version 36000 (0.0018) -[2023-09-26 08:36:43,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 18456576. Throughput: 0: 786.9, 1: 787.4. Samples: 4611743. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) -[2023-09-26 08:36:43,762][91478] Avg episode reward: [(0, '6.630'), (1, '6.930')] -[2023-09-26 08:36:48,762][91478] Fps is (10 sec: 6553.4, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 18489344. Throughput: 0: 785.2, 1: 783.8. Samples: 4621226. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) -[2023-09-26 08:36:48,763][91478] Avg episode reward: [(0, '6.760'), (1, '6.840')] -[2023-09-26 08:36:52,493][92474] Updated weights for policy 1, policy_version 36160 (0.0018) -[2023-09-26 08:36:52,493][92473] Updated weights for policy 0, policy_version 36160 (0.0017) -[2023-09-26 08:36:53,762][91478] Fps is (10 sec: 5734.2, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 18513920. Throughput: 0: 790.6, 1: 790.2. Samples: 4626276. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) -[2023-09-26 08:36:53,763][91478] Avg episode reward: [(0, '6.820'), (1, '7.070')] -[2023-09-26 08:36:58,762][91478] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 18546688. Throughput: 0: 782.9, 1: 784.0. Samples: 4635402. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) -[2023-09-26 08:36:58,763][91478] Avg episode reward: [(0, '6.710'), (1, '6.880')] -[2023-09-26 08:37:03,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 18579456. Throughput: 0: 783.4, 1: 784.6. Samples: 4644868. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:37:03,763][91478] Avg episode reward: [(0, '6.580'), (1, '6.860')] -[2023-09-26 08:37:05,445][92474] Updated weights for policy 1, policy_version 36320 (0.0020) -[2023-09-26 08:37:05,445][92473] Updated weights for policy 0, policy_version 36320 (0.0020) -[2023-09-26 08:37:08,762][91478] Fps is (10 sec: 6553.7, 60 sec: 6280.6, 300 sec: 6248.1). Total num frames: 18612224. Throughput: 0: 784.5, 1: 784.8. Samples: 4649750. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:37:08,762][91478] Avg episode reward: [(0, '6.830'), (1, '6.920')] -[2023-09-26 08:37:13,762][91478] Fps is (10 sec: 6553.8, 60 sec: 6280.6, 300 sec: 6248.1). Total num frames: 18644992. Throughput: 0: 785.6, 1: 786.6. Samples: 4659170. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:37:13,762][91478] Avg episode reward: [(0, '6.810'), (1, '7.110')] -[2023-09-26 08:37:13,770][91993] Saving ./train_atari/atari_frostbite/checkpoint_p0/checkpoint_000036416_9322496.pth... -[2023-09-26 08:37:13,771][92345] Saving ./train_atari/atari_frostbite/checkpoint_p1/checkpoint_000036416_9322496.pth... -[2023-09-26 08:37:13,804][91993] Removing ./train_atari/atari_frostbite/checkpoint_p0/checkpoint_000033488_8572928.pth -[2023-09-26 08:37:13,811][92345] Removing ./train_atari/atari_frostbite/checkpoint_p1/checkpoint_000033488_8572928.pth -[2023-09-26 08:37:18,603][92474] Updated weights for policy 1, policy_version 36480 (0.0014) -[2023-09-26 08:37:18,605][92473] Updated weights for policy 0, policy_version 36480 (0.0014) -[2023-09-26 08:37:18,762][91478] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 18677760. Throughput: 0: 783.9, 1: 783.9. Samples: 4668412. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:37:18,763][91478] Avg episode reward: [(0, '6.580'), (1, '6.950')] -[2023-09-26 08:37:23,762][91478] Fps is (10 sec: 5734.3, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 18702336. Throughput: 0: 787.3, 1: 786.0. Samples: 4673219. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:37:23,763][91478] Avg episode reward: [(0, '6.650'), (1, '7.050')] -[2023-09-26 08:37:28,762][91478] Fps is (10 sec: 5734.5, 60 sec: 6280.6, 300 sec: 6220.4). Total num frames: 18735104. Throughput: 0: 786.1, 1: 786.2. Samples: 4682497. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:37:28,762][91478] Avg episode reward: [(0, '6.590'), (1, '7.180')] -[2023-09-26 08:37:31,578][92473] Updated weights for policy 0, policy_version 36640 (0.0017) -[2023-09-26 08:37:31,579][92474] Updated weights for policy 1, policy_version 36640 (0.0020) -[2023-09-26 08:37:33,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 18767872. Throughput: 0: 785.0, 1: 787.1. Samples: 4691972. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:37:33,763][91478] Avg episode reward: [(0, '6.820'), (1, '7.130')] -[2023-09-26 08:37:38,762][91478] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 18800640. Throughput: 0: 779.5, 1: 779.7. Samples: 4696439. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:37:38,763][91478] Avg episode reward: [(0, '6.710'), (1, '6.860')] -[2023-09-26 08:37:43,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 18833408. Throughput: 0: 786.8, 1: 787.5. Samples: 4706245. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:37:43,763][91478] Avg episode reward: [(0, '6.630'), (1, '7.090')] -[2023-09-26 08:37:44,711][92474] Updated weights for policy 1, policy_version 36800 (0.0018) -[2023-09-26 08:37:44,711][92473] Updated weights for policy 0, policy_version 36800 (0.0018) -[2023-09-26 08:37:48,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 18866176. Throughput: 0: 786.9, 1: 785.2. Samples: 4715611. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:37:48,763][91478] Avg episode reward: [(0, '6.620'), (1, '7.290')] -[2023-09-26 08:37:53,762][91478] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 18890752. Throughput: 0: 784.4, 1: 781.4. Samples: 4720214. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:37:53,763][91478] Avg episode reward: [(0, '6.360'), (1, '7.150')] -[2023-09-26 08:37:57,925][92473] Updated weights for policy 0, policy_version 36960 (0.0015) -[2023-09-26 08:37:57,925][92474] Updated weights for policy 1, policy_version 36960 (0.0016) -[2023-09-26 08:37:58,762][91478] Fps is (10 sec: 5734.6, 60 sec: 6280.6, 300 sec: 6248.1). Total num frames: 18923520. Throughput: 0: 778.8, 1: 778.0. Samples: 4729226. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:37:58,762][91478] Avg episode reward: [(0, '6.360'), (1, '6.780')] -[2023-09-26 08:38:03,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 18956288. Throughput: 0: 781.5, 1: 779.4. Samples: 4738654. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:38:03,763][91478] Avg episode reward: [(0, '6.480'), (1, '6.940')] -[2023-09-26 08:38:08,762][91478] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 18989056. Throughput: 0: 776.4, 1: 778.0. Samples: 4743168. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:38:08,763][91478] Avg episode reward: [(0, '6.310'), (1, '6.860')] -[2023-09-26 08:38:11,167][92473] Updated weights for policy 0, policy_version 37120 (0.0017) -[2023-09-26 08:38:11,168][92474] Updated weights for policy 1, policy_version 37120 (0.0019) -[2023-09-26 08:38:13,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 19021824. Throughput: 0: 779.9, 1: 779.8. Samples: 4752686. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0) -[2023-09-26 08:38:13,763][91478] Avg episode reward: [(0, '6.620'), (1, '6.720')] -[2023-09-26 08:38:18,762][91478] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 19046400. Throughput: 0: 775.7, 1: 774.6. Samples: 4761737. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0) -[2023-09-26 08:38:18,763][91478] Avg episode reward: [(0, '6.740'), (1, '6.980')] -[2023-09-26 08:38:23,762][91478] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 19079168. Throughput: 0: 779.5, 1: 779.2. Samples: 4766582. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0) -[2023-09-26 08:38:23,763][91478] Avg episode reward: [(0, '6.730'), (1, '7.130')] -[2023-09-26 08:38:24,509][92474] Updated weights for policy 1, policy_version 37280 (0.0016) -[2023-09-26 08:38:24,509][92473] Updated weights for policy 0, policy_version 37280 (0.0016) -[2023-09-26 08:38:28,762][91478] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 19111936. Throughput: 0: 774.2, 1: 774.5. Samples: 4775936. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0) -[2023-09-26 08:38:28,762][91478] Avg episode reward: [(0, '6.600'), (1, '6.790')] -[2023-09-26 08:38:33,762][91478] Fps is (10 sec: 6553.7, 60 sec: 6280.6, 300 sec: 6275.9). Total num frames: 19144704. Throughput: 0: 772.6, 1: 774.0. Samples: 4785210. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0) -[2023-09-26 08:38:33,762][91478] Avg episode reward: [(0, '6.500'), (1, '6.760')] -[2023-09-26 08:38:37,704][92473] Updated weights for policy 0, policy_version 37440 (0.0017) -[2023-09-26 08:38:37,704][92474] Updated weights for policy 1, policy_version 37440 (0.0016) -[2023-09-26 08:38:38,762][91478] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 19169280. Throughput: 0: 773.1, 1: 774.7. Samples: 4789866. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:38:38,762][91478] Avg episode reward: [(0, '6.640'), (1, '7.040')] -[2023-09-26 08:38:43,762][91478] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 19202048. Throughput: 0: 773.7, 1: 773.5. Samples: 4798851. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:38:43,763][91478] Avg episode reward: [(0, '6.560'), (1, '6.960')] -[2023-09-26 08:38:48,762][91478] Fps is (10 sec: 6553.5, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 19234816. Throughput: 0: 776.0, 1: 776.5. Samples: 4808518. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:38:48,763][91478] Avg episode reward: [(0, '6.700'), (1, '6.920')] -[2023-09-26 08:38:50,780][92473] Updated weights for policy 0, policy_version 37600 (0.0017) -[2023-09-26 08:38:50,781][92474] Updated weights for policy 1, policy_version 37600 (0.0018) -[2023-09-26 08:38:53,762][91478] Fps is (10 sec: 6553.8, 60 sec: 6280.6, 300 sec: 6248.1). Total num frames: 19267584. Throughput: 0: 775.7, 1: 774.3. Samples: 4812921. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:38:53,762][91478] Avg episode reward: [(0, '6.700'), (1, '6.930')] -[2023-09-26 08:38:58,762][91478] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 19300352. Throughput: 0: 775.6, 1: 774.8. Samples: 4822456. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:38:58,763][91478] Avg episode reward: [(0, '6.720'), (1, '7.130')] -[2023-09-26 08:39:03,762][91478] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 19324928. Throughput: 0: 777.2, 1: 776.5. Samples: 4831656. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:39:03,763][91478] Avg episode reward: [(0, '6.720'), (1, '6.910')] -[2023-09-26 08:39:03,899][92473] Updated weights for policy 0, policy_version 37760 (0.0015) -[2023-09-26 08:39:03,899][92474] Updated weights for policy 1, policy_version 37760 (0.0018) -[2023-09-26 08:39:08,762][91478] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 19357696. Throughput: 0: 776.0, 1: 776.4. Samples: 4836443. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:39:08,762][91478] Avg episode reward: [(0, '6.720'), (1, '6.710')] -[2023-09-26 08:39:13,762][91478] Fps is (10 sec: 6553.7, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 19390464. Throughput: 0: 773.8, 1: 773.7. Samples: 4845573. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:39:13,762][91478] Avg episode reward: [(0, '6.610'), (1, '7.030')] -[2023-09-26 08:39:13,772][91993] Saving ./train_atari/atari_frostbite/checkpoint_p0/checkpoint_000037872_9695232.pth... -[2023-09-26 08:39:13,772][92345] Saving ./train_atari/atari_frostbite/checkpoint_p1/checkpoint_000037872_9695232.pth... -[2023-09-26 08:39:13,820][91993] Removing ./train_atari/atari_frostbite/checkpoint_p0/checkpoint_000034944_8945664.pth -[2023-09-26 08:39:13,820][92345] Removing ./train_atari/atari_frostbite/checkpoint_p1/checkpoint_000034944_8945664.pth -[2023-09-26 08:39:17,282][92473] Updated weights for policy 0, policy_version 37920 (0.0018) -[2023-09-26 08:39:17,282][92474] Updated weights for policy 1, policy_version 37920 (0.0016) -[2023-09-26 08:39:18,762][91478] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 19423232. Throughput: 0: 772.6, 1: 774.1. Samples: 4854809. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:39:18,763][91478] Avg episode reward: [(0, '6.740'), (1, '6.960')] -[2023-09-26 08:39:23,762][91478] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 19447808. Throughput: 0: 774.0, 1: 774.8. Samples: 4859561. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:39:23,763][91478] Avg episode reward: [(0, '6.740'), (1, '7.080')] -[2023-09-26 08:39:28,762][91478] Fps is (10 sec: 5734.5, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 19480576. Throughput: 0: 775.8, 1: 776.7. Samples: 4868712. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-09-26 08:39:28,762][91478] Avg episode reward: [(0, '6.850'), (1, '6.580')] -[2023-09-26 08:39:30,438][92474] Updated weights for policy 1, policy_version 38080 (0.0015) -[2023-09-26 08:39:30,438][92473] Updated weights for policy 0, policy_version 38080 (0.0017) -[2023-09-26 08:39:33,762][91478] Fps is (10 sec: 6553.5, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 19513344. Throughput: 0: 774.2, 1: 777.2. Samples: 4878331. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-09-26 08:39:33,763][91478] Avg episode reward: [(0, '6.750'), (1, '6.530')] -[2023-09-26 08:39:38,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 19546112. Throughput: 0: 778.3, 1: 778.2. Samples: 4882963. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-09-26 08:39:38,762][91478] Avg episode reward: [(0, '6.640'), (1, '6.530')] -[2023-09-26 08:39:43,422][92474] Updated weights for policy 1, policy_version 38240 (0.0016) -[2023-09-26 08:39:43,423][92473] Updated weights for policy 0, policy_version 38240 (0.0017) -[2023-09-26 08:39:43,762][91478] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 19578880. Throughput: 0: 778.3, 1: 780.3. Samples: 4892594. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-09-26 08:39:43,764][91478] Avg episode reward: [(0, '6.860'), (1, '6.770')] -[2023-09-26 08:39:48,762][91478] Fps is (10 sec: 6553.4, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 19611648. Throughput: 0: 777.6, 1: 777.8. Samples: 4901646. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) -[2023-09-26 08:39:48,763][91478] Avg episode reward: [(0, '6.720'), (1, '6.920')] -[2023-09-26 08:39:53,762][91478] Fps is (10 sec: 5734.6, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 19636224. Throughput: 0: 778.8, 1: 778.4. Samples: 4906517. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:39:53,763][91478] Avg episode reward: [(0, '6.650'), (1, '6.730')] -[2023-09-26 08:39:56,387][92473] Updated weights for policy 0, policy_version 38400 (0.0013) -[2023-09-26 08:39:56,388][92474] Updated weights for policy 1, policy_version 38400 (0.0017) -[2023-09-26 08:39:58,762][91478] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 19668992. Throughput: 0: 784.4, 1: 783.2. Samples: 4916113. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:39:58,763][91478] Avg episode reward: [(0, '6.540'), (1, '6.640')] -[2023-09-26 08:40:03,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 19701760. Throughput: 0: 787.6, 1: 785.2. Samples: 4925586. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:40:03,763][91478] Avg episode reward: [(0, '6.510'), (1, '6.940')] -[2023-09-26 08:40:08,762][91478] Fps is (10 sec: 6553.8, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 19734528. Throughput: 0: 787.4, 1: 787.5. Samples: 4930434. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:40:08,762][91478] Avg episode reward: [(0, '6.660'), (1, '6.880')] -[2023-09-26 08:40:09,292][92473] Updated weights for policy 0, policy_version 38560 (0.0017) -[2023-09-26 08:40:09,292][92474] Updated weights for policy 1, policy_version 38560 (0.0015) -[2023-09-26 08:40:13,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 19767296. Throughput: 0: 789.3, 1: 789.9. Samples: 4939777. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) -[2023-09-26 08:40:13,763][91478] Avg episode reward: [(0, '6.570'), (1, '6.520')] -[2023-09-26 08:40:18,762][91478] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 19800064. Throughput: 0: 787.7, 1: 786.3. Samples: 4949158. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) -[2023-09-26 08:40:18,763][91478] Avg episode reward: [(0, '6.540'), (1, '6.360')] -[2023-09-26 08:40:22,482][92474] Updated weights for policy 1, policy_version 38720 (0.0018) -[2023-09-26 08:40:22,482][92473] Updated weights for policy 0, policy_version 38720 (0.0017) -[2023-09-26 08:40:23,762][91478] Fps is (10 sec: 6553.5, 60 sec: 6417.1, 300 sec: 6275.9). Total num frames: 19832832. Throughput: 0: 789.7, 1: 788.2. Samples: 4953967. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) -[2023-09-26 08:40:23,763][91478] Avg episode reward: [(0, '6.670'), (1, '6.300')] -[2023-09-26 08:40:28,762][91478] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 19857408. Throughput: 0: 786.5, 1: 785.4. Samples: 4963326. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) -[2023-09-26 08:40:28,763][91478] Avg episode reward: [(0, '6.690'), (1, '6.760')] -[2023-09-26 08:40:33,762][91478] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 19890176. Throughput: 0: 787.0, 1: 788.6. Samples: 4972549. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) -[2023-09-26 08:40:33,763][91478] Avg episode reward: [(0, '6.700'), (1, '6.920')] -[2023-09-26 08:40:35,454][92473] Updated weights for policy 0, policy_version 38880 (0.0017) -[2023-09-26 08:40:35,454][92474] Updated weights for policy 1, policy_version 38880 (0.0015) -[2023-09-26 08:40:38,762][91478] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 19922944. Throughput: 0: 787.6, 1: 788.2. Samples: 4977431. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) -[2023-09-26 08:40:38,763][91478] Avg episode reward: [(0, '6.490'), (1, '7.140')] -[2023-09-26 08:40:43,762][91478] Fps is (10 sec: 6553.5, 60 sec: 6280.6, 300 sec: 6248.1). Total num frames: 19955712. Throughput: 0: 786.7, 1: 787.0. Samples: 4986932. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) -[2023-09-26 08:40:43,763][91478] Avg episode reward: [(0, '6.680'), (1, '6.970')] -[2023-09-26 08:40:48,240][92474] Updated weights for policy 1, policy_version 39040 (0.0017) -[2023-09-26 08:40:48,240][92473] Updated weights for policy 0, policy_version 39040 (0.0018) -[2023-09-26 08:40:48,762][91478] Fps is (10 sec: 6553.7, 60 sec: 6280.6, 300 sec: 6275.9). Total num frames: 19988480. Throughput: 0: 791.9, 1: 791.1. Samples: 4996820. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) -[2023-09-26 08:40:48,762][91478] Avg episode reward: [(0, '6.490'), (1, '6.960')] -[2023-09-26 08:40:52,116][92513] Stopping RolloutWorker_w6... -[2023-09-26 08:40:52,116][92345] Saving ./train_atari/atari_frostbite/checkpoint_p1/checkpoint_000039088_10006528.pth... -[2023-09-26 08:40:52,116][92512] Stopping RolloutWorker_w5... -[2023-09-26 08:40:52,116][92507] Stopping RolloutWorker_w3... -[2023-09-26 08:40:52,116][92509] Stopping RolloutWorker_w2... -[2023-09-26 08:40:52,116][92511] Stopping RolloutWorker_w1... -[2023-09-26 08:40:52,116][92510] Stopping RolloutWorker_w4... -[2023-09-26 08:40:52,117][92514] Stopping RolloutWorker_w7... -[2023-09-26 08:40:52,116][91478] Component RolloutWorker_w6 stopped! -[2023-09-26 08:40:52,117][92512] Loop rollout_proc5_evt_loop terminating... -[2023-09-26 08:40:52,117][92513] Loop rollout_proc6_evt_loop terminating... -[2023-09-26 08:40:52,117][92507] Loop rollout_proc3_evt_loop terminating... -[2023-09-26 08:40:52,117][92509] Loop rollout_proc2_evt_loop terminating... -[2023-09-26 08:40:52,117][92475] Stopping RolloutWorker_w0... -[2023-09-26 08:40:52,117][92514] Loop rollout_proc7_evt_loop terminating... -[2023-09-26 08:40:52,117][92511] Loop rollout_proc1_evt_loop terminating... -[2023-09-26 08:40:52,117][92510] Loop rollout_proc4_evt_loop terminating... -[2023-09-26 08:40:52,118][91478] Component RolloutWorker_w5 stopped! -[2023-09-26 08:40:52,118][92475] Loop rollout_proc0_evt_loop terminating... -[2023-09-26 08:40:52,118][91478] Component RolloutWorker_w4 stopped! -[2023-09-26 08:40:52,119][91478] Component RolloutWorker_w3 stopped! -[2023-09-26 08:40:52,119][91478] Component RolloutWorker_w2 stopped! -[2023-09-26 08:40:52,120][91478] Component Batcher_1 stopped! -[2023-09-26 08:40:52,120][91993] Stopping Batcher_0... -[2023-09-26 08:40:52,120][91478] Component RolloutWorker_w1 stopped! -[2023-09-26 08:40:52,121][91993] Loop batcher_evt_loop terminating... -[2023-09-26 08:40:52,121][91478] Component RolloutWorker_w7 stopped! -[2023-09-26 08:40:52,122][91478] Component RolloutWorker_w0 stopped! -[2023-09-26 08:40:52,122][91478] Component Batcher_0 stopped! -[2023-09-26 08:40:52,136][91993] Saving ./train_atari/atari_frostbite/checkpoint_p0/checkpoint_000039088_10006528.pth... -[2023-09-26 08:40:52,116][92345] Stopping Batcher_1... -[2023-09-26 08:40:52,146][92345] Loop batcher_evt_loop terminating... -[2023-09-26 08:40:52,147][92345] Removing ./train_atari/atari_frostbite/checkpoint_p1/checkpoint_000036416_9322496.pth -[2023-09-26 08:40:52,152][92345] Saving ./train_atari/atari_frostbite/checkpoint_p1/checkpoint_000039088_10006528.pth... -[2023-09-26 08:40:52,165][91993] Removing ./train_atari/atari_frostbite/checkpoint_p0/checkpoint_000036416_9322496.pth -[2023-09-26 08:40:52,169][91993] Saving ./train_atari/atari_frostbite/checkpoint_p0/checkpoint_000039088_10006528.pth... -[2023-09-26 08:40:52,174][92473] Weights refcount: 2 0 -[2023-09-26 08:40:52,175][92473] Stopping InferenceWorker_p0-w0... -[2023-09-26 08:40:52,176][92473] Loop inference_proc0-0_evt_loop terminating... -[2023-09-26 08:40:52,176][91478] Component InferenceWorker_p0-w0 stopped! -[2023-09-26 08:40:52,184][92474] Weights refcount: 2 0 -[2023-09-26 08:40:52,187][92474] Stopping InferenceWorker_p1-w0... -[2023-09-26 08:40:52,187][92474] Loop inference_proc1-0_evt_loop terminating... -[2023-09-26 08:40:52,187][91478] Component InferenceWorker_p1-w0 stopped! -[2023-09-26 08:40:52,198][92345] Stopping LearnerWorker_p1... -[2023-09-26 08:40:52,198][92345] Loop learner_proc1_evt_loop terminating... -[2023-09-26 08:40:52,199][91478] Component LearnerWorker_p1 stopped! -[2023-09-26 08:40:52,205][91993] Stopping LearnerWorker_p0... -[2023-09-26 08:40:52,205][91993] Loop learner_proc0_evt_loop terminating... -[2023-09-26 08:40:52,205][91478] Component LearnerWorker_p0 stopped! -[2023-09-26 08:40:52,206][91478] Waiting for process learner_proc0 to stop... -[2023-09-26 08:40:52,978][91478] Waiting for process learner_proc1 to stop... -[2023-09-26 08:40:53,007][91478] Waiting for process inference_proc0-0 to join... -[2023-09-26 08:40:53,007][91478] Waiting for process inference_proc1-0 to join... -[2023-09-26 08:40:53,008][91478] Waiting for process rollout_proc0 to join... -[2023-09-26 08:40:53,009][91478] Waiting for process rollout_proc1 to join... -[2023-09-26 08:40:53,009][91478] Waiting for process rollout_proc2 to join... -[2023-09-26 08:40:53,010][91478] Waiting for process rollout_proc3 to join... -[2023-09-26 08:40:53,011][91478] Waiting for process rollout_proc4 to join... -[2023-09-26 08:40:53,011][91478] Waiting for process rollout_proc5 to join... -[2023-09-26 08:40:53,012][91478] Waiting for process rollout_proc6 to join... -[2023-09-26 08:40:53,012][91478] Waiting for process rollout_proc7 to join... -[2023-09-26 08:40:53,013][91478] Batcher 0 profile tree view: -batching: 20.8274, releasing_batches: 1.7384 -[2023-09-26 08:40:53,014][91478] Batcher 1 profile tree view: -batching: 20.7750, releasing_batches: 1.7423 -[2023-09-26 08:40:53,014][91478] InferenceWorker_p0-w0 profile tree view: -wait_policy: 0.0052 - wait_policy_total: 664.1598 -update_model: 37.4093 - weight_update: 0.0018 -one_step: 0.0011 - handle_policy_step: 2301.0509 - deserialize: 67.6480, stack: 16.2028, obs_to_device_normalize: 557.7310, forward: 1108.6167, send_messages: 92.7013 - prepare_outputs: 308.4975 - to_cpu: 156.5519 -[2023-09-26 08:40:53,015][91478] InferenceWorker_p1-w0 profile tree view: -wait_policy: 0.0051 - wait_policy_total: 675.5771 -update_model: 37.4117 - weight_update: 0.0017 -one_step: 0.0012 - handle_policy_step: 2289.0075 - deserialize: 68.8435, stack: 16.3902, obs_to_device_normalize: 557.2466, forward: 1101.6084, send_messages: 95.4714 - prepare_outputs: 305.1509 - to_cpu: 153.7165 -[2023-09-26 08:40:53,015][91478] Learner 0 profile tree view: -misc: 0.0168, prepare_batch: 32.3151 -train: 457.6851 - epoch_init: 0.1070, minibatch_init: 3.1684, losses_postprocess: 62.9331, kl_divergence: 5.4435, after_optimizer: 21.9874 - calculate_losses: 44.8992 - losses_init: 0.0966, forward_head: 14.2799, bptt_initial: 0.4291, bptt: 0.4542, tail: 10.2721, advantages_returns: 3.0540, losses: 12.7400 - update: 315.0401 - clip: 164.6867 -[2023-09-26 08:40:53,016][91478] Learner 1 profile tree view: -misc: 0.0161, prepare_batch: 32.5272 -train: 456.0765 - epoch_init: 0.1004, minibatch_init: 3.0887, losses_postprocess: 62.7012, kl_divergence: 5.4166, after_optimizer: 22.2459 - calculate_losses: 43.9644 - losses_init: 0.1022, forward_head: 13.3938, bptt_initial: 0.4418, bptt: 0.4710, tail: 10.3155, advantages_returns: 3.0700, losses: 12.5869 - update: 314.4898 - clip: 163.4965 -[2023-09-26 08:40:53,017][91478] RolloutWorker_w0 profile tree view: -wait_for_trajectories: 0.4031, enqueue_policy_requests: 42.5075, env_step: 1194.7815, overhead: 29.3365, complete_rollouts: 1.0681 -save_policy_outputs: 54.2119 - split_output_tensors: 18.5711 -[2023-09-26 08:40:53,017][91478] RolloutWorker_w7 profile tree view: -wait_for_trajectories: 0.4047, enqueue_policy_requests: 42.0556, env_step: 1242.2116, overhead: 29.2284, complete_rollouts: 1.0608 -save_policy_outputs: 53.4701 - split_output_tensors: 18.2535 -[2023-09-26 08:40:53,018][91478] Loop Runner_EvtLoop terminating... -[2023-09-26 08:40:53,018][91478] Runner profile tree view: -main_loop: 3215.1037 -[2023-09-26 08:40:53,018][91478] Collected {0: 10006528, 1: 10006528}, FPS: 6224.7 +[2023-10-11 14:54:25,310][85000] Using optimizer +[2023-10-11 14:54:25,310][85000] No checkpoints found +[2023-10-11 14:54:25,311][85000] Did not load from checkpoint, starting from scratch! +[2023-10-11 14:54:25,311][85000] Initialized policy 1 weights for model version 0 +[2023-10-11 14:54:25,312][85000] LearnerWorker_p1 finished initialization! +[2023-10-11 14:54:25,312][85000] Using GPUs [0] for process 1 (actually maps to GPUs [1]) +[2023-10-11 14:54:26,406][84230] Starting process rollout_proc14 +[2023-10-11 14:54:26,412][85175] Using GPUs [1] for process 1 (actually maps to GPUs [1]) +[2023-10-11 14:54:26,412][85175] Set environment var CUDA_VISIBLE_DEVICES to '1' (GPU indices [1]) for inference process 1 +[2023-10-11 14:54:26,430][85175] Num visible devices: 1 +[2023-10-11 14:54:26,435][84230] Starting process rollout_proc15 +[2023-10-11 14:54:26,443][85214] Worker 6 uses CPU cores [12, 13] +[2023-10-11 14:54:26,447][85215] Worker 4 uses CPU cores [8, 9] +[2023-10-11 14:54:26,463][85212] Worker 3 uses CPU cores [6, 7] +[2023-10-11 14:54:26,661][85176] Using GPUs [0] for process 0 (actually maps to GPUs [0]) +[2023-10-11 14:54:26,662][85176] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0 +[2023-10-11 14:54:26,683][85176] Num visible devices: 1 +[2023-10-11 14:54:26,707][85213] Worker 1 uses CPU cores [2, 3] +[2023-10-11 14:54:26,743][85210] Worker 2 uses CPU cores [4, 5] +[2023-10-11 14:54:26,745][85223] Worker 13 uses CPU cores [26, 27] +[2023-10-11 14:54:26,808][85217] Worker 8 uses CPU cores [16, 17] +[2023-10-11 14:54:26,870][85218] Worker 9 uses CPU cores [18, 19] +[2023-10-11 14:54:26,935][85209] Worker 0 uses CPU cores [0, 1] +[2023-10-11 14:54:26,971][85219] Worker 10 uses CPU cores [20, 21] +[2023-10-11 14:54:26,982][85222] Worker 12 uses CPU cores [24, 25] +[2023-10-11 14:54:27,003][85221] Worker 11 uses CPU cores [22, 23] +[2023-10-11 14:54:27,121][85216] Worker 5 uses CPU cores [10, 11] +[2023-10-11 14:54:27,125][85220] Worker 7 uses CPU cores [14, 15] +[2023-10-11 14:54:27,322][85175] RunningMeanStd input shape: (4, 84, 84) +[2023-10-11 14:54:27,323][85175] RunningMeanStd input shape: (1,) +[2023-10-11 14:54:27,334][85175] ConvEncoder: input_channels=4 +[2023-10-11 14:54:27,350][85176] RunningMeanStd input shape: (4, 84, 84) +[2023-10-11 14:54:27,351][85176] RunningMeanStd input shape: (1,) +[2023-10-11 14:54:27,369][85176] ConvEncoder: input_channels=4 +[2023-10-11 14:54:27,438][85175] Conv encoder output size: 512 +[2023-10-11 14:54:27,497][85176] Conv encoder output size: 512 +[2023-10-11 14:54:28,411][85870] Worker 14 uses CPU cores [28, 29] +[2023-10-11 14:54:28,412][84230] Inference worker 1-0 is ready! +[2023-10-11 14:54:28,413][84230] Inference worker 0-0 is ready! +[2023-10-11 14:54:28,413][84230] All inference workers are ready! Signal rollout workers to start! +[2023-10-11 14:54:28,414][85221] EnvRunner 11-0 uses policy 1 +[2023-10-11 14:54:28,414][85210] EnvRunner 2-0 uses policy 0 +[2023-10-11 14:54:28,415][85212] EnvRunner 3-0 uses policy 1 +[2023-10-11 14:54:28,415][85219] EnvRunner 10-0 uses policy 0 +[2023-10-11 14:54:28,415][85220] EnvRunner 7-0 uses policy 1 +[2023-10-11 14:54:28,414][84230] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 0. Throughput: 0: nan, 1: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) +[2023-10-11 14:54:28,415][85214] EnvRunner 6-0 uses policy 0 +[2023-10-11 14:54:28,415][85213] EnvRunner 1-0 uses policy 1 +[2023-10-11 14:54:28,415][85218] EnvRunner 9-0 uses policy 1 +[2023-10-11 14:54:28,415][85209] EnvRunner 0-0 uses policy 0 +[2023-10-11 14:54:28,415][85217] EnvRunner 8-0 uses policy 0 +[2023-10-11 14:54:28,415][85223] EnvRunner 13-0 uses policy 1 +[2023-10-11 14:54:28,415][85216] EnvRunner 5-0 uses policy 1 +[2023-10-11 14:54:28,415][85215] EnvRunner 4-0 uses policy 0 +[2023-10-11 14:54:28,415][85222] EnvRunner 12-0 uses policy 0 +[2023-10-11 14:54:28,415][85902] Worker 15 uses CPU cores [30, 31] +[2023-10-11 14:54:28,536][85902] EnvRunner 15-0 uses policy 1 +[2023-10-11 14:54:28,612][85870] EnvRunner 14-0 uses policy 0 +[2023-10-11 14:54:30,628][84230] Heartbeat connected on Batcher_0 +[2023-10-11 14:54:30,630][84230] Heartbeat connected on LearnerWorker_p0 +[2023-10-11 14:54:30,633][84230] Heartbeat connected on Batcher_1 +[2023-10-11 14:54:30,636][84230] Heartbeat connected on LearnerWorker_p1 +[2023-10-11 14:54:30,644][84230] Heartbeat connected on InferenceWorker_p0-w0 +[2023-10-11 14:54:30,646][84230] Heartbeat connected on InferenceWorker_p1-w0 +[2023-10-11 14:54:30,652][84230] Heartbeat connected on RolloutWorker_w2 +[2023-10-11 14:54:30,653][84230] Heartbeat connected on RolloutWorker_w0 +[2023-10-11 14:54:30,655][84230] Heartbeat connected on RolloutWorker_w1 +[2023-10-11 14:54:30,658][84230] Heartbeat connected on RolloutWorker_w4 +[2023-10-11 14:54:30,659][84230] Heartbeat connected on RolloutWorker_w3 +[2023-10-11 14:54:30,660][84230] Heartbeat connected on RolloutWorker_w5 +[2023-10-11 14:54:30,663][84230] Heartbeat connected on RolloutWorker_w6 +[2023-10-11 14:54:30,665][84230] Heartbeat connected on RolloutWorker_w7 +[2023-10-11 14:54:30,668][84230] Heartbeat connected on RolloutWorker_w8 +[2023-10-11 14:54:30,673][84230] Heartbeat connected on RolloutWorker_w9 +[2023-10-11 14:54:30,678][84230] Heartbeat connected on RolloutWorker_w10 +[2023-10-11 14:54:30,681][84230] Heartbeat connected on RolloutWorker_w11 +[2023-10-11 14:54:30,684][84230] Heartbeat connected on RolloutWorker_w12 +[2023-10-11 14:54:30,686][84230] Heartbeat connected on RolloutWorker_w13 +[2023-10-11 14:54:30,689][84230] Heartbeat connected on RolloutWorker_w14 +[2023-10-11 14:54:30,693][84230] Heartbeat connected on RolloutWorker_w15 +[2023-10-11 14:54:31,063][84230] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 447.1, 1: 285.5. Samples: 1940. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) +[2023-10-11 14:54:31,064][84230] Avg episode reward: [(0, '1.154'), (1, '1.062')] +[2023-10-11 14:54:36,062][84230] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 891.8, 1: 857.8. Samples: 13380. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) +[2023-10-11 14:54:36,063][84230] Avg episode reward: [(0, '1.820'), (1, '1.780')] +[2023-10-11 14:54:38,559][85176] Updated weights for policy 0, policy_version 10 (0.0008) +[2023-10-11 14:54:38,723][85175] Updated weights for policy 1, policy_version 10 (0.0007) +[2023-10-11 14:54:38,931][85176] Updated weights for policy 0, policy_version 20 (0.0009) +[2023-10-11 14:54:39,079][85175] Updated weights for policy 1, policy_version 20 (0.0008) +[2023-10-11 14:54:39,308][85176] Updated weights for policy 0, policy_version 30 (0.0009) +[2023-10-11 14:54:39,441][85175] Updated weights for policy 1, policy_version 30 (0.0009) +[2023-10-11 14:54:41,063][84230] Fps is (10 sec: 6553.8, 60 sec: 5181.5, 300 sec: 5181.5). Total num frames: 65536. Throughput: 0: 1180.9, 1: 1179.2. Samples: 29850. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) +[2023-10-11 14:54:41,063][84230] Avg episode reward: [(0, '1.970'), (1, '1.810')] +[2023-10-11 14:54:41,912][85175] Updated weights for policy 1, policy_version 40 (0.0009) +[2023-10-11 14:54:42,059][85176] Updated weights for policy 0, policy_version 40 (0.0007) +[2023-10-11 14:54:42,286][85175] Updated weights for policy 1, policy_version 50 (0.0008) +[2023-10-11 14:54:42,423][85176] Updated weights for policy 0, policy_version 50 (0.0009) +[2023-10-11 14:54:42,645][85175] Updated weights for policy 1, policy_version 60 (0.0008) +[2023-10-11 14:54:42,799][85176] Updated weights for policy 0, policy_version 60 (0.0009) +[2023-10-11 14:54:46,063][84230] Fps is (10 sec: 13106.8, 60 sec: 7426.9, 300 sec: 7426.9). Total num frames: 131072. Throughput: 0: 1423.7, 1: 1436.0. Samples: 50468. Policy #0 lag: (min: 33.0, avg: 33.0, max: 33.0) +[2023-10-11 14:54:46,064][84230] Avg episode reward: [(0, '1.950'), (1, '1.670')] +[2023-10-11 14:54:46,134][85175] Updated weights for policy 1, policy_version 70 (0.0008) +[2023-10-11 14:54:46,496][85175] Updated weights for policy 1, policy_version 80 (0.0009) +[2023-10-11 14:54:46,586][85176] Updated weights for policy 0, policy_version 70 (0.0008) +[2023-10-11 14:54:46,869][85175] Updated weights for policy 1, policy_version 90 (0.0009) +[2023-10-11 14:54:46,960][85176] Updated weights for policy 0, policy_version 80 (0.0008) +[2023-10-11 14:54:47,339][85176] Updated weights for policy 0, policy_version 90 (0.0008) +[2023-10-11 14:54:50,722][85175] Updated weights for policy 1, policy_version 100 (0.0009) +[2023-10-11 14:54:50,933][85176] Updated weights for policy 0, policy_version 100 (0.0008) +[2023-10-11 14:54:51,062][84230] Fps is (10 sec: 13107.3, 60 sec: 8681.0, 300 sec: 8681.0). Total num frames: 196608. Throughput: 0: 1307.3, 1: 1317.9. Samples: 59456. Policy #0 lag: (min: 31.0, avg: 47.0, max: 63.0) +[2023-10-11 14:54:51,063][84230] Avg episode reward: [(0, '1.920'), (1, '2.350')] +[2023-10-11 14:54:51,089][85175] Updated weights for policy 1, policy_version 110 (0.0007) +[2023-10-11 14:54:51,305][85176] Updated weights for policy 0, policy_version 110 (0.0007) +[2023-10-11 14:54:51,448][85175] Updated weights for policy 1, policy_version 120 (0.0007) +[2023-10-11 14:54:51,749][85176] Updated weights for policy 0, policy_version 122 (0.0009) +[2023-10-11 14:54:55,445][85175] Updated weights for policy 1, policy_version 130 (0.0007) +[2023-10-11 14:54:55,818][85175] Updated weights for policy 1, policy_version 140 (0.0008) +[2023-10-11 14:54:55,938][85176] Updated weights for policy 0, policy_version 132 (0.0008) +[2023-10-11 14:54:56,062][84230] Fps is (10 sec: 13107.4, 60 sec: 9481.5, 300 sec: 9481.5). Total num frames: 262144. Throughput: 0: 1439.8, 1: 1448.4. Samples: 79852. Policy #0 lag: (min: 22.0, avg: 35.7, max: 54.0) +[2023-10-11 14:54:56,063][84230] Avg episode reward: [(0, '2.400'), (1, '2.210')] +[2023-10-11 14:54:56,191][85175] Updated weights for policy 1, policy_version 150 (0.0008) +[2023-10-11 14:54:56,315][85176] Updated weights for policy 0, policy_version 142 (0.0009) +[2023-10-11 14:54:56,560][85000] Saving new best policy, reward=2.210! +[2023-10-11 14:54:56,562][85175] Updated weights for policy 1, policy_version 160 (0.0008) +[2023-10-11 14:54:56,687][85176] Updated weights for policy 0, policy_version 152 (0.0008) +[2023-10-11 14:54:56,977][84801] Saving new best policy, reward=2.400! +[2023-10-11 14:55:00,629][85176] Updated weights for policy 0, policy_version 162 (0.0008) +[2023-10-11 14:55:00,863][85175] Updated weights for policy 1, policy_version 170 (0.0008) +[2023-10-11 14:55:01,000][85176] Updated weights for policy 0, policy_version 172 (0.0009) +[2023-10-11 14:55:01,063][84230] Fps is (10 sec: 13106.9, 60 sec: 10036.7, 300 sec: 10036.7). Total num frames: 327680. Throughput: 0: 1530.8, 1: 1537.5. Samples: 100174. Policy #0 lag: (min: 31.0, avg: 35.4, max: 63.0) +[2023-10-11 14:55:01,064][84230] Avg episode reward: [(0, '2.780'), (1, '2.420')] +[2023-10-11 14:55:01,226][85175] Updated weights for policy 1, policy_version 180 (0.0008) +[2023-10-11 14:55:01,367][85176] Updated weights for policy 0, policy_version 182 (0.0008) +[2023-10-11 14:55:01,588][85175] Updated weights for policy 1, policy_version 190 (0.0007) +[2023-10-11 14:55:01,656][85000] Saving new best policy, reward=2.420! +[2023-10-11 14:55:01,737][84801] Saving new best policy, reward=2.780! +[2023-10-11 14:55:01,738][85176] Updated weights for policy 0, policy_version 192 (0.0007) +[2023-10-11 14:55:05,499][85175] Updated weights for policy 1, policy_version 200 (0.0008) +[2023-10-11 14:55:05,861][85175] Updated weights for policy 1, policy_version 210 (0.0009) +[2023-10-11 14:55:05,919][85176] Updated weights for policy 0, policy_version 202 (0.0007) +[2023-10-11 14:55:06,063][84230] Fps is (10 sec: 13107.2, 60 sec: 10444.5, 300 sec: 10444.5). Total num frames: 393216. Throughput: 0: 1448.3, 1: 1457.1. Samples: 109382. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 14:55:06,064][84230] Avg episode reward: [(0, '2.600'), (1, '3.100')] +[2023-10-11 14:55:06,233][85175] Updated weights for policy 1, policy_version 220 (0.0008) +[2023-10-11 14:55:06,288][85176] Updated weights for policy 0, policy_version 212 (0.0008) +[2023-10-11 14:55:06,372][85000] Saving new best policy, reward=3.100! +[2023-10-11 14:55:06,664][85176] Updated weights for policy 0, policy_version 222 (0.0007) +[2023-10-11 14:55:10,371][85175] Updated weights for policy 1, policy_version 230 (0.0009) +[2023-10-11 14:55:10,732][85175] Updated weights for policy 1, policy_version 240 (0.0009) +[2023-10-11 14:55:10,903][85176] Updated weights for policy 0, policy_version 232 (0.0009) +[2023-10-11 14:55:11,063][84230] Fps is (10 sec: 13107.3, 60 sec: 10756.7, 300 sec: 10756.7). Total num frames: 458752. Throughput: 0: 1516.6, 1: 1523.7. Samples: 129664. Policy #0 lag: (min: 31.0, avg: 33.9, max: 63.0) +[2023-10-11 14:55:11,064][84230] Avg episode reward: [(0, '3.380'), (1, '3.090')] +[2023-10-11 14:55:11,102][85175] Updated weights for policy 1, policy_version 250 (0.0009) +[2023-10-11 14:55:11,269][85176] Updated weights for policy 0, policy_version 242 (0.0007) +[2023-10-11 14:55:11,654][85176] Updated weights for policy 0, policy_version 252 (0.0010) +[2023-10-11 14:55:11,792][84801] Saving new best policy, reward=3.380! +[2023-10-11 14:55:15,099][85175] Updated weights for policy 1, policy_version 260 (0.0007) +[2023-10-11 14:55:15,463][85175] Updated weights for policy 1, policy_version 270 (0.0009) +[2023-10-11 14:55:15,806][85176] Updated weights for policy 0, policy_version 262 (0.0009) +[2023-10-11 14:55:15,839][85175] Updated weights for policy 1, policy_version 280 (0.0008) +[2023-10-11 14:55:16,062][84230] Fps is (10 sec: 13107.3, 60 sec: 11003.4, 300 sec: 11003.4). Total num frames: 524288. Throughput: 0: 1639.2, 1: 1644.5. Samples: 149704. Policy #0 lag: (min: 4.0, avg: 6.8, max: 36.0) +[2023-10-11 14:55:16,063][84230] Avg episode reward: [(0, '2.770'), (1, '3.230')] +[2023-10-11 14:55:16,131][85000] Saving new best policy, reward=3.230! +[2023-10-11 14:55:16,175][85176] Updated weights for policy 0, policy_version 272 (0.0009) +[2023-10-11 14:55:16,552][85176] Updated weights for policy 0, policy_version 282 (0.0008) +[2023-10-11 14:55:19,949][85175] Updated weights for policy 1, policy_version 290 (0.0009) +[2023-10-11 14:55:20,351][85175] Updated weights for policy 1, policy_version 300 (0.0010) +[2023-10-11 14:55:20,713][85175] Updated weights for policy 1, policy_version 310 (0.0009) +[2023-10-11 14:55:20,727][85176] Updated weights for policy 0, policy_version 292 (0.0009) +[2023-10-11 14:55:21,063][84230] Fps is (10 sec: 13107.0, 60 sec: 11203.1, 300 sec: 11203.1). Total num frames: 589824. Throughput: 0: 1614.8, 1: 1632.6. Samples: 159514. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 14:55:21,064][84230] Avg episode reward: [(0, '3.300'), (1, '3.480')] +[2023-10-11 14:55:21,072][85175] Updated weights for policy 1, policy_version 320 (0.0008) +[2023-10-11 14:55:21,072][85000] Saving new best policy, reward=3.480! +[2023-10-11 14:55:21,119][85176] Updated weights for policy 0, policy_version 302 (0.0008) +[2023-10-11 14:55:21,486][85176] Updated weights for policy 0, policy_version 312 (0.0007) +[2023-10-11 14:55:25,196][85175] Updated weights for policy 1, policy_version 330 (0.0008) +[2023-10-11 14:55:25,514][85176] Updated weights for policy 0, policy_version 322 (0.0008) +[2023-10-11 14:55:25,565][85175] Updated weights for policy 1, policy_version 340 (0.0008) +[2023-10-11 14:55:25,884][85176] Updated weights for policy 0, policy_version 332 (0.0007) +[2023-10-11 14:55:25,926][85175] Updated weights for policy 1, policy_version 350 (0.0008) +[2023-10-11 14:55:26,062][84230] Fps is (10 sec: 16384.1, 60 sec: 11936.7, 300 sec: 11936.7). Total num frames: 688128. Throughput: 0: 1659.2, 1: 1675.5. Samples: 179914. Policy #0 lag: (min: 31.0, avg: 32.7, max: 60.0) +[2023-10-11 14:55:26,063][84230] Avg episode reward: [(0, '3.540'), (1, '3.810')] +[2023-10-11 14:55:26,064][85000] Saving new best policy, reward=3.810! +[2023-10-11 14:55:26,250][85176] Updated weights for policy 0, policy_version 342 (0.0008) +[2023-10-11 14:55:26,622][84801] Saving new best policy, reward=3.540! +[2023-10-11 14:55:26,627][85176] Updated weights for policy 0, policy_version 352 (0.0009) +[2023-10-11 14:55:30,169][85175] Updated weights for policy 1, policy_version 360 (0.0008) +[2023-10-11 14:55:30,532][85175] Updated weights for policy 1, policy_version 370 (0.0009) +[2023-10-11 14:55:30,713][85176] Updated weights for policy 0, policy_version 362 (0.0009) +[2023-10-11 14:55:30,895][85175] Updated weights for policy 1, policy_version 380 (0.0008) +[2023-10-11 14:55:31,063][84230] Fps is (10 sec: 16384.4, 60 sec: 12561.1, 300 sec: 12030.1). Total num frames: 753664. Throughput: 0: 1651.3, 1: 1657.3. Samples: 199356. Policy #0 lag: (min: 26.0, avg: 27.4, max: 48.0) +[2023-10-11 14:55:31,063][84230] Avg episode reward: [(0, '3.930'), (1, '3.620')] +[2023-10-11 14:55:31,084][85176] Updated weights for policy 0, policy_version 372 (0.0009) +[2023-10-11 14:55:31,460][85176] Updated weights for policy 0, policy_version 382 (0.0008) +[2023-10-11 14:55:31,535][84801] Saving new best policy, reward=3.930! +[2023-10-11 14:55:34,924][85175] Updated weights for policy 1, policy_version 390 (0.0007) +[2023-10-11 14:55:35,288][85175] Updated weights for policy 1, policy_version 400 (0.0007) +[2023-10-11 14:55:35,396][85176] Updated weights for policy 0, policy_version 392 (0.0008) +[2023-10-11 14:55:35,651][85175] Updated weights for policy 1, policy_version 410 (0.0009) +[2023-10-11 14:55:35,761][85176] Updated weights for policy 0, policy_version 402 (0.0009) +[2023-10-11 14:55:36,063][84230] Fps is (10 sec: 13106.7, 60 sec: 13653.2, 300 sec: 12109.7). Total num frames: 819200. Throughput: 0: 1658.3, 1: 1674.9. Samples: 209450. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 14:55:36,064][84230] Avg episode reward: [(0, '4.310'), (1, '4.090')] +[2023-10-11 14:55:36,066][85000] Saving new best policy, reward=4.090! +[2023-10-11 14:55:36,129][85176] Updated weights for policy 0, policy_version 412 (0.0009) +[2023-10-11 14:55:36,274][84801] Saving new best policy, reward=4.310! +[2023-10-11 14:55:39,734][85175] Updated weights for policy 1, policy_version 420 (0.0008) +[2023-10-11 14:55:40,109][85175] Updated weights for policy 1, policy_version 430 (0.0007) +[2023-10-11 14:55:40,471][85175] Updated weights for policy 1, policy_version 440 (0.0009) +[2023-10-11 14:55:40,551][85176] Updated weights for policy 0, policy_version 422 (0.0009) +[2023-10-11 14:55:40,920][85176] Updated weights for policy 0, policy_version 432 (0.0007) +[2023-10-11 14:55:41,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 12178.4). Total num frames: 884736. Throughput: 0: 1658.1, 1: 1677.2. Samples: 229942. Policy #0 lag: (min: 22.0, avg: 24.2, max: 54.0) +[2023-10-11 14:55:41,063][84230] Avg episode reward: [(0, '4.020'), (1, '3.960')] +[2023-10-11 14:55:41,284][85176] Updated weights for policy 0, policy_version 442 (0.0007) +[2023-10-11 14:55:44,544][85175] Updated weights for policy 1, policy_version 450 (0.0008) +[2023-10-11 14:55:44,918][85175] Updated weights for policy 1, policy_version 460 (0.0011) +[2023-10-11 14:55:45,279][85175] Updated weights for policy 1, policy_version 470 (0.0008) +[2023-10-11 14:55:45,289][85176] Updated weights for policy 0, policy_version 452 (0.0008) +[2023-10-11 14:55:45,649][85175] Updated weights for policy 1, policy_version 480 (0.0007) +[2023-10-11 14:55:45,670][85176] Updated weights for policy 0, policy_version 462 (0.0007) +[2023-10-11 14:55:46,040][85176] Updated weights for policy 0, policy_version 472 (0.0009) +[2023-10-11 14:55:46,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 12238.2). Total num frames: 950272. Throughput: 0: 1650.0, 1: 1659.7. Samples: 249110. Policy #0 lag: (min: 31.0, avg: 41.8, max: 63.0) +[2023-10-11 14:55:46,064][84230] Avg episode reward: [(0, '4.320'), (1, '4.010')] +[2023-10-11 14:55:46,336][84801] Saving new best policy, reward=4.320! +[2023-10-11 14:55:49,637][85175] Updated weights for policy 1, policy_version 490 (0.0009) +[2023-10-11 14:55:50,006][85175] Updated weights for policy 1, policy_version 500 (0.0007) +[2023-10-11 14:55:50,114][85176] Updated weights for policy 0, policy_version 482 (0.0009) +[2023-10-11 14:55:50,387][85175] Updated weights for policy 1, policy_version 510 (0.0007) +[2023-10-11 14:55:50,478][85176] Updated weights for policy 0, policy_version 492 (0.0007) +[2023-10-11 14:55:50,858][85176] Updated weights for policy 0, policy_version 502 (0.0008) +[2023-10-11 14:55:51,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 12290.8). Total num frames: 1015808. Throughput: 0: 1656.6, 1: 1681.1. Samples: 259578. Policy #0 lag: (min: 1.0, avg: 10.4, max: 33.0) +[2023-10-11 14:55:51,063][84230] Avg episode reward: [(0, '4.200'), (1, '4.460')] +[2023-10-11 14:55:51,064][85000] Saving new best policy, reward=4.460! +[2023-10-11 14:55:51,230][85176] Updated weights for policy 0, policy_version 512 (0.0008) +[2023-10-11 14:55:54,506][85175] Updated weights for policy 1, policy_version 520 (0.0007) +[2023-10-11 14:55:54,877][85175] Updated weights for policy 1, policy_version 530 (0.0008) +[2023-10-11 14:55:55,242][85175] Updated weights for policy 1, policy_version 540 (0.0009) +[2023-10-11 14:55:55,389][85176] Updated weights for policy 0, policy_version 522 (0.0007) +[2023-10-11 14:55:55,770][85176] Updated weights for policy 0, policy_version 532 (0.0009) +[2023-10-11 14:55:56,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 12337.3). Total num frames: 1081344. Throughput: 0: 1660.7, 1: 1677.7. Samples: 279892. Policy #0 lag: (min: 26.0, avg: 29.0, max: 58.0) +[2023-10-11 14:55:56,063][84230] Avg episode reward: [(0, '4.540'), (1, '4.400')] +[2023-10-11 14:55:56,135][85176] Updated weights for policy 0, policy_version 542 (0.0009) +[2023-10-11 14:55:56,212][84801] Saving new best policy, reward=4.540! +[2023-10-11 14:55:59,311][85175] Updated weights for policy 1, policy_version 550 (0.0008) +[2023-10-11 14:55:59,682][85175] Updated weights for policy 1, policy_version 560 (0.0010) +[2023-10-11 14:56:00,042][85175] Updated weights for policy 1, policy_version 570 (0.0009) +[2023-10-11 14:56:00,274][85176] Updated weights for policy 0, policy_version 552 (0.0009) +[2023-10-11 14:56:00,644][85176] Updated weights for policy 0, policy_version 562 (0.0010) +[2023-10-11 14:56:01,020][85176] Updated weights for policy 0, policy_version 572 (0.0010) +[2023-10-11 14:56:01,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 12378.9). Total num frames: 1146880. Throughput: 0: 1647.2, 1: 1671.3. Samples: 299038. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 14:56:01,064][84230] Avg episode reward: [(0, '4.640'), (1, '4.450')] +[2023-10-11 14:56:01,162][84801] Saving new best policy, reward=4.640! +[2023-10-11 14:56:04,086][85175] Updated weights for policy 1, policy_version 580 (0.0008) +[2023-10-11 14:56:04,442][85175] Updated weights for policy 1, policy_version 590 (0.0010) +[2023-10-11 14:56:04,810][85175] Updated weights for policy 1, policy_version 600 (0.0011) +[2023-10-11 14:56:05,197][85176] Updated weights for policy 0, policy_version 582 (0.0010) +[2023-10-11 14:56:05,567][85176] Updated weights for policy 0, policy_version 592 (0.0007) +[2023-10-11 14:56:05,946][85176] Updated weights for policy 0, policy_version 602 (0.0009) +[2023-10-11 14:56:06,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 12416.2). Total num frames: 1212416. Throughput: 0: 1657.5, 1: 1682.1. Samples: 309794. Policy #0 lag: (min: 31.0, avg: 40.0, max: 63.0) +[2023-10-11 14:56:06,063][84230] Avg episode reward: [(0, '4.720'), (1, '5.090')] +[2023-10-11 14:56:06,064][85000] Saving new best policy, reward=5.090! +[2023-10-11 14:56:06,166][84801] Saving new best policy, reward=4.720! +[2023-10-11 14:56:09,007][85175] Updated weights for policy 1, policy_version 610 (0.0008) +[2023-10-11 14:56:09,397][85175] Updated weights for policy 1, policy_version 620 (0.0009) +[2023-10-11 14:56:09,760][85175] Updated weights for policy 1, policy_version 630 (0.0007) +[2023-10-11 14:56:10,114][85176] Updated weights for policy 0, policy_version 612 (0.0010) +[2023-10-11 14:56:10,131][85175] Updated weights for policy 1, policy_version 640 (0.0009) +[2023-10-11 14:56:10,499][85176] Updated weights for policy 0, policy_version 622 (0.0008) +[2023-10-11 14:56:10,884][85176] Updated weights for policy 0, policy_version 632 (0.0009) +[2023-10-11 14:56:11,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 12449.8). Total num frames: 1277952. Throughput: 0: 1656.9, 1: 1670.2. Samples: 329634. Policy #0 lag: (min: 17.0, avg: 22.3, max: 49.0) +[2023-10-11 14:56:11,064][84230] Avg episode reward: [(0, '4.610'), (1, '4.720')] +[2023-10-11 14:56:14,083][85175] Updated weights for policy 1, policy_version 650 (0.0008) +[2023-10-11 14:56:14,461][85175] Updated weights for policy 1, policy_version 660 (0.0009) +[2023-10-11 14:56:14,845][85175] Updated weights for policy 1, policy_version 670 (0.0010) +[2023-10-11 14:56:15,038][85176] Updated weights for policy 0, policy_version 642 (0.0008) +[2023-10-11 14:56:15,407][85176] Updated weights for policy 0, policy_version 652 (0.0008) +[2023-10-11 14:56:15,775][85176] Updated weights for policy 0, policy_version 662 (0.0009) +[2023-10-11 14:56:16,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 12480.4). Total num frames: 1343488. Throughput: 0: 1651.4, 1: 1671.1. Samples: 348868. Policy #0 lag: (min: 28.0, avg: 30.2, max: 60.0) +[2023-10-11 14:56:16,064][84230] Avg episode reward: [(0, '5.080'), (1, '4.730')] +[2023-10-11 14:56:16,078][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000000672_688128.pth... +[2023-10-11 14:56:16,151][85176] Updated weights for policy 0, policy_version 672 (0.0008) +[2023-10-11 14:56:16,151][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000000672_688128.pth... +[2023-10-11 14:56:16,192][84801] Saving new best policy, reward=5.080! +[2023-10-11 14:56:18,874][85175] Updated weights for policy 1, policy_version 680 (0.0009) +[2023-10-11 14:56:19,244][85175] Updated weights for policy 1, policy_version 690 (0.0009) +[2023-10-11 14:56:19,614][85175] Updated weights for policy 1, policy_version 700 (0.0011) +[2023-10-11 14:56:20,174][85176] Updated weights for policy 0, policy_version 682 (0.0010) +[2023-10-11 14:56:20,543][85176] Updated weights for policy 0, policy_version 692 (0.0008) +[2023-10-11 14:56:20,915][85176] Updated weights for policy 0, policy_version 702 (0.0007) +[2023-10-11 14:56:21,063][84230] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 12799.1). Total num frames: 1441792. Throughput: 0: 1658.4, 1: 1684.8. Samples: 359892. Policy #0 lag: (min: 15.0, avg: 22.9, max: 47.0) +[2023-10-11 14:56:21,063][84230] Avg episode reward: [(0, '5.120'), (1, '5.080')] +[2023-10-11 14:56:21,065][84801] Saving new best policy, reward=5.120! +[2023-10-11 14:56:23,753][85175] Updated weights for policy 1, policy_version 710 (0.0008) +[2023-10-11 14:56:24,116][85175] Updated weights for policy 1, policy_version 720 (0.0007) +[2023-10-11 14:56:24,482][85175] Updated weights for policy 1, policy_version 730 (0.0008) +[2023-10-11 14:56:25,067][85176] Updated weights for policy 0, policy_version 712 (0.0008) +[2023-10-11 14:56:25,430][85176] Updated weights for policy 0, policy_version 722 (0.0009) +[2023-10-11 14:56:25,800][85176] Updated weights for policy 0, policy_version 732 (0.0009) +[2023-10-11 14:56:26,062][84230] Fps is (10 sec: 16384.5, 60 sec: 13653.3, 300 sec: 12812.2). Total num frames: 1507328. Throughput: 0: 1659.4, 1: 1661.3. Samples: 379372. Policy #0 lag: (min: 4.0, avg: 11.7, max: 36.0) +[2023-10-11 14:56:26,063][84230] Avg episode reward: [(0, '5.140'), (1, '5.440')] +[2023-10-11 14:56:26,064][85000] Saving new best policy, reward=5.440! +[2023-10-11 14:56:26,064][84801] Saving new best policy, reward=5.140! +[2023-10-11 14:56:28,430][85175] Updated weights for policy 1, policy_version 740 (0.0009) +[2023-10-11 14:56:28,806][85175] Updated weights for policy 1, policy_version 750 (0.0008) +[2023-10-11 14:56:29,173][85175] Updated weights for policy 1, policy_version 760 (0.0008) +[2023-10-11 14:56:30,016][85176] Updated weights for policy 0, policy_version 742 (0.0008) +[2023-10-11 14:56:30,389][85176] Updated weights for policy 0, policy_version 752 (0.0008) +[2023-10-11 14:56:30,760][85176] Updated weights for policy 0, policy_version 762 (0.0010) +[2023-10-11 14:56:31,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 12824.2). Total num frames: 1572864. Throughput: 0: 1648.4, 1: 1681.5. Samples: 398952. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 14:56:31,063][84230] Avg episode reward: [(0, '5.200'), (1, '4.980')] +[2023-10-11 14:56:31,071][84801] Saving new best policy, reward=5.200! +[2023-10-11 14:56:33,095][85175] Updated weights for policy 1, policy_version 770 (0.0009) +[2023-10-11 14:56:33,458][85175] Updated weights for policy 1, policy_version 780 (0.0008) +[2023-10-11 14:56:33,833][85175] Updated weights for policy 1, policy_version 790 (0.0007) +[2023-10-11 14:56:34,192][85175] Updated weights for policy 1, policy_version 800 (0.0009) +[2023-10-11 14:56:35,039][85176] Updated weights for policy 0, policy_version 772 (0.0009) +[2023-10-11 14:56:35,406][85176] Updated weights for policy 0, policy_version 782 (0.0010) +[2023-10-11 14:56:35,786][85176] Updated weights for policy 0, policy_version 792 (0.0008) +[2023-10-11 14:56:36,062][84230] Fps is (10 sec: 9830.4, 60 sec: 13107.3, 300 sec: 12578.6). Total num frames: 1605632. Throughput: 0: 1658.5, 1: 1679.5. Samples: 409790. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 14:56:36,063][84230] Avg episode reward: [(0, '5.180'), (1, '5.260')] +[2023-10-11 14:56:38,233][85175] Updated weights for policy 1, policy_version 810 (0.0009) +[2023-10-11 14:56:38,595][85175] Updated weights for policy 1, policy_version 820 (0.0009) +[2023-10-11 14:56:38,959][85175] Updated weights for policy 1, policy_version 830 (0.0010) +[2023-10-11 14:56:39,800][85176] Updated weights for policy 0, policy_version 802 (0.0009) +[2023-10-11 14:56:40,174][85176] Updated weights for policy 0, policy_version 812 (0.0007) +[2023-10-11 14:56:40,536][85176] Updated weights for policy 0, policy_version 822 (0.0007) +[2023-10-11 14:56:40,913][85176] Updated weights for policy 0, policy_version 832 (0.0009) +[2023-10-11 14:56:41,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 12845.5). Total num frames: 1703936. Throughput: 0: 1652.9, 1: 1673.2. Samples: 429570. Policy #0 lag: (min: 12.0, avg: 18.8, max: 44.0) +[2023-10-11 14:56:41,064][84230] Avg episode reward: [(0, '5.190'), (1, '5.130')] +[2023-10-11 14:56:43,086][85175] Updated weights for policy 1, policy_version 840 (0.0007) +[2023-10-11 14:56:43,452][85175] Updated weights for policy 1, policy_version 850 (0.0007) +[2023-10-11 14:56:43,814][85175] Updated weights for policy 1, policy_version 860 (0.0008) +[2023-10-11 14:56:45,072][85176] Updated weights for policy 0, policy_version 842 (0.0009) +[2023-10-11 14:56:45,449][85176] Updated weights for policy 0, policy_version 852 (0.0008) +[2023-10-11 14:56:45,827][85176] Updated weights for policy 0, policy_version 862 (0.0007) +[2023-10-11 14:56:46,063][84230] Fps is (10 sec: 16383.7, 60 sec: 13653.4, 300 sec: 12855.0). Total num frames: 1769472. Throughput: 0: 1645.3, 1: 1693.3. Samples: 449278. Policy #0 lag: (min: 8.0, avg: 33.6, max: 40.0) +[2023-10-11 14:56:46,063][84230] Avg episode reward: [(0, '5.410'), (1, '5.190')] +[2023-10-11 14:56:46,073][84801] Saving new best policy, reward=5.410! +[2023-10-11 14:56:47,983][85175] Updated weights for policy 1, policy_version 870 (0.0009) +[2023-10-11 14:56:48,352][85175] Updated weights for policy 1, policy_version 880 (0.0010) +[2023-10-11 14:56:48,726][85175] Updated weights for policy 1, policy_version 890 (0.0010) +[2023-10-11 14:56:49,787][85176] Updated weights for policy 0, policy_version 872 (0.0008) +[2023-10-11 14:56:50,161][85176] Updated weights for policy 0, policy_version 882 (0.0008) +[2023-10-11 14:56:50,537][85176] Updated weights for policy 0, policy_version 892 (0.0009) +[2023-10-11 14:56:51,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 12863.9). Total num frames: 1835008. Throughput: 0: 1655.5, 1: 1676.4. Samples: 459730. Policy #0 lag: (min: 16.0, avg: 36.9, max: 48.0) +[2023-10-11 14:56:51,063][84230] Avg episode reward: [(0, '5.480'), (1, '5.500')] +[2023-10-11 14:56:51,064][84801] Saving new best policy, reward=5.480! +[2023-10-11 14:56:51,064][85000] Saving new best policy, reward=5.500! +[2023-10-11 14:56:52,610][85175] Updated weights for policy 1, policy_version 900 (0.0008) +[2023-10-11 14:56:52,980][85175] Updated weights for policy 1, policy_version 910 (0.0010) +[2023-10-11 14:56:53,351][85175] Updated weights for policy 1, policy_version 920 (0.0010) +[2023-10-11 14:56:54,759][85176] Updated weights for policy 0, policy_version 902 (0.0010) +[2023-10-11 14:56:55,129][85176] Updated weights for policy 0, policy_version 912 (0.0010) +[2023-10-11 14:56:55,504][85176] Updated weights for policy 0, policy_version 922 (0.0009) +[2023-10-11 14:56:56,062][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 12872.1). Total num frames: 1900544. Throughput: 0: 1655.2, 1: 1681.7. Samples: 479792. Policy #0 lag: (min: 31.0, avg: 38.7, max: 63.0) +[2023-10-11 14:56:56,063][84230] Avg episode reward: [(0, '5.610'), (1, '5.320')] +[2023-10-11 14:56:56,064][84801] Saving new best policy, reward=5.610! +[2023-10-11 14:56:57,539][85175] Updated weights for policy 1, policy_version 930 (0.0007) +[2023-10-11 14:56:57,936][85175] Updated weights for policy 1, policy_version 940 (0.0009) +[2023-10-11 14:56:58,311][85175] Updated weights for policy 1, policy_version 950 (0.0007) +[2023-10-11 14:56:58,679][85175] Updated weights for policy 1, policy_version 960 (0.0009) +[2023-10-11 14:56:59,615][85176] Updated weights for policy 0, policy_version 932 (0.0009) +[2023-10-11 14:56:59,981][85176] Updated weights for policy 0, policy_version 942 (0.0009) +[2023-10-11 14:57:00,362][85176] Updated weights for policy 0, policy_version 952 (0.0010) +[2023-10-11 14:57:01,063][84230] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 12879.8). Total num frames: 1966080. Throughput: 0: 1645.5, 1: 1700.0. Samples: 499416. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 14:57:01,064][84230] Avg episode reward: [(0, '5.330'), (1, '5.400')] +[2023-10-11 14:57:02,508][85175] Updated weights for policy 1, policy_version 970 (0.0010) +[2023-10-11 14:57:02,879][85175] Updated weights for policy 1, policy_version 980 (0.0010) +[2023-10-11 14:57:03,242][85175] Updated weights for policy 1, policy_version 990 (0.0009) +[2023-10-11 14:57:04,570][85176] Updated weights for policy 0, policy_version 962 (0.0008) +[2023-10-11 14:57:04,939][85176] Updated weights for policy 0, policy_version 972 (0.0010) +[2023-10-11 14:57:05,312][85176] Updated weights for policy 0, policy_version 982 (0.0007) +[2023-10-11 14:57:05,684][85176] Updated weights for policy 0, policy_version 992 (0.0007) +[2023-10-11 14:57:06,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 12887.0). Total num frames: 2031616. Throughput: 0: 1657.1, 1: 1668.1. Samples: 509528. Policy #0 lag: (min: 21.0, avg: 28.9, max: 53.0) +[2023-10-11 14:57:06,063][84230] Avg episode reward: [(0, '5.230'), (1, '5.490')] +[2023-10-11 14:57:07,284][85175] Updated weights for policy 1, policy_version 1000 (0.0007) +[2023-10-11 14:57:07,658][85175] Updated weights for policy 1, policy_version 1010 (0.0007) +[2023-10-11 14:57:08,029][85175] Updated weights for policy 1, policy_version 1020 (0.0007) +[2023-10-11 14:57:09,810][85176] Updated weights for policy 0, policy_version 1002 (0.0009) +[2023-10-11 14:57:10,181][85176] Updated weights for policy 0, policy_version 1012 (0.0008) +[2023-10-11 14:57:10,552][85176] Updated weights for policy 0, policy_version 1022 (0.0007) +[2023-10-11 14:57:11,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 12893.8). Total num frames: 2097152. Throughput: 0: 1649.9, 1: 1694.2. Samples: 529858. Policy #0 lag: (min: 13.0, avg: 24.0, max: 45.0) +[2023-10-11 14:57:11,064][84230] Avg episode reward: [(0, '5.440'), (1, '5.600')] +[2023-10-11 14:57:11,066][85000] Saving new best policy, reward=5.600! +[2023-10-11 14:57:12,236][85175] Updated weights for policy 1, policy_version 1030 (0.0008) +[2023-10-11 14:57:12,608][85175] Updated weights for policy 1, policy_version 1040 (0.0007) +[2023-10-11 14:57:12,974][85175] Updated weights for policy 1, policy_version 1050 (0.0007) +[2023-10-11 14:57:14,589][85176] Updated weights for policy 0, policy_version 1032 (0.0009) +[2023-10-11 14:57:14,963][85176] Updated weights for policy 0, policy_version 1042 (0.0008) +[2023-10-11 14:57:15,337][85176] Updated weights for policy 0, policy_version 1052 (0.0007) +[2023-10-11 14:57:16,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 12900.2). Total num frames: 2162688. Throughput: 0: 1649.9, 1: 1696.3. Samples: 549532. Policy #0 lag: (min: 3.0, avg: 7.9, max: 35.0) +[2023-10-11 14:57:16,064][84230] Avg episode reward: [(0, '5.270'), (1, '5.700')] +[2023-10-11 14:57:16,078][85000] Saving new best policy, reward=5.700! +[2023-10-11 14:57:17,204][85175] Updated weights for policy 1, policy_version 1060 (0.0007) +[2023-10-11 14:57:17,574][85175] Updated weights for policy 1, policy_version 1070 (0.0008) +[2023-10-11 14:57:17,937][85175] Updated weights for policy 1, policy_version 1080 (0.0007) +[2023-10-11 14:57:19,511][85176] Updated weights for policy 0, policy_version 1062 (0.0008) +[2023-10-11 14:57:19,895][85176] Updated weights for policy 0, policy_version 1072 (0.0007) +[2023-10-11 14:57:20,268][85176] Updated weights for policy 0, policy_version 1082 (0.0010) +[2023-10-11 14:57:21,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 12906.2). Total num frames: 2228224. Throughput: 0: 1658.3, 1: 1675.3. Samples: 559804. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 14:57:21,063][84230] Avg episode reward: [(0, '5.390'), (1, '5.420')] +[2023-10-11 14:57:21,872][85175] Updated weights for policy 1, policy_version 1090 (0.0007) +[2023-10-11 14:57:22,242][85175] Updated weights for policy 1, policy_version 1100 (0.0009) +[2023-10-11 14:57:22,609][85175] Updated weights for policy 1, policy_version 1110 (0.0008) +[2023-10-11 14:57:22,975][85175] Updated weights for policy 1, policy_version 1120 (0.0010) +[2023-10-11 14:57:24,341][85176] Updated weights for policy 0, policy_version 1092 (0.0007) +[2023-10-11 14:57:24,704][85176] Updated weights for policy 0, policy_version 1102 (0.0007) +[2023-10-11 14:57:25,085][85176] Updated weights for policy 0, policy_version 1112 (0.0008) +[2023-10-11 14:57:26,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 12911.8). Total num frames: 2293760. Throughput: 0: 1648.8, 1: 1696.3. Samples: 580098. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2023-10-11 14:57:26,063][84230] Avg episode reward: [(0, '5.330'), (1, '5.580')] +[2023-10-11 14:57:26,779][85175] Updated weights for policy 1, policy_version 1130 (0.0011) +[2023-10-11 14:57:27,153][85175] Updated weights for policy 1, policy_version 1140 (0.0010) +[2023-10-11 14:57:27,523][85175] Updated weights for policy 1, policy_version 1150 (0.0009) +[2023-10-11 14:57:29,186][85176] Updated weights for policy 0, policy_version 1122 (0.0008) +[2023-10-11 14:57:29,554][85176] Updated weights for policy 0, policy_version 1132 (0.0007) +[2023-10-11 14:57:29,927][85176] Updated weights for policy 0, policy_version 1142 (0.0010) +[2023-10-11 14:57:30,296][85176] Updated weights for policy 0, policy_version 1152 (0.0009) +[2023-10-11 14:57:31,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 12917.2). Total num frames: 2359296. Throughput: 0: 1652.3, 1: 1701.8. Samples: 600212. Policy #0 lag: (min: 31.0, avg: 35.9, max: 63.0) +[2023-10-11 14:57:31,063][84230] Avg episode reward: [(0, '5.320'), (1, '5.630')] +[2023-10-11 14:57:31,508][85175] Updated weights for policy 1, policy_version 1160 (0.0008) +[2023-10-11 14:57:31,869][85175] Updated weights for policy 1, policy_version 1170 (0.0007) +[2023-10-11 14:57:32,236][85175] Updated weights for policy 1, policy_version 1180 (0.0007) +[2023-10-11 14:57:34,348][85176] Updated weights for policy 0, policy_version 1162 (0.0009) +[2023-10-11 14:57:34,713][85176] Updated weights for policy 0, policy_version 1172 (0.0009) +[2023-10-11 14:57:35,089][85176] Updated weights for policy 0, policy_version 1182 (0.0008) +[2023-10-11 14:57:36,062][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 12922.2). Total num frames: 2424832. Throughput: 0: 1661.2, 1: 1691.5. Samples: 610604. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 14:57:36,063][84230] Avg episode reward: [(0, '5.450'), (1, '5.400')] +[2023-10-11 14:57:36,290][85175] Updated weights for policy 1, policy_version 1190 (0.0009) +[2023-10-11 14:57:36,663][85175] Updated weights for policy 1, policy_version 1200 (0.0010) +[2023-10-11 14:57:37,034][85175] Updated weights for policy 1, policy_version 1210 (0.0010) +[2023-10-11 14:57:39,283][85176] Updated weights for policy 0, policy_version 1192 (0.0009) +[2023-10-11 14:57:39,655][85176] Updated weights for policy 0, policy_version 1202 (0.0011) +[2023-10-11 14:57:40,031][85176] Updated weights for policy 0, policy_version 1212 (0.0010) +[2023-10-11 14:57:41,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 12927.0). Total num frames: 2490368. Throughput: 0: 1648.1, 1: 1700.8. Samples: 630494. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 14:57:41,064][84230] Avg episode reward: [(0, '5.380'), (1, '5.810')] +[2023-10-11 14:57:41,172][85175] Updated weights for policy 1, policy_version 1220 (0.0008) +[2023-10-11 14:57:41,536][85175] Updated weights for policy 1, policy_version 1230 (0.0007) +[2023-10-11 14:57:41,901][85175] Updated weights for policy 1, policy_version 1240 (0.0011) +[2023-10-11 14:57:42,193][85000] Saving new best policy, reward=5.810! +[2023-10-11 14:57:44,203][85176] Updated weights for policy 0, policy_version 1222 (0.0010) +[2023-10-11 14:57:44,604][85176] Updated weights for policy 0, policy_version 1232 (0.0009) +[2023-10-11 14:57:44,987][85176] Updated weights for policy 0, policy_version 1242 (0.0009) +[2023-10-11 14:57:45,991][85175] Updated weights for policy 1, policy_version 1250 (0.0009) +[2023-10-11 14:57:46,063][84230] Fps is (10 sec: 13106.8, 60 sec: 13107.2, 300 sec: 12931.6). Total num frames: 2555904. Throughput: 0: 1654.8, 1: 1699.8. Samples: 650372. Policy #0 lag: (min: 31.0, avg: 31.9, max: 51.0) +[2023-10-11 14:57:46,064][84230] Avg episode reward: [(0, '5.330'), (1, '5.850')] +[2023-10-11 14:57:46,381][85175] Updated weights for policy 1, policy_version 1260 (0.0009) +[2023-10-11 14:57:46,751][85175] Updated weights for policy 1, policy_version 1270 (0.0011) +[2023-10-11 14:57:47,129][85000] Saving new best policy, reward=5.850! +[2023-10-11 14:57:47,129][85175] Updated weights for policy 1, policy_version 1280 (0.0010) +[2023-10-11 14:57:49,053][85176] Updated weights for policy 0, policy_version 1252 (0.0008) +[2023-10-11 14:57:49,424][85176] Updated weights for policy 0, policy_version 1262 (0.0010) +[2023-10-11 14:57:49,805][85176] Updated weights for policy 0, policy_version 1272 (0.0008) +[2023-10-11 14:57:51,042][85175] Updated weights for policy 1, policy_version 1290 (0.0007) +[2023-10-11 14:57:51,062][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 12935.9). Total num frames: 2621440. Throughput: 0: 1657.7, 1: 1696.8. Samples: 660482. Policy #0 lag: (min: 31.0, avg: 38.2, max: 63.0) +[2023-10-11 14:57:51,063][84230] Avg episode reward: [(0, '5.510'), (1, '5.790')] +[2023-10-11 14:57:51,406][85175] Updated weights for policy 1, policy_version 1300 (0.0008) +[2023-10-11 14:57:51,780][85175] Updated weights for policy 1, policy_version 1310 (0.0010) +[2023-10-11 14:57:54,022][85176] Updated weights for policy 0, policy_version 1282 (0.0010) +[2023-10-11 14:57:54,398][85176] Updated weights for policy 0, policy_version 1292 (0.0009) +[2023-10-11 14:57:54,771][85176] Updated weights for policy 0, policy_version 1302 (0.0009) +[2023-10-11 14:57:55,148][85176] Updated weights for policy 0, policy_version 1312 (0.0009) +[2023-10-11 14:57:55,821][85175] Updated weights for policy 1, policy_version 1320 (0.0008) +[2023-10-11 14:57:56,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 12940.0). Total num frames: 2686976. Throughput: 0: 1650.6, 1: 1695.8. Samples: 680446. Policy #0 lag: (min: 26.0, avg: 28.4, max: 51.0) +[2023-10-11 14:57:56,063][84230] Avg episode reward: [(0, '5.490'), (1, '5.880')] +[2023-10-11 14:57:56,192][85175] Updated weights for policy 1, policy_version 1330 (0.0009) +[2023-10-11 14:57:56,559][85175] Updated weights for policy 1, policy_version 1340 (0.0008) +[2023-10-11 14:57:56,706][85000] Saving new best policy, reward=5.880! +[2023-10-11 14:57:59,305][85176] Updated weights for policy 0, policy_version 1322 (0.0009) +[2023-10-11 14:57:59,681][85176] Updated weights for policy 0, policy_version 1332 (0.0008) +[2023-10-11 14:58:00,055][85176] Updated weights for policy 0, policy_version 1342 (0.0008) +[2023-10-11 14:58:00,779][85175] Updated weights for policy 1, policy_version 1350 (0.0008) +[2023-10-11 14:58:01,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 12944.0). Total num frames: 2752512. Throughput: 0: 1657.0, 1: 1695.7. Samples: 700402. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) +[2023-10-11 14:58:01,064][84230] Avg episode reward: [(0, '5.690'), (1, '5.860')] +[2023-10-11 14:58:01,075][84801] Saving new best policy, reward=5.690! +[2023-10-11 14:58:01,148][85175] Updated weights for policy 1, policy_version 1360 (0.0008) +[2023-10-11 14:58:01,515][85175] Updated weights for policy 1, policy_version 1370 (0.0007) +[2023-10-11 14:58:04,060][85176] Updated weights for policy 0, policy_version 1352 (0.0009) +[2023-10-11 14:58:04,433][85176] Updated weights for policy 0, policy_version 1362 (0.0007) +[2023-10-11 14:58:04,806][85176] Updated weights for policy 0, policy_version 1372 (0.0007) +[2023-10-11 14:58:05,530][85175] Updated weights for policy 1, policy_version 1380 (0.0009) +[2023-10-11 14:58:05,896][85175] Updated weights for policy 1, policy_version 1390 (0.0007) +[2023-10-11 14:58:06,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 12947.7). Total num frames: 2818048. Throughput: 0: 1661.3, 1: 1692.0. Samples: 710704. Policy #0 lag: (min: 10.0, avg: 11.3, max: 27.0) +[2023-10-11 14:58:06,063][84230] Avg episode reward: [(0, '5.630'), (1, '6.000')] +[2023-10-11 14:58:06,255][85175] Updated weights for policy 1, policy_version 1400 (0.0007) +[2023-10-11 14:58:06,543][85000] Saving new best policy, reward=6.000! +[2023-10-11 14:58:08,904][85176] Updated weights for policy 0, policy_version 1382 (0.0008) +[2023-10-11 14:58:09,278][85176] Updated weights for policy 0, policy_version 1392 (0.0008) +[2023-10-11 14:58:09,639][85176] Updated weights for policy 0, policy_version 1402 (0.0010) +[2023-10-11 14:58:10,277][85175] Updated weights for policy 1, policy_version 1410 (0.0008) +[2023-10-11 14:58:10,649][85175] Updated weights for policy 1, policy_version 1420 (0.0008) +[2023-10-11 14:58:11,024][85175] Updated weights for policy 1, policy_version 1430 (0.0007) +[2023-10-11 14:58:11,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 12951.3). Total num frames: 2883584. Throughput: 0: 1652.2, 1: 1687.4. Samples: 730382. Policy #0 lag: (min: 10.0, avg: 16.7, max: 42.0) +[2023-10-11 14:58:11,064][84230] Avg episode reward: [(0, '5.670'), (1, '6.010')] +[2023-10-11 14:58:11,384][85175] Updated weights for policy 1, policy_version 1440 (0.0008) +[2023-10-11 14:58:11,384][85000] Saving new best policy, reward=6.010! +[2023-10-11 14:58:13,666][85176] Updated weights for policy 0, policy_version 1412 (0.0009) +[2023-10-11 14:58:14,044][85176] Updated weights for policy 0, policy_version 1422 (0.0009) +[2023-10-11 14:58:14,418][85176] Updated weights for policy 0, policy_version 1432 (0.0008) +[2023-10-11 14:58:15,583][85175] Updated weights for policy 1, policy_version 1450 (0.0007) +[2023-10-11 14:58:15,953][85175] Updated weights for policy 1, policy_version 1460 (0.0009) +[2023-10-11 14:58:16,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 12954.7). Total num frames: 2949120. Throughput: 0: 1663.0, 1: 1669.9. Samples: 750192. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-10-11 14:58:16,063][84230] Avg episode reward: [(0, '5.860'), (1, '6.290')] +[2023-10-11 14:58:16,071][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000001440_1474560.pth... +[2023-10-11 14:58:16,107][84801] Saving new best policy, reward=5.860! +[2023-10-11 14:58:16,316][85175] Updated weights for policy 1, policy_version 1470 (0.0008) +[2023-10-11 14:58:16,386][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000001472_1507328.pth... +[2023-10-11 14:58:16,415][85000] Saving new best policy, reward=6.290! +[2023-10-11 14:58:18,391][85176] Updated weights for policy 0, policy_version 1442 (0.0008) +[2023-10-11 14:58:18,761][85176] Updated weights for policy 0, policy_version 1452 (0.0008) +[2023-10-11 14:58:19,137][85176] Updated weights for policy 0, policy_version 1462 (0.0009) +[2023-10-11 14:58:19,502][85176] Updated weights for policy 0, policy_version 1472 (0.0008) +[2023-10-11 14:58:20,506][85175] Updated weights for policy 1, policy_version 1480 (0.0009) +[2023-10-11 14:58:20,886][85175] Updated weights for policy 1, policy_version 1490 (0.0009) +[2023-10-11 14:58:21,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 12958.0). Total num frames: 3014656. Throughput: 0: 1654.4, 1: 1675.0. Samples: 760428. Policy #0 lag: (min: 31.0, avg: 36.2, max: 63.0) +[2023-10-11 14:58:21,063][84230] Avg episode reward: [(0, '5.860'), (1, '6.550')] +[2023-10-11 14:58:21,256][85175] Updated weights for policy 1, policy_version 1500 (0.0009) +[2023-10-11 14:58:21,399][85000] Saving new best policy, reward=6.550! +[2023-10-11 14:58:23,695][85176] Updated weights for policy 0, policy_version 1482 (0.0009) +[2023-10-11 14:58:24,073][85176] Updated weights for policy 0, policy_version 1492 (0.0011) +[2023-10-11 14:58:24,452][85176] Updated weights for policy 0, policy_version 1502 (0.0010) +[2023-10-11 14:58:25,513][85175] Updated weights for policy 1, policy_version 1510 (0.0008) +[2023-10-11 14:58:25,884][85175] Updated weights for policy 1, policy_version 1520 (0.0009) +[2023-10-11 14:58:26,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 12961.1). Total num frames: 3080192. Throughput: 0: 1649.9, 1: 1672.0. Samples: 779978. Policy #0 lag: (min: 31.0, avg: 33.8, max: 63.0) +[2023-10-11 14:58:26,064][84230] Avg episode reward: [(0, '5.980'), (1, '6.240')] +[2023-10-11 14:58:26,065][84801] Saving new best policy, reward=5.980! +[2023-10-11 14:58:26,253][85175] Updated weights for policy 1, policy_version 1530 (0.0007) +[2023-10-11 14:58:28,505][85176] Updated weights for policy 0, policy_version 1512 (0.0008) +[2023-10-11 14:58:28,884][85176] Updated weights for policy 0, policy_version 1522 (0.0009) +[2023-10-11 14:58:29,249][85176] Updated weights for policy 0, policy_version 1532 (0.0009) +[2023-10-11 14:58:30,151][85175] Updated weights for policy 1, policy_version 1540 (0.0008) +[2023-10-11 14:58:30,519][85175] Updated weights for policy 1, policy_version 1550 (0.0008) +[2023-10-11 14:58:30,883][85175] Updated weights for policy 1, policy_version 1560 (0.0007) +[2023-10-11 14:58:31,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 12964.2). Total num frames: 3145728. Throughput: 0: 1665.7, 1: 1661.3. Samples: 800084. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2023-10-11 14:58:31,063][84230] Avg episode reward: [(0, '5.970'), (1, '6.010')] +[2023-10-11 14:58:33,352][85176] Updated weights for policy 0, policy_version 1542 (0.0010) +[2023-10-11 14:58:33,722][85176] Updated weights for policy 0, policy_version 1552 (0.0009) +[2023-10-11 14:58:34,090][85176] Updated weights for policy 0, policy_version 1562 (0.0010) +[2023-10-11 14:58:34,993][85175] Updated weights for policy 1, policy_version 1570 (0.0010) +[2023-10-11 14:58:35,396][85175] Updated weights for policy 1, policy_version 1580 (0.0010) +[2023-10-11 14:58:35,766][85175] Updated weights for policy 1, policy_version 1590 (0.0008) +[2023-10-11 14:58:36,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 12967.0). Total num frames: 3211264. Throughput: 0: 1654.9, 1: 1679.4. Samples: 810524. Policy #0 lag: (min: 17.0, avg: 29.5, max: 49.0) +[2023-10-11 14:58:36,064][84230] Avg episode reward: [(0, '6.210'), (1, '6.080')] +[2023-10-11 14:58:36,065][84801] Saving new best policy, reward=6.210! +[2023-10-11 14:58:36,137][85175] Updated weights for policy 1, policy_version 1600 (0.0008) +[2023-10-11 14:58:38,284][85176] Updated weights for policy 0, policy_version 1572 (0.0010) +[2023-10-11 14:58:38,664][85176] Updated weights for policy 0, policy_version 1582 (0.0009) +[2023-10-11 14:58:39,027][85176] Updated weights for policy 0, policy_version 1592 (0.0007) +[2023-10-11 14:58:40,177][85175] Updated weights for policy 1, policy_version 1610 (0.0008) +[2023-10-11 14:58:40,545][85175] Updated weights for policy 1, policy_version 1620 (0.0007) +[2023-10-11 14:58:40,906][85175] Updated weights for policy 1, policy_version 1630 (0.0009) +[2023-10-11 14:58:41,063][84230] Fps is (10 sec: 16383.7, 60 sec: 13653.3, 300 sec: 13099.5). Total num frames: 3309568. Throughput: 0: 1649.8, 1: 1676.7. Samples: 830138. Policy #0 lag: (min: 30.0, avg: 34.5, max: 62.0) +[2023-10-11 14:58:41,064][84230] Avg episode reward: [(0, '6.210'), (1, '6.020')] +[2023-10-11 14:58:43,176][85176] Updated weights for policy 0, policy_version 1602 (0.0008) +[2023-10-11 14:58:43,543][85176] Updated weights for policy 0, policy_version 1612 (0.0009) +[2023-10-11 14:58:43,915][85176] Updated weights for policy 0, policy_version 1622 (0.0010) +[2023-10-11 14:58:44,283][85176] Updated weights for policy 0, policy_version 1632 (0.0010) +[2023-10-11 14:58:45,065][85175] Updated weights for policy 1, policy_version 1640 (0.0011) +[2023-10-11 14:58:45,431][85175] Updated weights for policy 1, policy_version 1650 (0.0008) +[2023-10-11 14:58:45,804][85175] Updated weights for policy 1, policy_version 1660 (0.0010) +[2023-10-11 14:58:46,062][84230] Fps is (10 sec: 16384.3, 60 sec: 13653.4, 300 sec: 13099.7). Total num frames: 3375104. Throughput: 0: 1665.8, 1: 1657.1. Samples: 849934. Policy #0 lag: (min: 13.0, avg: 20.1, max: 45.0) +[2023-10-11 14:58:46,063][84230] Avg episode reward: [(0, '6.110'), (1, '6.220')] +[2023-10-11 14:58:48,387][85176] Updated weights for policy 0, policy_version 1642 (0.0008) +[2023-10-11 14:58:48,749][85176] Updated weights for policy 0, policy_version 1652 (0.0009) +[2023-10-11 14:58:49,122][85176] Updated weights for policy 0, policy_version 1662 (0.0008) +[2023-10-11 14:58:49,970][85175] Updated weights for policy 1, policy_version 1670 (0.0009) +[2023-10-11 14:58:50,336][85175] Updated weights for policy 1, policy_version 1680 (0.0009) +[2023-10-11 14:58:50,711][85175] Updated weights for policy 1, policy_version 1690 (0.0007) +[2023-10-11 14:58:51,063][84230] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13099.8). Total num frames: 3440640. Throughput: 0: 1652.9, 1: 1674.0. Samples: 860416. Policy #0 lag: (min: 4.0, avg: 12.0, max: 36.0) +[2023-10-11 14:58:51,063][84230] Avg episode reward: [(0, '6.190'), (1, '6.260')] +[2023-10-11 14:58:53,357][85176] Updated weights for policy 0, policy_version 1672 (0.0010) +[2023-10-11 14:58:53,736][85176] Updated weights for policy 0, policy_version 1682 (0.0008) +[2023-10-11 14:58:54,108][85176] Updated weights for policy 0, policy_version 1692 (0.0010) +[2023-10-11 14:58:54,775][85175] Updated weights for policy 1, policy_version 1700 (0.0009) +[2023-10-11 14:58:55,146][85175] Updated weights for policy 1, policy_version 1710 (0.0008) +[2023-10-11 14:58:55,508][85175] Updated weights for policy 1, policy_version 1720 (0.0008) +[2023-10-11 14:58:56,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13100.0). Total num frames: 3506176. Throughput: 0: 1658.7, 1: 1674.3. Samples: 880366. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 14:58:56,063][84230] Avg episode reward: [(0, '6.200'), (1, '6.250')] +[2023-10-11 14:58:58,067][85176] Updated weights for policy 0, policy_version 1702 (0.0011) +[2023-10-11 14:58:58,432][85176] Updated weights for policy 0, policy_version 1712 (0.0010) +[2023-10-11 14:58:58,816][85176] Updated weights for policy 0, policy_version 1722 (0.0009) +[2023-10-11 14:58:59,602][85175] Updated weights for policy 1, policy_version 1730 (0.0007) +[2023-10-11 14:58:59,978][85175] Updated weights for policy 1, policy_version 1740 (0.0007) +[2023-10-11 14:59:00,343][85175] Updated weights for policy 1, policy_version 1750 (0.0007) +[2023-10-11 14:59:00,714][85175] Updated weights for policy 1, policy_version 1760 (0.0007) +[2023-10-11 14:59:01,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13100.1). Total num frames: 3571712. Throughput: 0: 1665.6, 1: 1663.4. Samples: 899994. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 14:59:01,063][84230] Avg episode reward: [(0, '6.490'), (1, '6.010')] +[2023-10-11 14:59:01,071][84801] Saving new best policy, reward=6.490! +[2023-10-11 14:59:02,916][85176] Updated weights for policy 0, policy_version 1732 (0.0009) +[2023-10-11 14:59:03,294][85176] Updated weights for policy 0, policy_version 1742 (0.0007) +[2023-10-11 14:59:03,677][85176] Updated weights for policy 0, policy_version 1752 (0.0009) +[2023-10-11 14:59:04,739][85175] Updated weights for policy 1, policy_version 1770 (0.0011) +[2023-10-11 14:59:05,115][85175] Updated weights for policy 1, policy_version 1780 (0.0009) +[2023-10-11 14:59:05,474][85175] Updated weights for policy 1, policy_version 1790 (0.0008) +[2023-10-11 14:59:06,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13100.2). Total num frames: 3637248. Throughput: 0: 1652.8, 1: 1681.9. Samples: 910488. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 14:59:06,063][84230] Avg episode reward: [(0, '6.440'), (1, '6.270')] +[2023-10-11 14:59:07,797][85176] Updated weights for policy 0, policy_version 1762 (0.0008) +[2023-10-11 14:59:08,189][85176] Updated weights for policy 0, policy_version 1772 (0.0009) +[2023-10-11 14:59:08,563][85176] Updated weights for policy 0, policy_version 1782 (0.0009) +[2023-10-11 14:59:08,933][85176] Updated weights for policy 0, policy_version 1792 (0.0009) +[2023-10-11 14:59:09,582][85175] Updated weights for policy 1, policy_version 1800 (0.0007) +[2023-10-11 14:59:09,948][85175] Updated weights for policy 1, policy_version 1810 (0.0008) +[2023-10-11 14:59:10,317][85175] Updated weights for policy 1, policy_version 1820 (0.0007) +[2023-10-11 14:59:11,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13100.3). Total num frames: 3702784. Throughput: 0: 1665.3, 1: 1677.4. Samples: 930398. Policy #0 lag: (min: 8.0, avg: 19.5, max: 40.0) +[2023-10-11 14:59:11,064][84230] Avg episode reward: [(0, '6.530'), (1, '6.450')] +[2023-10-11 14:59:11,065][84801] Saving new best policy, reward=6.530! +[2023-10-11 14:59:12,879][85176] Updated weights for policy 0, policy_version 1802 (0.0010) +[2023-10-11 14:59:13,256][85176] Updated weights for policy 0, policy_version 1812 (0.0008) +[2023-10-11 14:59:13,630][85176] Updated weights for policy 0, policy_version 1822 (0.0008) +[2023-10-11 14:59:14,398][85175] Updated weights for policy 1, policy_version 1830 (0.0009) +[2023-10-11 14:59:14,761][85175] Updated weights for policy 1, policy_version 1840 (0.0008) +[2023-10-11 14:59:15,140][85175] Updated weights for policy 1, policy_version 1850 (0.0008) +[2023-10-11 14:59:16,063][84230] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13100.4). Total num frames: 3768320. Throughput: 0: 1667.6, 1: 1665.5. Samples: 950076. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 14:59:16,064][84230] Avg episode reward: [(0, '6.560'), (1, '6.170')] +[2023-10-11 14:59:16,076][84801] Saving new best policy, reward=6.560! +[2023-10-11 14:59:18,052][85176] Updated weights for policy 0, policy_version 1832 (0.0007) +[2023-10-11 14:59:18,430][85176] Updated weights for policy 0, policy_version 1842 (0.0007) +[2023-10-11 14:59:18,813][85176] Updated weights for policy 0, policy_version 1852 (0.0008) +[2023-10-11 14:59:19,188][85175] Updated weights for policy 1, policy_version 1860 (0.0008) +[2023-10-11 14:59:19,561][85175] Updated weights for policy 1, policy_version 1870 (0.0010) +[2023-10-11 14:59:19,923][85175] Updated weights for policy 1, policy_version 1880 (0.0008) +[2023-10-11 14:59:21,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13100.6). Total num frames: 3833856. Throughput: 0: 1656.7, 1: 1679.8. Samples: 960664. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 14:59:21,064][84230] Avg episode reward: [(0, '6.420'), (1, '6.360')] +[2023-10-11 14:59:22,838][85176] Updated weights for policy 0, policy_version 1862 (0.0009) +[2023-10-11 14:59:23,205][85176] Updated weights for policy 0, policy_version 1872 (0.0010) +[2023-10-11 14:59:23,572][85176] Updated weights for policy 0, policy_version 1882 (0.0011) +[2023-10-11 14:59:24,194][85175] Updated weights for policy 1, policy_version 1890 (0.0009) +[2023-10-11 14:59:24,596][85175] Updated weights for policy 1, policy_version 1900 (0.0010) +[2023-10-11 14:59:24,963][85175] Updated weights for policy 1, policy_version 1910 (0.0010) +[2023-10-11 14:59:25,343][85175] Updated weights for policy 1, policy_version 1920 (0.0008) +[2023-10-11 14:59:26,063][84230] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13218.3). Total num frames: 3899392. Throughput: 0: 1666.5, 1: 1667.0. Samples: 980146. Policy #0 lag: (min: 26.0, avg: 30.0, max: 58.0) +[2023-10-11 14:59:26,063][84230] Avg episode reward: [(0, '6.340'), (1, '6.720')] +[2023-10-11 14:59:26,064][85000] Saving new best policy, reward=6.720! +[2023-10-11 14:59:27,721][85176] Updated weights for policy 0, policy_version 1892 (0.0009) +[2023-10-11 14:59:28,095][85176] Updated weights for policy 0, policy_version 1902 (0.0010) +[2023-10-11 14:59:28,459][85176] Updated weights for policy 0, policy_version 1912 (0.0009) +[2023-10-11 14:59:29,349][85175] Updated weights for policy 1, policy_version 1930 (0.0009) +[2023-10-11 14:59:29,721][85175] Updated weights for policy 1, policy_version 1940 (0.0011) +[2023-10-11 14:59:30,085][85175] Updated weights for policy 1, policy_version 1950 (0.0010) +[2023-10-11 14:59:31,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 3964928. Throughput: 0: 1666.8, 1: 1663.8. Samples: 999808. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 14:59:31,063][84230] Avg episode reward: [(0, '6.280'), (1, '6.600')] +[2023-10-11 14:59:32,557][85176] Updated weights for policy 0, policy_version 1922 (0.0008) +[2023-10-11 14:59:32,934][85176] Updated weights for policy 0, policy_version 1932 (0.0007) +[2023-10-11 14:59:33,312][85176] Updated weights for policy 0, policy_version 1942 (0.0008) +[2023-10-11 14:59:33,685][85176] Updated weights for policy 0, policy_version 1952 (0.0010) +[2023-10-11 14:59:34,039][85175] Updated weights for policy 1, policy_version 1960 (0.0010) +[2023-10-11 14:59:34,400][85175] Updated weights for policy 1, policy_version 1970 (0.0009) +[2023-10-11 14:59:34,769][85175] Updated weights for policy 1, policy_version 1980 (0.0007) +[2023-10-11 14:59:36,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 4030464. Throughput: 0: 1650.9, 1: 1685.7. Samples: 1010566. Policy #0 lag: (min: 31.0, avg: 36.3, max: 63.0) +[2023-10-11 14:59:36,063][84230] Avg episode reward: [(0, '6.240'), (1, '6.640')] +[2023-10-11 14:59:37,883][85176] Updated weights for policy 0, policy_version 1962 (0.0010) +[2023-10-11 14:59:38,261][85176] Updated weights for policy 0, policy_version 1972 (0.0008) +[2023-10-11 14:59:38,632][85176] Updated weights for policy 0, policy_version 1982 (0.0009) +[2023-10-11 14:59:38,846][85175] Updated weights for policy 1, policy_version 1990 (0.0009) +[2023-10-11 14:59:39,218][85175] Updated weights for policy 1, policy_version 2000 (0.0008) +[2023-10-11 14:59:39,584][85175] Updated weights for policy 1, policy_version 2010 (0.0009) +[2023-10-11 14:59:41,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 4096000. Throughput: 0: 1657.4, 1: 1666.9. Samples: 1029960. Policy #0 lag: (min: 31.0, avg: 38.7, max: 63.0) +[2023-10-11 14:59:41,064][84230] Avg episode reward: [(0, '6.320'), (1, '6.760')] +[2023-10-11 14:59:41,065][85000] Saving new best policy, reward=6.760! +[2023-10-11 14:59:42,695][85176] Updated weights for policy 0, policy_version 1992 (0.0009) +[2023-10-11 14:59:43,070][85176] Updated weights for policy 0, policy_version 2002 (0.0009) +[2023-10-11 14:59:43,447][85176] Updated weights for policy 0, policy_version 2012 (0.0007) +[2023-10-11 14:59:43,580][85175] Updated weights for policy 1, policy_version 2020 (0.0008) +[2023-10-11 14:59:43,952][85175] Updated weights for policy 1, policy_version 2030 (0.0008) +[2023-10-11 14:59:44,324][85175] Updated weights for policy 1, policy_version 2040 (0.0009) +[2023-10-11 14:59:46,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 4161536. Throughput: 0: 1658.3, 1: 1681.6. Samples: 1050292. Policy #0 lag: (min: 1.0, avg: 13.9, max: 33.0) +[2023-10-11 14:59:46,063][84230] Avg episode reward: [(0, '6.390'), (1, '6.730')] +[2023-10-11 14:59:47,421][85176] Updated weights for policy 0, policy_version 2022 (0.0010) +[2023-10-11 14:59:47,799][85176] Updated weights for policy 0, policy_version 2032 (0.0009) +[2023-10-11 14:59:48,167][85176] Updated weights for policy 0, policy_version 2042 (0.0009) +[2023-10-11 14:59:48,433][85175] Updated weights for policy 1, policy_version 2050 (0.0010) +[2023-10-11 14:59:48,795][85175] Updated weights for policy 1, policy_version 2060 (0.0009) +[2023-10-11 14:59:49,164][85175] Updated weights for policy 1, policy_version 2070 (0.0008) +[2023-10-11 14:59:49,534][85175] Updated weights for policy 1, policy_version 2080 (0.0008) +[2023-10-11 14:59:51,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 4227072. Throughput: 0: 1651.7, 1: 1684.3. Samples: 1060608. Policy #0 lag: (min: 31.0, avg: 32.5, max: 57.0) +[2023-10-11 14:59:51,063][84230] Avg episode reward: [(0, '6.200'), (1, '6.280')] +[2023-10-11 14:59:52,331][85176] Updated weights for policy 0, policy_version 2052 (0.0008) +[2023-10-11 14:59:52,706][85176] Updated weights for policy 0, policy_version 2062 (0.0010) +[2023-10-11 14:59:53,083][85176] Updated weights for policy 0, policy_version 2072 (0.0010) +[2023-10-11 14:59:53,429][85175] Updated weights for policy 1, policy_version 2090 (0.0009) +[2023-10-11 14:59:53,795][85175] Updated weights for policy 1, policy_version 2100 (0.0009) +[2023-10-11 14:59:54,165][85175] Updated weights for policy 1, policy_version 2110 (0.0011) +[2023-10-11 14:59:56,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 4292608. Throughput: 0: 1656.2, 1: 1664.4. Samples: 1079826. Policy #0 lag: (min: 31.0, avg: 31.1, max: 38.0) +[2023-10-11 14:59:56,064][84230] Avg episode reward: [(0, '6.500'), (1, '6.170')] +[2023-10-11 14:59:57,237][85176] Updated weights for policy 0, policy_version 2082 (0.0007) +[2023-10-11 14:59:57,599][85176] Updated weights for policy 0, policy_version 2092 (0.0008) +[2023-10-11 14:59:57,972][85176] Updated weights for policy 0, policy_version 2102 (0.0008) +[2023-10-11 14:59:58,339][85176] Updated weights for policy 0, policy_version 2112 (0.0008) +[2023-10-11 14:59:58,402][85175] Updated weights for policy 1, policy_version 2120 (0.0008) +[2023-10-11 14:59:58,768][85175] Updated weights for policy 1, policy_version 2130 (0.0008) +[2023-10-11 14:59:59,136][85175] Updated weights for policy 1, policy_version 2140 (0.0008) +[2023-10-11 15:00:01,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 4358144. Throughput: 0: 1653.8, 1: 1686.1. Samples: 1100374. Policy #0 lag: (min: 26.0, avg: 26.3, max: 37.0) +[2023-10-11 15:00:01,063][84230] Avg episode reward: [(0, '6.410'), (1, '6.270')] +[2023-10-11 15:00:02,685][85176] Updated weights for policy 0, policy_version 2122 (0.0009) +[2023-10-11 15:00:03,067][85176] Updated weights for policy 0, policy_version 2132 (0.0009) +[2023-10-11 15:00:03,154][85175] Updated weights for policy 1, policy_version 2150 (0.0008) +[2023-10-11 15:00:03,438][85176] Updated weights for policy 0, policy_version 2142 (0.0009) +[2023-10-11 15:00:03,527][85175] Updated weights for policy 1, policy_version 2160 (0.0007) +[2023-10-11 15:00:03,902][85175] Updated weights for policy 1, policy_version 2170 (0.0009) +[2023-10-11 15:00:06,063][84230] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 4423680. Throughput: 0: 1645.0, 1: 1674.0. Samples: 1110016. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:00:06,063][84230] Avg episode reward: [(0, '6.670'), (1, '6.130')] +[2023-10-11 15:00:06,064][84801] Saving new best policy, reward=6.670! +[2023-10-11 15:00:07,602][85176] Updated weights for policy 0, policy_version 2152 (0.0008) +[2023-10-11 15:00:07,968][85175] Updated weights for policy 1, policy_version 2180 (0.0010) +[2023-10-11 15:00:07,970][85176] Updated weights for policy 0, policy_version 2162 (0.0008) +[2023-10-11 15:00:08,343][85175] Updated weights for policy 1, policy_version 2190 (0.0008) +[2023-10-11 15:00:08,344][85176] Updated weights for policy 0, policy_version 2172 (0.0007) +[2023-10-11 15:00:08,712][85175] Updated weights for policy 1, policy_version 2200 (0.0007) +[2023-10-11 15:00:11,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 4489216. Throughput: 0: 1651.1, 1: 1672.1. Samples: 1129690. Policy #0 lag: (min: 31.0, avg: 31.2, max: 40.0) +[2023-10-11 15:00:11,064][84230] Avg episode reward: [(0, '6.740'), (1, '6.280')] +[2023-10-11 15:00:11,065][84801] Saving new best policy, reward=6.740! +[2023-10-11 15:00:12,386][85176] Updated weights for policy 0, policy_version 2182 (0.0008) +[2023-10-11 15:00:12,766][85176] Updated weights for policy 0, policy_version 2192 (0.0007) +[2023-10-11 15:00:12,819][85175] Updated weights for policy 1, policy_version 2210 (0.0008) +[2023-10-11 15:00:13,140][85176] Updated weights for policy 0, policy_version 2202 (0.0009) +[2023-10-11 15:00:13,243][85175] Updated weights for policy 1, policy_version 2220 (0.0007) +[2023-10-11 15:00:13,614][85175] Updated weights for policy 1, policy_version 2230 (0.0007) +[2023-10-11 15:00:13,976][85175] Updated weights for policy 1, policy_version 2240 (0.0010) +[2023-10-11 15:00:16,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 4554752. Throughput: 0: 1647.6, 1: 1691.7. Samples: 1150078. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:00:16,063][84230] Avg episode reward: [(0, '6.830'), (1, '6.180')] +[2023-10-11 15:00:16,074][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000002240_2293760.pth... +[2023-10-11 15:00:16,074][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000002208_2260992.pth... +[2023-10-11 15:00:16,107][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000000672_688128.pth +[2023-10-11 15:00:16,109][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000000672_688128.pth +[2023-10-11 15:00:16,113][84801] Saving new best policy, reward=6.830! +[2023-10-11 15:00:17,192][85176] Updated weights for policy 0, policy_version 2212 (0.0008) +[2023-10-11 15:00:17,560][85176] Updated weights for policy 0, policy_version 2222 (0.0009) +[2023-10-11 15:00:17,882][85175] Updated weights for policy 1, policy_version 2250 (0.0007) +[2023-10-11 15:00:17,935][85176] Updated weights for policy 0, policy_version 2232 (0.0007) +[2023-10-11 15:00:18,254][85175] Updated weights for policy 1, policy_version 2260 (0.0007) +[2023-10-11 15:00:18,622][85175] Updated weights for policy 1, policy_version 2270 (0.0008) +[2023-10-11 15:00:21,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 4620288. Throughput: 0: 1642.7, 1: 1662.8. Samples: 1159314. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:00:21,063][84230] Avg episode reward: [(0, '6.680'), (1, '6.680')] +[2023-10-11 15:00:22,248][85176] Updated weights for policy 0, policy_version 2242 (0.0010) +[2023-10-11 15:00:22,626][85176] Updated weights for policy 0, policy_version 2252 (0.0008) +[2023-10-11 15:00:22,773][85175] Updated weights for policy 1, policy_version 2280 (0.0008) +[2023-10-11 15:00:23,006][85176] Updated weights for policy 0, policy_version 2262 (0.0010) +[2023-10-11 15:00:23,151][85175] Updated weights for policy 1, policy_version 2290 (0.0008) +[2023-10-11 15:00:23,373][85176] Updated weights for policy 0, policy_version 2272 (0.0007) +[2023-10-11 15:00:23,521][85175] Updated weights for policy 1, policy_version 2300 (0.0009) +[2023-10-11 15:00:26,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 4685824. Throughput: 0: 1648.8, 1: 1668.9. Samples: 1179256. Policy #0 lag: (min: 31.0, avg: 33.8, max: 63.0) +[2023-10-11 15:00:26,064][84230] Avg episode reward: [(0, '6.450'), (1, '6.570')] +[2023-10-11 15:00:27,506][85175] Updated weights for policy 1, policy_version 2310 (0.0007) +[2023-10-11 15:00:27,646][85176] Updated weights for policy 0, policy_version 2282 (0.0007) +[2023-10-11 15:00:27,867][85175] Updated weights for policy 1, policy_version 2320 (0.0007) +[2023-10-11 15:00:28,014][85176] Updated weights for policy 0, policy_version 2292 (0.0007) +[2023-10-11 15:00:28,245][85175] Updated weights for policy 1, policy_version 2330 (0.0008) +[2023-10-11 15:00:28,392][85176] Updated weights for policy 0, policy_version 2302 (0.0007) +[2023-10-11 15:00:31,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 4751360. Throughput: 0: 1647.6, 1: 1680.8. Samples: 1200070. Policy #0 lag: (min: 9.0, avg: 16.5, max: 41.0) +[2023-10-11 15:00:31,063][84230] Avg episode reward: [(0, '6.280'), (1, '6.730')] +[2023-10-11 15:00:32,286][85175] Updated weights for policy 1, policy_version 2340 (0.0009) +[2023-10-11 15:00:32,396][85176] Updated weights for policy 0, policy_version 2312 (0.0009) +[2023-10-11 15:00:32,659][85175] Updated weights for policy 1, policy_version 2350 (0.0008) +[2023-10-11 15:00:32,766][85176] Updated weights for policy 0, policy_version 2322 (0.0010) +[2023-10-11 15:00:33,024][85175] Updated weights for policy 1, policy_version 2360 (0.0009) +[2023-10-11 15:00:33,143][85176] Updated weights for policy 0, policy_version 2332 (0.0008) +[2023-10-11 15:00:36,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 4816896. Throughput: 0: 1646.5, 1: 1657.0. Samples: 1209268. Policy #0 lag: (min: 31.0, avg: 31.2, max: 41.0) +[2023-10-11 15:00:36,063][84230] Avg episode reward: [(0, '6.380'), (1, '6.820')] +[2023-10-11 15:00:36,064][85000] Saving new best policy, reward=6.820! +[2023-10-11 15:00:37,086][85175] Updated weights for policy 1, policy_version 2370 (0.0008) +[2023-10-11 15:00:37,312][85176] Updated weights for policy 0, policy_version 2342 (0.0008) +[2023-10-11 15:00:37,459][85175] Updated weights for policy 1, policy_version 2380 (0.0008) +[2023-10-11 15:00:37,682][85176] Updated weights for policy 0, policy_version 2352 (0.0009) +[2023-10-11 15:00:37,826][85175] Updated weights for policy 1, policy_version 2390 (0.0009) +[2023-10-11 15:00:38,058][85176] Updated weights for policy 0, policy_version 2362 (0.0009) +[2023-10-11 15:00:38,202][85175] Updated weights for policy 1, policy_version 2400 (0.0008) +[2023-10-11 15:00:41,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 4882432. Throughput: 0: 1644.7, 1: 1687.6. Samples: 1229782. Policy #0 lag: (min: 14.0, avg: 15.4, max: 40.0) +[2023-10-11 15:00:41,063][84230] Avg episode reward: [(0, '6.540'), (1, '6.460')] +[2023-10-11 15:00:42,248][85175] Updated weights for policy 1, policy_version 2410 (0.0007) +[2023-10-11 15:00:42,257][85176] Updated weights for policy 0, policy_version 2372 (0.0008) +[2023-10-11 15:00:42,618][85175] Updated weights for policy 1, policy_version 2420 (0.0008) +[2023-10-11 15:00:42,628][85176] Updated weights for policy 0, policy_version 2382 (0.0008) +[2023-10-11 15:00:42,990][85175] Updated weights for policy 1, policy_version 2430 (0.0008) +[2023-10-11 15:00:42,997][85176] Updated weights for policy 0, policy_version 2392 (0.0009) +[2023-10-11 15:00:46,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 4947968. Throughput: 0: 1648.0, 1: 1687.0. Samples: 1250450. Policy #0 lag: (min: 31.0, avg: 32.0, max: 53.0) +[2023-10-11 15:00:46,063][84230] Avg episode reward: [(0, '6.650'), (1, '6.250')] +[2023-10-11 15:00:47,046][85175] Updated weights for policy 1, policy_version 2440 (0.0010) +[2023-10-11 15:00:47,114][85176] Updated weights for policy 0, policy_version 2402 (0.0011) +[2023-10-11 15:00:47,423][85175] Updated weights for policy 1, policy_version 2450 (0.0009) +[2023-10-11 15:00:47,489][85176] Updated weights for policy 0, policy_version 2412 (0.0008) +[2023-10-11 15:00:47,785][85175] Updated weights for policy 1, policy_version 2460 (0.0008) +[2023-10-11 15:00:47,872][85176] Updated weights for policy 0, policy_version 2422 (0.0008) +[2023-10-11 15:00:48,233][85176] Updated weights for policy 0, policy_version 2432 (0.0007) +[2023-10-11 15:00:51,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 5013504. Throughput: 0: 1649.4, 1: 1670.4. Samples: 1259406. Policy #0 lag: (min: 4.0, avg: 10.0, max: 36.0) +[2023-10-11 15:00:51,064][84230] Avg episode reward: [(0, '6.670'), (1, '6.110')] +[2023-10-11 15:00:51,825][85175] Updated weights for policy 1, policy_version 2470 (0.0007) +[2023-10-11 15:00:52,201][85175] Updated weights for policy 1, policy_version 2480 (0.0009) +[2023-10-11 15:00:52,531][85176] Updated weights for policy 0, policy_version 2442 (0.0010) +[2023-10-11 15:00:52,563][85175] Updated weights for policy 1, policy_version 2490 (0.0009) +[2023-10-11 15:00:52,903][85176] Updated weights for policy 0, policy_version 2452 (0.0008) +[2023-10-11 15:00:53,275][85176] Updated weights for policy 0, policy_version 2462 (0.0009) +[2023-10-11 15:00:56,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 5079040. Throughput: 0: 1645.8, 1: 1684.6. Samples: 1279556. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:00:56,063][84230] Avg episode reward: [(0, '6.750'), (1, '6.190')] +[2023-10-11 15:00:56,619][85175] Updated weights for policy 1, policy_version 2500 (0.0009) +[2023-10-11 15:00:56,992][85175] Updated weights for policy 1, policy_version 2510 (0.0007) +[2023-10-11 15:00:57,361][85175] Updated weights for policy 1, policy_version 2520 (0.0007) +[2023-10-11 15:00:57,373][85176] Updated weights for policy 0, policy_version 2472 (0.0008) +[2023-10-11 15:00:57,745][85176] Updated weights for policy 0, policy_version 2482 (0.0008) +[2023-10-11 15:00:58,122][85176] Updated weights for policy 0, policy_version 2492 (0.0007) +[2023-10-11 15:01:01,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13107.1, 300 sec: 13329.3). Total num frames: 5144576. Throughput: 0: 1646.0, 1: 1694.2. Samples: 1300386. Policy #0 lag: (min: 21.0, avg: 21.4, max: 34.0) +[2023-10-11 15:01:01,064][84230] Avg episode reward: [(0, '6.870'), (1, '6.420')] +[2023-10-11 15:01:01,076][84801] Saving new best policy, reward=6.870! +[2023-10-11 15:01:01,326][85175] Updated weights for policy 1, policy_version 2530 (0.0007) +[2023-10-11 15:01:01,714][85175] Updated weights for policy 1, policy_version 2540 (0.0007) +[2023-10-11 15:01:02,076][85175] Updated weights for policy 1, policy_version 2550 (0.0007) +[2023-10-11 15:01:02,299][85176] Updated weights for policy 0, policy_version 2502 (0.0008) +[2023-10-11 15:01:02,444][85175] Updated weights for policy 1, policy_version 2560 (0.0007) +[2023-10-11 15:01:02,674][85176] Updated weights for policy 0, policy_version 2512 (0.0007) +[2023-10-11 15:01:03,041][85176] Updated weights for policy 0, policy_version 2522 (0.0007) +[2023-10-11 15:01:06,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 5210112. Throughput: 0: 1648.1, 1: 1689.4. Samples: 1309502. Policy #0 lag: (min: 18.0, avg: 18.0, max: 18.0) +[2023-10-11 15:01:06,063][84230] Avg episode reward: [(0, '6.610'), (1, '6.150')] +[2023-10-11 15:01:06,538][85175] Updated weights for policy 1, policy_version 2570 (0.0007) +[2023-10-11 15:01:06,902][85175] Updated weights for policy 1, policy_version 2580 (0.0008) +[2023-10-11 15:01:07,182][85176] Updated weights for policy 0, policy_version 2532 (0.0007) +[2023-10-11 15:01:07,275][85175] Updated weights for policy 1, policy_version 2590 (0.0009) +[2023-10-11 15:01:07,556][85176] Updated weights for policy 0, policy_version 2542 (0.0009) +[2023-10-11 15:01:07,927][85176] Updated weights for policy 0, policy_version 2552 (0.0010) +[2023-10-11 15:01:11,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 5275648. Throughput: 0: 1650.1, 1: 1700.2. Samples: 1330020. Policy #0 lag: (min: 31.0, avg: 38.5, max: 63.0) +[2023-10-11 15:01:11,064][84230] Avg episode reward: [(0, '6.760'), (1, '6.310')] +[2023-10-11 15:01:11,294][85175] Updated weights for policy 1, policy_version 2600 (0.0008) +[2023-10-11 15:01:11,671][85175] Updated weights for policy 1, policy_version 2610 (0.0008) +[2023-10-11 15:01:11,980][85176] Updated weights for policy 0, policy_version 2562 (0.0008) +[2023-10-11 15:01:12,035][85175] Updated weights for policy 1, policy_version 2620 (0.0008) +[2023-10-11 15:01:12,356][85176] Updated weights for policy 0, policy_version 2572 (0.0008) +[2023-10-11 15:01:12,731][85176] Updated weights for policy 0, policy_version 2582 (0.0009) +[2023-10-11 15:01:13,101][85176] Updated weights for policy 0, policy_version 2592 (0.0008) +[2023-10-11 15:01:15,996][85175] Updated weights for policy 1, policy_version 2630 (0.0007) +[2023-10-11 15:01:16,063][84230] Fps is (10 sec: 13106.8, 60 sec: 13107.1, 300 sec: 13218.3). Total num frames: 5341184. Throughput: 0: 1654.3, 1: 1701.1. Samples: 1351064. Policy #0 lag: (min: 31.0, avg: 38.5, max: 63.0) +[2023-10-11 15:01:16,064][84230] Avg episode reward: [(0, '6.640'), (1, '6.690')] +[2023-10-11 15:01:16,375][85175] Updated weights for policy 1, policy_version 2640 (0.0009) +[2023-10-11 15:01:16,738][85175] Updated weights for policy 1, policy_version 2650 (0.0008) +[2023-10-11 15:01:17,000][85176] Updated weights for policy 0, policy_version 2602 (0.0008) +[2023-10-11 15:01:17,375][85176] Updated weights for policy 0, policy_version 2612 (0.0009) +[2023-10-11 15:01:17,738][85176] Updated weights for policy 0, policy_version 2622 (0.0009) +[2023-10-11 15:01:20,973][85175] Updated weights for policy 1, policy_version 2660 (0.0009) +[2023-10-11 15:01:21,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13218.3). Total num frames: 5406720. Throughput: 0: 1655.2, 1: 1699.2. Samples: 1360216. Policy #0 lag: (min: 31.0, avg: 37.5, max: 63.0) +[2023-10-11 15:01:21,064][84230] Avg episode reward: [(0, '6.790'), (1, '6.490')] +[2023-10-11 15:01:21,352][85175] Updated weights for policy 1, policy_version 2670 (0.0008) +[2023-10-11 15:01:21,722][85175] Updated weights for policy 1, policy_version 2680 (0.0008) +[2023-10-11 15:01:21,902][85176] Updated weights for policy 0, policy_version 2632 (0.0008) +[2023-10-11 15:01:22,283][85176] Updated weights for policy 0, policy_version 2642 (0.0009) +[2023-10-11 15:01:22,652][85176] Updated weights for policy 0, policy_version 2652 (0.0009) +[2023-10-11 15:01:25,721][85175] Updated weights for policy 1, policy_version 2690 (0.0009) +[2023-10-11 15:01:26,062][84230] Fps is (10 sec: 13107.7, 60 sec: 13107.3, 300 sec: 13218.3). Total num frames: 5472256. Throughput: 0: 1663.7, 1: 1693.5. Samples: 1380856. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:01:26,063][84230] Avg episode reward: [(0, '6.940'), (1, '6.360')] +[2023-10-11 15:01:26,064][84801] Saving new best policy, reward=6.940! +[2023-10-11 15:01:26,087][85175] Updated weights for policy 1, policy_version 2700 (0.0011) +[2023-10-11 15:01:26,463][85175] Updated weights for policy 1, policy_version 2710 (0.0009) +[2023-10-11 15:01:26,759][85176] Updated weights for policy 0, policy_version 2662 (0.0008) +[2023-10-11 15:01:26,831][85175] Updated weights for policy 1, policy_version 2720 (0.0008) +[2023-10-11 15:01:27,132][85176] Updated weights for policy 0, policy_version 2672 (0.0010) +[2023-10-11 15:01:27,505][85176] Updated weights for policy 0, policy_version 2682 (0.0010) +[2023-10-11 15:01:30,833][85175] Updated weights for policy 1, policy_version 2730 (0.0007) +[2023-10-11 15:01:31,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 5537792. Throughput: 0: 1657.0, 1: 1688.4. Samples: 1400992. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:01:31,063][84230] Avg episode reward: [(0, '6.550'), (1, '6.310')] +[2023-10-11 15:01:31,213][85175] Updated weights for policy 1, policy_version 2740 (0.0007) +[2023-10-11 15:01:31,583][85175] Updated weights for policy 1, policy_version 2750 (0.0007) +[2023-10-11 15:01:31,662][85176] Updated weights for policy 0, policy_version 2692 (0.0008) +[2023-10-11 15:01:32,045][85176] Updated weights for policy 0, policy_version 2702 (0.0007) +[2023-10-11 15:01:32,419][85176] Updated weights for policy 0, policy_version 2712 (0.0007) +[2023-10-11 15:01:35,453][85175] Updated weights for policy 1, policy_version 2760 (0.0007) +[2023-10-11 15:01:35,821][85175] Updated weights for policy 1, policy_version 2770 (0.0007) +[2023-10-11 15:01:36,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13218.3). Total num frames: 5603328. Throughput: 0: 1658.0, 1: 1696.0. Samples: 1410340. Policy #0 lag: (min: 14.0, avg: 21.9, max: 46.0) +[2023-10-11 15:01:36,063][84230] Avg episode reward: [(0, '6.240'), (1, '6.240')] +[2023-10-11 15:01:36,185][85175] Updated weights for policy 1, policy_version 2780 (0.0007) +[2023-10-11 15:01:36,523][85176] Updated weights for policy 0, policy_version 2722 (0.0010) +[2023-10-11 15:01:36,897][85176] Updated weights for policy 0, policy_version 2732 (0.0009) +[2023-10-11 15:01:37,267][85176] Updated weights for policy 0, policy_version 2742 (0.0007) +[2023-10-11 15:01:37,639][85176] Updated weights for policy 0, policy_version 2752 (0.0008) +[2023-10-11 15:01:40,330][85175] Updated weights for policy 1, policy_version 2790 (0.0009) +[2023-10-11 15:01:40,709][85175] Updated weights for policy 1, policy_version 2800 (0.0010) +[2023-10-11 15:01:41,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13218.3). Total num frames: 5668864. Throughput: 0: 1667.2, 1: 1699.3. Samples: 1431048. Policy #0 lag: (min: 15.0, avg: 22.9, max: 47.0) +[2023-10-11 15:01:41,063][84230] Avg episode reward: [(0, '6.630'), (1, '6.310')] +[2023-10-11 15:01:41,074][85175] Updated weights for policy 1, policy_version 2810 (0.0009) +[2023-10-11 15:01:41,762][85176] Updated weights for policy 0, policy_version 2762 (0.0008) +[2023-10-11 15:01:42,144][85176] Updated weights for policy 0, policy_version 2772 (0.0011) +[2023-10-11 15:01:42,514][85176] Updated weights for policy 0, policy_version 2782 (0.0010) +[2023-10-11 15:01:45,198][85175] Updated weights for policy 1, policy_version 2820 (0.0010) +[2023-10-11 15:01:45,567][85175] Updated weights for policy 1, policy_version 2830 (0.0009) +[2023-10-11 15:01:45,937][85175] Updated weights for policy 1, policy_version 2840 (0.0007) +[2023-10-11 15:01:46,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13218.3). Total num frames: 5734400. Throughput: 0: 1676.0, 1: 1684.2. Samples: 1451596. Policy #0 lag: (min: 25.0, avg: 33.4, max: 57.0) +[2023-10-11 15:01:46,064][84230] Avg episode reward: [(0, '6.910'), (1, '6.210')] +[2023-10-11 15:01:46,409][85176] Updated weights for policy 0, policy_version 2792 (0.0008) +[2023-10-11 15:01:46,791][85176] Updated weights for policy 0, policy_version 2802 (0.0008) +[2023-10-11 15:01:47,162][85176] Updated weights for policy 0, policy_version 2812 (0.0008) +[2023-10-11 15:01:50,035][85175] Updated weights for policy 1, policy_version 2850 (0.0008) +[2023-10-11 15:01:50,408][85175] Updated weights for policy 1, policy_version 2860 (0.0011) +[2023-10-11 15:01:50,780][85175] Updated weights for policy 1, policy_version 2870 (0.0010) +[2023-10-11 15:01:51,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13218.3). Total num frames: 5799936. Throughput: 0: 1676.1, 1: 1693.3. Samples: 1461128. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) +[2023-10-11 15:01:51,063][84230] Avg episode reward: [(0, '7.030'), (1, '6.240')] +[2023-10-11 15:01:51,158][85175] Updated weights for policy 1, policy_version 2880 (0.0009) +[2023-10-11 15:01:51,337][85176] Updated weights for policy 0, policy_version 2822 (0.0008) +[2023-10-11 15:01:51,714][85176] Updated weights for policy 0, policy_version 2832 (0.0009) +[2023-10-11 15:01:52,087][85176] Updated weights for policy 0, policy_version 2842 (0.0009) +[2023-10-11 15:01:52,306][84801] Saving new best policy, reward=7.030! +[2023-10-11 15:01:55,130][85175] Updated weights for policy 1, policy_version 2890 (0.0009) +[2023-10-11 15:01:55,502][85175] Updated weights for policy 1, policy_version 2900 (0.0009) +[2023-10-11 15:01:55,869][85175] Updated weights for policy 1, policy_version 2910 (0.0009) +[2023-10-11 15:01:56,063][84230] Fps is (10 sec: 16384.2, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 5898240. Throughput: 0: 1677.0, 1: 1691.2. Samples: 1481588. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) +[2023-10-11 15:01:56,063][84230] Avg episode reward: [(0, '6.610'), (1, '6.440')] +[2023-10-11 15:01:56,113][85176] Updated weights for policy 0, policy_version 2852 (0.0008) +[2023-10-11 15:01:56,488][85176] Updated weights for policy 0, policy_version 2862 (0.0008) +[2023-10-11 15:01:56,875][85176] Updated weights for policy 0, policy_version 2872 (0.0011) +[2023-10-11 15:01:59,835][85175] Updated weights for policy 1, policy_version 2920 (0.0008) +[2023-10-11 15:02:00,215][85175] Updated weights for policy 1, policy_version 2930 (0.0008) +[2023-10-11 15:02:00,578][85175] Updated weights for policy 1, policy_version 2940 (0.0007) +[2023-10-11 15:02:00,883][85176] Updated weights for policy 0, policy_version 2882 (0.0009) +[2023-10-11 15:02:01,063][84230] Fps is (10 sec: 16383.8, 60 sec: 13653.4, 300 sec: 13329.4). Total num frames: 5963776. Throughput: 0: 1674.9, 1: 1667.9. Samples: 1501488. Policy #0 lag: (min: 16.0, avg: 40.7, max: 48.0) +[2023-10-11 15:02:01,063][84230] Avg episode reward: [(0, '6.250'), (1, '6.290')] +[2023-10-11 15:02:01,253][85176] Updated weights for policy 0, policy_version 2892 (0.0007) +[2023-10-11 15:02:01,627][85176] Updated weights for policy 0, policy_version 2902 (0.0007) +[2023-10-11 15:02:01,993][85176] Updated weights for policy 0, policy_version 2912 (0.0010) +[2023-10-11 15:02:04,588][85175] Updated weights for policy 1, policy_version 2950 (0.0008) +[2023-10-11 15:02:04,961][85175] Updated weights for policy 1, policy_version 2960 (0.0009) +[2023-10-11 15:02:05,335][85175] Updated weights for policy 1, policy_version 2970 (0.0007) +[2023-10-11 15:02:06,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 6029312. Throughput: 0: 1675.5, 1: 1691.1. Samples: 1511712. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-10-11 15:02:06,063][84230] Avg episode reward: [(0, '6.310'), (1, '6.310')] +[2023-10-11 15:02:06,075][85176] Updated weights for policy 0, policy_version 2922 (0.0007) +[2023-10-11 15:02:06,459][85176] Updated weights for policy 0, policy_version 2932 (0.0009) +[2023-10-11 15:02:06,831][85176] Updated weights for policy 0, policy_version 2942 (0.0007) +[2023-10-11 15:02:09,517][85175] Updated weights for policy 1, policy_version 2980 (0.0010) +[2023-10-11 15:02:09,890][85175] Updated weights for policy 1, policy_version 2990 (0.0008) +[2023-10-11 15:02:10,259][85175] Updated weights for policy 1, policy_version 3000 (0.0008) +[2023-10-11 15:02:10,925][85176] Updated weights for policy 0, policy_version 2952 (0.0009) +[2023-10-11 15:02:11,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13329.4). Total num frames: 6094848. Throughput: 0: 1673.0, 1: 1685.6. Samples: 1531992. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-10-11 15:02:11,063][84230] Avg episode reward: [(0, '6.740'), (1, '6.200')] +[2023-10-11 15:02:11,296][85176] Updated weights for policy 0, policy_version 2962 (0.0007) +[2023-10-11 15:02:11,667][85176] Updated weights for policy 0, policy_version 2972 (0.0007) +[2023-10-11 15:02:14,371][85175] Updated weights for policy 1, policy_version 3010 (0.0009) +[2023-10-11 15:02:14,737][85175] Updated weights for policy 1, policy_version 3020 (0.0010) +[2023-10-11 15:02:15,110][85175] Updated weights for policy 1, policy_version 3030 (0.0008) +[2023-10-11 15:02:15,478][85175] Updated weights for policy 1, policy_version 3040 (0.0008) +[2023-10-11 15:02:15,764][85176] Updated weights for policy 0, policy_version 2982 (0.0009) +[2023-10-11 15:02:16,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13329.4). Total num frames: 6160384. Throughput: 0: 1675.6, 1: 1668.6. Samples: 1551482. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:02:16,063][84230] Avg episode reward: [(0, '7.040'), (1, '6.250')] +[2023-10-11 15:02:16,073][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000003040_3112960.pth... +[2023-10-11 15:02:16,107][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000001472_1507328.pth +[2023-10-11 15:02:16,144][85176] Updated weights for policy 0, policy_version 2992 (0.0008) +[2023-10-11 15:02:16,520][85176] Updated weights for policy 0, policy_version 3002 (0.0008) +[2023-10-11 15:02:16,739][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000003008_3080192.pth... +[2023-10-11 15:02:16,770][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000001440_1474560.pth +[2023-10-11 15:02:16,773][84801] Saving new best policy, reward=7.040! +[2023-10-11 15:02:19,190][85175] Updated weights for policy 1, policy_version 3050 (0.0010) +[2023-10-11 15:02:19,547][85175] Updated weights for policy 1, policy_version 3060 (0.0010) +[2023-10-11 15:02:19,919][85175] Updated weights for policy 1, policy_version 3070 (0.0007) +[2023-10-11 15:02:20,641][85176] Updated weights for policy 0, policy_version 3012 (0.0009) +[2023-10-11 15:02:21,007][85176] Updated weights for policy 0, policy_version 3022 (0.0009) +[2023-10-11 15:02:21,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 6225920. Throughput: 0: 1675.6, 1: 1697.5. Samples: 1562130. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:02:21,064][84230] Avg episode reward: [(0, '6.940'), (1, '6.080')] +[2023-10-11 15:02:21,391][85176] Updated weights for policy 0, policy_version 3032 (0.0010) +[2023-10-11 15:02:23,989][85175] Updated weights for policy 1, policy_version 3080 (0.0008) +[2023-10-11 15:02:24,361][85175] Updated weights for policy 1, policy_version 3090 (0.0009) +[2023-10-11 15:02:24,727][85175] Updated weights for policy 1, policy_version 3100 (0.0010) +[2023-10-11 15:02:25,647][85176] Updated weights for policy 0, policy_version 3042 (0.0009) +[2023-10-11 15:02:26,033][85176] Updated weights for policy 0, policy_version 3052 (0.0011) +[2023-10-11 15:02:26,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 6291456. Throughput: 0: 1668.4, 1: 1680.5. Samples: 1581748. Policy #0 lag: (min: 31.0, avg: 34.8, max: 63.0) +[2023-10-11 15:02:26,063][84230] Avg episode reward: [(0, '6.480'), (1, '6.300')] +[2023-10-11 15:02:26,411][85176] Updated weights for policy 0, policy_version 3062 (0.0009) +[2023-10-11 15:02:26,786][85176] Updated weights for policy 0, policy_version 3072 (0.0008) +[2023-10-11 15:02:28,816][85175] Updated weights for policy 1, policy_version 3110 (0.0008) +[2023-10-11 15:02:29,191][85175] Updated weights for policy 1, policy_version 3120 (0.0008) +[2023-10-11 15:02:29,561][85175] Updated weights for policy 1, policy_version 3130 (0.0008) +[2023-10-11 15:02:31,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 6356992. Throughput: 0: 1656.9, 1: 1676.7. Samples: 1601606. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:02:31,063][84230] Avg episode reward: [(0, '6.280'), (1, '6.580')] +[2023-10-11 15:02:31,121][85176] Updated weights for policy 0, policy_version 3082 (0.0009) +[2023-10-11 15:02:31,494][85176] Updated weights for policy 0, policy_version 3092 (0.0009) +[2023-10-11 15:02:31,862][85176] Updated weights for policy 0, policy_version 3102 (0.0008) +[2023-10-11 15:02:33,396][85175] Updated weights for policy 1, policy_version 3140 (0.0009) +[2023-10-11 15:02:33,770][85175] Updated weights for policy 1, policy_version 3150 (0.0009) +[2023-10-11 15:02:34,131][85175] Updated weights for policy 1, policy_version 3160 (0.0010) +[2023-10-11 15:02:35,866][85176] Updated weights for policy 0, policy_version 3112 (0.0007) +[2023-10-11 15:02:36,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 6422528. Throughput: 0: 1655.2, 1: 1694.9. Samples: 1611882. Policy #0 lag: (min: 3.0, avg: 6.5, max: 35.0) +[2023-10-11 15:02:36,063][84230] Avg episode reward: [(0, '6.810'), (1, '6.610')] +[2023-10-11 15:02:36,242][85176] Updated weights for policy 0, policy_version 3122 (0.0008) +[2023-10-11 15:02:36,620][85176] Updated weights for policy 0, policy_version 3132 (0.0010) +[2023-10-11 15:02:38,145][85175] Updated weights for policy 1, policy_version 3170 (0.0008) +[2023-10-11 15:02:38,555][85175] Updated weights for policy 1, policy_version 3180 (0.0009) +[2023-10-11 15:02:38,923][85175] Updated weights for policy 1, policy_version 3190 (0.0010) +[2023-10-11 15:02:39,298][85175] Updated weights for policy 1, policy_version 3200 (0.0010) +[2023-10-11 15:02:40,797][85176] Updated weights for policy 0, policy_version 3142 (0.0009) +[2023-10-11 15:02:41,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13329.4). Total num frames: 6488064. Throughput: 0: 1658.7, 1: 1675.9. Samples: 1631644. Policy #0 lag: (min: 31.0, avg: 35.7, max: 63.0) +[2023-10-11 15:02:41,063][84230] Avg episode reward: [(0, '6.850'), (1, '6.240')] +[2023-10-11 15:02:41,162][85176] Updated weights for policy 0, policy_version 3152 (0.0007) +[2023-10-11 15:02:41,539][85176] Updated weights for policy 0, policy_version 3162 (0.0007) +[2023-10-11 15:02:43,372][85175] Updated weights for policy 1, policy_version 3210 (0.0008) +[2023-10-11 15:02:43,734][85175] Updated weights for policy 1, policy_version 3220 (0.0009) +[2023-10-11 15:02:44,102][85175] Updated weights for policy 1, policy_version 3230 (0.0010) +[2023-10-11 15:02:45,586][85176] Updated weights for policy 0, policy_version 3172 (0.0007) +[2023-10-11 15:02:45,954][85176] Updated weights for policy 0, policy_version 3182 (0.0007) +[2023-10-11 15:02:46,063][84230] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13329.3). Total num frames: 6553600. Throughput: 0: 1654.2, 1: 1692.6. Samples: 1652094. Policy #0 lag: (min: 29.0, avg: 34.0, max: 61.0) +[2023-10-11 15:02:46,064][84230] Avg episode reward: [(0, '7.020'), (1, '6.190')] +[2023-10-11 15:02:46,329][85176] Updated weights for policy 0, policy_version 3192 (0.0007) +[2023-10-11 15:02:48,190][85175] Updated weights for policy 1, policy_version 3240 (0.0009) +[2023-10-11 15:02:48,560][85175] Updated weights for policy 1, policy_version 3250 (0.0009) +[2023-10-11 15:02:48,938][85175] Updated weights for policy 1, policy_version 3260 (0.0010) +[2023-10-11 15:02:50,330][85176] Updated weights for policy 0, policy_version 3202 (0.0009) +[2023-10-11 15:02:50,706][85176] Updated weights for policy 0, policy_version 3212 (0.0008) +[2023-10-11 15:02:51,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 6619136. Throughput: 0: 1653.2, 1: 1685.4. Samples: 1661950. Policy #0 lag: (min: 29.0, avg: 34.0, max: 61.0) +[2023-10-11 15:02:51,064][85176] Updated weights for policy 0, policy_version 3222 (0.0010) +[2023-10-11 15:02:51,063][84230] Avg episode reward: [(0, '6.760'), (1, '5.920')] +[2023-10-11 15:02:51,436][85176] Updated weights for policy 0, policy_version 3232 (0.0008) +[2023-10-11 15:02:52,973][85175] Updated weights for policy 1, policy_version 3270 (0.0008) +[2023-10-11 15:02:53,341][85175] Updated weights for policy 1, policy_version 3280 (0.0007) +[2023-10-11 15:02:53,705][85175] Updated weights for policy 1, policy_version 3290 (0.0009) +[2023-10-11 15:02:55,557][85176] Updated weights for policy 0, policy_version 3242 (0.0007) +[2023-10-11 15:02:55,937][85176] Updated weights for policy 0, policy_version 3252 (0.0007) +[2023-10-11 15:02:56,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 6684672. Throughput: 0: 1654.2, 1: 1679.9. Samples: 1682028. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) +[2023-10-11 15:02:56,063][84230] Avg episode reward: [(0, '6.500'), (1, '6.030')] +[2023-10-11 15:02:56,309][85176] Updated weights for policy 0, policy_version 3262 (0.0008) +[2023-10-11 15:02:57,835][85175] Updated weights for policy 1, policy_version 3300 (0.0007) +[2023-10-11 15:02:58,199][85175] Updated weights for policy 1, policy_version 3310 (0.0009) +[2023-10-11 15:02:58,558][85175] Updated weights for policy 1, policy_version 3320 (0.0008) +[2023-10-11 15:03:00,351][85176] Updated weights for policy 0, policy_version 3272 (0.0008) +[2023-10-11 15:03:00,719][85176] Updated weights for policy 0, policy_version 3282 (0.0008) +[2023-10-11 15:03:01,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 6750208. Throughput: 0: 1650.2, 1: 1699.6. Samples: 1702226. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) +[2023-10-11 15:03:01,063][84230] Avg episode reward: [(0, '6.660'), (1, '6.390')] +[2023-10-11 15:03:01,099][85176] Updated weights for policy 0, policy_version 3292 (0.0008) +[2023-10-11 15:03:02,653][85175] Updated weights for policy 1, policy_version 3330 (0.0007) +[2023-10-11 15:03:03,019][85175] Updated weights for policy 1, policy_version 3340 (0.0007) +[2023-10-11 15:03:03,387][85175] Updated weights for policy 1, policy_version 3350 (0.0007) +[2023-10-11 15:03:03,762][85175] Updated weights for policy 1, policy_version 3360 (0.0009) +[2023-10-11 15:03:05,176][85176] Updated weights for policy 0, policy_version 3302 (0.0008) +[2023-10-11 15:03:05,552][85176] Updated weights for policy 0, policy_version 3312 (0.0007) +[2023-10-11 15:03:05,919][85176] Updated weights for policy 0, policy_version 3322 (0.0010) +[2023-10-11 15:03:06,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 6815744. Throughput: 0: 1664.9, 1: 1672.5. Samples: 1712314. Policy #0 lag: (min: 31.0, avg: 31.4, max: 46.0) +[2023-10-11 15:03:06,064][84230] Avg episode reward: [(0, '6.900'), (1, '6.530')] +[2023-10-11 15:03:07,731][85175] Updated weights for policy 1, policy_version 3370 (0.0009) +[2023-10-11 15:03:08,092][85175] Updated weights for policy 1, policy_version 3380 (0.0008) +[2023-10-11 15:03:08,467][85175] Updated weights for policy 1, policy_version 3390 (0.0010) +[2023-10-11 15:03:10,286][85176] Updated weights for policy 0, policy_version 3332 (0.0007) +[2023-10-11 15:03:10,656][85176] Updated weights for policy 0, policy_version 3342 (0.0010) +[2023-10-11 15:03:11,039][85176] Updated weights for policy 0, policy_version 3352 (0.0009) +[2023-10-11 15:03:11,063][84230] Fps is (10 sec: 13106.8, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 6881280. Throughput: 0: 1667.8, 1: 1685.6. Samples: 1732654. Policy #0 lag: (min: 31.0, avg: 31.4, max: 46.0) +[2023-10-11 15:03:11,064][84230] Avg episode reward: [(0, '6.700'), (1, '6.420')] +[2023-10-11 15:03:12,585][85175] Updated weights for policy 1, policy_version 3400 (0.0007) +[2023-10-11 15:03:12,951][85175] Updated weights for policy 1, policy_version 3410 (0.0008) +[2023-10-11 15:03:13,324][85175] Updated weights for policy 1, policy_version 3420 (0.0007) +[2023-10-11 15:03:15,151][85176] Updated weights for policy 0, policy_version 3362 (0.0009) +[2023-10-11 15:03:15,523][85176] Updated weights for policy 0, policy_version 3372 (0.0010) +[2023-10-11 15:03:15,902][85176] Updated weights for policy 0, policy_version 3382 (0.0007) +[2023-10-11 15:03:16,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 6946816. Throughput: 0: 1662.3, 1: 1696.4. Samples: 1752750. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:03:16,063][84230] Avg episode reward: [(0, '6.550'), (1, '6.530')] +[2023-10-11 15:03:16,271][85176] Updated weights for policy 0, policy_version 3392 (0.0007) +[2023-10-11 15:03:17,412][85175] Updated weights for policy 1, policy_version 3430 (0.0009) +[2023-10-11 15:03:17,784][85175] Updated weights for policy 1, policy_version 3440 (0.0008) +[2023-10-11 15:03:18,158][85175] Updated weights for policy 1, policy_version 3450 (0.0007) +[2023-10-11 15:03:20,232][85176] Updated weights for policy 0, policy_version 3402 (0.0009) +[2023-10-11 15:03:20,604][85176] Updated weights for policy 0, policy_version 3412 (0.0008) +[2023-10-11 15:03:20,985][85176] Updated weights for policy 0, policy_version 3422 (0.0010) +[2023-10-11 15:03:21,062][84230] Fps is (10 sec: 16384.5, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 7045120. Throughput: 0: 1679.2, 1: 1664.9. Samples: 1762366. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-11 15:03:21,063][84230] Avg episode reward: [(0, '6.810'), (1, '6.370')] +[2023-10-11 15:03:22,230][85175] Updated weights for policy 1, policy_version 3460 (0.0009) +[2023-10-11 15:03:22,607][85175] Updated weights for policy 1, policy_version 3470 (0.0008) +[2023-10-11 15:03:22,967][85175] Updated weights for policy 1, policy_version 3480 (0.0007) +[2023-10-11 15:03:25,131][85176] Updated weights for policy 0, policy_version 3432 (0.0009) +[2023-10-11 15:03:25,506][85176] Updated weights for policy 0, policy_version 3442 (0.0008) +[2023-10-11 15:03:25,880][85176] Updated weights for policy 0, policy_version 3452 (0.0008) +[2023-10-11 15:03:26,062][84230] Fps is (10 sec: 16384.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 7110656. Throughput: 0: 1672.0, 1: 1686.6. Samples: 1782780. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-11 15:03:26,063][84230] Avg episode reward: [(0, '6.860'), (1, '6.160')] +[2023-10-11 15:03:27,055][85175] Updated weights for policy 1, policy_version 3490 (0.0008) +[2023-10-11 15:03:27,472][85175] Updated weights for policy 1, policy_version 3500 (0.0008) +[2023-10-11 15:03:27,856][85175] Updated weights for policy 1, policy_version 3510 (0.0007) +[2023-10-11 15:03:28,222][85175] Updated weights for policy 1, policy_version 3520 (0.0007) +[2023-10-11 15:03:29,903][85176] Updated weights for policy 0, policy_version 3462 (0.0009) +[2023-10-11 15:03:30,280][85176] Updated weights for policy 0, policy_version 3472 (0.0009) +[2023-10-11 15:03:30,658][85176] Updated weights for policy 0, policy_version 3482 (0.0007) +[2023-10-11 15:03:31,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 7176192. Throughput: 0: 1652.4, 1: 1690.4. Samples: 1802518. Policy #0 lag: (min: 7.0, avg: 7.1, max: 13.0) +[2023-10-11 15:03:31,063][84230] Avg episode reward: [(0, '6.680'), (1, '6.220')] +[2023-10-11 15:03:32,040][85175] Updated weights for policy 1, policy_version 3530 (0.0008) +[2023-10-11 15:03:32,409][85175] Updated weights for policy 1, policy_version 3540 (0.0007) +[2023-10-11 15:03:32,780][85175] Updated weights for policy 1, policy_version 3550 (0.0007) +[2023-10-11 15:03:34,798][85176] Updated weights for policy 0, policy_version 3492 (0.0008) +[2023-10-11 15:03:35,175][85176] Updated weights for policy 0, policy_version 3502 (0.0007) +[2023-10-11 15:03:35,551][85176] Updated weights for policy 0, policy_version 3512 (0.0008) +[2023-10-11 15:03:36,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 7241728. Throughput: 0: 1670.0, 1: 1676.0. Samples: 1812518. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:03:36,063][84230] Avg episode reward: [(0, '6.500'), (1, '6.260')] +[2023-10-11 15:03:36,724][85175] Updated weights for policy 1, policy_version 3560 (0.0009) +[2023-10-11 15:03:37,096][85175] Updated weights for policy 1, policy_version 3570 (0.0007) +[2023-10-11 15:03:37,466][85175] Updated weights for policy 1, policy_version 3580 (0.0009) +[2023-10-11 15:03:39,691][85176] Updated weights for policy 0, policy_version 3522 (0.0009) +[2023-10-11 15:03:40,061][85176] Updated weights for policy 0, policy_version 3532 (0.0007) +[2023-10-11 15:03:40,441][85176] Updated weights for policy 0, policy_version 3542 (0.0008) +[2023-10-11 15:03:40,806][85176] Updated weights for policy 0, policy_version 3552 (0.0008) +[2023-10-11 15:03:41,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 7307264. Throughput: 0: 1666.2, 1: 1696.0. Samples: 1833326. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:03:41,063][84230] Avg episode reward: [(0, '6.640'), (1, '6.520')] +[2023-10-11 15:03:41,437][85175] Updated weights for policy 1, policy_version 3590 (0.0009) +[2023-10-11 15:03:41,808][85175] Updated weights for policy 1, policy_version 3600 (0.0010) +[2023-10-11 15:03:42,176][85175] Updated weights for policy 1, policy_version 3610 (0.0010) +[2023-10-11 15:03:44,895][85176] Updated weights for policy 0, policy_version 3562 (0.0009) +[2023-10-11 15:03:45,269][85176] Updated weights for policy 0, policy_version 3572 (0.0009) +[2023-10-11 15:03:45,644][85176] Updated weights for policy 0, policy_version 3582 (0.0008) +[2023-10-11 15:03:46,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13329.4). Total num frames: 7372800. Throughput: 0: 1652.2, 1: 1695.5. Samples: 1852872. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:03:46,063][84230] Avg episode reward: [(0, '6.940'), (1, '6.530')] +[2023-10-11 15:03:46,325][85175] Updated weights for policy 1, policy_version 3620 (0.0008) +[2023-10-11 15:03:46,700][85175] Updated weights for policy 1, policy_version 3630 (0.0009) +[2023-10-11 15:03:47,073][85175] Updated weights for policy 1, policy_version 3640 (0.0009) +[2023-10-11 15:03:49,755][85176] Updated weights for policy 0, policy_version 3592 (0.0009) +[2023-10-11 15:03:50,130][85176] Updated weights for policy 0, policy_version 3602 (0.0007) +[2023-10-11 15:03:50,508][85176] Updated weights for policy 0, policy_version 3612 (0.0008) +[2023-10-11 15:03:51,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 7438336. Throughput: 0: 1660.5, 1: 1683.4. Samples: 1862788. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:03:51,063][84230] Avg episode reward: [(0, '6.960'), (1, '6.530')] +[2023-10-11 15:03:51,330][85175] Updated weights for policy 1, policy_version 3650 (0.0009) +[2023-10-11 15:03:51,702][85175] Updated weights for policy 1, policy_version 3660 (0.0008) +[2023-10-11 15:03:52,062][85175] Updated weights for policy 1, policy_version 3670 (0.0008) +[2023-10-11 15:03:52,430][85175] Updated weights for policy 1, policy_version 3680 (0.0010) +[2023-10-11 15:03:54,758][85176] Updated weights for policy 0, policy_version 3622 (0.0008) +[2023-10-11 15:03:55,127][85176] Updated weights for policy 0, policy_version 3632 (0.0009) +[2023-10-11 15:03:55,500][85176] Updated weights for policy 0, policy_version 3642 (0.0008) +[2023-10-11 15:03:56,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13329.3). Total num frames: 7503872. Throughput: 0: 1661.8, 1: 1684.0. Samples: 1883214. Policy #0 lag: (min: 31.0, avg: 32.5, max: 58.0) +[2023-10-11 15:03:56,064][84230] Avg episode reward: [(0, '6.710'), (1, '6.090')] +[2023-10-11 15:03:56,625][85175] Updated weights for policy 1, policy_version 3690 (0.0011) +[2023-10-11 15:03:56,992][85175] Updated weights for policy 1, policy_version 3700 (0.0010) +[2023-10-11 15:03:57,372][85175] Updated weights for policy 1, policy_version 3710 (0.0011) +[2023-10-11 15:03:59,498][85176] Updated weights for policy 0, policy_version 3652 (0.0008) +[2023-10-11 15:03:59,861][85176] Updated weights for policy 0, policy_version 3662 (0.0011) +[2023-10-11 15:04:00,236][85176] Updated weights for policy 0, policy_version 3672 (0.0008) +[2023-10-11 15:04:01,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 7569408. Throughput: 0: 1645.9, 1: 1687.2. Samples: 1902736. Policy #0 lag: (min: 31.0, avg: 32.5, max: 58.0) +[2023-10-11 15:04:01,063][84230] Avg episode reward: [(0, '6.520'), (1, '6.010')] +[2023-10-11 15:04:01,406][85175] Updated weights for policy 1, policy_version 3720 (0.0008) +[2023-10-11 15:04:01,785][85175] Updated weights for policy 1, policy_version 3730 (0.0007) +[2023-10-11 15:04:02,155][85175] Updated weights for policy 1, policy_version 3740 (0.0009) +[2023-10-11 15:04:04,414][85176] Updated weights for policy 0, policy_version 3682 (0.0010) +[2023-10-11 15:04:04,816][85176] Updated weights for policy 0, policy_version 3692 (0.0007) +[2023-10-11 15:04:05,190][85176] Updated weights for policy 0, policy_version 3702 (0.0007) +[2023-10-11 15:04:05,570][85176] Updated weights for policy 0, policy_version 3712 (0.0007) +[2023-10-11 15:04:06,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 7634944. Throughput: 0: 1659.8, 1: 1689.0. Samples: 1913060. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:04:06,064][84230] Avg episode reward: [(0, '6.630'), (1, '5.940')] +[2023-10-11 15:04:06,249][85175] Updated weights for policy 1, policy_version 3750 (0.0010) +[2023-10-11 15:04:06,625][85175] Updated weights for policy 1, policy_version 3760 (0.0011) +[2023-10-11 15:04:07,002][85175] Updated weights for policy 1, policy_version 3770 (0.0008) +[2023-10-11 15:04:09,530][85176] Updated weights for policy 0, policy_version 3722 (0.0010) +[2023-10-11 15:04:09,910][85176] Updated weights for policy 0, policy_version 3732 (0.0010) +[2023-10-11 15:04:10,277][85176] Updated weights for policy 0, policy_version 3742 (0.0008) +[2023-10-11 15:04:11,062][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13329.4). Total num frames: 7700480. Throughput: 0: 1656.8, 1: 1689.5. Samples: 1933364. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:04:11,063][84230] Avg episode reward: [(0, '6.990'), (1, '6.280')] +[2023-10-11 15:04:11,111][85175] Updated weights for policy 1, policy_version 3780 (0.0008) +[2023-10-11 15:04:11,478][85175] Updated weights for policy 1, policy_version 3790 (0.0009) +[2023-10-11 15:04:11,854][85175] Updated weights for policy 1, policy_version 3800 (0.0011) +[2023-10-11 15:04:14,268][85176] Updated weights for policy 0, policy_version 3752 (0.0008) +[2023-10-11 15:04:14,641][85176] Updated weights for policy 0, policy_version 3762 (0.0009) +[2023-10-11 15:04:15,008][85176] Updated weights for policy 0, policy_version 3772 (0.0008) +[2023-10-11 15:04:16,049][85175] Updated weights for policy 1, policy_version 3810 (0.0008) +[2023-10-11 15:04:16,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 7766016. Throughput: 0: 1661.6, 1: 1680.8. Samples: 1952922. Policy #0 lag: (min: 16.0, avg: 41.1, max: 48.0) +[2023-10-11 15:04:16,063][84230] Avg episode reward: [(0, '7.170'), (1, '6.470')] +[2023-10-11 15:04:16,069][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000003776_3866624.pth... +[2023-10-11 15:04:16,104][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000002208_2260992.pth +[2023-10-11 15:04:16,107][84801] Saving new best policy, reward=7.170! +[2023-10-11 15:04:16,443][85175] Updated weights for policy 1, policy_version 3820 (0.0008) +[2023-10-11 15:04:16,823][85175] Updated weights for policy 1, policy_version 3830 (0.0008) +[2023-10-11 15:04:17,192][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000003840_3932160.pth... +[2023-10-11 15:04:17,196][85175] Updated weights for policy 1, policy_version 3840 (0.0008) +[2023-10-11 15:04:17,231][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000002240_2293760.pth +[2023-10-11 15:04:19,024][85176] Updated weights for policy 0, policy_version 3782 (0.0010) +[2023-10-11 15:04:19,396][85176] Updated weights for policy 0, policy_version 3792 (0.0010) +[2023-10-11 15:04:19,765][85176] Updated weights for policy 0, policy_version 3802 (0.0010) +[2023-10-11 15:04:21,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 7831552. Throughput: 0: 1674.5, 1: 1676.0. Samples: 1963292. Policy #0 lag: (min: 31.0, avg: 31.1, max: 36.0) +[2023-10-11 15:04:21,063][84230] Avg episode reward: [(0, '6.770'), (1, '6.350')] +[2023-10-11 15:04:21,208][85175] Updated weights for policy 1, policy_version 3850 (0.0009) +[2023-10-11 15:04:21,580][85175] Updated weights for policy 1, policy_version 3860 (0.0007) +[2023-10-11 15:04:21,941][85175] Updated weights for policy 1, policy_version 3870 (0.0008) +[2023-10-11 15:04:23,887][85176] Updated weights for policy 0, policy_version 3812 (0.0009) +[2023-10-11 15:04:24,258][85176] Updated weights for policy 0, policy_version 3822 (0.0010) +[2023-10-11 15:04:24,628][85176] Updated weights for policy 0, policy_version 3832 (0.0009) +[2023-10-11 15:04:25,889][85175] Updated weights for policy 1, policy_version 3880 (0.0009) +[2023-10-11 15:04:26,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 7897088. Throughput: 0: 1658.3, 1: 1673.2. Samples: 1983240. Policy #0 lag: (min: 31.0, avg: 31.1, max: 36.0) +[2023-10-11 15:04:26,064][84230] Avg episode reward: [(0, '6.420'), (1, '6.220')] +[2023-10-11 15:04:26,256][85175] Updated weights for policy 1, policy_version 3890 (0.0008) +[2023-10-11 15:04:26,627][85175] Updated weights for policy 1, policy_version 3900 (0.0010) +[2023-10-11 15:04:28,514][85176] Updated weights for policy 0, policy_version 3842 (0.0010) +[2023-10-11 15:04:28,881][85176] Updated weights for policy 0, policy_version 3852 (0.0008) +[2023-10-11 15:04:29,262][85176] Updated weights for policy 0, policy_version 3862 (0.0008) +[2023-10-11 15:04:29,627][85176] Updated weights for policy 0, policy_version 3872 (0.0009) +[2023-10-11 15:04:30,712][85175] Updated weights for policy 1, policy_version 3910 (0.0009) +[2023-10-11 15:04:31,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 7962624. Throughput: 0: 1676.9, 1: 1678.4. Samples: 2003860. Policy #0 lag: (min: 17.0, avg: 25.1, max: 49.0) +[2023-10-11 15:04:31,063][84230] Avg episode reward: [(0, '6.670'), (1, '6.050')] +[2023-10-11 15:04:31,086][85175] Updated weights for policy 1, policy_version 3920 (0.0009) +[2023-10-11 15:04:31,452][85175] Updated weights for policy 1, policy_version 3930 (0.0009) +[2023-10-11 15:04:33,756][85176] Updated weights for policy 0, policy_version 3882 (0.0011) +[2023-10-11 15:04:34,136][85176] Updated weights for policy 0, policy_version 3892 (0.0007) +[2023-10-11 15:04:34,512][85176] Updated weights for policy 0, policy_version 3902 (0.0009) +[2023-10-11 15:04:35,497][85175] Updated weights for policy 1, policy_version 3940 (0.0009) +[2023-10-11 15:04:35,872][85175] Updated weights for policy 1, policy_version 3950 (0.0010) +[2023-10-11 15:04:36,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 8028160. Throughput: 0: 1678.3, 1: 1684.2. Samples: 2014098. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:04:36,063][84230] Avg episode reward: [(0, '6.630'), (1, '6.150')] +[2023-10-11 15:04:36,238][85175] Updated weights for policy 1, policy_version 3960 (0.0010) +[2023-10-11 15:04:38,563][85176] Updated weights for policy 0, policy_version 3912 (0.0008) +[2023-10-11 15:04:38,925][85176] Updated weights for policy 0, policy_version 3922 (0.0007) +[2023-10-11 15:04:39,297][85176] Updated weights for policy 0, policy_version 3932 (0.0010) +[2023-10-11 15:04:40,110][85175] Updated weights for policy 1, policy_version 3970 (0.0010) +[2023-10-11 15:04:40,488][85175] Updated weights for policy 1, policy_version 3980 (0.0010) +[2023-10-11 15:04:40,854][85175] Updated weights for policy 1, policy_version 3990 (0.0010) +[2023-10-11 15:04:41,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 8093696. Throughput: 0: 1656.9, 1: 1690.0. Samples: 2033824. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:04:41,063][84230] Avg episode reward: [(0, '6.710'), (1, '6.280')] +[2023-10-11 15:04:41,222][85175] Updated weights for policy 1, policy_version 4000 (0.0011) +[2023-10-11 15:04:43,521][85176] Updated weights for policy 0, policy_version 3942 (0.0009) +[2023-10-11 15:04:43,891][85176] Updated weights for policy 0, policy_version 3952 (0.0007) +[2023-10-11 15:04:44,276][85176] Updated weights for policy 0, policy_version 3962 (0.0008) +[2023-10-11 15:04:45,207][85175] Updated weights for policy 1, policy_version 4010 (0.0008) +[2023-10-11 15:04:45,588][85175] Updated weights for policy 1, policy_version 4020 (0.0007) +[2023-10-11 15:04:45,960][85175] Updated weights for policy 1, policy_version 4030 (0.0007) +[2023-10-11 15:04:46,063][84230] Fps is (10 sec: 16383.7, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 8192000. Throughput: 0: 1683.3, 1: 1680.7. Samples: 2054116. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:04:46,064][84230] Avg episode reward: [(0, '6.820'), (1, '6.560')] +[2023-10-11 15:04:48,468][85176] Updated weights for policy 0, policy_version 3972 (0.0007) +[2023-10-11 15:04:48,842][85176] Updated weights for policy 0, policy_version 3982 (0.0009) +[2023-10-11 15:04:49,214][85176] Updated weights for policy 0, policy_version 3992 (0.0007) +[2023-10-11 15:04:49,948][85175] Updated weights for policy 1, policy_version 4040 (0.0007) +[2023-10-11 15:04:50,316][85175] Updated weights for policy 1, policy_version 4050 (0.0008) +[2023-10-11 15:04:50,695][85175] Updated weights for policy 1, policy_version 4060 (0.0009) +[2023-10-11 15:04:51,063][84230] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 8257536. Throughput: 0: 1672.2, 1: 1696.9. Samples: 2064672. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:04:51,063][84230] Avg episode reward: [(0, '6.960'), (1, '6.430')] +[2023-10-11 15:04:53,285][85176] Updated weights for policy 0, policy_version 4002 (0.0009) +[2023-10-11 15:04:53,650][85176] Updated weights for policy 0, policy_version 4012 (0.0008) +[2023-10-11 15:04:54,020][85176] Updated weights for policy 0, policy_version 4022 (0.0007) +[2023-10-11 15:04:54,392][85176] Updated weights for policy 0, policy_version 4032 (0.0007) +[2023-10-11 15:04:54,742][85175] Updated weights for policy 1, policy_version 4070 (0.0010) +[2023-10-11 15:04:55,109][85175] Updated weights for policy 1, policy_version 4080 (0.0008) +[2023-10-11 15:04:55,480][85175] Updated weights for policy 1, policy_version 4090 (0.0007) +[2023-10-11 15:04:56,063][84230] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 8323072. Throughput: 0: 1658.1, 1: 1697.2. Samples: 2084352. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:04:56,063][84230] Avg episode reward: [(0, '6.670'), (1, '6.250')] +[2023-10-11 15:04:58,536][85176] Updated weights for policy 0, policy_version 4042 (0.0007) +[2023-10-11 15:04:58,916][85176] Updated weights for policy 0, policy_version 4052 (0.0009) +[2023-10-11 15:04:59,282][85176] Updated weights for policy 0, policy_version 4062 (0.0011) +[2023-10-11 15:04:59,496][85175] Updated weights for policy 1, policy_version 4100 (0.0010) +[2023-10-11 15:04:59,875][85175] Updated weights for policy 1, policy_version 4110 (0.0010) +[2023-10-11 15:05:00,242][85175] Updated weights for policy 1, policy_version 4120 (0.0007) +[2023-10-11 15:05:01,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 8388608. Throughput: 0: 1675.1, 1: 1676.7. Samples: 2103752. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:05:01,063][84230] Avg episode reward: [(0, '6.630'), (1, '6.390')] +[2023-10-11 15:05:03,222][85176] Updated weights for policy 0, policy_version 4072 (0.0010) +[2023-10-11 15:05:03,597][85176] Updated weights for policy 0, policy_version 4082 (0.0009) +[2023-10-11 15:05:03,971][85176] Updated weights for policy 0, policy_version 4092 (0.0009) +[2023-10-11 15:05:04,405][85175] Updated weights for policy 1, policy_version 4130 (0.0009) +[2023-10-11 15:05:04,785][85175] Updated weights for policy 1, policy_version 4140 (0.0007) +[2023-10-11 15:05:05,154][85175] Updated weights for policy 1, policy_version 4150 (0.0007) +[2023-10-11 15:05:05,523][85175] Updated weights for policy 1, policy_version 4160 (0.0010) +[2023-10-11 15:05:06,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 8454144. Throughput: 0: 1661.4, 1: 1708.4. Samples: 2114932. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) +[2023-10-11 15:05:06,063][84230] Avg episode reward: [(0, '6.990'), (1, '6.310')] +[2023-10-11 15:05:07,971][85176] Updated weights for policy 0, policy_version 4102 (0.0009) +[2023-10-11 15:05:08,340][85176] Updated weights for policy 0, policy_version 4112 (0.0008) +[2023-10-11 15:05:08,715][85176] Updated weights for policy 0, policy_version 4122 (0.0007) +[2023-10-11 15:05:09,407][85175] Updated weights for policy 1, policy_version 4170 (0.0008) +[2023-10-11 15:05:09,777][85175] Updated weights for policy 1, policy_version 4180 (0.0009) +[2023-10-11 15:05:10,144][85175] Updated weights for policy 1, policy_version 4190 (0.0009) +[2023-10-11 15:05:11,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 8519680. Throughput: 0: 1670.7, 1: 1691.2. Samples: 2134524. Policy #0 lag: (min: 7.0, avg: 7.0, max: 11.0) +[2023-10-11 15:05:11,064][84230] Avg episode reward: [(0, '6.980'), (1, '6.030')] +[2023-10-11 15:05:12,895][85176] Updated weights for policy 0, policy_version 4132 (0.0009) +[2023-10-11 15:05:13,267][85176] Updated weights for policy 0, policy_version 4142 (0.0007) +[2023-10-11 15:05:13,643][85176] Updated weights for policy 0, policy_version 4152 (0.0009) +[2023-10-11 15:05:13,869][85175] Updated weights for policy 1, policy_version 4200 (0.0008) +[2023-10-11 15:05:14,246][85175] Updated weights for policy 1, policy_version 4210 (0.0008) +[2023-10-11 15:05:14,618][85175] Updated weights for policy 1, policy_version 4220 (0.0010) +[2023-10-11 15:05:16,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 8585216. Throughput: 0: 1673.8, 1: 1677.5. Samples: 2154670. Policy #0 lag: (min: 31.0, avg: 38.8, max: 63.0) +[2023-10-11 15:05:16,064][84230] Avg episode reward: [(0, '6.860'), (1, '6.390')] +[2023-10-11 15:05:17,947][85176] Updated weights for policy 0, policy_version 4162 (0.0008) +[2023-10-11 15:05:18,321][85176] Updated weights for policy 0, policy_version 4172 (0.0008) +[2023-10-11 15:05:18,672][85175] Updated weights for policy 1, policy_version 4230 (0.0007) +[2023-10-11 15:05:18,693][85176] Updated weights for policy 0, policy_version 4182 (0.0007) +[2023-10-11 15:05:19,039][85175] Updated weights for policy 1, policy_version 4240 (0.0009) +[2023-10-11 15:05:19,063][85176] Updated weights for policy 0, policy_version 4192 (0.0010) +[2023-10-11 15:05:19,413][85175] Updated weights for policy 1, policy_version 4250 (0.0008) +[2023-10-11 15:05:21,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 8650752. Throughput: 0: 1658.3, 1: 1706.2. Samples: 2165500. Policy #0 lag: (min: 31.0, avg: 38.8, max: 63.0) +[2023-10-11 15:05:21,063][84230] Avg episode reward: [(0, '6.430'), (1, '6.560')] +[2023-10-11 15:05:23,382][85176] Updated weights for policy 0, policy_version 4202 (0.0009) +[2023-10-11 15:05:23,484][85175] Updated weights for policy 1, policy_version 4260 (0.0007) +[2023-10-11 15:05:23,770][85176] Updated weights for policy 0, policy_version 4212 (0.0008) +[2023-10-11 15:05:23,849][85175] Updated weights for policy 1, policy_version 4270 (0.0009) +[2023-10-11 15:05:24,138][85176] Updated weights for policy 0, policy_version 4222 (0.0008) +[2023-10-11 15:05:24,208][85175] Updated weights for policy 1, policy_version 4280 (0.0009) +[2023-10-11 15:05:26,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 8716288. Throughput: 0: 1662.9, 1: 1677.5. Samples: 2184144. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:05:26,064][84230] Avg episode reward: [(0, '6.520'), (1, '6.630')] +[2023-10-11 15:05:28,348][85175] Updated weights for policy 1, policy_version 4290 (0.0010) +[2023-10-11 15:05:28,427][85176] Updated weights for policy 0, policy_version 4232 (0.0009) +[2023-10-11 15:05:28,706][85175] Updated weights for policy 1, policy_version 4300 (0.0008) +[2023-10-11 15:05:28,798][85176] Updated weights for policy 0, policy_version 4242 (0.0008) +[2023-10-11 15:05:29,069][85175] Updated weights for policy 1, policy_version 4310 (0.0009) +[2023-10-11 15:05:29,168][85176] Updated weights for policy 0, policy_version 4252 (0.0010) +[2023-10-11 15:05:29,433][85175] Updated weights for policy 1, policy_version 4320 (0.0009) +[2023-10-11 15:05:31,063][84230] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 8781824. Throughput: 0: 1660.5, 1: 1684.1. Samples: 2204626. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:05:31,064][84230] Avg episode reward: [(0, '7.140'), (1, '6.490')] +[2023-10-11 15:05:33,013][85176] Updated weights for policy 0, policy_version 4262 (0.0010) +[2023-10-11 15:05:33,384][85176] Updated weights for policy 0, policy_version 4272 (0.0010) +[2023-10-11 15:05:33,422][85175] Updated weights for policy 1, policy_version 4330 (0.0010) +[2023-10-11 15:05:33,757][85176] Updated weights for policy 0, policy_version 4282 (0.0008) +[2023-10-11 15:05:33,795][85175] Updated weights for policy 1, policy_version 4340 (0.0009) +[2023-10-11 15:05:34,169][85175] Updated weights for policy 1, policy_version 4350 (0.0008) +[2023-10-11 15:05:36,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 8847360. Throughput: 0: 1654.0, 1: 1689.9. Samples: 2215146. Policy #0 lag: (min: 31.0, avg: 33.4, max: 63.0) +[2023-10-11 15:05:36,063][84230] Avg episode reward: [(0, '7.280'), (1, '6.400')] +[2023-10-11 15:05:36,064][84801] Saving new best policy, reward=7.280! +[2023-10-11 15:05:37,845][85176] Updated weights for policy 0, policy_version 4292 (0.0009) +[2023-10-11 15:05:38,225][85176] Updated weights for policy 0, policy_version 4302 (0.0008) +[2023-10-11 15:05:38,302][85175] Updated weights for policy 1, policy_version 4360 (0.0007) +[2023-10-11 15:05:38,596][85176] Updated weights for policy 0, policy_version 4312 (0.0008) +[2023-10-11 15:05:38,666][85175] Updated weights for policy 1, policy_version 4370 (0.0010) +[2023-10-11 15:05:39,041][85175] Updated weights for policy 1, policy_version 4380 (0.0009) +[2023-10-11 15:05:41,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 8912896. Throughput: 0: 1664.8, 1: 1666.6. Samples: 2234264. Policy #0 lag: (min: 31.0, avg: 33.4, max: 63.0) +[2023-10-11 15:05:41,064][84230] Avg episode reward: [(0, '6.500'), (1, '6.390')] +[2023-10-11 15:05:42,710][85176] Updated weights for policy 0, policy_version 4322 (0.0008) +[2023-10-11 15:05:42,972][85175] Updated weights for policy 1, policy_version 4390 (0.0009) +[2023-10-11 15:05:43,113][85176] Updated weights for policy 0, policy_version 4332 (0.0007) +[2023-10-11 15:05:43,343][85175] Updated weights for policy 1, policy_version 4400 (0.0008) +[2023-10-11 15:05:43,481][85176] Updated weights for policy 0, policy_version 4342 (0.0007) +[2023-10-11 15:05:43,714][85175] Updated weights for policy 1, policy_version 4410 (0.0007) +[2023-10-11 15:05:43,863][85176] Updated weights for policy 0, policy_version 4352 (0.0007) +[2023-10-11 15:05:46,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 8978432. Throughput: 0: 1662.0, 1: 1698.8. Samples: 2254988. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:05:46,063][84230] Avg episode reward: [(0, '6.050'), (1, '6.290')] +[2023-10-11 15:05:47,780][85175] Updated weights for policy 1, policy_version 4420 (0.0010) +[2023-10-11 15:05:48,054][85176] Updated weights for policy 0, policy_version 4362 (0.0007) +[2023-10-11 15:05:48,143][85175] Updated weights for policy 1, policy_version 4430 (0.0010) +[2023-10-11 15:05:48,433][85176] Updated weights for policy 0, policy_version 4372 (0.0008) +[2023-10-11 15:05:48,515][85175] Updated weights for policy 1, policy_version 4440 (0.0009) +[2023-10-11 15:05:48,805][85176] Updated weights for policy 0, policy_version 4382 (0.0008) +[2023-10-11 15:05:51,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 9043968. Throughput: 0: 1650.0, 1: 1680.7. Samples: 2264814. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:05:51,063][84230] Avg episode reward: [(0, '6.150'), (1, '6.560')] +[2023-10-11 15:05:52,635][85175] Updated weights for policy 1, policy_version 4450 (0.0008) +[2023-10-11 15:05:52,876][85176] Updated weights for policy 0, policy_version 4392 (0.0007) +[2023-10-11 15:05:53,001][85175] Updated weights for policy 1, policy_version 4460 (0.0007) +[2023-10-11 15:05:53,248][85176] Updated weights for policy 0, policy_version 4402 (0.0009) +[2023-10-11 15:05:53,378][85175] Updated weights for policy 1, policy_version 4470 (0.0007) +[2023-10-11 15:05:53,618][85176] Updated weights for policy 0, policy_version 4412 (0.0008) +[2023-10-11 15:05:53,740][85175] Updated weights for policy 1, policy_version 4480 (0.0010) +[2023-10-11 15:05:56,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 9109504. Throughput: 0: 1650.7, 1: 1682.7. Samples: 2284524. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:05:56,063][84230] Avg episode reward: [(0, '6.830'), (1, '6.530')] +[2023-10-11 15:05:57,898][85176] Updated weights for policy 0, policy_version 4422 (0.0009) +[2023-10-11 15:05:58,026][85175] Updated weights for policy 1, policy_version 4490 (0.0008) +[2023-10-11 15:05:58,272][85176] Updated weights for policy 0, policy_version 4432 (0.0009) +[2023-10-11 15:05:58,405][85175] Updated weights for policy 1, policy_version 4500 (0.0008) +[2023-10-11 15:05:58,640][85176] Updated weights for policy 0, policy_version 4442 (0.0008) +[2023-10-11 15:05:58,779][85175] Updated weights for policy 1, policy_version 4510 (0.0008) +[2023-10-11 15:06:01,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 9175040. Throughput: 0: 1644.5, 1: 1686.4. Samples: 2304564. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:06:01,063][84230] Avg episode reward: [(0, '7.190'), (1, '6.430')] +[2023-10-11 15:06:02,648][85176] Updated weights for policy 0, policy_version 4452 (0.0008) +[2023-10-11 15:06:02,845][85175] Updated weights for policy 1, policy_version 4520 (0.0008) +[2023-10-11 15:06:03,023][85176] Updated weights for policy 0, policy_version 4462 (0.0009) +[2023-10-11 15:06:03,204][85175] Updated weights for policy 1, policy_version 4530 (0.0009) +[2023-10-11 15:06:03,396][85176] Updated weights for policy 0, policy_version 4472 (0.0007) +[2023-10-11 15:06:03,574][85175] Updated weights for policy 1, policy_version 4540 (0.0008) +[2023-10-11 15:06:06,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 9240576. Throughput: 0: 1642.9, 1: 1663.0. Samples: 2314264. Policy #0 lag: (min: 31.0, avg: 34.5, max: 63.0) +[2023-10-11 15:06:06,063][84230] Avg episode reward: [(0, '7.190'), (1, '6.460')] +[2023-10-11 15:06:07,302][85176] Updated weights for policy 0, policy_version 4482 (0.0007) +[2023-10-11 15:06:07,671][85176] Updated weights for policy 0, policy_version 4492 (0.0010) +[2023-10-11 15:06:07,684][85175] Updated weights for policy 1, policy_version 4550 (0.0008) +[2023-10-11 15:06:08,053][85175] Updated weights for policy 1, policy_version 4560 (0.0009) +[2023-10-11 15:06:08,055][85176] Updated weights for policy 0, policy_version 4502 (0.0007) +[2023-10-11 15:06:08,420][85175] Updated weights for policy 1, policy_version 4570 (0.0007) +[2023-10-11 15:06:08,431][85176] Updated weights for policy 0, policy_version 4512 (0.0008) +[2023-10-11 15:06:11,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 9306112. Throughput: 0: 1658.7, 1: 1684.6. Samples: 2334594. Policy #0 lag: (min: 31.0, avg: 34.5, max: 63.0) +[2023-10-11 15:06:11,063][84230] Avg episode reward: [(0, '6.850'), (1, '6.340')] +[2023-10-11 15:06:12,396][85175] Updated weights for policy 1, policy_version 4580 (0.0008) +[2023-10-11 15:06:12,599][85176] Updated weights for policy 0, policy_version 4522 (0.0008) +[2023-10-11 15:06:12,774][85175] Updated weights for policy 1, policy_version 4590 (0.0008) +[2023-10-11 15:06:12,970][85176] Updated weights for policy 0, policy_version 4532 (0.0008) +[2023-10-11 15:06:13,138][85175] Updated weights for policy 1, policy_version 4600 (0.0007) +[2023-10-11 15:06:13,351][85176] Updated weights for policy 0, policy_version 4542 (0.0008) +[2023-10-11 15:06:16,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 9371648. Throughput: 0: 1662.0, 1: 1689.5. Samples: 2355440. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:06:16,063][84230] Avg episode reward: [(0, '6.610'), (1, '6.530')] +[2023-10-11 15:06:16,073][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000004544_4653056.pth... +[2023-10-11 15:06:16,073][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000004608_4718592.pth... +[2023-10-11 15:06:16,126][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000003008_3080192.pth +[2023-10-11 15:06:16,126][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000003040_3112960.pth +[2023-10-11 15:06:17,089][85175] Updated weights for policy 1, policy_version 4610 (0.0007) +[2023-10-11 15:06:17,385][85176] Updated weights for policy 0, policy_version 4552 (0.0009) +[2023-10-11 15:06:17,453][85175] Updated weights for policy 1, policy_version 4620 (0.0007) +[2023-10-11 15:06:17,759][85176] Updated weights for policy 0, policy_version 4562 (0.0008) +[2023-10-11 15:06:17,827][85175] Updated weights for policy 1, policy_version 4630 (0.0008) +[2023-10-11 15:06:18,127][85176] Updated weights for policy 0, policy_version 4572 (0.0008) +[2023-10-11 15:06:18,193][85175] Updated weights for policy 1, policy_version 4640 (0.0008) +[2023-10-11 15:06:21,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 9437184. Throughput: 0: 1650.1, 1: 1666.3. Samples: 2364384. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:06:21,063][84230] Avg episode reward: [(0, '6.530'), (1, '6.500')] +[2023-10-11 15:06:22,200][85175] Updated weights for policy 1, policy_version 4650 (0.0009) +[2023-10-11 15:06:22,299][85176] Updated weights for policy 0, policy_version 4582 (0.0008) +[2023-10-11 15:06:22,574][85175] Updated weights for policy 1, policy_version 4660 (0.0010) +[2023-10-11 15:06:22,676][85176] Updated weights for policy 0, policy_version 4592 (0.0008) +[2023-10-11 15:06:22,942][85175] Updated weights for policy 1, policy_version 4670 (0.0009) +[2023-10-11 15:06:23,041][85176] Updated weights for policy 0, policy_version 4602 (0.0007) +[2023-10-11 15:06:26,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 9502720. Throughput: 0: 1660.0, 1: 1691.7. Samples: 2385092. Policy #0 lag: (min: 10.0, avg: 14.1, max: 42.0) +[2023-10-11 15:06:26,064][84230] Avg episode reward: [(0, '6.530'), (1, '6.520')] +[2023-10-11 15:06:26,971][85176] Updated weights for policy 0, policy_version 4612 (0.0009) +[2023-10-11 15:06:27,115][85175] Updated weights for policy 1, policy_version 4680 (0.0007) +[2023-10-11 15:06:27,345][85176] Updated weights for policy 0, policy_version 4622 (0.0009) +[2023-10-11 15:06:27,486][85175] Updated weights for policy 1, policy_version 4690 (0.0009) +[2023-10-11 15:06:27,713][85176] Updated weights for policy 0, policy_version 4632 (0.0008) +[2023-10-11 15:06:27,853][85175] Updated weights for policy 1, policy_version 4700 (0.0008) +[2023-10-11 15:06:31,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 9568256. Throughput: 0: 1661.4, 1: 1684.3. Samples: 2405548. Policy #0 lag: (min: 10.0, avg: 14.1, max: 42.0) +[2023-10-11 15:06:31,064][84230] Avg episode reward: [(0, '6.750'), (1, '6.420')] +[2023-10-11 15:06:31,851][85175] Updated weights for policy 1, policy_version 4710 (0.0009) +[2023-10-11 15:06:31,949][85176] Updated weights for policy 0, policy_version 4642 (0.0008) +[2023-10-11 15:06:32,224][85175] Updated weights for policy 1, policy_version 4720 (0.0010) +[2023-10-11 15:06:32,330][85176] Updated weights for policy 0, policy_version 4652 (0.0008) +[2023-10-11 15:06:32,590][85175] Updated weights for policy 1, policy_version 4730 (0.0008) +[2023-10-11 15:06:32,697][85176] Updated weights for policy 0, policy_version 4662 (0.0009) +[2023-10-11 15:06:33,072][85176] Updated weights for policy 0, policy_version 4672 (0.0010) +[2023-10-11 15:06:36,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 9633792. Throughput: 0: 1655.9, 1: 1675.9. Samples: 2414746. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:06:36,063][84230] Avg episode reward: [(0, '7.070'), (1, '6.250')] +[2023-10-11 15:06:36,589][85175] Updated weights for policy 1, policy_version 4740 (0.0009) +[2023-10-11 15:06:36,954][85175] Updated weights for policy 1, policy_version 4750 (0.0010) +[2023-10-11 15:06:37,198][85176] Updated weights for policy 0, policy_version 4682 (0.0008) +[2023-10-11 15:06:37,318][85175] Updated weights for policy 1, policy_version 4760 (0.0009) +[2023-10-11 15:06:37,568][85176] Updated weights for policy 0, policy_version 4692 (0.0007) +[2023-10-11 15:06:37,943][85176] Updated weights for policy 0, policy_version 4702 (0.0010) +[2023-10-11 15:06:41,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 9699328. Throughput: 0: 1666.3, 1: 1688.7. Samples: 2435500. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:06:41,064][84230] Avg episode reward: [(0, '7.070'), (1, '6.380')] +[2023-10-11 15:06:41,377][85175] Updated weights for policy 1, policy_version 4770 (0.0009) +[2023-10-11 15:06:41,753][85175] Updated weights for policy 1, policy_version 4780 (0.0008) +[2023-10-11 15:06:42,100][85176] Updated weights for policy 0, policy_version 4712 (0.0009) +[2023-10-11 15:06:42,115][85175] Updated weights for policy 1, policy_version 4790 (0.0008) +[2023-10-11 15:06:42,471][85176] Updated weights for policy 0, policy_version 4722 (0.0009) +[2023-10-11 15:06:42,475][85175] Updated weights for policy 1, policy_version 4800 (0.0008) +[2023-10-11 15:06:42,846][85176] Updated weights for policy 0, policy_version 4732 (0.0008) +[2023-10-11 15:06:46,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 9764864. Throughput: 0: 1674.0, 1: 1702.9. Samples: 2456526. Policy #0 lag: (min: 31.0, avg: 38.0, max: 63.0) +[2023-10-11 15:06:46,064][84230] Avg episode reward: [(0, '6.820'), (1, '6.520')] +[2023-10-11 15:06:46,468][85175] Updated weights for policy 1, policy_version 4810 (0.0007) +[2023-10-11 15:06:46,833][85175] Updated weights for policy 1, policy_version 4820 (0.0009) +[2023-10-11 15:06:46,994][85176] Updated weights for policy 0, policy_version 4742 (0.0010) +[2023-10-11 15:06:47,202][85175] Updated weights for policy 1, policy_version 4830 (0.0009) +[2023-10-11 15:06:47,362][85176] Updated weights for policy 0, policy_version 4752 (0.0009) +[2023-10-11 15:06:47,740][85176] Updated weights for policy 0, policy_version 4762 (0.0009) +[2023-10-11 15:06:51,062][84230] Fps is (10 sec: 13107.7, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 9830400. Throughput: 0: 1666.6, 1: 1693.9. Samples: 2465484. Policy #0 lag: (min: 31.0, avg: 38.0, max: 63.0) +[2023-10-11 15:06:51,063][84230] Avg episode reward: [(0, '6.520'), (1, '6.690')] +[2023-10-11 15:06:51,320][85175] Updated weights for policy 1, policy_version 4840 (0.0008) +[2023-10-11 15:06:51,682][85175] Updated weights for policy 1, policy_version 4850 (0.0009) +[2023-10-11 15:06:51,878][85176] Updated weights for policy 0, policy_version 4772 (0.0011) +[2023-10-11 15:06:52,057][85175] Updated weights for policy 1, policy_version 4860 (0.0008) +[2023-10-11 15:06:52,257][85176] Updated weights for policy 0, policy_version 4782 (0.0010) +[2023-10-11 15:06:52,633][85176] Updated weights for policy 0, policy_version 4792 (0.0008) +[2023-10-11 15:06:56,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 9895936. Throughput: 0: 1672.0, 1: 1697.2. Samples: 2486206. Policy #0 lag: (min: 7.0, avg: 9.2, max: 39.0) +[2023-10-11 15:06:56,063][84230] Avg episode reward: [(0, '6.870'), (1, '6.630')] +[2023-10-11 15:06:56,142][85175] Updated weights for policy 1, policy_version 4870 (0.0007) +[2023-10-11 15:06:56,516][85175] Updated weights for policy 1, policy_version 4880 (0.0009) +[2023-10-11 15:06:56,602][85176] Updated weights for policy 0, policy_version 4802 (0.0008) +[2023-10-11 15:06:56,884][85175] Updated weights for policy 1, policy_version 4890 (0.0007) +[2023-10-11 15:06:56,977][85176] Updated weights for policy 0, policy_version 4812 (0.0007) +[2023-10-11 15:06:57,345][85176] Updated weights for policy 0, policy_version 4822 (0.0007) +[2023-10-11 15:06:57,719][85176] Updated weights for policy 0, policy_version 4832 (0.0010) +[2023-10-11 15:07:00,973][85175] Updated weights for policy 1, policy_version 4900 (0.0009) +[2023-10-11 15:07:01,063][84230] Fps is (10 sec: 13106.2, 60 sec: 13107.1, 300 sec: 13329.3). Total num frames: 9961472. Throughput: 0: 1670.6, 1: 1693.7. Samples: 2506832. Policy #0 lag: (min: 7.0, avg: 9.2, max: 39.0) +[2023-10-11 15:07:01,064][84230] Avg episode reward: [(0, '6.970'), (1, '6.230')] +[2023-10-11 15:07:01,343][85175] Updated weights for policy 1, policy_version 4910 (0.0008) +[2023-10-11 15:07:01,715][85175] Updated weights for policy 1, policy_version 4920 (0.0007) +[2023-10-11 15:07:01,801][85176] Updated weights for policy 0, policy_version 4842 (0.0010) +[2023-10-11 15:07:02,189][85176] Updated weights for policy 0, policy_version 4852 (0.0007) +[2023-10-11 15:07:02,561][85176] Updated weights for policy 0, policy_version 4862 (0.0010) +[2023-10-11 15:07:05,586][85175] Updated weights for policy 1, policy_version 4930 (0.0007) +[2023-10-11 15:07:05,951][85175] Updated weights for policy 1, policy_version 4940 (0.0007) +[2023-10-11 15:07:06,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 10027008. Throughput: 0: 1669.8, 1: 1696.9. Samples: 2515884. Policy #0 lag: (min: 12.0, avg: 20.0, max: 44.0) +[2023-10-11 15:07:06,063][84230] Avg episode reward: [(0, '6.600'), (1, '6.270')] +[2023-10-11 15:07:06,317][85175] Updated weights for policy 1, policy_version 4950 (0.0007) +[2023-10-11 15:07:06,694][85175] Updated weights for policy 1, policy_version 4960 (0.0008) +[2023-10-11 15:07:06,697][85176] Updated weights for policy 0, policy_version 4872 (0.0009) +[2023-10-11 15:07:07,066][85176] Updated weights for policy 0, policy_version 4882 (0.0010) +[2023-10-11 15:07:07,440][85176] Updated weights for policy 0, policy_version 4892 (0.0010) +[2023-10-11 15:07:10,609][85175] Updated weights for policy 1, policy_version 4970 (0.0009) +[2023-10-11 15:07:10,974][85175] Updated weights for policy 1, policy_version 4980 (0.0007) +[2023-10-11 15:07:11,063][84230] Fps is (10 sec: 13108.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 10092544. Throughput: 0: 1665.4, 1: 1698.6. Samples: 2536470. Policy #0 lag: (min: 12.0, avg: 20.0, max: 44.0) +[2023-10-11 15:07:11,063][84230] Avg episode reward: [(0, '6.640'), (1, '6.680')] +[2023-10-11 15:07:11,360][85175] Updated weights for policy 1, policy_version 4990 (0.0009) +[2023-10-11 15:07:11,534][85176] Updated weights for policy 0, policy_version 4902 (0.0010) +[2023-10-11 15:07:11,910][85176] Updated weights for policy 0, policy_version 4912 (0.0010) +[2023-10-11 15:07:12,279][85176] Updated weights for policy 0, policy_version 4922 (0.0009) +[2023-10-11 15:07:15,595][85175] Updated weights for policy 1, policy_version 5000 (0.0010) +[2023-10-11 15:07:15,964][85175] Updated weights for policy 1, policy_version 5010 (0.0010) +[2023-10-11 15:07:16,063][84230] Fps is (10 sec: 13106.8, 60 sec: 13107.1, 300 sec: 13329.4). Total num frames: 10158080. Throughput: 0: 1669.4, 1: 1695.3. Samples: 2556960. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:07:16,064][84230] Avg episode reward: [(0, '6.480'), (1, '6.600')] +[2023-10-11 15:07:16,336][85175] Updated weights for policy 1, policy_version 5020 (0.0009) +[2023-10-11 15:07:16,523][85176] Updated weights for policy 0, policy_version 4932 (0.0007) +[2023-10-11 15:07:16,919][85176] Updated weights for policy 0, policy_version 4942 (0.0009) +[2023-10-11 15:07:17,296][85176] Updated weights for policy 0, policy_version 4952 (0.0009) +[2023-10-11 15:07:20,340][85175] Updated weights for policy 1, policy_version 5030 (0.0008) +[2023-10-11 15:07:20,707][85175] Updated weights for policy 1, policy_version 5040 (0.0009) +[2023-10-11 15:07:21,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 10223616. Throughput: 0: 1667.6, 1: 1701.6. Samples: 2566360. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:07:21,063][84230] Avg episode reward: [(0, '6.640'), (1, '6.530')] +[2023-10-11 15:07:21,077][85175] Updated weights for policy 1, policy_version 5050 (0.0009) +[2023-10-11 15:07:21,307][85176] Updated weights for policy 0, policy_version 4962 (0.0009) +[2023-10-11 15:07:21,686][85176] Updated weights for policy 0, policy_version 4972 (0.0009) +[2023-10-11 15:07:22,069][85176] Updated weights for policy 0, policy_version 4982 (0.0008) +[2023-10-11 15:07:22,439][85176] Updated weights for policy 0, policy_version 4992 (0.0008) +[2023-10-11 15:07:25,020][85175] Updated weights for policy 1, policy_version 5060 (0.0008) +[2023-10-11 15:07:25,383][85175] Updated weights for policy 1, policy_version 5070 (0.0008) +[2023-10-11 15:07:25,751][85175] Updated weights for policy 1, policy_version 5080 (0.0012) +[2023-10-11 15:07:26,063][84230] Fps is (10 sec: 16384.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 10321920. Throughput: 0: 1663.3, 1: 1698.4. Samples: 2586776. Policy #0 lag: (min: 8.0, avg: 35.6, max: 40.0) +[2023-10-11 15:07:26,064][84230] Avg episode reward: [(0, '6.940'), (1, '6.530')] +[2023-10-11 15:07:26,718][85176] Updated weights for policy 0, policy_version 5002 (0.0008) +[2023-10-11 15:07:27,097][85176] Updated weights for policy 0, policy_version 5012 (0.0007) +[2023-10-11 15:07:27,470][85176] Updated weights for policy 0, policy_version 5022 (0.0009) +[2023-10-11 15:07:29,965][85175] Updated weights for policy 1, policy_version 5090 (0.0010) +[2023-10-11 15:07:30,341][85175] Updated weights for policy 1, policy_version 5100 (0.0008) +[2023-10-11 15:07:30,710][85175] Updated weights for policy 1, policy_version 5110 (0.0007) +[2023-10-11 15:07:31,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 10354688. Throughput: 0: 1658.6, 1: 1673.9. Samples: 2606488. Policy #0 lag: (min: 8.0, avg: 35.6, max: 40.0) +[2023-10-11 15:07:31,063][84230] Avg episode reward: [(0, '6.590'), (1, '6.760')] +[2023-10-11 15:07:31,083][85175] Updated weights for policy 1, policy_version 5120 (0.0008) +[2023-10-11 15:07:31,344][85176] Updated weights for policy 0, policy_version 5032 (0.0009) +[2023-10-11 15:07:31,715][85176] Updated weights for policy 0, policy_version 5042 (0.0008) +[2023-10-11 15:07:32,086][85176] Updated weights for policy 0, policy_version 5052 (0.0008) +[2023-10-11 15:07:35,094][85175] Updated weights for policy 1, policy_version 5130 (0.0009) +[2023-10-11 15:07:35,464][85175] Updated weights for policy 1, policy_version 5140 (0.0009) +[2023-10-11 15:07:35,831][85175] Updated weights for policy 1, policy_version 5150 (0.0007) +[2023-10-11 15:07:36,062][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 10452992. Throughput: 0: 1663.5, 1: 1694.1. Samples: 2616574. Policy #0 lag: (min: 27.0, avg: 27.0, max: 27.0) +[2023-10-11 15:07:36,063][84230] Avg episode reward: [(0, '6.510'), (1, '6.320')] +[2023-10-11 15:07:36,095][85176] Updated weights for policy 0, policy_version 5062 (0.0007) +[2023-10-11 15:07:36,470][85176] Updated weights for policy 0, policy_version 5072 (0.0009) +[2023-10-11 15:07:36,834][85176] Updated weights for policy 0, policy_version 5082 (0.0008) +[2023-10-11 15:07:39,711][85175] Updated weights for policy 1, policy_version 5160 (0.0008) +[2023-10-11 15:07:40,081][85175] Updated weights for policy 1, policy_version 5170 (0.0007) +[2023-10-11 15:07:40,462][85175] Updated weights for policy 1, policy_version 5180 (0.0007) +[2023-10-11 15:07:40,996][85176] Updated weights for policy 0, policy_version 5092 (0.0009) +[2023-10-11 15:07:41,063][84230] Fps is (10 sec: 16383.8, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 10518528. Throughput: 0: 1660.2, 1: 1697.1. Samples: 2637284. Policy #0 lag: (min: 27.0, avg: 27.0, max: 27.0) +[2023-10-11 15:07:41,064][84230] Avg episode reward: [(0, '6.830'), (1, '6.320')] +[2023-10-11 15:07:41,361][85176] Updated weights for policy 0, policy_version 5102 (0.0009) +[2023-10-11 15:07:41,730][85176] Updated weights for policy 0, policy_version 5112 (0.0008) +[2023-10-11 15:07:44,464][85175] Updated weights for policy 1, policy_version 5190 (0.0008) +[2023-10-11 15:07:44,842][85175] Updated weights for policy 1, policy_version 5200 (0.0009) +[2023-10-11 15:07:45,222][85175] Updated weights for policy 1, policy_version 5210 (0.0009) +[2023-10-11 15:07:45,877][85176] Updated weights for policy 0, policy_version 5122 (0.0009) +[2023-10-11 15:07:46,063][84230] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 10584064. Throughput: 0: 1664.8, 1: 1670.2. Samples: 2656908. Policy #0 lag: (min: 31.0, avg: 35.8, max: 63.0) +[2023-10-11 15:07:46,064][84230] Avg episode reward: [(0, '6.750'), (1, '6.440')] +[2023-10-11 15:07:46,245][85176] Updated weights for policy 0, policy_version 5132 (0.0007) +[2023-10-11 15:07:46,630][85176] Updated weights for policy 0, policy_version 5142 (0.0008) +[2023-10-11 15:07:47,004][85176] Updated weights for policy 0, policy_version 5152 (0.0010) +[2023-10-11 15:07:49,332][85175] Updated weights for policy 1, policy_version 5220 (0.0009) +[2023-10-11 15:07:49,696][85175] Updated weights for policy 1, policy_version 5230 (0.0008) +[2023-10-11 15:07:50,074][85175] Updated weights for policy 1, policy_version 5240 (0.0008) +[2023-10-11 15:07:51,058][85176] Updated weights for policy 0, policy_version 5162 (0.0011) +[2023-10-11 15:07:51,062][84230] Fps is (10 sec: 13107.6, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 10649600. Throughput: 0: 1667.4, 1: 1694.0. Samples: 2667146. Policy #0 lag: (min: 31.0, avg: 35.8, max: 63.0) +[2023-10-11 15:07:51,063][84230] Avg episode reward: [(0, '6.630'), (1, '6.640')] +[2023-10-11 15:07:51,428][85176] Updated weights for policy 0, policy_version 5172 (0.0008) +[2023-10-11 15:07:51,798][85176] Updated weights for policy 0, policy_version 5182 (0.0009) +[2023-10-11 15:07:54,085][85175] Updated weights for policy 1, policy_version 5250 (0.0009) +[2023-10-11 15:07:54,461][85175] Updated weights for policy 1, policy_version 5260 (0.0008) +[2023-10-11 15:07:54,825][85175] Updated weights for policy 1, policy_version 5270 (0.0009) +[2023-10-11 15:07:55,189][85175] Updated weights for policy 1, policy_version 5280 (0.0008) +[2023-10-11 15:07:55,783][85176] Updated weights for policy 0, policy_version 5192 (0.0007) +[2023-10-11 15:07:56,063][84230] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 10715136. Throughput: 0: 1677.4, 1: 1680.5. Samples: 2687576. Policy #0 lag: (min: 26.0, avg: 33.4, max: 58.0) +[2023-10-11 15:07:56,063][84230] Avg episode reward: [(0, '6.640'), (1, '6.510')] +[2023-10-11 15:07:56,154][85176] Updated weights for policy 0, policy_version 5202 (0.0007) +[2023-10-11 15:07:56,537][85176] Updated weights for policy 0, policy_version 5212 (0.0007) +[2023-10-11 15:07:59,217][85175] Updated weights for policy 1, policy_version 5290 (0.0008) +[2023-10-11 15:07:59,585][85175] Updated weights for policy 1, policy_version 5300 (0.0008) +[2023-10-11 15:07:59,961][85175] Updated weights for policy 1, policy_version 5310 (0.0009) +[2023-10-11 15:08:00,549][85176] Updated weights for policy 0, policy_version 5222 (0.0008) +[2023-10-11 15:08:00,919][85176] Updated weights for policy 0, policy_version 5232 (0.0009) +[2023-10-11 15:08:01,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.5, 300 sec: 13440.4). Total num frames: 10780672. Throughput: 0: 1668.1, 1: 1670.1. Samples: 2707180. Policy #0 lag: (min: 26.0, avg: 33.4, max: 58.0) +[2023-10-11 15:08:01,063][84230] Avg episode reward: [(0, '6.860'), (1, '6.540')] +[2023-10-11 15:08:01,293][85176] Updated weights for policy 0, policy_version 5242 (0.0009) +[2023-10-11 15:08:03,936][85175] Updated weights for policy 1, policy_version 5320 (0.0011) +[2023-10-11 15:08:04,305][85175] Updated weights for policy 1, policy_version 5330 (0.0008) +[2023-10-11 15:08:04,671][85175] Updated weights for policy 1, policy_version 5340 (0.0007) +[2023-10-11 15:08:05,575][85176] Updated weights for policy 0, policy_version 5252 (0.0007) +[2023-10-11 15:08:05,971][85176] Updated weights for policy 0, policy_version 5262 (0.0008) +[2023-10-11 15:08:06,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 10846208. Throughput: 0: 1673.0, 1: 1693.7. Samples: 2717862. Policy #0 lag: (min: 1.0, avg: 1.6, max: 17.0) +[2023-10-11 15:08:06,063][84230] Avg episode reward: [(0, '6.820'), (1, '6.760')] +[2023-10-11 15:08:06,347][85176] Updated weights for policy 0, policy_version 5272 (0.0008) +[2023-10-11 15:08:08,638][85175] Updated weights for policy 1, policy_version 5350 (0.0007) +[2023-10-11 15:08:09,003][85175] Updated weights for policy 1, policy_version 5360 (0.0009) +[2023-10-11 15:08:09,378][85175] Updated weights for policy 1, policy_version 5370 (0.0009) +[2023-10-11 15:08:10,555][85176] Updated weights for policy 0, policy_version 5282 (0.0008) +[2023-10-11 15:08:10,921][85176] Updated weights for policy 0, policy_version 5292 (0.0009) +[2023-10-11 15:08:11,062][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 10911744. Throughput: 0: 1670.9, 1: 1674.3. Samples: 2737308. Policy #0 lag: (min: 1.0, avg: 1.6, max: 17.0) +[2023-10-11 15:08:11,063][84230] Avg episode reward: [(0, '6.520'), (1, '6.580')] +[2023-10-11 15:08:11,286][85176] Updated weights for policy 0, policy_version 5302 (0.0008) +[2023-10-11 15:08:11,666][85176] Updated weights for policy 0, policy_version 5312 (0.0009) +[2023-10-11 15:08:13,478][85175] Updated weights for policy 1, policy_version 5380 (0.0009) +[2023-10-11 15:08:13,858][85175] Updated weights for policy 1, policy_version 5390 (0.0008) +[2023-10-11 15:08:14,222][85175] Updated weights for policy 1, policy_version 5400 (0.0009) +[2023-10-11 15:08:15,791][85176] Updated weights for policy 0, policy_version 5322 (0.0008) +[2023-10-11 15:08:16,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13329.3). Total num frames: 10977280. Throughput: 0: 1668.2, 1: 1693.1. Samples: 2757746. Policy #0 lag: (min: 31.0, avg: 35.8, max: 63.0) +[2023-10-11 15:08:16,064][84230] Avg episode reward: [(0, '6.640'), (1, '6.580')] +[2023-10-11 15:08:16,073][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000005408_5537792.pth... +[2023-10-11 15:08:16,105][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000003840_3932160.pth +[2023-10-11 15:08:16,158][85176] Updated weights for policy 0, policy_version 5332 (0.0009) +[2023-10-11 15:08:16,544][85176] Updated weights for policy 0, policy_version 5342 (0.0007) +[2023-10-11 15:08:16,615][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000005344_5472256.pth... +[2023-10-11 15:08:16,646][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000003776_3866624.pth +[2023-10-11 15:08:18,320][85175] Updated weights for policy 1, policy_version 5410 (0.0007) +[2023-10-11 15:08:18,699][85175] Updated weights for policy 1, policy_version 5420 (0.0009) +[2023-10-11 15:08:19,066][85175] Updated weights for policy 1, policy_version 5430 (0.0009) +[2023-10-11 15:08:19,439][85175] Updated weights for policy 1, policy_version 5440 (0.0011) +[2023-10-11 15:08:20,631][85176] Updated weights for policy 0, policy_version 5352 (0.0008) +[2023-10-11 15:08:21,007][85176] Updated weights for policy 0, policy_version 5362 (0.0009) +[2023-10-11 15:08:21,062][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 11042816. Throughput: 0: 1666.4, 1: 1697.7. Samples: 2767956. Policy #0 lag: (min: 31.0, avg: 35.8, max: 63.0) +[2023-10-11 15:08:21,063][84230] Avg episode reward: [(0, '6.590'), (1, '6.550')] +[2023-10-11 15:08:21,381][85176] Updated weights for policy 0, policy_version 5372 (0.0010) +[2023-10-11 15:08:23,523][85175] Updated weights for policy 1, policy_version 5450 (0.0008) +[2023-10-11 15:08:23,898][85175] Updated weights for policy 1, policy_version 5460 (0.0008) +[2023-10-11 15:08:24,256][85175] Updated weights for policy 1, policy_version 5470 (0.0010) +[2023-10-11 15:08:25,330][85176] Updated weights for policy 0, policy_version 5382 (0.0009) +[2023-10-11 15:08:25,694][85176] Updated weights for policy 0, policy_version 5392 (0.0007) +[2023-10-11 15:08:26,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 11108352. Throughput: 0: 1669.1, 1: 1671.3. Samples: 2787602. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) +[2023-10-11 15:08:26,063][84230] Avg episode reward: [(0, '6.620'), (1, '6.480')] +[2023-10-11 15:08:26,066][85176] Updated weights for policy 0, policy_version 5402 (0.0009) +[2023-10-11 15:08:28,332][85175] Updated weights for policy 1, policy_version 5480 (0.0009) +[2023-10-11 15:08:28,705][85175] Updated weights for policy 1, policy_version 5490 (0.0008) +[2023-10-11 15:08:29,068][85175] Updated weights for policy 1, policy_version 5500 (0.0010) +[2023-10-11 15:08:30,205][85176] Updated weights for policy 0, policy_version 5412 (0.0008) +[2023-10-11 15:08:30,572][85176] Updated weights for policy 0, policy_version 5422 (0.0010) +[2023-10-11 15:08:30,945][85176] Updated weights for policy 0, policy_version 5432 (0.0009) +[2023-10-11 15:08:31,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13329.3). Total num frames: 11173888. Throughput: 0: 1649.1, 1: 1702.6. Samples: 2807732. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) +[2023-10-11 15:08:31,064][84230] Avg episode reward: [(0, '6.530'), (1, '6.270')] +[2023-10-11 15:08:33,177][85175] Updated weights for policy 1, policy_version 5510 (0.0010) +[2023-10-11 15:08:33,547][85175] Updated weights for policy 1, policy_version 5520 (0.0009) +[2023-10-11 15:08:33,913][85175] Updated weights for policy 1, policy_version 5530 (0.0009) +[2023-10-11 15:08:35,056][85176] Updated weights for policy 0, policy_version 5442 (0.0007) +[2023-10-11 15:08:35,427][85176] Updated weights for policy 0, policy_version 5452 (0.0007) +[2023-10-11 15:08:35,806][85176] Updated weights for policy 0, policy_version 5462 (0.0007) +[2023-10-11 15:08:36,062][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 11239424. Throughput: 0: 1660.6, 1: 1689.5. Samples: 2817898. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) +[2023-10-11 15:08:36,063][84230] Avg episode reward: [(0, '6.640'), (1, '6.560')] +[2023-10-11 15:08:36,176][85176] Updated weights for policy 0, policy_version 5472 (0.0009) +[2023-10-11 15:08:37,915][85175] Updated weights for policy 1, policy_version 5540 (0.0009) +[2023-10-11 15:08:38,286][85175] Updated weights for policy 1, policy_version 5550 (0.0008) +[2023-10-11 15:08:38,662][85175] Updated weights for policy 1, policy_version 5560 (0.0010) +[2023-10-11 15:08:40,265][85176] Updated weights for policy 0, policy_version 5482 (0.0009) +[2023-10-11 15:08:40,645][85176] Updated weights for policy 0, policy_version 5492 (0.0010) +[2023-10-11 15:08:41,027][85176] Updated weights for policy 0, policy_version 5502 (0.0011) +[2023-10-11 15:08:41,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 11304960. Throughput: 0: 1658.0, 1: 1679.6. Samples: 2837768. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) +[2023-10-11 15:08:41,063][84230] Avg episode reward: [(0, '6.850'), (1, '6.580')] +[2023-10-11 15:08:42,837][85175] Updated weights for policy 1, policy_version 5570 (0.0009) +[2023-10-11 15:08:43,216][85175] Updated weights for policy 1, policy_version 5580 (0.0010) +[2023-10-11 15:08:43,581][85175] Updated weights for policy 1, policy_version 5590 (0.0008) +[2023-10-11 15:08:43,957][85175] Updated weights for policy 1, policy_version 5600 (0.0010) +[2023-10-11 15:08:45,224][85176] Updated weights for policy 0, policy_version 5512 (0.0008) +[2023-10-11 15:08:45,598][85176] Updated weights for policy 0, policy_version 5522 (0.0010) +[2023-10-11 15:08:45,973][85176] Updated weights for policy 0, policy_version 5532 (0.0007) +[2023-10-11 15:08:46,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 11370496. Throughput: 0: 1651.4, 1: 1698.5. Samples: 2857924. Policy #0 lag: (min: 17.0, avg: 28.2, max: 49.0) +[2023-10-11 15:08:46,063][84230] Avg episode reward: [(0, '6.960'), (1, '6.620')] +[2023-10-11 15:08:47,895][85175] Updated weights for policy 1, policy_version 5610 (0.0010) +[2023-10-11 15:08:48,270][85175] Updated weights for policy 1, policy_version 5620 (0.0009) +[2023-10-11 15:08:48,629][85175] Updated weights for policy 1, policy_version 5630 (0.0010) +[2023-10-11 15:08:50,118][85176] Updated weights for policy 0, policy_version 5542 (0.0008) +[2023-10-11 15:08:50,494][85176] Updated weights for policy 0, policy_version 5552 (0.0009) +[2023-10-11 15:08:50,867][85176] Updated weights for policy 0, policy_version 5562 (0.0009) +[2023-10-11 15:08:51,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.1, 300 sec: 13329.4). Total num frames: 11436032. Throughput: 0: 1661.0, 1: 1671.3. Samples: 2867816. Policy #0 lag: (min: 17.0, avg: 28.2, max: 49.0) +[2023-10-11 15:08:51,064][84230] Avg episode reward: [(0, '7.160'), (1, '6.410')] +[2023-10-11 15:08:52,601][85175] Updated weights for policy 1, policy_version 5640 (0.0009) +[2023-10-11 15:08:52,970][85175] Updated weights for policy 1, policy_version 5650 (0.0008) +[2023-10-11 15:08:53,382][85175] Updated weights for policy 1, policy_version 5660 (0.0008) +[2023-10-11 15:08:55,256][85176] Updated weights for policy 0, policy_version 5572 (0.0009) +[2023-10-11 15:08:55,634][85176] Updated weights for policy 0, policy_version 5582 (0.0008) +[2023-10-11 15:08:56,004][85176] Updated weights for policy 0, policy_version 5592 (0.0007) +[2023-10-11 15:08:56,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13107.1, 300 sec: 13329.3). Total num frames: 11501568. Throughput: 0: 1660.8, 1: 1686.9. Samples: 2887956. Policy #0 lag: (min: 17.0, avg: 22.3, max: 49.0) +[2023-10-11 15:08:56,064][84230] Avg episode reward: [(0, '7.150'), (1, '6.110')] +[2023-10-11 15:08:57,580][85175] Updated weights for policy 1, policy_version 5670 (0.0009) +[2023-10-11 15:08:57,938][85175] Updated weights for policy 1, policy_version 5680 (0.0010) +[2023-10-11 15:08:58,320][85175] Updated weights for policy 1, policy_version 5690 (0.0008) +[2023-10-11 15:09:00,059][85176] Updated weights for policy 0, policy_version 5602 (0.0007) +[2023-10-11 15:09:00,423][85176] Updated weights for policy 0, policy_version 5612 (0.0009) +[2023-10-11 15:09:00,802][85176] Updated weights for policy 0, policy_version 5622 (0.0010) +[2023-10-11 15:09:01,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 11567104. Throughput: 0: 1650.3, 1: 1686.4. Samples: 2907896. Policy #0 lag: (min: 17.0, avg: 22.3, max: 49.0) +[2023-10-11 15:09:01,064][84230] Avg episode reward: [(0, '6.840'), (1, '6.430')] +[2023-10-11 15:09:01,178][85176] Updated weights for policy 0, policy_version 5632 (0.0010) +[2023-10-11 15:09:02,352][85175] Updated weights for policy 1, policy_version 5700 (0.0008) +[2023-10-11 15:09:02,723][85175] Updated weights for policy 1, policy_version 5710 (0.0009) +[2023-10-11 15:09:03,091][85175] Updated weights for policy 1, policy_version 5720 (0.0007) +[2023-10-11 15:09:05,360][85176] Updated weights for policy 0, policy_version 5642 (0.0012) +[2023-10-11 15:09:05,724][85176] Updated weights for policy 0, policy_version 5652 (0.0009) +[2023-10-11 15:09:06,062][84230] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 11632640. Throughput: 0: 1661.3, 1: 1663.2. Samples: 2917558. Policy #0 lag: (min: 31.0, avg: 33.9, max: 63.0) +[2023-10-11 15:09:06,063][84230] Avg episode reward: [(0, '6.790'), (1, '6.640')] +[2023-10-11 15:09:06,104][85176] Updated weights for policy 0, policy_version 5662 (0.0009) +[2023-10-11 15:09:07,094][85175] Updated weights for policy 1, policy_version 5730 (0.0009) +[2023-10-11 15:09:07,469][85175] Updated weights for policy 1, policy_version 5740 (0.0010) +[2023-10-11 15:09:07,833][85175] Updated weights for policy 1, policy_version 5750 (0.0007) +[2023-10-11 15:09:08,201][85175] Updated weights for policy 1, policy_version 5760 (0.0010) +[2023-10-11 15:09:10,115][85176] Updated weights for policy 0, policy_version 5672 (0.0010) +[2023-10-11 15:09:10,489][85176] Updated weights for policy 0, policy_version 5682 (0.0010) +[2023-10-11 15:09:10,875][85176] Updated weights for policy 0, policy_version 5692 (0.0010) +[2023-10-11 15:09:11,063][84230] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 11730944. Throughput: 0: 1657.3, 1: 1688.9. Samples: 2938180. Policy #0 lag: (min: 31.0, avg: 31.2, max: 42.0) +[2023-10-11 15:09:11,064][84230] Avg episode reward: [(0, '6.550'), (1, '6.860')] +[2023-10-11 15:09:11,065][85000] Saving new best policy, reward=6.860! +[2023-10-11 15:09:12,196][85175] Updated weights for policy 1, policy_version 5770 (0.0008) +[2023-10-11 15:09:12,572][85175] Updated weights for policy 1, policy_version 5780 (0.0010) +[2023-10-11 15:09:12,947][85175] Updated weights for policy 1, policy_version 5790 (0.0008) +[2023-10-11 15:09:14,911][85176] Updated weights for policy 0, policy_version 5702 (0.0009) +[2023-10-11 15:09:15,282][85176] Updated weights for policy 0, policy_version 5712 (0.0009) +[2023-10-11 15:09:15,662][85176] Updated weights for policy 0, policy_version 5722 (0.0009) +[2023-10-11 15:09:16,063][84230] Fps is (10 sec: 16383.7, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 11796480. Throughput: 0: 1648.5, 1: 1684.1. Samples: 2957700. Policy #0 lag: (min: 31.0, avg: 31.2, max: 42.0) +[2023-10-11 15:09:16,063][84230] Avg episode reward: [(0, '6.600'), (1, '6.510')] +[2023-10-11 15:09:17,066][85175] Updated weights for policy 1, policy_version 5800 (0.0007) +[2023-10-11 15:09:17,422][85175] Updated weights for policy 1, policy_version 5810 (0.0007) +[2023-10-11 15:09:17,801][85175] Updated weights for policy 1, policy_version 5820 (0.0009) +[2023-10-11 15:09:19,902][85176] Updated weights for policy 0, policy_version 5732 (0.0008) +[2023-10-11 15:09:20,276][85176] Updated weights for policy 0, policy_version 5742 (0.0007) +[2023-10-11 15:09:20,656][85176] Updated weights for policy 0, policy_version 5752 (0.0009) +[2023-10-11 15:09:21,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 11862016. Throughput: 0: 1658.2, 1: 1669.3. Samples: 2967638. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:09:21,064][84230] Avg episode reward: [(0, '6.630'), (1, '6.360')] +[2023-10-11 15:09:21,826][85175] Updated weights for policy 1, policy_version 5830 (0.0009) +[2023-10-11 15:09:22,201][85175] Updated weights for policy 1, policy_version 5840 (0.0008) +[2023-10-11 15:09:22,575][85175] Updated weights for policy 1, policy_version 5850 (0.0008) +[2023-10-11 15:09:24,547][85176] Updated weights for policy 0, policy_version 5762 (0.0008) +[2023-10-11 15:09:24,920][85176] Updated weights for policy 0, policy_version 5772 (0.0010) +[2023-10-11 15:09:25,298][85176] Updated weights for policy 0, policy_version 5782 (0.0009) +[2023-10-11 15:09:25,668][85176] Updated weights for policy 0, policy_version 5792 (0.0007) +[2023-10-11 15:09:26,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 11927552. Throughput: 0: 1655.3, 1: 1685.7. Samples: 2988114. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:09:26,063][84230] Avg episode reward: [(0, '6.640'), (1, '6.470')] +[2023-10-11 15:09:26,643][85175] Updated weights for policy 1, policy_version 5860 (0.0009) +[2023-10-11 15:09:27,014][85175] Updated weights for policy 1, policy_version 5870 (0.0009) +[2023-10-11 15:09:27,381][85175] Updated weights for policy 1, policy_version 5880 (0.0009) +[2023-10-11 15:09:29,918][85176] Updated weights for policy 0, policy_version 5802 (0.0010) +[2023-10-11 15:09:30,289][85176] Updated weights for policy 0, policy_version 5812 (0.0010) +[2023-10-11 15:09:30,657][85176] Updated weights for policy 0, policy_version 5822 (0.0009) +[2023-10-11 15:09:31,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 11993088. Throughput: 0: 1649.5, 1: 1689.3. Samples: 3008170. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:09:31,063][84230] Avg episode reward: [(0, '6.750'), (1, '6.590')] +[2023-10-11 15:09:31,451][85175] Updated weights for policy 1, policy_version 5890 (0.0009) +[2023-10-11 15:09:31,819][85175] Updated weights for policy 1, policy_version 5900 (0.0009) +[2023-10-11 15:09:32,183][85175] Updated weights for policy 1, policy_version 5910 (0.0009) +[2023-10-11 15:09:32,556][85175] Updated weights for policy 1, policy_version 5920 (0.0007) +[2023-10-11 15:09:34,704][85176] Updated weights for policy 0, policy_version 5832 (0.0008) +[2023-10-11 15:09:35,081][85176] Updated weights for policy 0, policy_version 5842 (0.0007) +[2023-10-11 15:09:35,458][85176] Updated weights for policy 0, policy_version 5852 (0.0009) +[2023-10-11 15:09:36,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 12058624. Throughput: 0: 1663.2, 1: 1683.8. Samples: 3018430. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:09:36,063][84230] Avg episode reward: [(0, '6.750'), (1, '6.440')] +[2023-10-11 15:09:36,596][85175] Updated weights for policy 1, policy_version 5930 (0.0009) +[2023-10-11 15:09:36,961][85175] Updated weights for policy 1, policy_version 5940 (0.0010) +[2023-10-11 15:09:37,331][85175] Updated weights for policy 1, policy_version 5950 (0.0008) +[2023-10-11 15:09:39,505][85176] Updated weights for policy 0, policy_version 5862 (0.0010) +[2023-10-11 15:09:39,903][85176] Updated weights for policy 0, policy_version 5872 (0.0008) +[2023-10-11 15:09:40,274][85176] Updated weights for policy 0, policy_version 5882 (0.0008) +[2023-10-11 15:09:41,062][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 12124160. Throughput: 0: 1662.3, 1: 1691.7. Samples: 3038884. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:09:41,063][84230] Avg episode reward: [(0, '6.700'), (1, '6.720')] +[2023-10-11 15:09:41,133][85175] Updated weights for policy 1, policy_version 5960 (0.0010) +[2023-10-11 15:09:41,499][85175] Updated weights for policy 1, policy_version 5970 (0.0009) +[2023-10-11 15:09:41,869][85175] Updated weights for policy 1, policy_version 5980 (0.0007) +[2023-10-11 15:09:44,253][85176] Updated weights for policy 0, policy_version 5892 (0.0011) +[2023-10-11 15:09:44,631][85176] Updated weights for policy 0, policy_version 5902 (0.0010) +[2023-10-11 15:09:44,997][85176] Updated weights for policy 0, policy_version 5912 (0.0010) +[2023-10-11 15:09:45,897][85175] Updated weights for policy 1, policy_version 5990 (0.0009) +[2023-10-11 15:09:46,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13329.4). Total num frames: 12189696. Throughput: 0: 1656.1, 1: 1693.3. Samples: 3058622. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:09:46,063][84230] Avg episode reward: [(0, '6.520'), (1, '6.590')] +[2023-10-11 15:09:46,263][85175] Updated weights for policy 1, policy_version 6000 (0.0009) +[2023-10-11 15:09:46,634][85175] Updated weights for policy 1, policy_version 6010 (0.0007) +[2023-10-11 15:09:49,265][85176] Updated weights for policy 0, policy_version 5922 (0.0007) +[2023-10-11 15:09:49,644][85176] Updated weights for policy 0, policy_version 5932 (0.0007) +[2023-10-11 15:09:50,008][85176] Updated weights for policy 0, policy_version 5942 (0.0008) +[2023-10-11 15:09:50,386][85176] Updated weights for policy 0, policy_version 5952 (0.0010) +[2023-10-11 15:09:50,764][85175] Updated weights for policy 1, policy_version 6020 (0.0008) +[2023-10-11 15:09:51,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13329.4). Total num frames: 12255232. Throughput: 0: 1669.6, 1: 1692.1. Samples: 3068832. Policy #0 lag: (min: 17.0, avg: 23.0, max: 49.0) +[2023-10-11 15:09:51,063][84230] Avg episode reward: [(0, '6.530'), (1, '6.550')] +[2023-10-11 15:09:51,132][85175] Updated weights for policy 1, policy_version 6030 (0.0007) +[2023-10-11 15:09:51,506][85175] Updated weights for policy 1, policy_version 6040 (0.0007) +[2023-10-11 15:09:54,257][85176] Updated weights for policy 0, policy_version 5962 (0.0008) +[2023-10-11 15:09:54,635][85176] Updated weights for policy 0, policy_version 5972 (0.0007) +[2023-10-11 15:09:55,018][85176] Updated weights for policy 0, policy_version 5982 (0.0007) +[2023-10-11 15:09:55,563][85175] Updated weights for policy 1, policy_version 6050 (0.0009) +[2023-10-11 15:09:55,942][85175] Updated weights for policy 1, policy_version 6060 (0.0008) +[2023-10-11 15:09:56,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13329.4). Total num frames: 12320768. Throughput: 0: 1662.2, 1: 1693.8. Samples: 3089200. Policy #0 lag: (min: 17.0, avg: 23.0, max: 49.0) +[2023-10-11 15:09:56,063][84230] Avg episode reward: [(0, '6.380'), (1, '6.640')] +[2023-10-11 15:09:56,308][85175] Updated weights for policy 1, policy_version 6070 (0.0007) +[2023-10-11 15:09:56,685][85175] Updated weights for policy 1, policy_version 6080 (0.0008) +[2023-10-11 15:09:59,064][85176] Updated weights for policy 0, policy_version 5992 (0.0008) +[2023-10-11 15:09:59,429][85176] Updated weights for policy 0, policy_version 6002 (0.0008) +[2023-10-11 15:09:59,809][85176] Updated weights for policy 0, policy_version 6012 (0.0010) +[2023-10-11 15:10:00,557][85175] Updated weights for policy 1, policy_version 6090 (0.0008) +[2023-10-11 15:10:00,925][85175] Updated weights for policy 1, policy_version 6100 (0.0007) +[2023-10-11 15:10:01,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13329.4). Total num frames: 12386304. Throughput: 0: 1671.9, 1: 1692.0. Samples: 3109076. Policy #0 lag: (min: 31.0, avg: 34.0, max: 63.0) +[2023-10-11 15:10:01,063][84230] Avg episode reward: [(0, '6.730'), (1, '6.380')] +[2023-10-11 15:10:01,297][85175] Updated weights for policy 1, policy_version 6110 (0.0008) +[2023-10-11 15:10:03,914][85176] Updated weights for policy 0, policy_version 6022 (0.0009) +[2023-10-11 15:10:04,277][85176] Updated weights for policy 0, policy_version 6032 (0.0009) +[2023-10-11 15:10:04,653][85176] Updated weights for policy 0, policy_version 6042 (0.0007) +[2023-10-11 15:10:05,480][85175] Updated weights for policy 1, policy_version 6120 (0.0007) +[2023-10-11 15:10:05,856][85175] Updated weights for policy 1, policy_version 6130 (0.0008) +[2023-10-11 15:10:06,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 12451840. Throughput: 0: 1682.7, 1: 1701.7. Samples: 3119936. Policy #0 lag: (min: 31.0, avg: 34.0, max: 63.0) +[2023-10-11 15:10:06,063][84230] Avg episode reward: [(0, '6.860'), (1, '6.400')] +[2023-10-11 15:10:06,229][85175] Updated weights for policy 1, policy_version 6140 (0.0009) +[2023-10-11 15:10:08,660][85176] Updated weights for policy 0, policy_version 6052 (0.0007) +[2023-10-11 15:10:09,034][85176] Updated weights for policy 0, policy_version 6062 (0.0008) +[2023-10-11 15:10:09,412][85176] Updated weights for policy 0, policy_version 6072 (0.0008) +[2023-10-11 15:10:10,284][85175] Updated weights for policy 1, policy_version 6150 (0.0009) +[2023-10-11 15:10:10,651][85175] Updated weights for policy 1, policy_version 6160 (0.0007) +[2023-10-11 15:10:11,019][85175] Updated weights for policy 1, policy_version 6170 (0.0010) +[2023-10-11 15:10:11,062][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 12517376. Throughput: 0: 1663.5, 1: 1703.4. Samples: 3139622. Policy #0 lag: (min: 31.0, avg: 36.4, max: 63.0) +[2023-10-11 15:10:11,063][84230] Avg episode reward: [(0, '6.720'), (1, '6.560')] +[2023-10-11 15:10:13,416][85176] Updated weights for policy 0, policy_version 6082 (0.0009) +[2023-10-11 15:10:13,793][85176] Updated weights for policy 0, policy_version 6092 (0.0008) +[2023-10-11 15:10:14,166][85176] Updated weights for policy 0, policy_version 6102 (0.0010) +[2023-10-11 15:10:14,543][85176] Updated weights for policy 0, policy_version 6112 (0.0009) +[2023-10-11 15:10:14,872][85175] Updated weights for policy 1, policy_version 6180 (0.0008) +[2023-10-11 15:10:15,242][85175] Updated weights for policy 1, policy_version 6190 (0.0008) +[2023-10-11 15:10:15,614][85175] Updated weights for policy 1, policy_version 6200 (0.0008) +[2023-10-11 15:10:16,062][84230] Fps is (10 sec: 16383.9, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 12615680. Throughput: 0: 1679.7, 1: 1678.3. Samples: 3159280. Policy #0 lag: (min: 31.0, avg: 36.4, max: 63.0) +[2023-10-11 15:10:16,063][84230] Avg episode reward: [(0, '6.570'), (1, '6.760')] +[2023-10-11 15:10:16,071][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000006112_6258688.pth... +[2023-10-11 15:10:16,071][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000006208_6356992.pth... +[2023-10-11 15:10:16,110][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000004608_4718592.pth +[2023-10-11 15:10:16,111][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000004544_4653056.pth +[2023-10-11 15:10:18,724][85176] Updated weights for policy 0, policy_version 6122 (0.0010) +[2023-10-11 15:10:19,104][85176] Updated weights for policy 0, policy_version 6132 (0.0008) +[2023-10-11 15:10:19,476][85176] Updated weights for policy 0, policy_version 6142 (0.0008) +[2023-10-11 15:10:19,694][85175] Updated weights for policy 1, policy_version 6210 (0.0008) +[2023-10-11 15:10:20,069][85175] Updated weights for policy 1, policy_version 6220 (0.0010) +[2023-10-11 15:10:20,444][85175] Updated weights for policy 1, policy_version 6230 (0.0011) +[2023-10-11 15:10:20,810][85175] Updated weights for policy 1, policy_version 6240 (0.0009) +[2023-10-11 15:10:21,063][84230] Fps is (10 sec: 16383.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 12681216. Throughput: 0: 1674.5, 1: 1695.3. Samples: 3170072. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) +[2023-10-11 15:10:21,064][84230] Avg episode reward: [(0, '6.590'), (1, '6.640')] +[2023-10-11 15:10:23,596][85176] Updated weights for policy 0, policy_version 6152 (0.0009) +[2023-10-11 15:10:23,975][85176] Updated weights for policy 0, policy_version 6162 (0.0009) +[2023-10-11 15:10:24,346][85176] Updated weights for policy 0, policy_version 6172 (0.0008) +[2023-10-11 15:10:25,115][85175] Updated weights for policy 1, policy_version 6250 (0.0009) +[2023-10-11 15:10:25,497][85175] Updated weights for policy 1, policy_version 6260 (0.0007) +[2023-10-11 15:10:25,861][85175] Updated weights for policy 1, policy_version 6270 (0.0011) +[2023-10-11 15:10:26,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 12746752. Throughput: 0: 1658.7, 1: 1695.0. Samples: 3189800. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) +[2023-10-11 15:10:26,064][84230] Avg episode reward: [(0, '6.630'), (1, '6.360')] +[2023-10-11 15:10:28,519][85176] Updated weights for policy 0, policy_version 6182 (0.0009) +[2023-10-11 15:10:28,902][85176] Updated weights for policy 0, policy_version 6192 (0.0009) +[2023-10-11 15:10:29,281][85176] Updated weights for policy 0, policy_version 6202 (0.0010) +[2023-10-11 15:10:29,861][85175] Updated weights for policy 1, policy_version 6280 (0.0010) +[2023-10-11 15:10:30,228][85175] Updated weights for policy 1, policy_version 6290 (0.0008) +[2023-10-11 15:10:30,603][85175] Updated weights for policy 1, policy_version 6300 (0.0007) +[2023-10-11 15:10:31,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 12812288. Throughput: 0: 1675.8, 1: 1675.2. Samples: 3209416. Policy #0 lag: (min: 30.0, avg: 32.8, max: 62.0) +[2023-10-11 15:10:31,063][84230] Avg episode reward: [(0, '6.740'), (1, '6.390')] +[2023-10-11 15:10:33,406][85176] Updated weights for policy 0, policy_version 6212 (0.0008) +[2023-10-11 15:10:33,784][85176] Updated weights for policy 0, policy_version 6222 (0.0009) +[2023-10-11 15:10:34,155][85176] Updated weights for policy 0, policy_version 6232 (0.0008) +[2023-10-11 15:10:34,538][85175] Updated weights for policy 1, policy_version 6310 (0.0007) +[2023-10-11 15:10:34,897][85175] Updated weights for policy 1, policy_version 6320 (0.0008) +[2023-10-11 15:10:35,275][85175] Updated weights for policy 1, policy_version 6330 (0.0010) +[2023-10-11 15:10:36,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 12877824. Throughput: 0: 1665.8, 1: 1704.3. Samples: 3220484. Policy #0 lag: (min: 30.0, avg: 32.8, max: 62.0) +[2023-10-11 15:10:36,064][84230] Avg episode reward: [(0, '6.960'), (1, '6.460')] +[2023-10-11 15:10:38,234][85176] Updated weights for policy 0, policy_version 6242 (0.0009) +[2023-10-11 15:10:38,596][85176] Updated weights for policy 0, policy_version 6252 (0.0011) +[2023-10-11 15:10:38,973][85176] Updated weights for policy 0, policy_version 6262 (0.0011) +[2023-10-11 15:10:39,346][85176] Updated weights for policy 0, policy_version 6272 (0.0008) +[2023-10-11 15:10:39,387][85175] Updated weights for policy 1, policy_version 6340 (0.0011) +[2023-10-11 15:10:39,749][85175] Updated weights for policy 1, policy_version 6350 (0.0007) +[2023-10-11 15:10:40,123][85175] Updated weights for policy 1, policy_version 6360 (0.0009) +[2023-10-11 15:10:41,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 12943360. Throughput: 0: 1650.0, 1: 1696.0. Samples: 3239770. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:10:41,063][84230] Avg episode reward: [(0, '7.190'), (1, '6.690')] +[2023-10-11 15:10:43,448][85176] Updated weights for policy 0, policy_version 6282 (0.0008) +[2023-10-11 15:10:43,829][85176] Updated weights for policy 0, policy_version 6292 (0.0008) +[2023-10-11 15:10:44,032][85175] Updated weights for policy 1, policy_version 6370 (0.0010) +[2023-10-11 15:10:44,197][85176] Updated weights for policy 0, policy_version 6302 (0.0009) +[2023-10-11 15:10:44,398][85175] Updated weights for policy 1, policy_version 6380 (0.0008) +[2023-10-11 15:10:44,763][85175] Updated weights for policy 1, policy_version 6390 (0.0008) +[2023-10-11 15:10:45,128][85175] Updated weights for policy 1, policy_version 6400 (0.0007) +[2023-10-11 15:10:46,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 13008896. Throughput: 0: 1669.3, 1: 1674.0. Samples: 3259528. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:10:46,064][84230] Avg episode reward: [(0, '7.300'), (1, '6.520')] +[2023-10-11 15:10:46,077][84801] Saving new best policy, reward=7.300! +[2023-10-11 15:10:48,214][85176] Updated weights for policy 0, policy_version 6312 (0.0008) +[2023-10-11 15:10:48,587][85176] Updated weights for policy 0, policy_version 6322 (0.0008) +[2023-10-11 15:10:48,968][85176] Updated weights for policy 0, policy_version 6332 (0.0008) +[2023-10-11 15:10:49,288][85175] Updated weights for policy 1, policy_version 6410 (0.0009) +[2023-10-11 15:10:49,656][85175] Updated weights for policy 1, policy_version 6420 (0.0007) +[2023-10-11 15:10:50,031][85175] Updated weights for policy 1, policy_version 6430 (0.0010) +[2023-10-11 15:10:51,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 13074432. Throughput: 0: 1646.4, 1: 1692.5. Samples: 3270188. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:10:51,064][84230] Avg episode reward: [(0, '6.970'), (1, '6.580')] +[2023-10-11 15:10:53,089][85176] Updated weights for policy 0, policy_version 6342 (0.0008) +[2023-10-11 15:10:53,468][85176] Updated weights for policy 0, policy_version 6352 (0.0009) +[2023-10-11 15:10:53,840][85176] Updated weights for policy 0, policy_version 6362 (0.0008) +[2023-10-11 15:10:54,104][85175] Updated weights for policy 1, policy_version 6440 (0.0007) +[2023-10-11 15:10:54,501][85175] Updated weights for policy 1, policy_version 6450 (0.0010) +[2023-10-11 15:10:54,867][85175] Updated weights for policy 1, policy_version 6460 (0.0010) +[2023-10-11 15:10:56,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 13139968. Throughput: 0: 1654.8, 1: 1675.2. Samples: 3289468. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:10:56,063][84230] Avg episode reward: [(0, '6.860'), (1, '6.530')] +[2023-10-11 15:10:57,889][85176] Updated weights for policy 0, policy_version 6372 (0.0008) +[2023-10-11 15:10:58,264][85176] Updated weights for policy 0, policy_version 6382 (0.0007) +[2023-10-11 15:10:58,635][85176] Updated weights for policy 0, policy_version 6392 (0.0007) +[2023-10-11 15:10:58,794][85175] Updated weights for policy 1, policy_version 6470 (0.0008) +[2023-10-11 15:10:59,159][85175] Updated weights for policy 1, policy_version 6480 (0.0008) +[2023-10-11 15:10:59,518][85175] Updated weights for policy 1, policy_version 6490 (0.0010) +[2023-10-11 15:11:01,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 13205504. Throughput: 0: 1662.9, 1: 1684.8. Samples: 3309928. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:11:01,064][84230] Avg episode reward: [(0, '6.640'), (1, '6.400')] +[2023-10-11 15:11:02,820][85176] Updated weights for policy 0, policy_version 6402 (0.0007) +[2023-10-11 15:11:03,189][85176] Updated weights for policy 0, policy_version 6412 (0.0007) +[2023-10-11 15:11:03,462][85175] Updated weights for policy 1, policy_version 6500 (0.0008) +[2023-10-11 15:11:03,564][85176] Updated weights for policy 0, policy_version 6422 (0.0008) +[2023-10-11 15:11:03,831][85175] Updated weights for policy 1, policy_version 6510 (0.0008) +[2023-10-11 15:11:03,934][85176] Updated weights for policy 0, policy_version 6432 (0.0008) +[2023-10-11 15:11:04,190][85175] Updated weights for policy 1, policy_version 6520 (0.0010) +[2023-10-11 15:11:06,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 13271040. Throughput: 0: 1650.8, 1: 1696.9. Samples: 3320718. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:11:06,063][84230] Avg episode reward: [(0, '6.530'), (1, '6.500')] +[2023-10-11 15:11:07,932][85176] Updated weights for policy 0, policy_version 6442 (0.0009) +[2023-10-11 15:11:08,213][85175] Updated weights for policy 1, policy_version 6530 (0.0010) +[2023-10-11 15:11:08,297][85176] Updated weights for policy 0, policy_version 6452 (0.0009) +[2023-10-11 15:11:08,580][85175] Updated weights for policy 1, policy_version 6540 (0.0008) +[2023-10-11 15:11:08,674][85176] Updated weights for policy 0, policy_version 6462 (0.0008) +[2023-10-11 15:11:08,946][85175] Updated weights for policy 1, policy_version 6550 (0.0008) +[2023-10-11 15:11:09,318][85175] Updated weights for policy 1, policy_version 6560 (0.0010) +[2023-10-11 15:11:11,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 13336576. Throughput: 0: 1665.5, 1: 1670.6. Samples: 3339924. Policy #0 lag: (min: 31.0, avg: 37.5, max: 63.0) +[2023-10-11 15:11:11,064][84230] Avg episode reward: [(0, '6.500'), (1, '6.600')] +[2023-10-11 15:11:12,848][85176] Updated weights for policy 0, policy_version 6472 (0.0008) +[2023-10-11 15:11:13,135][85175] Updated weights for policy 1, policy_version 6570 (0.0007) +[2023-10-11 15:11:13,219][85176] Updated weights for policy 0, policy_version 6482 (0.0007) +[2023-10-11 15:11:13,497][85175] Updated weights for policy 1, policy_version 6580 (0.0007) +[2023-10-11 15:11:13,593][85176] Updated weights for policy 0, policy_version 6492 (0.0007) +[2023-10-11 15:11:13,868][85175] Updated weights for policy 1, policy_version 6590 (0.0009) +[2023-10-11 15:11:16,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 13402112. Throughput: 0: 1663.1, 1: 1698.8. Samples: 3360702. Policy #0 lag: (min: 31.0, avg: 37.5, max: 63.0) +[2023-10-11 15:11:16,064][84230] Avg episode reward: [(0, '6.720'), (1, '6.440')] +[2023-10-11 15:11:17,774][85175] Updated weights for policy 1, policy_version 6600 (0.0008) +[2023-10-11 15:11:17,908][85176] Updated weights for policy 0, policy_version 6502 (0.0009) +[2023-10-11 15:11:18,145][85175] Updated weights for policy 1, policy_version 6610 (0.0007) +[2023-10-11 15:11:18,283][85176] Updated weights for policy 0, policy_version 6512 (0.0007) +[2023-10-11 15:11:18,518][85175] Updated weights for policy 1, policy_version 6620 (0.0008) +[2023-10-11 15:11:18,662][85176] Updated weights for policy 0, policy_version 6522 (0.0008) +[2023-10-11 15:11:21,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 13467648. Throughput: 0: 1651.8, 1: 1676.6. Samples: 3370262. Policy #0 lag: (min: 0.0, avg: 18.5, max: 32.0) +[2023-10-11 15:11:21,063][84230] Avg episode reward: [(0, '6.860'), (1, '6.550')] +[2023-10-11 15:11:22,583][85175] Updated weights for policy 1, policy_version 6630 (0.0007) +[2023-10-11 15:11:22,784][85176] Updated weights for policy 0, policy_version 6532 (0.0007) +[2023-10-11 15:11:22,956][85175] Updated weights for policy 1, policy_version 6640 (0.0008) +[2023-10-11 15:11:23,156][85176] Updated weights for policy 0, policy_version 6542 (0.0007) +[2023-10-11 15:11:23,331][85175] Updated weights for policy 1, policy_version 6650 (0.0008) +[2023-10-11 15:11:23,530][85176] Updated weights for policy 0, policy_version 6552 (0.0007) +[2023-10-11 15:11:26,063][84230] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 13533184. Throughput: 0: 1666.0, 1: 1674.5. Samples: 3390090. Policy #0 lag: (min: 0.0, avg: 18.5, max: 32.0) +[2023-10-11 15:11:26,063][84230] Avg episode reward: [(0, '6.730'), (1, '6.630')] +[2023-10-11 15:11:27,337][85175] Updated weights for policy 1, policy_version 6660 (0.0009) +[2023-10-11 15:11:27,706][85175] Updated weights for policy 1, policy_version 6670 (0.0009) +[2023-10-11 15:11:27,711][85176] Updated weights for policy 0, policy_version 6562 (0.0008) +[2023-10-11 15:11:28,072][85175] Updated weights for policy 1, policy_version 6680 (0.0008) +[2023-10-11 15:11:28,088][85176] Updated weights for policy 0, policy_version 6572 (0.0007) +[2023-10-11 15:11:28,448][85176] Updated weights for policy 0, policy_version 6582 (0.0008) +[2023-10-11 15:11:28,819][85176] Updated weights for policy 0, policy_version 6592 (0.0009) +[2023-10-11 15:11:31,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 13598720. Throughput: 0: 1660.8, 1: 1703.2. Samples: 3410910. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:11:31,064][84230] Avg episode reward: [(0, '6.530'), (1, '6.780')] +[2023-10-11 15:11:32,186][85175] Updated weights for policy 1, policy_version 6690 (0.0007) +[2023-10-11 15:11:32,561][85175] Updated weights for policy 1, policy_version 6700 (0.0009) +[2023-10-11 15:11:32,933][85175] Updated weights for policy 1, policy_version 6710 (0.0007) +[2023-10-11 15:11:32,936][85176] Updated weights for policy 0, policy_version 6602 (0.0009) +[2023-10-11 15:11:33,295][85175] Updated weights for policy 1, policy_version 6720 (0.0007) +[2023-10-11 15:11:33,321][85176] Updated weights for policy 0, policy_version 6612 (0.0008) +[2023-10-11 15:11:33,690][85176] Updated weights for policy 0, policy_version 6622 (0.0009) +[2023-10-11 15:11:36,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 13664256. Throughput: 0: 1658.4, 1: 1679.1. Samples: 3420374. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:11:36,064][84230] Avg episode reward: [(0, '6.640'), (1, '6.660')] +[2023-10-11 15:11:37,254][85175] Updated weights for policy 1, policy_version 6730 (0.0007) +[2023-10-11 15:11:37,623][85175] Updated weights for policy 1, policy_version 6740 (0.0008) +[2023-10-11 15:11:37,781][85176] Updated weights for policy 0, policy_version 6632 (0.0008) +[2023-10-11 15:11:37,996][85175] Updated weights for policy 1, policy_version 6750 (0.0008) +[2023-10-11 15:11:38,163][85176] Updated weights for policy 0, policy_version 6642 (0.0007) +[2023-10-11 15:11:38,535][85176] Updated weights for policy 0, policy_version 6652 (0.0010) +[2023-10-11 15:11:41,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 13729792. Throughput: 0: 1660.7, 1: 1699.9. Samples: 3440694. Policy #0 lag: (min: 31.0, avg: 46.7, max: 63.0) +[2023-10-11 15:11:41,064][84230] Avg episode reward: [(0, '6.640'), (1, '6.280')] +[2023-10-11 15:11:42,355][85175] Updated weights for policy 1, policy_version 6760 (0.0008) +[2023-10-11 15:11:42,626][85176] Updated weights for policy 0, policy_version 6662 (0.0010) +[2023-10-11 15:11:42,729][85175] Updated weights for policy 1, policy_version 6770 (0.0008) +[2023-10-11 15:11:43,006][85176] Updated weights for policy 0, policy_version 6672 (0.0008) +[2023-10-11 15:11:43,098][85175] Updated weights for policy 1, policy_version 6780 (0.0008) +[2023-10-11 15:11:43,379][85176] Updated weights for policy 0, policy_version 6682 (0.0009) +[2023-10-11 15:11:46,063][84230] Fps is (10 sec: 13106.8, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 13795328. Throughput: 0: 1656.6, 1: 1700.0. Samples: 3460976. Policy #0 lag: (min: 31.0, avg: 46.7, max: 63.0) +[2023-10-11 15:11:46,064][84230] Avg episode reward: [(0, '6.580'), (1, '6.150')] +[2023-10-11 15:11:47,300][85175] Updated weights for policy 1, policy_version 6790 (0.0008) +[2023-10-11 15:11:47,475][85176] Updated weights for policy 0, policy_version 6692 (0.0008) +[2023-10-11 15:11:47,675][85175] Updated weights for policy 1, policy_version 6800 (0.0009) +[2023-10-11 15:11:47,845][85176] Updated weights for policy 0, policy_version 6702 (0.0007) +[2023-10-11 15:11:48,041][85175] Updated weights for policy 1, policy_version 6810 (0.0008) +[2023-10-11 15:11:48,208][85176] Updated weights for policy 0, policy_version 6712 (0.0008) +[2023-10-11 15:11:51,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 13860864. Throughput: 0: 1649.7, 1: 1667.9. Samples: 3470008. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:11:51,063][84230] Avg episode reward: [(0, '6.620'), (1, '6.340')] +[2023-10-11 15:11:52,016][85175] Updated weights for policy 1, policy_version 6820 (0.0009) +[2023-10-11 15:11:52,210][85176] Updated weights for policy 0, policy_version 6722 (0.0008) +[2023-10-11 15:11:52,386][85175] Updated weights for policy 1, policy_version 6830 (0.0009) +[2023-10-11 15:11:52,575][85176] Updated weights for policy 0, policy_version 6732 (0.0007) +[2023-10-11 15:11:52,749][85175] Updated weights for policy 1, policy_version 6840 (0.0008) +[2023-10-11 15:11:52,946][85176] Updated weights for policy 0, policy_version 6742 (0.0007) +[2023-10-11 15:11:53,319][85176] Updated weights for policy 0, policy_version 6752 (0.0007) +[2023-10-11 15:11:56,063][84230] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.5). Total num frames: 13926400. Throughput: 0: 1661.4, 1: 1695.3. Samples: 3490976. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:11:56,063][84230] Avg episode reward: [(0, '6.510'), (1, '6.230')] +[2023-10-11 15:11:56,787][85175] Updated weights for policy 1, policy_version 6850 (0.0007) +[2023-10-11 15:11:57,149][85175] Updated weights for policy 1, policy_version 6860 (0.0007) +[2023-10-11 15:11:57,342][85176] Updated weights for policy 0, policy_version 6762 (0.0008) +[2023-10-11 15:11:57,517][85175] Updated weights for policy 1, policy_version 6870 (0.0008) +[2023-10-11 15:11:57,708][85176] Updated weights for policy 0, policy_version 6772 (0.0009) +[2023-10-11 15:11:57,877][85175] Updated weights for policy 1, policy_version 6880 (0.0007) +[2023-10-11 15:11:58,077][85176] Updated weights for policy 0, policy_version 6782 (0.0009) +[2023-10-11 15:12:01,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 13991936. Throughput: 0: 1668.6, 1: 1688.9. Samples: 3511788. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:12:01,063][84230] Avg episode reward: [(0, '6.490'), (1, '6.600')] +[2023-10-11 15:12:01,957][85175] Updated weights for policy 1, policy_version 6890 (0.0009) +[2023-10-11 15:12:02,201][85176] Updated weights for policy 0, policy_version 6792 (0.0007) +[2023-10-11 15:12:02,320][85175] Updated weights for policy 1, policy_version 6900 (0.0007) +[2023-10-11 15:12:02,581][85176] Updated weights for policy 0, policy_version 6802 (0.0009) +[2023-10-11 15:12:02,697][85175] Updated weights for policy 1, policy_version 6910 (0.0008) +[2023-10-11 15:12:02,964][85176] Updated weights for policy 0, policy_version 6812 (0.0008) +[2023-10-11 15:12:06,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 14057472. Throughput: 0: 1662.8, 1: 1683.7. Samples: 3520854. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:12:06,063][84230] Avg episode reward: [(0, '6.300'), (1, '6.710')] +[2023-10-11 15:12:06,795][85175] Updated weights for policy 1, policy_version 6920 (0.0008) +[2023-10-11 15:12:06,795][85176] Updated weights for policy 0, policy_version 6822 (0.0007) +[2023-10-11 15:12:07,174][85175] Updated weights for policy 1, policy_version 6930 (0.0008) +[2023-10-11 15:12:07,181][85176] Updated weights for policy 0, policy_version 6832 (0.0007) +[2023-10-11 15:12:07,543][85175] Updated weights for policy 1, policy_version 6940 (0.0008) +[2023-10-11 15:12:07,550][85176] Updated weights for policy 0, policy_version 6842 (0.0008) +[2023-10-11 15:12:11,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 14123008. Throughput: 0: 1677.4, 1: 1688.2. Samples: 3541542. Policy #0 lag: (min: 14.0, avg: 17.3, max: 46.0) +[2023-10-11 15:12:11,063][84230] Avg episode reward: [(0, '6.350'), (1, '6.540')] +[2023-10-11 15:12:11,486][85175] Updated weights for policy 1, policy_version 6950 (0.0009) +[2023-10-11 15:12:11,559][85176] Updated weights for policy 0, policy_version 6852 (0.0008) +[2023-10-11 15:12:11,854][85175] Updated weights for policy 1, policy_version 6960 (0.0008) +[2023-10-11 15:12:11,930][85176] Updated weights for policy 0, policy_version 6862 (0.0007) +[2023-10-11 15:12:12,219][85175] Updated weights for policy 1, policy_version 6970 (0.0007) +[2023-10-11 15:12:12,299][85176] Updated weights for policy 0, policy_version 6872 (0.0008) +[2023-10-11 15:12:16,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 14188544. Throughput: 0: 1674.5, 1: 1688.7. Samples: 3562252. Policy #0 lag: (min: 14.0, avg: 17.3, max: 46.0) +[2023-10-11 15:12:16,063][84230] Avg episode reward: [(0, '6.750'), (1, '6.630')] +[2023-10-11 15:12:16,073][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000006880_7045120.pth... +[2023-10-11 15:12:16,102][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000005344_5472256.pth +[2023-10-11 15:12:16,304][85175] Updated weights for policy 1, policy_version 6980 (0.0008) +[2023-10-11 15:12:16,617][85176] Updated weights for policy 0, policy_version 6882 (0.0007) +[2023-10-11 15:12:16,670][85175] Updated weights for policy 1, policy_version 6990 (0.0007) +[2023-10-11 15:12:16,980][85176] Updated weights for policy 0, policy_version 6892 (0.0009) +[2023-10-11 15:12:17,039][85175] Updated weights for policy 1, policy_version 7000 (0.0010) +[2023-10-11 15:12:17,330][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000007008_7176192.pth... +[2023-10-11 15:12:17,351][85176] Updated weights for policy 0, policy_version 6902 (0.0009) +[2023-10-11 15:12:17,360][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000005408_5537792.pth +[2023-10-11 15:12:17,722][85176] Updated weights for policy 0, policy_version 6912 (0.0009) +[2023-10-11 15:12:21,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 14254080. Throughput: 0: 1664.5, 1: 1686.4. Samples: 3571166. Policy #0 lag: (min: 24.0, avg: 46.6, max: 56.0) +[2023-10-11 15:12:21,063][84230] Avg episode reward: [(0, '6.860'), (1, '6.530')] +[2023-10-11 15:12:21,202][85175] Updated weights for policy 1, policy_version 7010 (0.0008) +[2023-10-11 15:12:21,566][85175] Updated weights for policy 1, policy_version 7020 (0.0010) +[2023-10-11 15:12:21,874][85176] Updated weights for policy 0, policy_version 6922 (0.0009) +[2023-10-11 15:12:21,940][85175] Updated weights for policy 1, policy_version 7030 (0.0009) +[2023-10-11 15:12:22,254][85176] Updated weights for policy 0, policy_version 6932 (0.0007) +[2023-10-11 15:12:22,302][85175] Updated weights for policy 1, policy_version 7040 (0.0008) +[2023-10-11 15:12:22,619][85176] Updated weights for policy 0, policy_version 6942 (0.0010) +[2023-10-11 15:12:26,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 14319616. Throughput: 0: 1677.8, 1: 1678.0. Samples: 3591704. Policy #0 lag: (min: 24.0, avg: 46.6, max: 56.0) +[2023-10-11 15:12:26,064][84230] Avg episode reward: [(0, '6.580'), (1, '6.490')] +[2023-10-11 15:12:26,398][85175] Updated weights for policy 1, policy_version 7050 (0.0007) +[2023-10-11 15:12:26,749][85176] Updated weights for policy 0, policy_version 6952 (0.0009) +[2023-10-11 15:12:26,763][85175] Updated weights for policy 1, policy_version 7060 (0.0008) +[2023-10-11 15:12:27,124][85176] Updated weights for policy 0, policy_version 6962 (0.0008) +[2023-10-11 15:12:27,146][85175] Updated weights for policy 1, policy_version 7070 (0.0007) +[2023-10-11 15:12:27,495][85176] Updated weights for policy 0, policy_version 6972 (0.0009) +[2023-10-11 15:12:31,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 14385152. Throughput: 0: 1679.0, 1: 1686.4. Samples: 3612418. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:12:31,063][84230] Avg episode reward: [(0, '6.680'), (1, '6.680')] +[2023-10-11 15:12:31,408][85175] Updated weights for policy 1, policy_version 7080 (0.0010) +[2023-10-11 15:12:31,594][85176] Updated weights for policy 0, policy_version 6982 (0.0008) +[2023-10-11 15:12:31,782][85175] Updated weights for policy 1, policy_version 7090 (0.0007) +[2023-10-11 15:12:31,969][85176] Updated weights for policy 0, policy_version 6992 (0.0009) +[2023-10-11 15:12:32,143][85175] Updated weights for policy 1, policy_version 7100 (0.0007) +[2023-10-11 15:12:32,332][85176] Updated weights for policy 0, policy_version 7002 (0.0009) +[2023-10-11 15:12:35,978][85175] Updated weights for policy 1, policy_version 7110 (0.0007) +[2023-10-11 15:12:36,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 14450688. Throughput: 0: 1673.3, 1: 1688.5. Samples: 3621290. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:12:36,063][84230] Avg episode reward: [(0, '7.000'), (1, '6.280')] +[2023-10-11 15:12:36,346][85175] Updated weights for policy 1, policy_version 7120 (0.0011) +[2023-10-11 15:12:36,500][85176] Updated weights for policy 0, policy_version 7012 (0.0010) +[2023-10-11 15:12:36,718][85175] Updated weights for policy 1, policy_version 7130 (0.0007) +[2023-10-11 15:12:36,871][85176] Updated weights for policy 0, policy_version 7022 (0.0008) +[2023-10-11 15:12:37,241][85176] Updated weights for policy 0, policy_version 7032 (0.0010) +[2023-10-11 15:12:40,771][85175] Updated weights for policy 1, policy_version 7140 (0.0008) +[2023-10-11 15:12:41,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 14516224. Throughput: 0: 1663.1, 1: 1687.8. Samples: 3641764. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:12:41,063][84230] Avg episode reward: [(0, '6.860'), (1, '6.350')] +[2023-10-11 15:12:41,145][85175] Updated weights for policy 1, policy_version 7150 (0.0011) +[2023-10-11 15:12:41,504][85175] Updated weights for policy 1, policy_version 7160 (0.0010) +[2023-10-11 15:12:41,553][85176] Updated weights for policy 0, policy_version 7042 (0.0009) +[2023-10-11 15:12:41,924][85176] Updated weights for policy 0, policy_version 7052 (0.0007) +[2023-10-11 15:12:42,307][85176] Updated weights for policy 0, policy_version 7062 (0.0008) +[2023-10-11 15:12:42,673][85176] Updated weights for policy 0, policy_version 7072 (0.0008) +[2023-10-11 15:12:45,514][85175] Updated weights for policy 1, policy_version 7170 (0.0007) +[2023-10-11 15:12:45,879][85175] Updated weights for policy 1, policy_version 7180 (0.0008) +[2023-10-11 15:12:46,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 14581760. Throughput: 0: 1660.7, 1: 1683.5. Samples: 3662278. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:12:46,063][84230] Avg episode reward: [(0, '6.750'), (1, '6.790')] +[2023-10-11 15:12:46,246][85175] Updated weights for policy 1, policy_version 7190 (0.0007) +[2023-10-11 15:12:46,611][85175] Updated weights for policy 1, policy_version 7200 (0.0008) +[2023-10-11 15:12:46,762][85176] Updated weights for policy 0, policy_version 7082 (0.0011) +[2023-10-11 15:12:47,136][85176] Updated weights for policy 0, policy_version 7092 (0.0007) +[2023-10-11 15:12:47,510][85176] Updated weights for policy 0, policy_version 7102 (0.0008) +[2023-10-11 15:12:50,575][85175] Updated weights for policy 1, policy_version 7210 (0.0008) +[2023-10-11 15:12:50,936][85175] Updated weights for policy 1, policy_version 7220 (0.0008) +[2023-10-11 15:12:51,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 14647296. Throughput: 0: 1661.3, 1: 1689.1. Samples: 3671620. Policy #0 lag: (min: 31.0, avg: 33.6, max: 63.0) +[2023-10-11 15:12:51,063][84230] Avg episode reward: [(0, '6.860'), (1, '6.810')] +[2023-10-11 15:12:51,310][85175] Updated weights for policy 1, policy_version 7230 (0.0008) +[2023-10-11 15:12:51,683][85176] Updated weights for policy 0, policy_version 7112 (0.0008) +[2023-10-11 15:12:52,072][85176] Updated weights for policy 0, policy_version 7122 (0.0008) +[2023-10-11 15:12:52,439][85176] Updated weights for policy 0, policy_version 7132 (0.0009) +[2023-10-11 15:12:55,352][85175] Updated weights for policy 1, policy_version 7240 (0.0009) +[2023-10-11 15:12:55,722][85175] Updated weights for policy 1, policy_version 7250 (0.0010) +[2023-10-11 15:12:56,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 14712832. Throughput: 0: 1652.1, 1: 1696.3. Samples: 3692218. Policy #0 lag: (min: 31.0, avg: 33.6, max: 63.0) +[2023-10-11 15:12:56,063][84230] Avg episode reward: [(0, '6.790'), (1, '6.460')] +[2023-10-11 15:12:56,098][85175] Updated weights for policy 1, policy_version 7260 (0.0007) +[2023-10-11 15:12:56,463][85176] Updated weights for policy 0, policy_version 7142 (0.0011) +[2023-10-11 15:12:56,826][85176] Updated weights for policy 0, policy_version 7152 (0.0011) +[2023-10-11 15:12:57,210][85176] Updated weights for policy 0, policy_version 7162 (0.0011) +[2023-10-11 15:13:00,157][85175] Updated weights for policy 1, policy_version 7270 (0.0007) +[2023-10-11 15:13:00,537][85175] Updated weights for policy 1, policy_version 7280 (0.0009) +[2023-10-11 15:13:00,912][85175] Updated weights for policy 1, policy_version 7290 (0.0009) +[2023-10-11 15:13:01,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 14778368. Throughput: 0: 1654.2, 1: 1677.3. Samples: 3712170. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:13:01,064][84230] Avg episode reward: [(0, '6.830'), (1, '6.420')] +[2023-10-11 15:13:01,446][85176] Updated weights for policy 0, policy_version 7172 (0.0009) +[2023-10-11 15:13:01,814][85176] Updated weights for policy 0, policy_version 7182 (0.0010) +[2023-10-11 15:13:02,195][85176] Updated weights for policy 0, policy_version 7192 (0.0009) +[2023-10-11 15:13:05,030][85175] Updated weights for policy 1, policy_version 7300 (0.0009) +[2023-10-11 15:13:05,391][85175] Updated weights for policy 1, policy_version 7310 (0.0008) +[2023-10-11 15:13:05,765][85175] Updated weights for policy 1, policy_version 7320 (0.0010) +[2023-10-11 15:13:06,063][84230] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 14876672. Throughput: 0: 1656.9, 1: 1690.9. Samples: 3721820. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:13:06,063][84230] Avg episode reward: [(0, '6.850'), (1, '6.140')] +[2023-10-11 15:13:06,104][85176] Updated weights for policy 0, policy_version 7202 (0.0011) +[2023-10-11 15:13:06,478][85176] Updated weights for policy 0, policy_version 7212 (0.0009) +[2023-10-11 15:13:06,849][85176] Updated weights for policy 0, policy_version 7222 (0.0010) +[2023-10-11 15:13:07,234][85176] Updated weights for policy 0, policy_version 7232 (0.0009) +[2023-10-11 15:13:09,732][85175] Updated weights for policy 1, policy_version 7330 (0.0009) +[2023-10-11 15:13:10,113][85175] Updated weights for policy 1, policy_version 7340 (0.0008) +[2023-10-11 15:13:10,469][85175] Updated weights for policy 1, policy_version 7350 (0.0007) +[2023-10-11 15:13:10,836][85175] Updated weights for policy 1, policy_version 7360 (0.0010) +[2023-10-11 15:13:11,062][84230] Fps is (10 sec: 16384.4, 60 sec: 13653.4, 300 sec: 13440.5). Total num frames: 14942208. Throughput: 0: 1652.2, 1: 1693.6. Samples: 3742262. Policy #0 lag: (min: 29.0, avg: 29.0, max: 29.0) +[2023-10-11 15:13:11,063][84230] Avg episode reward: [(0, '7.080'), (1, '6.130')] +[2023-10-11 15:13:11,495][85176] Updated weights for policy 0, policy_version 7242 (0.0010) +[2023-10-11 15:13:11,861][85176] Updated weights for policy 0, policy_version 7252 (0.0011) +[2023-10-11 15:13:12,233][85176] Updated weights for policy 0, policy_version 7262 (0.0011) +[2023-10-11 15:13:14,897][85175] Updated weights for policy 1, policy_version 7370 (0.0007) +[2023-10-11 15:13:15,263][85175] Updated weights for policy 1, policy_version 7380 (0.0007) +[2023-10-11 15:13:15,645][85175] Updated weights for policy 1, policy_version 7390 (0.0008) +[2023-10-11 15:13:16,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 15007744. Throughput: 0: 1656.8, 1: 1670.7. Samples: 3762158. Policy #0 lag: (min: 29.0, avg: 29.0, max: 29.0) +[2023-10-11 15:13:16,064][84230] Avg episode reward: [(0, '7.190'), (1, '6.680')] +[2023-10-11 15:13:16,317][85176] Updated weights for policy 0, policy_version 7272 (0.0009) +[2023-10-11 15:13:16,698][85176] Updated weights for policy 0, policy_version 7282 (0.0009) +[2023-10-11 15:13:17,055][85176] Updated weights for policy 0, policy_version 7292 (0.0011) +[2023-10-11 15:13:19,819][85175] Updated weights for policy 1, policy_version 7400 (0.0007) +[2023-10-11 15:13:20,197][85175] Updated weights for policy 1, policy_version 7410 (0.0007) +[2023-10-11 15:13:20,569][85175] Updated weights for policy 1, policy_version 7420 (0.0009) +[2023-10-11 15:13:21,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 15073280. Throughput: 0: 1656.4, 1: 1700.4. Samples: 3772350. Policy #0 lag: (min: 1.0, avg: 10.7, max: 33.0) +[2023-10-11 15:13:21,063][84230] Avg episode reward: [(0, '7.000'), (1, '6.860')] +[2023-10-11 15:13:21,250][85176] Updated weights for policy 0, policy_version 7302 (0.0008) +[2023-10-11 15:13:21,630][85176] Updated weights for policy 0, policy_version 7312 (0.0008) +[2023-10-11 15:13:21,991][85176] Updated weights for policy 0, policy_version 7322 (0.0007) +[2023-10-11 15:13:24,808][85175] Updated weights for policy 1, policy_version 7430 (0.0008) +[2023-10-11 15:13:25,176][85175] Updated weights for policy 1, policy_version 7440 (0.0009) +[2023-10-11 15:13:25,539][85175] Updated weights for policy 1, policy_version 7450 (0.0007) +[2023-10-11 15:13:25,969][85176] Updated weights for policy 0, policy_version 7332 (0.0008) +[2023-10-11 15:13:26,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 15138816. Throughput: 0: 1663.1, 1: 1695.9. Samples: 3792918. Policy #0 lag: (min: 1.0, avg: 10.7, max: 33.0) +[2023-10-11 15:13:26,063][84230] Avg episode reward: [(0, '7.020'), (1, '6.870')] +[2023-10-11 15:13:26,064][85000] Saving new best policy, reward=6.870! +[2023-10-11 15:13:26,349][85176] Updated weights for policy 0, policy_version 7342 (0.0010) +[2023-10-11 15:13:26,715][85176] Updated weights for policy 0, policy_version 7352 (0.0008) +[2023-10-11 15:13:29,553][85175] Updated weights for policy 1, policy_version 7460 (0.0007) +[2023-10-11 15:13:29,917][85175] Updated weights for policy 1, policy_version 7470 (0.0009) +[2023-10-11 15:13:30,288][85175] Updated weights for policy 1, policy_version 7480 (0.0007) +[2023-10-11 15:13:30,851][85176] Updated weights for policy 0, policy_version 7362 (0.0008) +[2023-10-11 15:13:31,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 15204352. Throughput: 0: 1668.6, 1: 1670.5. Samples: 3812538. Policy #0 lag: (min: 13.0, avg: 14.9, max: 42.0) +[2023-10-11 15:13:31,063][84230] Avg episode reward: [(0, '7.040'), (1, '6.670')] +[2023-10-11 15:13:31,220][85176] Updated weights for policy 0, policy_version 7372 (0.0008) +[2023-10-11 15:13:31,587][85176] Updated weights for policy 0, policy_version 7382 (0.0007) +[2023-10-11 15:13:31,964][85176] Updated weights for policy 0, policy_version 7392 (0.0008) +[2023-10-11 15:13:34,172][85175] Updated weights for policy 1, policy_version 7490 (0.0007) +[2023-10-11 15:13:34,534][85175] Updated weights for policy 1, policy_version 7500 (0.0008) +[2023-10-11 15:13:34,903][85175] Updated weights for policy 1, policy_version 7510 (0.0008) +[2023-10-11 15:13:35,267][85175] Updated weights for policy 1, policy_version 7520 (0.0008) +[2023-10-11 15:13:36,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 15269888. Throughput: 0: 1669.3, 1: 1696.1. Samples: 3823064. Policy #0 lag: (min: 13.0, avg: 14.9, max: 42.0) +[2023-10-11 15:13:36,063][84230] Avg episode reward: [(0, '6.710'), (1, '6.510')] +[2023-10-11 15:13:36,066][85176] Updated weights for policy 0, policy_version 7402 (0.0010) +[2023-10-11 15:13:36,444][85176] Updated weights for policy 0, policy_version 7412 (0.0011) +[2023-10-11 15:13:36,821][85176] Updated weights for policy 0, policy_version 7422 (0.0010) +[2023-10-11 15:13:39,456][85175] Updated weights for policy 1, policy_version 7530 (0.0008) +[2023-10-11 15:13:39,829][85175] Updated weights for policy 1, policy_version 7540 (0.0009) +[2023-10-11 15:13:40,198][85175] Updated weights for policy 1, policy_version 7550 (0.0010) +[2023-10-11 15:13:40,961][85176] Updated weights for policy 0, policy_version 7432 (0.0009) +[2023-10-11 15:13:41,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 15335424. Throughput: 0: 1667.8, 1: 1682.6. Samples: 3842986. Policy #0 lag: (min: 31.0, avg: 39.5, max: 63.0) +[2023-10-11 15:13:41,063][84230] Avg episode reward: [(0, '6.600'), (1, '6.510')] +[2023-10-11 15:13:41,340][85176] Updated weights for policy 0, policy_version 7442 (0.0010) +[2023-10-11 15:13:41,710][85176] Updated weights for policy 0, policy_version 7452 (0.0009) +[2023-10-11 15:13:44,206][85175] Updated weights for policy 1, policy_version 7560 (0.0010) +[2023-10-11 15:13:44,579][85175] Updated weights for policy 1, policy_version 7570 (0.0010) +[2023-10-11 15:13:44,949][85175] Updated weights for policy 1, policy_version 7580 (0.0008) +[2023-10-11 15:13:46,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 15400960. Throughput: 0: 1664.4, 1: 1677.2. Samples: 3862540. Policy #0 lag: (min: 31.0, avg: 39.5, max: 63.0) +[2023-10-11 15:13:46,063][84230] Avg episode reward: [(0, '6.750'), (1, '6.630')] +[2023-10-11 15:13:46,078][85176] Updated weights for policy 0, policy_version 7462 (0.0009) +[2023-10-11 15:13:46,449][85176] Updated weights for policy 0, policy_version 7472 (0.0009) +[2023-10-11 15:13:46,822][85176] Updated weights for policy 0, policy_version 7482 (0.0009) +[2023-10-11 15:13:49,048][85175] Updated weights for policy 1, policy_version 7590 (0.0008) +[2023-10-11 15:13:49,422][85175] Updated weights for policy 1, policy_version 7600 (0.0008) +[2023-10-11 15:13:49,792][85175] Updated weights for policy 1, policy_version 7610 (0.0010) +[2023-10-11 15:13:50,905][85176] Updated weights for policy 0, policy_version 7492 (0.0007) +[2023-10-11 15:13:51,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 15466496. Throughput: 0: 1663.9, 1: 1693.1. Samples: 3872886. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:13:51,064][84230] Avg episode reward: [(0, '6.310'), (1, '6.780')] +[2023-10-11 15:13:51,267][85176] Updated weights for policy 0, policy_version 7502 (0.0009) +[2023-10-11 15:13:51,638][85176] Updated weights for policy 0, policy_version 7512 (0.0010) +[2023-10-11 15:13:53,688][85175] Updated weights for policy 1, policy_version 7620 (0.0008) +[2023-10-11 15:13:54,057][85175] Updated weights for policy 1, policy_version 7630 (0.0009) +[2023-10-11 15:13:54,416][85175] Updated weights for policy 1, policy_version 7640 (0.0007) +[2023-10-11 15:13:55,711][85176] Updated weights for policy 0, policy_version 7522 (0.0011) +[2023-10-11 15:13:56,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 15532032. Throughput: 0: 1664.0, 1: 1673.4. Samples: 3892446. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:13:56,063][84230] Avg episode reward: [(0, '6.420'), (1, '6.860')] +[2023-10-11 15:13:56,084][85176] Updated weights for policy 0, policy_version 7532 (0.0010) +[2023-10-11 15:13:56,457][85176] Updated weights for policy 0, policy_version 7542 (0.0010) +[2023-10-11 15:13:56,835][85176] Updated weights for policy 0, policy_version 7552 (0.0009) +[2023-10-11 15:13:58,269][85175] Updated weights for policy 1, policy_version 7650 (0.0007) +[2023-10-11 15:13:58,644][85175] Updated weights for policy 1, policy_version 7660 (0.0011) +[2023-10-11 15:13:59,002][85175] Updated weights for policy 1, policy_version 7670 (0.0011) +[2023-10-11 15:13:59,382][85175] Updated weights for policy 1, policy_version 7680 (0.0009) +[2023-10-11 15:14:00,795][85176] Updated weights for policy 0, policy_version 7562 (0.0007) +[2023-10-11 15:14:01,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 15597568. Throughput: 0: 1654.9, 1: 1695.4. Samples: 3912924. Policy #0 lag: (min: 31.0, avg: 36.6, max: 63.0) +[2023-10-11 15:14:01,064][84230] Avg episode reward: [(0, '6.420'), (1, '6.550')] +[2023-10-11 15:14:01,171][85176] Updated weights for policy 0, policy_version 7572 (0.0009) +[2023-10-11 15:14:01,538][85176] Updated weights for policy 0, policy_version 7582 (0.0009) +[2023-10-11 15:14:03,446][85175] Updated weights for policy 1, policy_version 7690 (0.0009) +[2023-10-11 15:14:03,816][85175] Updated weights for policy 1, policy_version 7700 (0.0009) +[2023-10-11 15:14:04,182][85175] Updated weights for policy 1, policy_version 7710 (0.0010) +[2023-10-11 15:14:05,663][85176] Updated weights for policy 0, policy_version 7592 (0.0008) +[2023-10-11 15:14:06,041][85176] Updated weights for policy 0, policy_version 7602 (0.0008) +[2023-10-11 15:14:06,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 15663104. Throughput: 0: 1663.8, 1: 1684.2. Samples: 3923010. Policy #0 lag: (min: 31.0, avg: 36.6, max: 63.0) +[2023-10-11 15:14:06,064][84230] Avg episode reward: [(0, '6.640'), (1, '6.320')] +[2023-10-11 15:14:06,422][85176] Updated weights for policy 0, policy_version 7612 (0.0007) +[2023-10-11 15:14:08,182][85175] Updated weights for policy 1, policy_version 7720 (0.0009) +[2023-10-11 15:14:08,559][85175] Updated weights for policy 1, policy_version 7730 (0.0007) +[2023-10-11 15:14:08,927][85175] Updated weights for policy 1, policy_version 7740 (0.0009) +[2023-10-11 15:14:10,456][85176] Updated weights for policy 0, policy_version 7622 (0.0010) +[2023-10-11 15:14:10,839][85176] Updated weights for policy 0, policy_version 7632 (0.0009) +[2023-10-11 15:14:11,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.1, 300 sec: 13329.4). Total num frames: 15728640. Throughput: 0: 1668.3, 1: 1664.9. Samples: 3942910. Policy #0 lag: (min: 28.0, avg: 33.9, max: 60.0) +[2023-10-11 15:14:11,064][84230] Avg episode reward: [(0, '6.860'), (1, '6.390')] +[2023-10-11 15:14:11,212][85176] Updated weights for policy 0, policy_version 7642 (0.0008) +[2023-10-11 15:14:13,027][85175] Updated weights for policy 1, policy_version 7750 (0.0010) +[2023-10-11 15:14:13,396][85175] Updated weights for policy 1, policy_version 7760 (0.0011) +[2023-10-11 15:14:13,760][85175] Updated weights for policy 1, policy_version 7770 (0.0008) +[2023-10-11 15:14:15,243][85176] Updated weights for policy 0, policy_version 7652 (0.0008) +[2023-10-11 15:14:15,617][85176] Updated weights for policy 0, policy_version 7662 (0.0007) +[2023-10-11 15:14:15,988][85176] Updated weights for policy 0, policy_version 7672 (0.0011) +[2023-10-11 15:14:16,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 15794176. Throughput: 0: 1654.1, 1: 1692.8. Samples: 3963150. Policy #0 lag: (min: 28.0, avg: 33.9, max: 60.0) +[2023-10-11 15:14:16,064][84230] Avg episode reward: [(0, '6.860'), (1, '6.560')] +[2023-10-11 15:14:16,075][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000007776_7962624.pth... +[2023-10-11 15:14:16,106][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000006208_6356992.pth +[2023-10-11 15:14:16,110][85000] Saving a milestone ./train_atari/atari_frostbite_APPO/checkpoint_p1/milestones/checkpoint_000007776_7962624.pth +[2023-10-11 15:14:16,285][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000007680_7864320.pth... +[2023-10-11 15:14:16,313][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000006112_6258688.pth +[2023-10-11 15:14:16,317][84801] Saving a milestone ./train_atari/atari_frostbite_APPO/checkpoint_p0/milestones/checkpoint_000007680_7864320.pth +[2023-10-11 15:14:17,843][85175] Updated weights for policy 1, policy_version 7780 (0.0009) +[2023-10-11 15:14:18,210][85175] Updated weights for policy 1, policy_version 7790 (0.0008) +[2023-10-11 15:14:18,587][85175] Updated weights for policy 1, policy_version 7800 (0.0009) +[2023-10-11 15:14:19,990][85176] Updated weights for policy 0, policy_version 7682 (0.0009) +[2023-10-11 15:14:20,359][85176] Updated weights for policy 0, policy_version 7692 (0.0009) +[2023-10-11 15:14:20,734][85176] Updated weights for policy 0, policy_version 7702 (0.0008) +[2023-10-11 15:14:21,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 15859712. Throughput: 0: 1665.4, 1: 1668.6. Samples: 3973092. Policy #0 lag: (min: 25.0, avg: 38.3, max: 57.0) +[2023-10-11 15:14:21,063][84230] Avg episode reward: [(0, '6.860'), (1, '6.360')] +[2023-10-11 15:14:21,113][85176] Updated weights for policy 0, policy_version 7712 (0.0008) +[2023-10-11 15:14:22,713][85175] Updated weights for policy 1, policy_version 7810 (0.0008) +[2023-10-11 15:14:23,095][85175] Updated weights for policy 1, policy_version 7820 (0.0007) +[2023-10-11 15:14:23,462][85175] Updated weights for policy 1, policy_version 7830 (0.0007) +[2023-10-11 15:14:23,833][85175] Updated weights for policy 1, policy_version 7840 (0.0007) +[2023-10-11 15:14:25,224][85176] Updated weights for policy 0, policy_version 7722 (0.0010) +[2023-10-11 15:14:25,598][85176] Updated weights for policy 0, policy_version 7732 (0.0010) +[2023-10-11 15:14:25,978][85176] Updated weights for policy 0, policy_version 7742 (0.0007) +[2023-10-11 15:14:26,062][84230] Fps is (10 sec: 16384.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 15958016. Throughput: 0: 1673.1, 1: 1664.7. Samples: 3993186. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:14:26,063][84230] Avg episode reward: [(0, '7.080'), (1, '6.500')] +[2023-10-11 15:14:28,108][85175] Updated weights for policy 1, policy_version 7850 (0.0007) +[2023-10-11 15:14:28,485][85175] Updated weights for policy 1, policy_version 7860 (0.0008) +[2023-10-11 15:14:28,852][85175] Updated weights for policy 1, policy_version 7870 (0.0008) +[2023-10-11 15:14:30,028][85176] Updated weights for policy 0, policy_version 7752 (0.0007) +[2023-10-11 15:14:30,401][85176] Updated weights for policy 0, policy_version 7762 (0.0008) +[2023-10-11 15:14:30,783][85176] Updated weights for policy 0, policy_version 7772 (0.0007) +[2023-10-11 15:14:31,062][84230] Fps is (10 sec: 16384.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 16023552. Throughput: 0: 1659.4, 1: 1684.2. Samples: 4013004. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:14:31,063][84230] Avg episode reward: [(0, '6.650'), (1, '6.630')] +[2023-10-11 15:14:32,836][85175] Updated weights for policy 1, policy_version 7880 (0.0008) +[2023-10-11 15:14:33,214][85175] Updated weights for policy 1, policy_version 7890 (0.0008) +[2023-10-11 15:14:33,592][85175] Updated weights for policy 1, policy_version 7900 (0.0008) +[2023-10-11 15:14:34,787][85176] Updated weights for policy 0, policy_version 7782 (0.0008) +[2023-10-11 15:14:35,165][85176] Updated weights for policy 0, policy_version 7792 (0.0008) +[2023-10-11 15:14:35,533][85176] Updated weights for policy 0, policy_version 7802 (0.0007) +[2023-10-11 15:14:36,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 16089088. Throughput: 0: 1681.1, 1: 1660.9. Samples: 4023278. Policy #0 lag: (min: 31.0, avg: 31.1, max: 38.0) +[2023-10-11 15:14:36,063][84230] Avg episode reward: [(0, '6.610'), (1, '6.560')] +[2023-10-11 15:14:37,600][85175] Updated weights for policy 1, policy_version 7910 (0.0008) +[2023-10-11 15:14:37,974][85175] Updated weights for policy 1, policy_version 7920 (0.0007) +[2023-10-11 15:14:38,350][85175] Updated weights for policy 1, policy_version 7930 (0.0008) +[2023-10-11 15:14:39,488][85176] Updated weights for policy 0, policy_version 7812 (0.0007) +[2023-10-11 15:14:39,866][85176] Updated weights for policy 0, policy_version 7822 (0.0008) +[2023-10-11 15:14:40,244][85176] Updated weights for policy 0, policy_version 7832 (0.0007) +[2023-10-11 15:14:41,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 16154624. Throughput: 0: 1682.3, 1: 1678.3. Samples: 4043672. Policy #0 lag: (min: 31.0, avg: 31.1, max: 38.0) +[2023-10-11 15:14:41,064][84230] Avg episode reward: [(0, '6.420'), (1, '6.400')] +[2023-10-11 15:14:42,399][85175] Updated weights for policy 1, policy_version 7940 (0.0009) +[2023-10-11 15:14:42,767][85175] Updated weights for policy 1, policy_version 7950 (0.0008) +[2023-10-11 15:14:43,139][85175] Updated weights for policy 1, policy_version 7960 (0.0009) +[2023-10-11 15:14:44,432][85176] Updated weights for policy 0, policy_version 7842 (0.0008) +[2023-10-11 15:14:44,806][85176] Updated weights for policy 0, policy_version 7852 (0.0008) +[2023-10-11 15:14:45,181][85176] Updated weights for policy 0, policy_version 7862 (0.0008) +[2023-10-11 15:14:45,560][85176] Updated weights for policy 0, policy_version 7872 (0.0007) +[2023-10-11 15:14:46,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 16220160. Throughput: 0: 1664.7, 1: 1682.6. Samples: 4063552. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:14:46,063][84230] Avg episode reward: [(0, '6.310'), (1, '6.450')] +[2023-10-11 15:14:47,154][85175] Updated weights for policy 1, policy_version 7970 (0.0009) +[2023-10-11 15:14:47,515][85175] Updated weights for policy 1, policy_version 7980 (0.0010) +[2023-10-11 15:14:47,882][85175] Updated weights for policy 1, policy_version 7990 (0.0010) +[2023-10-11 15:14:48,244][85175] Updated weights for policy 1, policy_version 8000 (0.0009) +[2023-10-11 15:14:49,640][85176] Updated weights for policy 0, policy_version 7882 (0.0007) +[2023-10-11 15:14:50,021][85176] Updated weights for policy 0, policy_version 7892 (0.0007) +[2023-10-11 15:14:50,379][85176] Updated weights for policy 0, policy_version 7902 (0.0008) +[2023-10-11 15:14:51,062][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 16285696. Throughput: 0: 1686.2, 1: 1666.4. Samples: 4073876. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:14:51,063][84230] Avg episode reward: [(0, '6.530'), (1, '6.490')] +[2023-10-11 15:14:52,195][85175] Updated weights for policy 1, policy_version 8010 (0.0009) +[2023-10-11 15:14:52,566][85175] Updated weights for policy 1, policy_version 8020 (0.0008) +[2023-10-11 15:14:52,929][85175] Updated weights for policy 1, policy_version 8030 (0.0009) +[2023-10-11 15:14:54,378][85176] Updated weights for policy 0, policy_version 7912 (0.0007) +[2023-10-11 15:14:54,745][85176] Updated weights for policy 0, policy_version 7922 (0.0007) +[2023-10-11 15:14:55,115][85176] Updated weights for policy 0, policy_version 7932 (0.0010) +[2023-10-11 15:14:56,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 16351232. Throughput: 0: 1672.2, 1: 1690.0. Samples: 4094208. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:14:56,064][84230] Avg episode reward: [(0, '6.640'), (1, '6.700')] +[2023-10-11 15:14:57,097][85175] Updated weights for policy 1, policy_version 8040 (0.0009) +[2023-10-11 15:14:57,465][85175] Updated weights for policy 1, policy_version 8050 (0.0008) +[2023-10-11 15:14:57,835][85175] Updated weights for policy 1, policy_version 8060 (0.0008) +[2023-10-11 15:14:59,210][85176] Updated weights for policy 0, policy_version 7942 (0.0008) +[2023-10-11 15:14:59,595][85176] Updated weights for policy 0, policy_version 7952 (0.0009) +[2023-10-11 15:14:59,964][85176] Updated weights for policy 0, policy_version 7962 (0.0009) +[2023-10-11 15:15:01,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 16416768. Throughput: 0: 1668.5, 1: 1688.3. Samples: 4114206. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:15:01,064][84230] Avg episode reward: [(0, '6.970'), (1, '6.800')] +[2023-10-11 15:15:01,898][85175] Updated weights for policy 1, policy_version 8070 (0.0009) +[2023-10-11 15:15:02,269][85175] Updated weights for policy 1, policy_version 8080 (0.0008) +[2023-10-11 15:15:02,640][85175] Updated weights for policy 1, policy_version 8090 (0.0009) +[2023-10-11 15:15:04,027][85176] Updated weights for policy 0, policy_version 7972 (0.0008) +[2023-10-11 15:15:04,400][85176] Updated weights for policy 0, policy_version 7982 (0.0010) +[2023-10-11 15:15:04,773][85176] Updated weights for policy 0, policy_version 7992 (0.0009) +[2023-10-11 15:15:06,062][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 16482304. Throughput: 0: 1683.9, 1: 1680.7. Samples: 4124498. Policy #0 lag: (min: 1.0, avg: 7.2, max: 33.0) +[2023-10-11 15:15:06,063][84230] Avg episode reward: [(0, '6.850'), (1, '6.440')] +[2023-10-11 15:15:06,523][85175] Updated weights for policy 1, policy_version 8100 (0.0007) +[2023-10-11 15:15:06,891][85175] Updated weights for policy 1, policy_version 8110 (0.0010) +[2023-10-11 15:15:07,253][85175] Updated weights for policy 1, policy_version 8120 (0.0009) +[2023-10-11 15:15:09,048][85176] Updated weights for policy 0, policy_version 8002 (0.0010) +[2023-10-11 15:15:09,423][85176] Updated weights for policy 0, policy_version 8012 (0.0008) +[2023-10-11 15:15:09,802][85176] Updated weights for policy 0, policy_version 8022 (0.0007) +[2023-10-11 15:15:10,178][85176] Updated weights for policy 0, policy_version 8032 (0.0009) +[2023-10-11 15:15:11,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13329.4). Total num frames: 16547840. Throughput: 0: 1664.8, 1: 1699.1. Samples: 4144558. Policy #0 lag: (min: 1.0, avg: 7.2, max: 33.0) +[2023-10-11 15:15:11,063][84230] Avg episode reward: [(0, '7.070'), (1, '6.140')] +[2023-10-11 15:15:11,363][85175] Updated weights for policy 1, policy_version 8130 (0.0010) +[2023-10-11 15:15:11,727][85175] Updated weights for policy 1, policy_version 8140 (0.0007) +[2023-10-11 15:15:12,104][85175] Updated weights for policy 1, policy_version 8150 (0.0008) +[2023-10-11 15:15:12,470][85175] Updated weights for policy 1, policy_version 8160 (0.0008) +[2023-10-11 15:15:14,115][85176] Updated weights for policy 0, policy_version 8042 (0.0011) +[2023-10-11 15:15:14,491][85176] Updated weights for policy 0, policy_version 8052 (0.0010) +[2023-10-11 15:15:14,873][85176] Updated weights for policy 0, policy_version 8062 (0.0007) +[2023-10-11 15:15:16,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13329.4). Total num frames: 16613376. Throughput: 0: 1666.0, 1: 1709.4. Samples: 4164898. Policy #0 lag: (min: 21.0, avg: 24.1, max: 53.0) +[2023-10-11 15:15:16,063][84230] Avg episode reward: [(0, '6.860'), (1, '6.420')] +[2023-10-11 15:15:16,235][85175] Updated weights for policy 1, policy_version 8170 (0.0009) +[2023-10-11 15:15:16,611][85175] Updated weights for policy 1, policy_version 8180 (0.0009) +[2023-10-11 15:15:16,977][85175] Updated weights for policy 1, policy_version 8190 (0.0010) +[2023-10-11 15:15:19,014][85176] Updated weights for policy 0, policy_version 8072 (0.0007) +[2023-10-11 15:15:19,396][85176] Updated weights for policy 0, policy_version 8082 (0.0007) +[2023-10-11 15:15:19,759][85176] Updated weights for policy 0, policy_version 8092 (0.0008) +[2023-10-11 15:15:21,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 16678912. Throughput: 0: 1675.0, 1: 1701.8. Samples: 4175236. Policy #0 lag: (min: 21.0, avg: 24.1, max: 53.0) +[2023-10-11 15:15:21,064][84230] Avg episode reward: [(0, '6.750'), (1, '6.540')] +[2023-10-11 15:15:21,137][85175] Updated weights for policy 1, policy_version 8200 (0.0008) +[2023-10-11 15:15:21,502][85175] Updated weights for policy 1, policy_version 8210 (0.0007) +[2023-10-11 15:15:21,877][85175] Updated weights for policy 1, policy_version 8220 (0.0010) +[2023-10-11 15:15:23,818][85176] Updated weights for policy 0, policy_version 8102 (0.0008) +[2023-10-11 15:15:24,188][85176] Updated weights for policy 0, policy_version 8112 (0.0010) +[2023-10-11 15:15:24,570][85176] Updated weights for policy 0, policy_version 8122 (0.0008) +[2023-10-11 15:15:25,825][85175] Updated weights for policy 1, policy_version 8230 (0.0009) +[2023-10-11 15:15:26,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 16744448. Throughput: 0: 1652.3, 1: 1710.2. Samples: 4194986. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) +[2023-10-11 15:15:26,063][84230] Avg episode reward: [(0, '6.640'), (1, '6.720')] +[2023-10-11 15:15:26,197][85175] Updated weights for policy 1, policy_version 8240 (0.0007) +[2023-10-11 15:15:26,567][85175] Updated weights for policy 1, policy_version 8250 (0.0007) +[2023-10-11 15:15:28,741][85176] Updated weights for policy 0, policy_version 8132 (0.0010) +[2023-10-11 15:15:29,117][85176] Updated weights for policy 0, policy_version 8142 (0.0008) +[2023-10-11 15:15:29,494][85176] Updated weights for policy 0, policy_version 8152 (0.0008) +[2023-10-11 15:15:30,464][85175] Updated weights for policy 1, policy_version 8260 (0.0009) +[2023-10-11 15:15:30,838][85175] Updated weights for policy 1, policy_version 8270 (0.0007) +[2023-10-11 15:15:31,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 16809984. Throughput: 0: 1664.7, 1: 1705.4. Samples: 4215206. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) +[2023-10-11 15:15:31,063][84230] Avg episode reward: [(0, '6.960'), (1, '6.440')] +[2023-10-11 15:15:31,205][85175] Updated weights for policy 1, policy_version 8280 (0.0009) +[2023-10-11 15:15:33,506][85176] Updated weights for policy 0, policy_version 8162 (0.0008) +[2023-10-11 15:15:33,881][85176] Updated weights for policy 0, policy_version 8172 (0.0007) +[2023-10-11 15:15:34,253][85176] Updated weights for policy 0, policy_version 8182 (0.0010) +[2023-10-11 15:15:34,631][85176] Updated weights for policy 0, policy_version 8192 (0.0007) +[2023-10-11 15:15:35,230][85175] Updated weights for policy 1, policy_version 8290 (0.0008) +[2023-10-11 15:15:35,591][85175] Updated weights for policy 1, policy_version 8300 (0.0010) +[2023-10-11 15:15:35,953][85175] Updated weights for policy 1, policy_version 8310 (0.0009) +[2023-10-11 15:15:36,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 16875520. Throughput: 0: 1664.7, 1: 1708.5. Samples: 4225670. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-10-11 15:15:36,063][84230] Avg episode reward: [(0, '7.080'), (1, '6.500')] +[2023-10-11 15:15:36,321][85175] Updated weights for policy 1, policy_version 8320 (0.0007) +[2023-10-11 15:15:38,804][85176] Updated weights for policy 0, policy_version 8202 (0.0009) +[2023-10-11 15:15:39,181][85176] Updated weights for policy 0, policy_version 8212 (0.0007) +[2023-10-11 15:15:39,554][85176] Updated weights for policy 0, policy_version 8222 (0.0007) +[2023-10-11 15:15:40,354][85175] Updated weights for policy 1, policy_version 8330 (0.0007) +[2023-10-11 15:15:40,723][85175] Updated weights for policy 1, policy_version 8340 (0.0008) +[2023-10-11 15:15:41,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 16941056. Throughput: 0: 1653.0, 1: 1707.3. Samples: 4245422. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-10-11 15:15:41,063][84230] Avg episode reward: [(0, '6.820'), (1, '6.470')] +[2023-10-11 15:15:41,092][85175] Updated weights for policy 1, policy_version 8350 (0.0007) +[2023-10-11 15:15:43,475][85176] Updated weights for policy 0, policy_version 8232 (0.0008) +[2023-10-11 15:15:43,841][85176] Updated weights for policy 0, policy_version 8242 (0.0008) +[2023-10-11 15:15:44,214][85176] Updated weights for policy 0, policy_version 8252 (0.0009) +[2023-10-11 15:15:45,253][85175] Updated weights for policy 1, policy_version 8360 (0.0008) +[2023-10-11 15:15:45,638][85175] Updated weights for policy 1, policy_version 8370 (0.0007) +[2023-10-11 15:15:46,000][85175] Updated weights for policy 1, policy_version 8380 (0.0007) +[2023-10-11 15:15:46,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13107.1, 300 sec: 13329.4). Total num frames: 17006592. Throughput: 0: 1667.6, 1: 1694.3. Samples: 4265494. Policy #0 lag: (min: 30.0, avg: 32.3, max: 62.0) +[2023-10-11 15:15:46,064][84230] Avg episode reward: [(0, '6.850'), (1, '6.670')] +[2023-10-11 15:15:48,433][85176] Updated weights for policy 0, policy_version 8262 (0.0009) +[2023-10-11 15:15:48,806][85176] Updated weights for policy 0, policy_version 8272 (0.0010) +[2023-10-11 15:15:49,185][85176] Updated weights for policy 0, policy_version 8282 (0.0007) +[2023-10-11 15:15:49,962][85175] Updated weights for policy 1, policy_version 8390 (0.0010) +[2023-10-11 15:15:50,324][85175] Updated weights for policy 1, policy_version 8400 (0.0008) +[2023-10-11 15:15:50,694][85175] Updated weights for policy 1, policy_version 8410 (0.0008) +[2023-10-11 15:15:51,062][84230] Fps is (10 sec: 16384.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 17104896. Throughput: 0: 1660.6, 1: 1708.8. Samples: 4276124. Policy #0 lag: (min: 30.0, avg: 32.3, max: 62.0) +[2023-10-11 15:15:51,063][84230] Avg episode reward: [(0, '6.640'), (1, '6.710')] +[2023-10-11 15:15:53,217][85176] Updated weights for policy 0, policy_version 8292 (0.0009) +[2023-10-11 15:15:53,581][85176] Updated weights for policy 0, policy_version 8302 (0.0007) +[2023-10-11 15:15:53,961][85176] Updated weights for policy 0, policy_version 8312 (0.0010) +[2023-10-11 15:15:54,853][85175] Updated weights for policy 1, policy_version 8420 (0.0008) +[2023-10-11 15:15:55,218][85175] Updated weights for policy 1, policy_version 8430 (0.0009) +[2023-10-11 15:15:55,589][85175] Updated weights for policy 1, policy_version 8440 (0.0008) +[2023-10-11 15:15:56,062][84230] Fps is (10 sec: 16384.5, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 17170432. Throughput: 0: 1661.3, 1: 1705.7. Samples: 4296074. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:15:56,063][84230] Avg episode reward: [(0, '6.410'), (1, '6.640')] +[2023-10-11 15:15:57,891][85176] Updated weights for policy 0, policy_version 8322 (0.0009) +[2023-10-11 15:15:58,262][85176] Updated weights for policy 0, policy_version 8332 (0.0009) +[2023-10-11 15:15:58,649][85176] Updated weights for policy 0, policy_version 8342 (0.0010) +[2023-10-11 15:15:59,017][85176] Updated weights for policy 0, policy_version 8352 (0.0010) +[2023-10-11 15:15:59,635][85175] Updated weights for policy 1, policy_version 8450 (0.0007) +[2023-10-11 15:15:59,998][85175] Updated weights for policy 1, policy_version 8460 (0.0007) +[2023-10-11 15:16:00,376][85175] Updated weights for policy 1, policy_version 8470 (0.0010) +[2023-10-11 15:16:00,749][85175] Updated weights for policy 1, policy_version 8480 (0.0009) +[2023-10-11 15:16:01,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 17235968. Throughput: 0: 1679.6, 1: 1672.5. Samples: 4315744. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:16:01,063][84230] Avg episode reward: [(0, '6.750'), (1, '6.430')] +[2023-10-11 15:16:03,068][85176] Updated weights for policy 0, policy_version 8362 (0.0011) +[2023-10-11 15:16:03,444][85176] Updated weights for policy 0, policy_version 8372 (0.0011) +[2023-10-11 15:16:03,823][85176] Updated weights for policy 0, policy_version 8382 (0.0009) +[2023-10-11 15:16:04,780][85175] Updated weights for policy 1, policy_version 8490 (0.0009) +[2023-10-11 15:16:05,148][85175] Updated weights for policy 1, policy_version 8500 (0.0010) +[2023-10-11 15:16:05,513][85175] Updated weights for policy 1, policy_version 8510 (0.0007) +[2023-10-11 15:16:06,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 17301504. Throughput: 0: 1658.8, 1: 1696.7. Samples: 4326232. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:16:06,064][84230] Avg episode reward: [(0, '7.070'), (1, '6.480')] +[2023-10-11 15:16:08,009][85176] Updated weights for policy 0, policy_version 8392 (0.0009) +[2023-10-11 15:16:08,397][85176] Updated weights for policy 0, policy_version 8402 (0.0008) +[2023-10-11 15:16:08,769][85176] Updated weights for policy 0, policy_version 8412 (0.0009) +[2023-10-11 15:16:09,562][85175] Updated weights for policy 1, policy_version 8520 (0.0007) +[2023-10-11 15:16:09,930][85175] Updated weights for policy 1, policy_version 8530 (0.0008) +[2023-10-11 15:16:10,295][85175] Updated weights for policy 1, policy_version 8540 (0.0009) +[2023-10-11 15:16:11,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 17367040. Throughput: 0: 1671.7, 1: 1689.9. Samples: 4346256. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:16:11,064][84230] Avg episode reward: [(0, '7.180'), (1, '6.580')] +[2023-10-11 15:16:12,843][85176] Updated weights for policy 0, policy_version 8422 (0.0008) +[2023-10-11 15:16:13,218][85176] Updated weights for policy 0, policy_version 8432 (0.0007) +[2023-10-11 15:16:13,586][85176] Updated weights for policy 0, policy_version 8442 (0.0007) +[2023-10-11 15:16:14,552][85175] Updated weights for policy 1, policy_version 8550 (0.0009) +[2023-10-11 15:16:14,926][85175] Updated weights for policy 1, policy_version 8560 (0.0007) +[2023-10-11 15:16:15,288][85175] Updated weights for policy 1, policy_version 8570 (0.0007) +[2023-10-11 15:16:16,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 17432576. Throughput: 0: 1680.8, 1: 1669.0. Samples: 4365946. Policy #0 lag: (min: 21.0, avg: 23.1, max: 52.0) +[2023-10-11 15:16:16,063][84230] Avg episode reward: [(0, '6.860'), (1, '6.500')] +[2023-10-11 15:16:16,071][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000008448_8650752.pth... +[2023-10-11 15:16:16,071][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000008576_8781824.pth... +[2023-10-11 15:16:16,101][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000007008_7176192.pth +[2023-10-11 15:16:16,111][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000006880_7045120.pth +[2023-10-11 15:16:17,903][85176] Updated weights for policy 0, policy_version 8452 (0.0010) +[2023-10-11 15:16:18,278][85176] Updated weights for policy 0, policy_version 8462 (0.0010) +[2023-10-11 15:16:18,639][85176] Updated weights for policy 0, policy_version 8472 (0.0011) +[2023-10-11 15:16:19,248][85175] Updated weights for policy 1, policy_version 8580 (0.0009) +[2023-10-11 15:16:19,624][85175] Updated weights for policy 1, policy_version 8590 (0.0010) +[2023-10-11 15:16:20,003][85175] Updated weights for policy 1, policy_version 8600 (0.0010) +[2023-10-11 15:16:21,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 17498112. Throughput: 0: 1659.7, 1: 1695.7. Samples: 4376664. Policy #0 lag: (min: 21.0, avg: 23.1, max: 52.0) +[2023-10-11 15:16:21,064][84230] Avg episode reward: [(0, '6.530'), (1, '6.630')] +[2023-10-11 15:16:22,579][85176] Updated weights for policy 0, policy_version 8482 (0.0010) +[2023-10-11 15:16:22,954][85176] Updated weights for policy 0, policy_version 8492 (0.0008) +[2023-10-11 15:16:23,329][85176] Updated weights for policy 0, policy_version 8502 (0.0009) +[2023-10-11 15:16:23,699][85176] Updated weights for policy 0, policy_version 8512 (0.0008) +[2023-10-11 15:16:24,004][85175] Updated weights for policy 1, policy_version 8610 (0.0008) +[2023-10-11 15:16:24,367][85175] Updated weights for policy 1, policy_version 8620 (0.0007) +[2023-10-11 15:16:24,730][85175] Updated weights for policy 1, policy_version 8630 (0.0007) +[2023-10-11 15:16:25,109][85175] Updated weights for policy 1, policy_version 8640 (0.0008) +[2023-10-11 15:16:26,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 17563648. Throughput: 0: 1671.8, 1: 1682.6. Samples: 4396372. Policy #0 lag: (min: 26.0, avg: 26.0, max: 29.0) +[2023-10-11 15:16:26,064][84230] Avg episode reward: [(0, '6.640'), (1, '6.630')] +[2023-10-11 15:16:27,665][85176] Updated weights for policy 0, policy_version 8522 (0.0009) +[2023-10-11 15:16:28,030][85176] Updated weights for policy 0, policy_version 8532 (0.0011) +[2023-10-11 15:16:28,413][85176] Updated weights for policy 0, policy_version 8542 (0.0010) +[2023-10-11 15:16:29,148][85175] Updated weights for policy 1, policy_version 8650 (0.0009) +[2023-10-11 15:16:29,513][85175] Updated weights for policy 1, policy_version 8660 (0.0009) +[2023-10-11 15:16:29,882][85175] Updated weights for policy 1, policy_version 8670 (0.0009) +[2023-10-11 15:16:31,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 17629184. Throughput: 0: 1669.3, 1: 1679.3. Samples: 4416180. Policy #0 lag: (min: 26.0, avg: 26.0, max: 29.0) +[2023-10-11 15:16:31,063][84230] Avg episode reward: [(0, '6.970'), (1, '6.440')] +[2023-10-11 15:16:32,474][85176] Updated weights for policy 0, policy_version 8552 (0.0010) +[2023-10-11 15:16:32,840][85176] Updated weights for policy 0, policy_version 8562 (0.0009) +[2023-10-11 15:16:33,220][85176] Updated weights for policy 0, policy_version 8572 (0.0008) +[2023-10-11 15:16:33,982][85175] Updated weights for policy 1, policy_version 8680 (0.0008) +[2023-10-11 15:16:34,359][85175] Updated weights for policy 1, policy_version 8690 (0.0009) +[2023-10-11 15:16:34,728][85175] Updated weights for policy 1, policy_version 8700 (0.0007) +[2023-10-11 15:16:36,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 17694720. Throughput: 0: 1648.2, 1: 1694.4. Samples: 4426542. Policy #0 lag: (min: 31.0, avg: 33.9, max: 63.0) +[2023-10-11 15:16:36,063][84230] Avg episode reward: [(0, '6.970'), (1, '6.410')] +[2023-10-11 15:16:37,469][85176] Updated weights for policy 0, policy_version 8582 (0.0009) +[2023-10-11 15:16:37,839][85176] Updated weights for policy 0, policy_version 8592 (0.0007) +[2023-10-11 15:16:38,210][85176] Updated weights for policy 0, policy_version 8602 (0.0007) +[2023-10-11 15:16:38,705][85175] Updated weights for policy 1, policy_version 8710 (0.0011) +[2023-10-11 15:16:39,068][85175] Updated weights for policy 1, policy_version 8720 (0.0008) +[2023-10-11 15:16:39,432][85175] Updated weights for policy 1, policy_version 8730 (0.0008) +[2023-10-11 15:16:41,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 17760256. Throughput: 0: 1667.6, 1: 1664.6. Samples: 4446026. Policy #0 lag: (min: 31.0, avg: 33.9, max: 63.0) +[2023-10-11 15:16:41,064][84230] Avg episode reward: [(0, '6.530'), (1, '6.420')] +[2023-10-11 15:16:42,395][85176] Updated weights for policy 0, policy_version 8612 (0.0009) +[2023-10-11 15:16:42,773][85176] Updated weights for policy 0, policy_version 8622 (0.0009) +[2023-10-11 15:16:43,157][85176] Updated weights for policy 0, policy_version 8632 (0.0007) +[2023-10-11 15:16:43,597][85175] Updated weights for policy 1, policy_version 8740 (0.0009) +[2023-10-11 15:16:43,963][85175] Updated weights for policy 1, policy_version 8750 (0.0008) +[2023-10-11 15:16:44,326][85175] Updated weights for policy 1, policy_version 8760 (0.0008) +[2023-10-11 15:16:46,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 17825792. Throughput: 0: 1664.4, 1: 1685.9. Samples: 4466508. Policy #0 lag: (min: 31.0, avg: 33.8, max: 63.0) +[2023-10-11 15:16:46,064][84230] Avg episode reward: [(0, '6.640'), (1, '6.610')] +[2023-10-11 15:16:47,230][85176] Updated weights for policy 0, policy_version 8642 (0.0007) +[2023-10-11 15:16:47,611][85176] Updated weights for policy 0, policy_version 8652 (0.0010) +[2023-10-11 15:16:47,975][85176] Updated weights for policy 0, policy_version 8662 (0.0010) +[2023-10-11 15:16:48,350][85176] Updated weights for policy 0, policy_version 8672 (0.0009) +[2023-10-11 15:16:48,390][85175] Updated weights for policy 1, policy_version 8770 (0.0008) +[2023-10-11 15:16:48,754][85175] Updated weights for policy 1, policy_version 8780 (0.0008) +[2023-10-11 15:16:49,127][85175] Updated weights for policy 1, policy_version 8790 (0.0010) +[2023-10-11 15:16:49,507][85175] Updated weights for policy 1, policy_version 8800 (0.0010) +[2023-10-11 15:16:51,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 17891328. Throughput: 0: 1656.6, 1: 1687.3. Samples: 4476710. Policy #0 lag: (min: 31.0, avg: 33.8, max: 63.0) +[2023-10-11 15:16:51,063][84230] Avg episode reward: [(0, '6.750'), (1, '6.780')] +[2023-10-11 15:16:52,432][85176] Updated weights for policy 0, policy_version 8682 (0.0008) +[2023-10-11 15:16:52,803][85176] Updated weights for policy 0, policy_version 8692 (0.0008) +[2023-10-11 15:16:53,171][85176] Updated weights for policy 0, policy_version 8702 (0.0007) +[2023-10-11 15:16:53,623][85175] Updated weights for policy 1, policy_version 8810 (0.0010) +[2023-10-11 15:16:53,988][85175] Updated weights for policy 1, policy_version 8820 (0.0009) +[2023-10-11 15:16:54,364][85175] Updated weights for policy 1, policy_version 8830 (0.0010) +[2023-10-11 15:16:56,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 17956864. Throughput: 0: 1671.8, 1: 1668.5. Samples: 4496568. Policy #0 lag: (min: 15.0, avg: 17.8, max: 47.0) +[2023-10-11 15:16:56,063][84230] Avg episode reward: [(0, '6.750'), (1, '6.580')] +[2023-10-11 15:16:57,302][85176] Updated weights for policy 0, policy_version 8712 (0.0009) +[2023-10-11 15:16:57,678][85176] Updated weights for policy 0, policy_version 8722 (0.0009) +[2023-10-11 15:16:58,057][85176] Updated weights for policy 0, policy_version 8732 (0.0009) +[2023-10-11 15:16:58,210][85175] Updated weights for policy 1, policy_version 8840 (0.0008) +[2023-10-11 15:16:58,572][85175] Updated weights for policy 1, policy_version 8850 (0.0009) +[2023-10-11 15:16:58,937][85175] Updated weights for policy 1, policy_version 8860 (0.0009) +[2023-10-11 15:17:01,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 18022400. Throughput: 0: 1665.0, 1: 1698.1. Samples: 4517284. Policy #0 lag: (min: 15.0, avg: 17.8, max: 47.0) +[2023-10-11 15:17:01,063][84230] Avg episode reward: [(0, '6.860'), (1, '6.690')] +[2023-10-11 15:17:02,154][85176] Updated weights for policy 0, policy_version 8742 (0.0008) +[2023-10-11 15:17:02,523][85176] Updated weights for policy 0, policy_version 8752 (0.0010) +[2023-10-11 15:17:02,895][85176] Updated weights for policy 0, policy_version 8762 (0.0007) +[2023-10-11 15:17:02,907][85175] Updated weights for policy 1, policy_version 8870 (0.0009) +[2023-10-11 15:17:03,274][85175] Updated weights for policy 1, policy_version 8880 (0.0008) +[2023-10-11 15:17:03,644][85175] Updated weights for policy 1, policy_version 8890 (0.0007) +[2023-10-11 15:17:06,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 18087936. Throughput: 0: 1658.6, 1: 1678.1. Samples: 4526816. Policy #0 lag: (min: 9.0, avg: 17.2, max: 41.0) +[2023-10-11 15:17:06,063][84230] Avg episode reward: [(0, '6.640'), (1, '6.410')] +[2023-10-11 15:17:07,090][85176] Updated weights for policy 0, policy_version 8772 (0.0007) +[2023-10-11 15:17:07,470][85176] Updated weights for policy 0, policy_version 8782 (0.0010) +[2023-10-11 15:17:07,664][85175] Updated weights for policy 1, policy_version 8900 (0.0009) +[2023-10-11 15:17:07,839][85176] Updated weights for policy 0, policy_version 8792 (0.0007) +[2023-10-11 15:17:08,028][85175] Updated weights for policy 1, policy_version 8910 (0.0009) +[2023-10-11 15:17:08,402][85175] Updated weights for policy 1, policy_version 8920 (0.0007) +[2023-10-11 15:17:11,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 18153472. Throughput: 0: 1667.2, 1: 1681.8. Samples: 4547076. Policy #0 lag: (min: 9.0, avg: 17.2, max: 41.0) +[2023-10-11 15:17:11,064][84230] Avg episode reward: [(0, '6.640'), (1, '6.350')] +[2023-10-11 15:17:11,828][85176] Updated weights for policy 0, policy_version 8802 (0.0009) +[2023-10-11 15:17:12,202][85176] Updated weights for policy 0, policy_version 8812 (0.0008) +[2023-10-11 15:17:12,430][85175] Updated weights for policy 1, policy_version 8930 (0.0007) +[2023-10-11 15:17:12,568][85176] Updated weights for policy 0, policy_version 8822 (0.0008) +[2023-10-11 15:17:12,795][85175] Updated weights for policy 1, policy_version 8940 (0.0008) +[2023-10-11 15:17:12,938][85176] Updated weights for policy 0, policy_version 8832 (0.0007) +[2023-10-11 15:17:13,159][85175] Updated weights for policy 1, policy_version 8950 (0.0007) +[2023-10-11 15:17:13,526][85175] Updated weights for policy 1, policy_version 8960 (0.0007) +[2023-10-11 15:17:16,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 18219008. Throughput: 0: 1674.3, 1: 1702.0. Samples: 4568114. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:17:16,065][84230] Avg episode reward: [(0, '6.640'), (1, '6.430')] +[2023-10-11 15:17:16,855][85176] Updated weights for policy 0, policy_version 8842 (0.0007) +[2023-10-11 15:17:17,221][85176] Updated weights for policy 0, policy_version 8852 (0.0008) +[2023-10-11 15:17:17,520][85175] Updated weights for policy 1, policy_version 8970 (0.0008) +[2023-10-11 15:17:17,612][85176] Updated weights for policy 0, policy_version 8862 (0.0010) +[2023-10-11 15:17:17,897][85175] Updated weights for policy 1, policy_version 8980 (0.0008) +[2023-10-11 15:17:18,264][85175] Updated weights for policy 1, policy_version 8990 (0.0009) +[2023-10-11 15:17:21,063][84230] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 18284544. Throughput: 0: 1674.5, 1: 1671.0. Samples: 4577090. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:17:21,063][84230] Avg episode reward: [(0, '6.640'), (1, '6.480')] +[2023-10-11 15:17:21,870][85176] Updated weights for policy 0, policy_version 8872 (0.0008) +[2023-10-11 15:17:22,206][85175] Updated weights for policy 1, policy_version 9000 (0.0008) +[2023-10-11 15:17:22,241][85176] Updated weights for policy 0, policy_version 8882 (0.0009) +[2023-10-11 15:17:22,565][85175] Updated weights for policy 1, policy_version 9010 (0.0007) +[2023-10-11 15:17:22,606][85176] Updated weights for policy 0, policy_version 8892 (0.0009) +[2023-10-11 15:17:22,936][85175] Updated weights for policy 1, policy_version 9020 (0.0008) +[2023-10-11 15:17:26,063][84230] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 18350080. Throughput: 0: 1668.3, 1: 1702.1. Samples: 4597694. Policy #0 lag: (min: 17.0, avg: 17.1, max: 24.0) +[2023-10-11 15:17:26,063][84230] Avg episode reward: [(0, '6.750'), (1, '6.590')] +[2023-10-11 15:17:26,744][85176] Updated weights for policy 0, policy_version 8902 (0.0008) +[2023-10-11 15:17:27,122][85176] Updated weights for policy 0, policy_version 8912 (0.0008) +[2023-10-11 15:17:27,149][85175] Updated weights for policy 1, policy_version 9030 (0.0007) +[2023-10-11 15:17:27,493][85176] Updated weights for policy 0, policy_version 8922 (0.0007) +[2023-10-11 15:17:27,537][85175] Updated weights for policy 1, policy_version 9040 (0.0007) +[2023-10-11 15:17:27,916][85175] Updated weights for policy 1, policy_version 9050 (0.0009) +[2023-10-11 15:17:31,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 18415616. Throughput: 0: 1670.5, 1: 1696.5. Samples: 4618024. Policy #0 lag: (min: 17.0, avg: 17.1, max: 24.0) +[2023-10-11 15:17:31,063][84230] Avg episode reward: [(0, '6.750'), (1, '6.820')] +[2023-10-11 15:17:31,568][85176] Updated weights for policy 0, policy_version 8932 (0.0011) +[2023-10-11 15:17:31,940][85176] Updated weights for policy 0, policy_version 8942 (0.0009) +[2023-10-11 15:17:32,021][85175] Updated weights for policy 1, policy_version 9060 (0.0010) +[2023-10-11 15:17:32,305][85176] Updated weights for policy 0, policy_version 8952 (0.0010) +[2023-10-11 15:17:32,386][85175] Updated weights for policy 1, policy_version 9070 (0.0007) +[2023-10-11 15:17:32,746][85175] Updated weights for policy 1, policy_version 9080 (0.0008) +[2023-10-11 15:17:36,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 18481152. Throughput: 0: 1670.9, 1: 1671.6. Samples: 4627120. Policy #0 lag: (min: 31.0, avg: 37.6, max: 63.0) +[2023-10-11 15:17:36,064][84230] Avg episode reward: [(0, '6.970'), (1, '6.510')] +[2023-10-11 15:17:36,536][85176] Updated weights for policy 0, policy_version 8962 (0.0010) +[2023-10-11 15:17:36,913][85176] Updated weights for policy 0, policy_version 8972 (0.0008) +[2023-10-11 15:17:36,927][85175] Updated weights for policy 1, policy_version 9090 (0.0009) +[2023-10-11 15:17:37,286][85176] Updated weights for policy 0, policy_version 8982 (0.0007) +[2023-10-11 15:17:37,291][85175] Updated weights for policy 1, policy_version 9100 (0.0009) +[2023-10-11 15:17:37,652][85175] Updated weights for policy 1, policy_version 9110 (0.0009) +[2023-10-11 15:17:37,658][85176] Updated weights for policy 0, policy_version 8992 (0.0008) +[2023-10-11 15:17:38,018][85175] Updated weights for policy 1, policy_version 9120 (0.0008) +[2023-10-11 15:17:41,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 18546688. Throughput: 0: 1665.5, 1: 1695.7. Samples: 4647820. Policy #0 lag: (min: 31.0, avg: 37.6, max: 63.0) +[2023-10-11 15:17:41,064][84230] Avg episode reward: [(0, '6.970'), (1, '6.480')] +[2023-10-11 15:17:41,765][85176] Updated weights for policy 0, policy_version 9002 (0.0008) +[2023-10-11 15:17:42,067][85175] Updated weights for policy 1, policy_version 9130 (0.0009) +[2023-10-11 15:17:42,144][85176] Updated weights for policy 0, policy_version 9012 (0.0007) +[2023-10-11 15:17:42,441][85175] Updated weights for policy 1, policy_version 9140 (0.0009) +[2023-10-11 15:17:42,520][85176] Updated weights for policy 0, policy_version 9022 (0.0007) +[2023-10-11 15:17:42,802][85175] Updated weights for policy 1, policy_version 9150 (0.0008) +[2023-10-11 15:17:46,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 18612224. Throughput: 0: 1671.7, 1: 1692.1. Samples: 4668652. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:17:46,064][84230] Avg episode reward: [(0, '6.740'), (1, '6.530')] +[2023-10-11 15:17:46,626][85176] Updated weights for policy 0, policy_version 9032 (0.0008) +[2023-10-11 15:17:46,784][85175] Updated weights for policy 1, policy_version 9160 (0.0007) +[2023-10-11 15:17:46,995][85176] Updated weights for policy 0, policy_version 9042 (0.0007) +[2023-10-11 15:17:47,153][85175] Updated weights for policy 1, policy_version 9170 (0.0007) +[2023-10-11 15:17:47,376][85176] Updated weights for policy 0, policy_version 9052 (0.0008) +[2023-10-11 15:17:47,513][85175] Updated weights for policy 1, policy_version 9180 (0.0008) +[2023-10-11 15:17:51,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 18677760. Throughput: 0: 1668.6, 1: 1678.9. Samples: 4677456. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:17:51,063][84230] Avg episode reward: [(0, '6.630'), (1, '6.680')] +[2023-10-11 15:17:51,467][85175] Updated weights for policy 1, policy_version 9190 (0.0009) +[2023-10-11 15:17:51,494][85176] Updated weights for policy 0, policy_version 9062 (0.0007) +[2023-10-11 15:17:51,838][85175] Updated weights for policy 1, policy_version 9200 (0.0007) +[2023-10-11 15:17:51,867][85176] Updated weights for policy 0, policy_version 9072 (0.0008) +[2023-10-11 15:17:52,204][85175] Updated weights for policy 1, policy_version 9210 (0.0007) +[2023-10-11 15:17:52,240][85176] Updated weights for policy 0, policy_version 9082 (0.0009) +[2023-10-11 15:17:56,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 18743296. Throughput: 0: 1671.4, 1: 1691.2. Samples: 4698394. Policy #0 lag: (min: 26.0, avg: 30.5, max: 58.0) +[2023-10-11 15:17:56,063][84230] Avg episode reward: [(0, '6.750'), (1, '6.680')] +[2023-10-11 15:17:56,281][85176] Updated weights for policy 0, policy_version 9092 (0.0007) +[2023-10-11 15:17:56,330][85175] Updated weights for policy 1, policy_version 9220 (0.0008) +[2023-10-11 15:17:56,653][85176] Updated weights for policy 0, policy_version 9102 (0.0008) +[2023-10-11 15:17:56,699][85175] Updated weights for policy 1, policy_version 9230 (0.0008) +[2023-10-11 15:17:57,031][85176] Updated weights for policy 0, policy_version 9112 (0.0010) +[2023-10-11 15:17:57,068][85175] Updated weights for policy 1, policy_version 9240 (0.0008) +[2023-10-11 15:18:01,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 18808832. Throughput: 0: 1666.6, 1: 1687.7. Samples: 4719060. Policy #0 lag: (min: 26.0, avg: 30.5, max: 58.0) +[2023-10-11 15:18:01,063][84230] Avg episode reward: [(0, '6.530'), (1, '6.690')] +[2023-10-11 15:18:01,131][85176] Updated weights for policy 0, policy_version 9122 (0.0009) +[2023-10-11 15:18:01,188][85175] Updated weights for policy 1, policy_version 9250 (0.0008) +[2023-10-11 15:18:01,502][85176] Updated weights for policy 0, policy_version 9132 (0.0010) +[2023-10-11 15:18:01,558][85175] Updated weights for policy 1, policy_version 9260 (0.0008) +[2023-10-11 15:18:01,888][85176] Updated weights for policy 0, policy_version 9142 (0.0008) +[2023-10-11 15:18:01,931][85175] Updated weights for policy 1, policy_version 9270 (0.0008) +[2023-10-11 15:18:02,255][85176] Updated weights for policy 0, policy_version 9152 (0.0008) +[2023-10-11 15:18:02,299][85175] Updated weights for policy 1, policy_version 9280 (0.0007) +[2023-10-11 15:18:06,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 18874368. Throughput: 0: 1670.0, 1: 1688.8. Samples: 4728238. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:18:06,064][84230] Avg episode reward: [(0, '6.530'), (1, '6.540')] +[2023-10-11 15:18:06,166][85176] Updated weights for policy 0, policy_version 9162 (0.0009) +[2023-10-11 15:18:06,334][85175] Updated weights for policy 1, policy_version 9290 (0.0008) +[2023-10-11 15:18:06,547][85176] Updated weights for policy 0, policy_version 9172 (0.0009) +[2023-10-11 15:18:06,704][85175] Updated weights for policy 1, policy_version 9300 (0.0010) +[2023-10-11 15:18:06,923][85176] Updated weights for policy 0, policy_version 9182 (0.0009) +[2023-10-11 15:18:07,061][85175] Updated weights for policy 1, policy_version 9310 (0.0009) +[2023-10-11 15:18:10,893][85176] Updated weights for policy 0, policy_version 9192 (0.0008) +[2023-10-11 15:18:11,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 18939904. Throughput: 0: 1674.2, 1: 1682.7. Samples: 4748754. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:18:11,063][84230] Avg episode reward: [(0, '6.530'), (1, '6.140')] +[2023-10-11 15:18:11,211][85175] Updated weights for policy 1, policy_version 9320 (0.0009) +[2023-10-11 15:18:11,261][85176] Updated weights for policy 0, policy_version 9202 (0.0007) +[2023-10-11 15:18:11,581][85175] Updated weights for policy 1, policy_version 9330 (0.0007) +[2023-10-11 15:18:11,647][85176] Updated weights for policy 0, policy_version 9212 (0.0007) +[2023-10-11 15:18:11,954][85175] Updated weights for policy 1, policy_version 9340 (0.0007) +[2023-10-11 15:18:15,753][85176] Updated weights for policy 0, policy_version 9222 (0.0007) +[2023-10-11 15:18:15,852][85175] Updated weights for policy 1, policy_version 9350 (0.0008) +[2023-10-11 15:18:16,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 19005440. Throughput: 0: 1674.4, 1: 1690.7. Samples: 4769454. Policy #0 lag: (min: 31.0, avg: 31.3, max: 43.0) +[2023-10-11 15:18:16,063][84230] Avg episode reward: [(0, '6.640'), (1, '6.580')] +[2023-10-11 15:18:16,126][85176] Updated weights for policy 0, policy_version 9232 (0.0007) +[2023-10-11 15:18:16,232][85175] Updated weights for policy 1, policy_version 9360 (0.0007) +[2023-10-11 15:18:16,491][85176] Updated weights for policy 0, policy_version 9242 (0.0009) +[2023-10-11 15:18:16,607][85175] Updated weights for policy 1, policy_version 9370 (0.0009) +[2023-10-11 15:18:16,710][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000009248_9469952.pth... +[2023-10-11 15:18:16,749][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000007680_7864320.pth +[2023-10-11 15:18:16,817][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000009376_9601024.pth... +[2023-10-11 15:18:16,855][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000007776_7962624.pth +[2023-10-11 15:18:20,474][85176] Updated weights for policy 0, policy_version 9252 (0.0009) +[2023-10-11 15:18:20,584][85175] Updated weights for policy 1, policy_version 9380 (0.0008) +[2023-10-11 15:18:20,845][85176] Updated weights for policy 0, policy_version 9262 (0.0008) +[2023-10-11 15:18:20,952][85175] Updated weights for policy 1, policy_version 9390 (0.0008) +[2023-10-11 15:18:21,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 19070976. Throughput: 0: 1679.0, 1: 1690.5. Samples: 4778750. Policy #0 lag: (min: 31.0, avg: 31.3, max: 43.0) +[2023-10-11 15:18:21,063][84230] Avg episode reward: [(0, '6.750'), (1, '6.510')] +[2023-10-11 15:18:21,221][85176] Updated weights for policy 0, policy_version 9272 (0.0008) +[2023-10-11 15:18:21,329][85175] Updated weights for policy 1, policy_version 9400 (0.0008) +[2023-10-11 15:18:25,295][85175] Updated weights for policy 1, policy_version 9410 (0.0007) +[2023-10-11 15:18:25,392][85176] Updated weights for policy 0, policy_version 9282 (0.0008) +[2023-10-11 15:18:25,658][85175] Updated weights for policy 1, policy_version 9420 (0.0009) +[2023-10-11 15:18:25,767][85176] Updated weights for policy 0, policy_version 9292 (0.0008) +[2023-10-11 15:18:26,025][85175] Updated weights for policy 1, policy_version 9430 (0.0009) +[2023-10-11 15:18:26,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 19136512. Throughput: 0: 1677.3, 1: 1693.2. Samples: 4799494. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:18:26,063][84230] Avg episode reward: [(0, '6.980'), (1, '6.840')] +[2023-10-11 15:18:26,144][85176] Updated weights for policy 0, policy_version 9302 (0.0007) +[2023-10-11 15:18:26,391][85175] Updated weights for policy 1, policy_version 9440 (0.0008) +[2023-10-11 15:18:26,514][85176] Updated weights for policy 0, policy_version 9312 (0.0007) +[2023-10-11 15:18:30,537][85175] Updated weights for policy 1, policy_version 9450 (0.0007) +[2023-10-11 15:18:30,608][85176] Updated weights for policy 0, policy_version 9322 (0.0009) +[2023-10-11 15:18:30,908][85175] Updated weights for policy 1, policy_version 9460 (0.0007) +[2023-10-11 15:18:30,987][85176] Updated weights for policy 0, policy_version 9332 (0.0009) +[2023-10-11 15:18:31,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 19202048. Throughput: 0: 1663.2, 1: 1684.3. Samples: 4819292. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:18:31,063][84230] Avg episode reward: [(0, '6.970'), (1, '6.870')] +[2023-10-11 15:18:31,277][85175] Updated weights for policy 1, policy_version 9470 (0.0009) +[2023-10-11 15:18:31,365][85176] Updated weights for policy 0, policy_version 9342 (0.0008) +[2023-10-11 15:18:35,269][85175] Updated weights for policy 1, policy_version 9480 (0.0008) +[2023-10-11 15:18:35,451][85176] Updated weights for policy 0, policy_version 9352 (0.0009) +[2023-10-11 15:18:35,640][85175] Updated weights for policy 1, policy_version 9490 (0.0007) +[2023-10-11 15:18:35,827][85176] Updated weights for policy 0, policy_version 9362 (0.0009) +[2023-10-11 15:18:35,997][85175] Updated weights for policy 1, policy_version 9500 (0.0007) +[2023-10-11 15:18:36,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 19267584. Throughput: 0: 1674.2, 1: 1693.1. Samples: 4828986. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:18:36,063][84230] Avg episode reward: [(0, '6.850'), (1, '6.560')] +[2023-10-11 15:18:36,198][85176] Updated weights for policy 0, policy_version 9372 (0.0010) +[2023-10-11 15:18:40,085][85175] Updated weights for policy 1, policy_version 9510 (0.0008) +[2023-10-11 15:18:40,455][85175] Updated weights for policy 1, policy_version 9520 (0.0008) +[2023-10-11 15:18:40,549][85176] Updated weights for policy 0, policy_version 9382 (0.0009) +[2023-10-11 15:18:40,826][85175] Updated weights for policy 1, policy_version 9530 (0.0008) +[2023-10-11 15:18:40,932][85176] Updated weights for policy 0, policy_version 9392 (0.0007) +[2023-10-11 15:18:41,062][84230] Fps is (10 sec: 16384.0, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 19365888. Throughput: 0: 1669.9, 1: 1690.5. Samples: 4849612. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:18:41,063][84230] Avg episode reward: [(0, '6.530'), (1, '6.580')] +[2023-10-11 15:18:41,301][85176] Updated weights for policy 0, policy_version 9402 (0.0009) +[2023-10-11 15:18:44,819][85175] Updated weights for policy 1, policy_version 9540 (0.0009) +[2023-10-11 15:18:45,195][85175] Updated weights for policy 1, policy_version 9550 (0.0008) +[2023-10-11 15:18:45,443][85176] Updated weights for policy 0, policy_version 9412 (0.0008) +[2023-10-11 15:18:45,559][85175] Updated weights for policy 1, policy_version 9560 (0.0009) +[2023-10-11 15:18:45,818][85176] Updated weights for policy 0, policy_version 9422 (0.0007) +[2023-10-11 15:18:46,063][84230] Fps is (10 sec: 16383.6, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 19431424. Throughput: 0: 1667.5, 1: 1668.2. Samples: 4869168. Policy #0 lag: (min: 31.0, avg: 37.6, max: 63.0) +[2023-10-11 15:18:46,064][84230] Avg episode reward: [(0, '6.640'), (1, '6.480')] +[2023-10-11 15:18:46,198][85176] Updated weights for policy 0, policy_version 9432 (0.0007) +[2023-10-11 15:18:49,532][85175] Updated weights for policy 1, policy_version 9570 (0.0009) +[2023-10-11 15:18:49,896][85175] Updated weights for policy 1, policy_version 9580 (0.0009) +[2023-10-11 15:18:50,189][85176] Updated weights for policy 0, policy_version 9442 (0.0008) +[2023-10-11 15:18:50,271][85175] Updated weights for policy 1, policy_version 9590 (0.0009) +[2023-10-11 15:18:50,568][85176] Updated weights for policy 0, policy_version 9452 (0.0007) +[2023-10-11 15:18:50,645][85175] Updated weights for policy 1, policy_version 9600 (0.0007) +[2023-10-11 15:18:50,944][85176] Updated weights for policy 0, policy_version 9462 (0.0008) +[2023-10-11 15:18:51,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 19496960. Throughput: 0: 1669.0, 1: 1690.6. Samples: 4879420. Policy #0 lag: (min: 31.0, avg: 37.6, max: 63.0) +[2023-10-11 15:18:51,063][84230] Avg episode reward: [(0, '6.530'), (1, '6.400')] +[2023-10-11 15:18:51,319][85176] Updated weights for policy 0, policy_version 9472 (0.0008) +[2023-10-11 15:18:54,691][85175] Updated weights for policy 1, policy_version 9610 (0.0010) +[2023-10-11 15:18:55,049][85175] Updated weights for policy 1, policy_version 9620 (0.0009) +[2023-10-11 15:18:55,418][85175] Updated weights for policy 1, policy_version 9630 (0.0008) +[2023-10-11 15:18:55,449][85176] Updated weights for policy 0, policy_version 9482 (0.0007) +[2023-10-11 15:18:55,823][85176] Updated weights for policy 0, policy_version 9492 (0.0007) +[2023-10-11 15:18:56,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 19562496. Throughput: 0: 1668.9, 1: 1693.3. Samples: 4900054. Policy #0 lag: (min: 31.0, avg: 41.6, max: 63.0) +[2023-10-11 15:18:56,063][84230] Avg episode reward: [(0, '6.630'), (1, '6.480')] +[2023-10-11 15:18:56,196][85176] Updated weights for policy 0, policy_version 9502 (0.0008) +[2023-10-11 15:18:59,582][85175] Updated weights for policy 1, policy_version 9640 (0.0011) +[2023-10-11 15:18:59,957][85175] Updated weights for policy 1, policy_version 9650 (0.0010) +[2023-10-11 15:19:00,318][85175] Updated weights for policy 1, policy_version 9660 (0.0007) +[2023-10-11 15:19:00,319][85176] Updated weights for policy 0, policy_version 9512 (0.0007) +[2023-10-11 15:19:00,702][85176] Updated weights for policy 0, policy_version 9522 (0.0008) +[2023-10-11 15:19:01,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 19628032. Throughput: 0: 1657.2, 1: 1666.4. Samples: 4919018. Policy #0 lag: (min: 31.0, avg: 41.6, max: 63.0) +[2023-10-11 15:19:01,064][84230] Avg episode reward: [(0, '6.600'), (1, '6.680')] +[2023-10-11 15:19:01,073][85176] Updated weights for policy 0, policy_version 9532 (0.0007) +[2023-10-11 15:19:04,479][85175] Updated weights for policy 1, policy_version 9670 (0.0007) +[2023-10-11 15:19:04,860][85175] Updated weights for policy 1, policy_version 9680 (0.0008) +[2023-10-11 15:19:05,235][85175] Updated weights for policy 1, policy_version 9690 (0.0007) +[2023-10-11 15:19:05,260][85176] Updated weights for policy 0, policy_version 9542 (0.0007) +[2023-10-11 15:19:05,633][85176] Updated weights for policy 0, policy_version 9552 (0.0009) +[2023-10-11 15:19:06,008][85176] Updated weights for policy 0, policy_version 9562 (0.0008) +[2023-10-11 15:19:06,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 19693568. Throughput: 0: 1661.6, 1: 1694.8. Samples: 4929786. Policy #0 lag: (min: 9.0, avg: 26.9, max: 41.0) +[2023-10-11 15:19:06,063][84230] Avg episode reward: [(0, '6.970'), (1, '6.960')] +[2023-10-11 15:19:06,064][85000] Saving new best policy, reward=6.960! +[2023-10-11 15:19:09,177][85175] Updated weights for policy 1, policy_version 9700 (0.0009) +[2023-10-11 15:19:09,551][85175] Updated weights for policy 1, policy_version 9710 (0.0009) +[2023-10-11 15:19:09,922][85175] Updated weights for policy 1, policy_version 9720 (0.0008) +[2023-10-11 15:19:09,985][85176] Updated weights for policy 0, policy_version 9572 (0.0009) +[2023-10-11 15:19:10,351][85176] Updated weights for policy 0, policy_version 9582 (0.0008) +[2023-10-11 15:19:10,727][85176] Updated weights for policy 0, policy_version 9592 (0.0011) +[2023-10-11 15:19:11,063][84230] Fps is (10 sec: 16384.0, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 19791872. Throughput: 0: 1663.5, 1: 1680.0. Samples: 4949954. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-11 15:19:11,064][84230] Avg episode reward: [(0, '7.300'), (1, '6.880')] +[2023-10-11 15:19:13,896][85175] Updated weights for policy 1, policy_version 9730 (0.0007) +[2023-10-11 15:19:14,264][85175] Updated weights for policy 1, policy_version 9740 (0.0008) +[2023-10-11 15:19:14,633][85175] Updated weights for policy 1, policy_version 9750 (0.0008) +[2023-10-11 15:19:14,809][85176] Updated weights for policy 0, policy_version 9602 (0.0010) +[2023-10-11 15:19:15,006][85175] Updated weights for policy 1, policy_version 9760 (0.0007) +[2023-10-11 15:19:15,174][85176] Updated weights for policy 0, policy_version 9612 (0.0009) +[2023-10-11 15:19:15,556][85176] Updated weights for policy 0, policy_version 9622 (0.0009) +[2023-10-11 15:19:15,925][85176] Updated weights for policy 0, policy_version 9632 (0.0008) +[2023-10-11 15:19:16,063][84230] Fps is (10 sec: 16383.5, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 19857408. Throughput: 0: 1658.2, 1: 1671.9. Samples: 4969150. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-11 15:19:16,064][84230] Avg episode reward: [(0, '6.970'), (1, '6.790')] +[2023-10-11 15:19:19,034][85175] Updated weights for policy 1, policy_version 9770 (0.0007) +[2023-10-11 15:19:19,405][85175] Updated weights for policy 1, policy_version 9780 (0.0009) +[2023-10-11 15:19:19,777][85175] Updated weights for policy 1, policy_version 9790 (0.0007) +[2023-10-11 15:19:20,126][85176] Updated weights for policy 0, policy_version 9642 (0.0010) +[2023-10-11 15:19:20,500][85176] Updated weights for policy 0, policy_version 9652 (0.0007) +[2023-10-11 15:19:20,866][85176] Updated weights for policy 0, policy_version 9662 (0.0007) +[2023-10-11 15:19:21,062][84230] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 13440.4). Total num frames: 19922944. Throughput: 0: 1667.1, 1: 1695.2. Samples: 4980288. Policy #0 lag: (min: 25.0, avg: 30.6, max: 57.0) +[2023-10-11 15:19:21,063][84230] Avg episode reward: [(0, '6.640'), (1, '6.730')] +[2023-10-11 15:19:23,863][85175] Updated weights for policy 1, policy_version 9800 (0.0007) +[2023-10-11 15:19:24,229][85175] Updated weights for policy 1, policy_version 9810 (0.0007) +[2023-10-11 15:19:24,586][85175] Updated weights for policy 1, policy_version 9820 (0.0009) +[2023-10-11 15:19:24,959][85176] Updated weights for policy 0, policy_version 9672 (0.0008) +[2023-10-11 15:19:25,328][85176] Updated weights for policy 0, policy_version 9682 (0.0009) +[2023-10-11 15:19:25,695][85176] Updated weights for policy 0, policy_version 9692 (0.0011) +[2023-10-11 15:19:26,063][84230] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 13440.4). Total num frames: 19988480. Throughput: 0: 1664.0, 1: 1670.2. Samples: 4999650. Policy #0 lag: (min: 25.0, avg: 30.6, max: 57.0) +[2023-10-11 15:19:26,064][84230] Avg episode reward: [(0, '6.530'), (1, '6.310')] +[2023-10-11 15:19:28,681][85175] Updated weights for policy 1, policy_version 9830 (0.0008) +[2023-10-11 15:19:29,052][85175] Updated weights for policy 1, policy_version 9840 (0.0009) +[2023-10-11 15:19:29,414][85175] Updated weights for policy 1, policy_version 9850 (0.0009) +[2023-10-11 15:19:29,759][85176] Updated weights for policy 0, policy_version 9702 (0.0008) +[2023-10-11 15:19:30,135][85176] Updated weights for policy 0, policy_version 9712 (0.0007) +[2023-10-11 15:19:30,517][85176] Updated weights for policy 0, policy_version 9722 (0.0009) +[2023-10-11 15:19:31,063][84230] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 13440.4). Total num frames: 20054016. Throughput: 0: 1643.3, 1: 1684.8. Samples: 5018934. Policy #0 lag: (min: 31.0, avg: 31.7, max: 49.0) +[2023-10-11 15:19:31,064][84230] Avg episode reward: [(0, '6.530'), (1, '6.120')] +[2023-10-11 15:19:33,516][85175] Updated weights for policy 1, policy_version 9860 (0.0009) +[2023-10-11 15:19:33,871][85175] Updated weights for policy 1, policy_version 9870 (0.0008) +[2023-10-11 15:19:34,240][85175] Updated weights for policy 1, policy_version 9880 (0.0007) +[2023-10-11 15:19:34,554][85176] Updated weights for policy 0, policy_version 9732 (0.0008) +[2023-10-11 15:19:34,934][85176] Updated weights for policy 0, policy_version 9742 (0.0010) +[2023-10-11 15:19:35,313][85176] Updated weights for policy 0, policy_version 9752 (0.0010) +[2023-10-11 15:19:36,063][84230] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 13440.4). Total num frames: 20119552. Throughput: 0: 1665.8, 1: 1689.1. Samples: 5030390. Policy #0 lag: (min: 31.0, avg: 31.7, max: 49.0) +[2023-10-11 15:19:36,064][84230] Avg episode reward: [(0, '6.420'), (1, '6.410')] +[2023-10-11 15:19:38,193][85175] Updated weights for policy 1, policy_version 9890 (0.0008) +[2023-10-11 15:19:38,561][85175] Updated weights for policy 1, policy_version 9900 (0.0007) +[2023-10-11 15:19:38,917][85175] Updated weights for policy 1, policy_version 9910 (0.0007) +[2023-10-11 15:19:39,287][85175] Updated weights for policy 1, policy_version 9920 (0.0010) +[2023-10-11 15:19:39,463][85176] Updated weights for policy 0, policy_version 9762 (0.0010) +[2023-10-11 15:19:39,832][85176] Updated weights for policy 0, policy_version 9772 (0.0007) +[2023-10-11 15:19:40,202][85176] Updated weights for policy 0, policy_version 9782 (0.0008) +[2023-10-11 15:19:40,573][85176] Updated weights for policy 0, policy_version 9792 (0.0009) +[2023-10-11 15:19:41,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 20185088. Throughput: 0: 1661.1, 1: 1668.3. Samples: 5049874. Policy #0 lag: (min: 31.0, avg: 32.7, max: 60.0) +[2023-10-11 15:19:41,064][84230] Avg episode reward: [(0, '6.530'), (1, '6.540')] +[2023-10-11 15:19:43,327][85175] Updated weights for policy 1, policy_version 9930 (0.0007) +[2023-10-11 15:19:43,701][85175] Updated weights for policy 1, policy_version 9940 (0.0010) +[2023-10-11 15:19:44,062][85175] Updated weights for policy 1, policy_version 9950 (0.0007) +[2023-10-11 15:19:44,529][85176] Updated weights for policy 0, policy_version 9802 (0.0007) +[2023-10-11 15:19:44,919][85176] Updated weights for policy 0, policy_version 9812 (0.0008) +[2023-10-11 15:19:45,286][85176] Updated weights for policy 0, policy_version 9822 (0.0007) +[2023-10-11 15:19:46,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 20250624. Throughput: 0: 1652.2, 1: 1697.1. Samples: 5069736. Policy #0 lag: (min: 31.0, avg: 32.7, max: 60.0) +[2023-10-11 15:19:46,064][84230] Avg episode reward: [(0, '6.530'), (1, '6.770')] +[2023-10-11 15:19:47,817][85175] Updated weights for policy 1, policy_version 9960 (0.0009) +[2023-10-11 15:19:48,185][85175] Updated weights for policy 1, policy_version 9970 (0.0011) +[2023-10-11 15:19:48,556][85175] Updated weights for policy 1, policy_version 9980 (0.0007) +[2023-10-11 15:19:49,280][85176] Updated weights for policy 0, policy_version 9832 (0.0010) +[2023-10-11 15:19:49,649][85176] Updated weights for policy 0, policy_version 9842 (0.0008) +[2023-10-11 15:19:50,023][85176] Updated weights for policy 0, policy_version 9852 (0.0007) +[2023-10-11 15:19:51,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 20316160. Throughput: 0: 1672.3, 1: 1676.8. Samples: 5080494. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-10-11 15:19:51,064][84230] Avg episode reward: [(0, '6.790'), (1, '6.830')] +[2023-10-11 15:19:52,701][85175] Updated weights for policy 1, policy_version 9990 (0.0008) +[2023-10-11 15:19:53,060][85175] Updated weights for policy 1, policy_version 10000 (0.0009) +[2023-10-11 15:19:53,426][85175] Updated weights for policy 1, policy_version 10010 (0.0007) +[2023-10-11 15:19:54,200][85176] Updated weights for policy 0, policy_version 9862 (0.0009) +[2023-10-11 15:19:54,579][85176] Updated weights for policy 0, policy_version 9872 (0.0008) +[2023-10-11 15:19:54,954][85176] Updated weights for policy 0, policy_version 9882 (0.0007) +[2023-10-11 15:19:56,063][84230] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 20381696. Throughput: 0: 1657.1, 1: 1681.2. Samples: 5100176. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-10-11 15:19:56,063][84230] Avg episode reward: [(0, '6.530'), (1, '6.980')] +[2023-10-11 15:19:56,064][85000] Saving new best policy, reward=6.980! +[2023-10-11 15:19:57,534][85175] Updated weights for policy 1, policy_version 10020 (0.0009) +[2023-10-11 15:19:57,939][85175] Updated weights for policy 1, policy_version 10030 (0.0010) +[2023-10-11 15:19:58,300][85175] Updated weights for policy 1, policy_version 10040 (0.0010) +[2023-10-11 15:19:59,028][85176] Updated weights for policy 0, policy_version 9892 (0.0007) +[2023-10-11 15:19:59,394][85176] Updated weights for policy 0, policy_version 9902 (0.0010) +[2023-10-11 15:19:59,767][85176] Updated weights for policy 0, policy_version 9912 (0.0009) +[2023-10-11 15:20:01,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 20447232. Throughput: 0: 1659.3, 1: 1692.7. Samples: 5119990. Policy #0 lag: (min: 24.0, avg: 38.2, max: 56.0) +[2023-10-11 15:20:01,064][84230] Avg episode reward: [(0, '6.820'), (1, '6.760')] +[2023-10-11 15:20:02,263][85175] Updated weights for policy 1, policy_version 10050 (0.0010) +[2023-10-11 15:20:02,637][85175] Updated weights for policy 1, policy_version 10060 (0.0008) +[2023-10-11 15:20:03,011][85175] Updated weights for policy 1, policy_version 10070 (0.0009) +[2023-10-11 15:20:03,381][85175] Updated weights for policy 1, policy_version 10080 (0.0009) +[2023-10-11 15:20:03,829][85176] Updated weights for policy 0, policy_version 9922 (0.0007) +[2023-10-11 15:20:04,190][85176] Updated weights for policy 0, policy_version 9932 (0.0010) +[2023-10-11 15:20:04,572][85176] Updated weights for policy 0, policy_version 9942 (0.0010) +[2023-10-11 15:20:04,941][85176] Updated weights for policy 0, policy_version 9952 (0.0010) +[2023-10-11 15:20:06,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 20512768. Throughput: 0: 1676.7, 1: 1663.0. Samples: 5130574. Policy #0 lag: (min: 24.0, avg: 38.2, max: 56.0) +[2023-10-11 15:20:06,063][84230] Avg episode reward: [(0, '6.810'), (1, '6.680')] +[2023-10-11 15:20:07,586][85175] Updated weights for policy 1, policy_version 10090 (0.0008) +[2023-10-11 15:20:07,956][85175] Updated weights for policy 1, policy_version 10100 (0.0007) +[2023-10-11 15:20:08,327][85175] Updated weights for policy 1, policy_version 10110 (0.0010) +[2023-10-11 15:20:09,156][85176] Updated weights for policy 0, policy_version 9962 (0.0009) +[2023-10-11 15:20:09,526][85176] Updated weights for policy 0, policy_version 9972 (0.0010) +[2023-10-11 15:20:09,909][85176] Updated weights for policy 0, policy_version 9982 (0.0007) +[2023-10-11 15:20:11,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 20578304. Throughput: 0: 1661.3, 1: 1683.2. Samples: 5150154. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:20:11,064][84230] Avg episode reward: [(0, '6.740'), (1, '6.700')] +[2023-10-11 15:20:12,416][85175] Updated weights for policy 1, policy_version 10120 (0.0009) +[2023-10-11 15:20:12,786][85175] Updated weights for policy 1, policy_version 10130 (0.0008) +[2023-10-11 15:20:13,154][85175] Updated weights for policy 1, policy_version 10140 (0.0009) +[2023-10-11 15:20:14,057][85176] Updated weights for policy 0, policy_version 9992 (0.0008) +[2023-10-11 15:20:14,434][85176] Updated weights for policy 0, policy_version 10002 (0.0008) +[2023-10-11 15:20:14,802][85176] Updated weights for policy 0, policy_version 10012 (0.0008) +[2023-10-11 15:20:16,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 20643840. Throughput: 0: 1672.1, 1: 1696.8. Samples: 5170536. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:20:16,063][84230] Avg episode reward: [(0, '6.700'), (1, '6.730')] +[2023-10-11 15:20:16,077][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000010016_10256384.pth... +[2023-10-11 15:20:16,077][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000010144_10387456.pth... +[2023-10-11 15:20:16,113][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000008576_8781824.pth +[2023-10-11 15:20:16,114][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000008448_8650752.pth +[2023-10-11 15:20:17,151][85175] Updated weights for policy 1, policy_version 10150 (0.0011) +[2023-10-11 15:20:17,525][85175] Updated weights for policy 1, policy_version 10160 (0.0010) +[2023-10-11 15:20:17,901][85175] Updated weights for policy 1, policy_version 10170 (0.0009) +[2023-10-11 15:20:18,805][85176] Updated weights for policy 0, policy_version 10022 (0.0008) +[2023-10-11 15:20:19,176][85176] Updated weights for policy 0, policy_version 10032 (0.0009) +[2023-10-11 15:20:19,551][85176] Updated weights for policy 0, policy_version 10042 (0.0011) +[2023-10-11 15:20:21,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 20709376. Throughput: 0: 1674.7, 1: 1668.6. Samples: 5180838. Policy #0 lag: (min: 31.0, avg: 41.8, max: 63.0) +[2023-10-11 15:20:21,064][84230] Avg episode reward: [(0, '6.950'), (1, '6.630')] +[2023-10-11 15:20:21,891][85175] Updated weights for policy 1, policy_version 10180 (0.0009) +[2023-10-11 15:20:22,273][85175] Updated weights for policy 1, policy_version 10190 (0.0009) +[2023-10-11 15:20:22,639][85175] Updated weights for policy 1, policy_version 10200 (0.0008) +[2023-10-11 15:20:23,755][85176] Updated weights for policy 0, policy_version 10052 (0.0010) +[2023-10-11 15:20:24,118][85176] Updated weights for policy 0, policy_version 10062 (0.0010) +[2023-10-11 15:20:24,498][85176] Updated weights for policy 0, policy_version 10072 (0.0010) +[2023-10-11 15:20:26,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 20774912. Throughput: 0: 1653.8, 1: 1693.2. Samples: 5200486. Policy #0 lag: (min: 31.0, avg: 41.8, max: 63.0) +[2023-10-11 15:20:26,064][84230] Avg episode reward: [(0, '6.940'), (1, '6.830')] +[2023-10-11 15:20:26,760][85175] Updated weights for policy 1, policy_version 10210 (0.0009) +[2023-10-11 15:20:27,134][85175] Updated weights for policy 1, policy_version 10220 (0.0008) +[2023-10-11 15:20:27,504][85175] Updated weights for policy 1, policy_version 10230 (0.0008) +[2023-10-11 15:20:27,876][85175] Updated weights for policy 1, policy_version 10240 (0.0007) +[2023-10-11 15:20:28,700][85176] Updated weights for policy 0, policy_version 10082 (0.0008) +[2023-10-11 15:20:29,071][85176] Updated weights for policy 0, policy_version 10092 (0.0009) +[2023-10-11 15:20:29,434][85176] Updated weights for policy 0, policy_version 10102 (0.0008) +[2023-10-11 15:20:29,816][85176] Updated weights for policy 0, policy_version 10112 (0.0008) +[2023-10-11 15:20:31,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 20840448. Throughput: 0: 1665.9, 1: 1694.7. Samples: 5220962. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:20:31,064][84230] Avg episode reward: [(0, '6.970'), (1, '6.550')] +[2023-10-11 15:20:31,875][85175] Updated weights for policy 1, policy_version 10250 (0.0009) +[2023-10-11 15:20:32,249][85175] Updated weights for policy 1, policy_version 10260 (0.0007) +[2023-10-11 15:20:32,624][85175] Updated weights for policy 1, policy_version 10270 (0.0008) +[2023-10-11 15:20:33,975][85176] Updated weights for policy 0, policy_version 10122 (0.0009) +[2023-10-11 15:20:34,348][85176] Updated weights for policy 0, policy_version 10132 (0.0011) +[2023-10-11 15:20:34,715][85176] Updated weights for policy 0, policy_version 10142 (0.0009) +[2023-10-11 15:20:36,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 20905984. Throughput: 0: 1663.6, 1: 1688.2. Samples: 5231326. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:20:36,063][84230] Avg episode reward: [(0, '7.140'), (1, '6.800')] +[2023-10-11 15:20:36,660][85175] Updated weights for policy 1, policy_version 10280 (0.0008) +[2023-10-11 15:20:37,035][85175] Updated weights for policy 1, policy_version 10290 (0.0009) +[2023-10-11 15:20:37,401][85175] Updated weights for policy 1, policy_version 10300 (0.0007) +[2023-10-11 15:20:38,707][85176] Updated weights for policy 0, policy_version 10152 (0.0008) +[2023-10-11 15:20:39,076][85176] Updated weights for policy 0, policy_version 10162 (0.0007) +[2023-10-11 15:20:39,442][85176] Updated weights for policy 0, policy_version 10172 (0.0009) +[2023-10-11 15:20:41,063][84230] Fps is (10 sec: 13107.5, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 20971520. Throughput: 0: 1653.2, 1: 1695.2. Samples: 5250854. Policy #0 lag: (min: 31.0, avg: 32.0, max: 53.0) +[2023-10-11 15:20:41,063][84230] Avg episode reward: [(0, '7.080'), (1, '6.920')] +[2023-10-11 15:20:41,431][85175] Updated weights for policy 1, policy_version 10310 (0.0008) +[2023-10-11 15:20:41,791][85175] Updated weights for policy 1, policy_version 10320 (0.0008) +[2023-10-11 15:20:42,171][85175] Updated weights for policy 1, policy_version 10330 (0.0007) +[2023-10-11 15:20:43,421][85176] Updated weights for policy 0, policy_version 10182 (0.0007) +[2023-10-11 15:20:43,793][85176] Updated weights for policy 0, policy_version 10192 (0.0007) +[2023-10-11 15:20:44,171][85176] Updated weights for policy 0, policy_version 10202 (0.0010) +[2023-10-11 15:20:46,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 21037056. Throughput: 0: 1667.1, 1: 1703.2. Samples: 5271652. Policy #0 lag: (min: 31.0, avg: 32.0, max: 53.0) +[2023-10-11 15:20:46,064][84230] Avg episode reward: [(0, '6.970'), (1, '6.510')] +[2023-10-11 15:20:46,267][85175] Updated weights for policy 1, policy_version 10340 (0.0009) +[2023-10-11 15:20:46,664][85175] Updated weights for policy 1, policy_version 10350 (0.0008) +[2023-10-11 15:20:47,043][85175] Updated weights for policy 1, policy_version 10360 (0.0009) +[2023-10-11 15:20:48,309][85176] Updated weights for policy 0, policy_version 10212 (0.0009) +[2023-10-11 15:20:48,685][85176] Updated weights for policy 0, policy_version 10222 (0.0008) +[2023-10-11 15:20:49,057][85176] Updated weights for policy 0, policy_version 10232 (0.0009) +[2023-10-11 15:20:50,999][85175] Updated weights for policy 1, policy_version 10370 (0.0007) +[2023-10-11 15:20:51,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 21102592. Throughput: 0: 1649.9, 1: 1699.6. Samples: 5281304. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2023-10-11 15:20:51,063][84230] Avg episode reward: [(0, '7.070'), (1, '6.340')] +[2023-10-11 15:20:51,371][85175] Updated weights for policy 1, policy_version 10380 (0.0007) +[2023-10-11 15:20:51,743][85175] Updated weights for policy 1, policy_version 10390 (0.0007) +[2023-10-11 15:20:52,107][85175] Updated weights for policy 1, policy_version 10400 (0.0007) +[2023-10-11 15:20:53,132][85176] Updated weights for policy 0, policy_version 10242 (0.0009) +[2023-10-11 15:20:53,501][85176] Updated weights for policy 0, policy_version 10252 (0.0008) +[2023-10-11 15:20:53,881][85176] Updated weights for policy 0, policy_version 10262 (0.0008) +[2023-10-11 15:20:54,251][85176] Updated weights for policy 0, policy_version 10272 (0.0008) +[2023-10-11 15:20:56,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 21168128. Throughput: 0: 1656.1, 1: 1706.8. Samples: 5301484. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2023-10-11 15:20:56,063][84230] Avg episode reward: [(0, '6.730'), (1, '6.280')] +[2023-10-11 15:20:56,225][85175] Updated weights for policy 1, policy_version 10410 (0.0008) +[2023-10-11 15:20:56,592][85175] Updated weights for policy 1, policy_version 10420 (0.0007) +[2023-10-11 15:20:56,963][85175] Updated weights for policy 1, policy_version 10430 (0.0009) +[2023-10-11 15:20:58,559][85176] Updated weights for policy 0, policy_version 10282 (0.0007) +[2023-10-11 15:20:58,942][85176] Updated weights for policy 0, policy_version 10292 (0.0008) +[2023-10-11 15:20:59,313][85176] Updated weights for policy 0, policy_version 10302 (0.0009) +[2023-10-11 15:21:00,788][85175] Updated weights for policy 1, policy_version 10440 (0.0008) +[2023-10-11 15:21:01,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 21233664. Throughput: 0: 1665.4, 1: 1704.8. Samples: 5322192. Policy #0 lag: (min: 19.0, avg: 25.6, max: 51.0) +[2023-10-11 15:21:01,063][84230] Avg episode reward: [(0, '6.750'), (1, '6.450')] +[2023-10-11 15:21:01,159][85175] Updated weights for policy 1, policy_version 10450 (0.0009) +[2023-10-11 15:21:01,534][85175] Updated weights for policy 1, policy_version 10460 (0.0008) +[2023-10-11 15:21:03,306][85176] Updated weights for policy 0, policy_version 10312 (0.0010) +[2023-10-11 15:21:03,676][85176] Updated weights for policy 0, policy_version 10322 (0.0010) +[2023-10-11 15:21:04,051][85176] Updated weights for policy 0, policy_version 10332 (0.0009) +[2023-10-11 15:21:05,566][85175] Updated weights for policy 1, policy_version 10470 (0.0009) +[2023-10-11 15:21:05,934][85175] Updated weights for policy 1, policy_version 10480 (0.0010) +[2023-10-11 15:21:06,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 21299200. Throughput: 0: 1653.5, 1: 1706.8. Samples: 5332048. Policy #0 lag: (min: 19.0, avg: 25.6, max: 51.0) +[2023-10-11 15:21:06,063][84230] Avg episode reward: [(0, '6.750'), (1, '6.840')] +[2023-10-11 15:21:06,308][85175] Updated weights for policy 1, policy_version 10490 (0.0009) +[2023-10-11 15:21:08,084][85176] Updated weights for policy 0, policy_version 10342 (0.0008) +[2023-10-11 15:21:08,459][85176] Updated weights for policy 0, policy_version 10352 (0.0008) +[2023-10-11 15:21:08,841][85176] Updated weights for policy 0, policy_version 10362 (0.0009) +[2023-10-11 15:21:10,412][85175] Updated weights for policy 1, policy_version 10500 (0.0007) +[2023-10-11 15:21:10,778][85175] Updated weights for policy 1, policy_version 10510 (0.0011) +[2023-10-11 15:21:11,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 21364736. Throughput: 0: 1667.2, 1: 1703.3. Samples: 5352160. Policy #0 lag: (min: 22.0, avg: 23.6, max: 45.0) +[2023-10-11 15:21:11,064][84230] Avg episode reward: [(0, '6.640'), (1, '7.150')] +[2023-10-11 15:21:11,149][85175] Updated weights for policy 1, policy_version 10520 (0.0009) +[2023-10-11 15:21:11,443][85000] Saving new best policy, reward=7.150! +[2023-10-11 15:21:12,765][85176] Updated weights for policy 0, policy_version 10372 (0.0009) +[2023-10-11 15:21:13,149][85176] Updated weights for policy 0, policy_version 10382 (0.0009) +[2023-10-11 15:21:13,526][85176] Updated weights for policy 0, policy_version 10392 (0.0010) +[2023-10-11 15:21:15,416][85175] Updated weights for policy 1, policy_version 10530 (0.0008) +[2023-10-11 15:21:15,788][85175] Updated weights for policy 1, policy_version 10540 (0.0009) +[2023-10-11 15:21:16,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 21430272. Throughput: 0: 1681.4, 1: 1691.2. Samples: 5372730. Policy #0 lag: (min: 22.0, avg: 23.6, max: 45.0) +[2023-10-11 15:21:16,063][84230] Avg episode reward: [(0, '6.640'), (1, '6.680')] +[2023-10-11 15:21:16,163][85175] Updated weights for policy 1, policy_version 10550 (0.0007) +[2023-10-11 15:21:16,537][85175] Updated weights for policy 1, policy_version 10560 (0.0009) +[2023-10-11 15:21:17,681][85176] Updated weights for policy 0, policy_version 10402 (0.0010) +[2023-10-11 15:21:18,051][85176] Updated weights for policy 0, policy_version 10412 (0.0007) +[2023-10-11 15:21:18,429][85176] Updated weights for policy 0, policy_version 10422 (0.0007) +[2023-10-11 15:21:18,800][85176] Updated weights for policy 0, policy_version 10432 (0.0007) +[2023-10-11 15:21:20,593][85175] Updated weights for policy 1, policy_version 10570 (0.0010) +[2023-10-11 15:21:20,966][85175] Updated weights for policy 1, policy_version 10580 (0.0008) +[2023-10-11 15:21:21,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 21495808. Throughput: 0: 1658.8, 1: 1695.0. Samples: 5382246. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) +[2023-10-11 15:21:21,063][84230] Avg episode reward: [(0, '6.750'), (1, '6.520')] +[2023-10-11 15:21:21,332][85175] Updated weights for policy 1, policy_version 10590 (0.0009) +[2023-10-11 15:21:22,800][85176] Updated weights for policy 0, policy_version 10442 (0.0007) +[2023-10-11 15:21:23,173][85176] Updated weights for policy 0, policy_version 10452 (0.0007) +[2023-10-11 15:21:23,535][85176] Updated weights for policy 0, policy_version 10462 (0.0008) +[2023-10-11 15:21:25,304][85175] Updated weights for policy 1, policy_version 10600 (0.0010) +[2023-10-11 15:21:25,676][85175] Updated weights for policy 1, policy_version 10610 (0.0007) +[2023-10-11 15:21:26,053][85175] Updated weights for policy 1, policy_version 10620 (0.0008) +[2023-10-11 15:21:26,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 21561344. Throughput: 0: 1677.0, 1: 1693.1. Samples: 5402506. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) +[2023-10-11 15:21:26,063][84230] Avg episode reward: [(0, '6.740'), (1, '6.420')] +[2023-10-11 15:21:27,657][85176] Updated weights for policy 0, policy_version 10472 (0.0007) +[2023-10-11 15:21:28,027][85176] Updated weights for policy 0, policy_version 10482 (0.0007) +[2023-10-11 15:21:28,396][85176] Updated weights for policy 0, policy_version 10492 (0.0007) +[2023-10-11 15:21:29,874][85175] Updated weights for policy 1, policy_version 10630 (0.0008) +[2023-10-11 15:21:30,249][85175] Updated weights for policy 1, policy_version 10640 (0.0009) +[2023-10-11 15:21:30,618][85175] Updated weights for policy 1, policy_version 10650 (0.0008) +[2023-10-11 15:21:31,062][84230] Fps is (10 sec: 16384.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 21659648. Throughput: 0: 1682.9, 1: 1672.6. Samples: 5422646. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) +[2023-10-11 15:21:31,063][84230] Avg episode reward: [(0, '6.850'), (1, '6.500')] +[2023-10-11 15:21:32,250][85176] Updated weights for policy 0, policy_version 10502 (0.0007) +[2023-10-11 15:21:32,632][85176] Updated weights for policy 0, policy_version 10512 (0.0009) +[2023-10-11 15:21:33,011][85176] Updated weights for policy 0, policy_version 10522 (0.0010) +[2023-10-11 15:21:34,813][85175] Updated weights for policy 1, policy_version 10660 (0.0008) +[2023-10-11 15:21:35,203][85175] Updated weights for policy 1, policy_version 10670 (0.0008) +[2023-10-11 15:21:35,573][85175] Updated weights for policy 1, policy_version 10680 (0.0008) +[2023-10-11 15:21:36,063][84230] Fps is (10 sec: 16383.6, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 21725184. Throughput: 0: 1668.4, 1: 1694.8. Samples: 5432650. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) +[2023-10-11 15:21:36,064][84230] Avg episode reward: [(0, '7.070'), (1, '6.780')] +[2023-10-11 15:21:37,084][85176] Updated weights for policy 0, policy_version 10532 (0.0009) +[2023-10-11 15:21:37,456][85176] Updated weights for policy 0, policy_version 10542 (0.0011) +[2023-10-11 15:21:37,841][85176] Updated weights for policy 0, policy_version 10552 (0.0010) +[2023-10-11 15:21:39,607][85175] Updated weights for policy 1, policy_version 10690 (0.0010) +[2023-10-11 15:21:39,973][85175] Updated weights for policy 1, policy_version 10700 (0.0009) +[2023-10-11 15:21:40,338][85175] Updated weights for policy 1, policy_version 10710 (0.0008) +[2023-10-11 15:21:40,703][85175] Updated weights for policy 1, policy_version 10720 (0.0007) +[2023-10-11 15:21:41,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 21790720. Throughput: 0: 1686.2, 1: 1689.8. Samples: 5453404. Policy #0 lag: (min: 25.0, avg: 36.1, max: 57.0) +[2023-10-11 15:21:41,063][84230] Avg episode reward: [(0, '7.180'), (1, '6.580')] +[2023-10-11 15:21:42,011][85176] Updated weights for policy 0, policy_version 10562 (0.0008) +[2023-10-11 15:21:42,390][85176] Updated weights for policy 0, policy_version 10572 (0.0009) +[2023-10-11 15:21:42,762][85176] Updated weights for policy 0, policy_version 10582 (0.0011) +[2023-10-11 15:21:43,141][85176] Updated weights for policy 0, policy_version 10592 (0.0009) +[2023-10-11 15:21:44,823][85175] Updated weights for policy 1, policy_version 10730 (0.0008) +[2023-10-11 15:21:45,187][85175] Updated weights for policy 1, policy_version 10740 (0.0008) +[2023-10-11 15:21:45,559][85175] Updated weights for policy 1, policy_version 10750 (0.0009) +[2023-10-11 15:21:46,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 21856256. Throughput: 0: 1686.3, 1: 1656.4. Samples: 5472618. Policy #0 lag: (min: 25.0, avg: 36.1, max: 57.0) +[2023-10-11 15:21:46,064][84230] Avg episode reward: [(0, '6.740'), (1, '6.720')] +[2023-10-11 15:21:47,340][85176] Updated weights for policy 0, policy_version 10602 (0.0010) +[2023-10-11 15:21:47,715][85176] Updated weights for policy 0, policy_version 10612 (0.0009) +[2023-10-11 15:21:48,086][85176] Updated weights for policy 0, policy_version 10622 (0.0008) +[2023-10-11 15:21:49,597][85175] Updated weights for policy 1, policy_version 10760 (0.0008) +[2023-10-11 15:21:49,957][85175] Updated weights for policy 1, policy_version 10770 (0.0008) +[2023-10-11 15:21:50,334][85175] Updated weights for policy 1, policy_version 10780 (0.0008) +[2023-10-11 15:21:51,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 21921792. Throughput: 0: 1666.8, 1: 1682.7. Samples: 5482774. Policy #0 lag: (min: 30.0, avg: 33.1, max: 62.0) +[2023-10-11 15:21:51,063][84230] Avg episode reward: [(0, '6.420'), (1, '6.520')] +[2023-10-11 15:21:52,113][85176] Updated weights for policy 0, policy_version 10632 (0.0009) +[2023-10-11 15:21:52,484][85176] Updated weights for policy 0, policy_version 10642 (0.0010) +[2023-10-11 15:21:52,856][85176] Updated weights for policy 0, policy_version 10652 (0.0007) +[2023-10-11 15:21:54,460][85175] Updated weights for policy 1, policy_version 10790 (0.0008) +[2023-10-11 15:21:54,829][85175] Updated weights for policy 1, policy_version 10800 (0.0007) +[2023-10-11 15:21:55,195][85175] Updated weights for policy 1, policy_version 10810 (0.0007) +[2023-10-11 15:21:56,062][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 21987328. Throughput: 0: 1677.3, 1: 1670.7. Samples: 5502818. Policy #0 lag: (min: 30.0, avg: 33.1, max: 62.0) +[2023-10-11 15:21:56,063][84230] Avg episode reward: [(0, '6.190'), (1, '6.720')] +[2023-10-11 15:21:57,163][85176] Updated weights for policy 0, policy_version 10662 (0.0007) +[2023-10-11 15:21:57,524][85176] Updated weights for policy 0, policy_version 10672 (0.0010) +[2023-10-11 15:21:57,893][85176] Updated weights for policy 0, policy_version 10682 (0.0008) +[2023-10-11 15:21:59,296][85175] Updated weights for policy 1, policy_version 10820 (0.0009) +[2023-10-11 15:21:59,665][85175] Updated weights for policy 1, policy_version 10830 (0.0010) +[2023-10-11 15:22:00,032][85175] Updated weights for policy 1, policy_version 10840 (0.0009) +[2023-10-11 15:22:01,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 22052864. Throughput: 0: 1674.4, 1: 1657.1. Samples: 5522648. Policy #0 lag: (min: 27.0, avg: 27.0, max: 27.0) +[2023-10-11 15:22:01,064][84230] Avg episode reward: [(0, '6.520'), (1, '6.580')] +[2023-10-11 15:22:01,787][85176] Updated weights for policy 0, policy_version 10692 (0.0008) +[2023-10-11 15:22:02,169][85176] Updated weights for policy 0, policy_version 10702 (0.0011) +[2023-10-11 15:22:02,545][85176] Updated weights for policy 0, policy_version 10712 (0.0010) +[2023-10-11 15:22:04,200][85175] Updated weights for policy 1, policy_version 10850 (0.0008) +[2023-10-11 15:22:04,571][85175] Updated weights for policy 1, policy_version 10860 (0.0007) +[2023-10-11 15:22:04,942][85175] Updated weights for policy 1, policy_version 10870 (0.0007) +[2023-10-11 15:22:05,313][85175] Updated weights for policy 1, policy_version 10880 (0.0008) +[2023-10-11 15:22:06,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 22118400. Throughput: 0: 1669.3, 1: 1682.7. Samples: 5533084. Policy #0 lag: (min: 27.0, avg: 27.0, max: 27.0) +[2023-10-11 15:22:06,063][84230] Avg episode reward: [(0, '6.420'), (1, '6.620')] +[2023-10-11 15:22:06,592][85176] Updated weights for policy 0, policy_version 10722 (0.0008) +[2023-10-11 15:22:06,978][85176] Updated weights for policy 0, policy_version 10732 (0.0007) +[2023-10-11 15:22:07,357][85176] Updated weights for policy 0, policy_version 10742 (0.0008) +[2023-10-11 15:22:07,719][85176] Updated weights for policy 0, policy_version 10752 (0.0008) +[2023-10-11 15:22:09,424][85175] Updated weights for policy 1, policy_version 10890 (0.0009) +[2023-10-11 15:22:09,795][85175] Updated weights for policy 1, policy_version 10900 (0.0007) +[2023-10-11 15:22:10,164][85175] Updated weights for policy 1, policy_version 10910 (0.0007) +[2023-10-11 15:22:11,062][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 22183936. Throughput: 0: 1676.9, 1: 1676.3. Samples: 5553398. Policy #0 lag: (min: 2.0, avg: 9.7, max: 34.0) +[2023-10-11 15:22:11,063][84230] Avg episode reward: [(0, '6.500'), (1, '6.570')] +[2023-10-11 15:22:11,927][85176] Updated weights for policy 0, policy_version 10762 (0.0007) +[2023-10-11 15:22:12,291][85176] Updated weights for policy 0, policy_version 10772 (0.0009) +[2023-10-11 15:22:12,656][85176] Updated weights for policy 0, policy_version 10782 (0.0008) +[2023-10-11 15:22:14,258][85175] Updated weights for policy 1, policy_version 10920 (0.0008) +[2023-10-11 15:22:14,617][85175] Updated weights for policy 1, policy_version 10930 (0.0007) +[2023-10-11 15:22:14,987][85175] Updated weights for policy 1, policy_version 10940 (0.0008) +[2023-10-11 15:22:16,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 22249472. Throughput: 0: 1678.5, 1: 1669.3. Samples: 5573300. Policy #0 lag: (min: 2.0, avg: 9.7, max: 34.0) +[2023-10-11 15:22:16,064][84230] Avg episode reward: [(0, '6.780'), (1, '6.760')] +[2023-10-11 15:22:16,078][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000010784_11042816.pth... +[2023-10-11 15:22:16,078][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000010944_11206656.pth... +[2023-10-11 15:22:16,109][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000009248_9469952.pth +[2023-10-11 15:22:16,111][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000009376_9601024.pth +[2023-10-11 15:22:16,568][85176] Updated weights for policy 0, policy_version 10792 (0.0009) +[2023-10-11 15:22:16,932][85176] Updated weights for policy 0, policy_version 10802 (0.0010) +[2023-10-11 15:22:17,311][85176] Updated weights for policy 0, policy_version 10812 (0.0008) +[2023-10-11 15:22:19,071][85175] Updated weights for policy 1, policy_version 10950 (0.0009) +[2023-10-11 15:22:19,441][85175] Updated weights for policy 1, policy_version 10960 (0.0009) +[2023-10-11 15:22:19,810][85175] Updated weights for policy 1, policy_version 10970 (0.0009) +[2023-10-11 15:22:21,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 22315008. Throughput: 0: 1677.0, 1: 1679.8. Samples: 5583706. Policy #0 lag: (min: 29.0, avg: 35.7, max: 61.0) +[2023-10-11 15:22:21,063][84230] Avg episode reward: [(0, '6.590'), (1, '6.810')] +[2023-10-11 15:22:21,378][85176] Updated weights for policy 0, policy_version 10822 (0.0010) +[2023-10-11 15:22:21,748][85176] Updated weights for policy 0, policy_version 10832 (0.0007) +[2023-10-11 15:22:22,127][85176] Updated weights for policy 0, policy_version 10842 (0.0007) +[2023-10-11 15:22:23,917][85175] Updated weights for policy 1, policy_version 10980 (0.0008) +[2023-10-11 15:22:24,281][85175] Updated weights for policy 1, policy_version 10990 (0.0009) +[2023-10-11 15:22:24,650][85175] Updated weights for policy 1, policy_version 11000 (0.0007) +[2023-10-11 15:22:25,938][85176] Updated weights for policy 0, policy_version 10852 (0.0010) +[2023-10-11 15:22:26,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 22380544. Throughput: 0: 1678.1, 1: 1660.5. Samples: 5603642. Policy #0 lag: (min: 29.0, avg: 35.7, max: 61.0) +[2023-10-11 15:22:26,064][84230] Avg episode reward: [(0, '6.520'), (1, '6.510')] +[2023-10-11 15:22:26,318][85176] Updated weights for policy 0, policy_version 10862 (0.0007) +[2023-10-11 15:22:26,683][85176] Updated weights for policy 0, policy_version 10872 (0.0009) +[2023-10-11 15:22:28,658][85175] Updated weights for policy 1, policy_version 11010 (0.0008) +[2023-10-11 15:22:29,028][85175] Updated weights for policy 1, policy_version 11020 (0.0009) +[2023-10-11 15:22:29,401][85175] Updated weights for policy 1, policy_version 11030 (0.0008) +[2023-10-11 15:22:29,771][85175] Updated weights for policy 1, policy_version 11040 (0.0008) +[2023-10-11 15:22:30,803][85176] Updated weights for policy 0, policy_version 10882 (0.0008) +[2023-10-11 15:22:31,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 22446080. Throughput: 0: 1684.6, 1: 1678.0. Samples: 5623934. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:22:31,064][84230] Avg episode reward: [(0, '6.640'), (1, '6.500')] +[2023-10-11 15:22:31,183][85176] Updated weights for policy 0, policy_version 10892 (0.0008) +[2023-10-11 15:22:31,552][85176] Updated weights for policy 0, policy_version 10902 (0.0008) +[2023-10-11 15:22:31,931][85176] Updated weights for policy 0, policy_version 10912 (0.0007) +[2023-10-11 15:22:33,772][85175] Updated weights for policy 1, policy_version 11050 (0.0008) +[2023-10-11 15:22:34,145][85175] Updated weights for policy 1, policy_version 11060 (0.0008) +[2023-10-11 15:22:34,509][85175] Updated weights for policy 1, policy_version 11070 (0.0009) +[2023-10-11 15:22:35,968][85176] Updated weights for policy 0, policy_version 10922 (0.0008) +[2023-10-11 15:22:36,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 22511616. Throughput: 0: 1686.6, 1: 1678.9. Samples: 5634222. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:22:36,063][84230] Avg episode reward: [(0, '6.970'), (1, '6.760')] +[2023-10-11 15:22:36,351][85176] Updated weights for policy 0, policy_version 10932 (0.0010) +[2023-10-11 15:22:36,739][85176] Updated weights for policy 0, policy_version 10942 (0.0009) +[2023-10-11 15:22:38,466][85175] Updated weights for policy 1, policy_version 11080 (0.0009) +[2023-10-11 15:22:38,826][85175] Updated weights for policy 1, policy_version 11090 (0.0007) +[2023-10-11 15:22:39,200][85175] Updated weights for policy 1, policy_version 11100 (0.0009) +[2023-10-11 15:22:40,909][85176] Updated weights for policy 0, policy_version 10952 (0.0008) +[2023-10-11 15:22:41,062][84230] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 22577152. Throughput: 0: 1686.3, 1: 1671.2. Samples: 5653904. Policy #0 lag: (min: 31.0, avg: 39.3, max: 63.0) +[2023-10-11 15:22:41,063][84230] Avg episode reward: [(0, '7.150'), (1, '6.610')] +[2023-10-11 15:22:41,285][85176] Updated weights for policy 0, policy_version 10962 (0.0007) +[2023-10-11 15:22:41,660][85176] Updated weights for policy 0, policy_version 10972 (0.0009) +[2023-10-11 15:22:43,341][85175] Updated weights for policy 1, policy_version 11110 (0.0011) +[2023-10-11 15:22:43,709][85175] Updated weights for policy 1, policy_version 11120 (0.0010) +[2023-10-11 15:22:44,072][85175] Updated weights for policy 1, policy_version 11130 (0.0009) +[2023-10-11 15:22:45,829][85176] Updated weights for policy 0, policy_version 10982 (0.0009) +[2023-10-11 15:22:46,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 22642688. Throughput: 0: 1679.7, 1: 1694.9. Samples: 5674504. Policy #0 lag: (min: 31.0, avg: 39.3, max: 63.0) +[2023-10-11 15:22:46,064][84230] Avg episode reward: [(0, '7.060'), (1, '6.520')] +[2023-10-11 15:22:46,201][85176] Updated weights for policy 0, policy_version 10992 (0.0009) +[2023-10-11 15:22:46,566][85176] Updated weights for policy 0, policy_version 11002 (0.0010) +[2023-10-11 15:22:48,031][85175] Updated weights for policy 1, policy_version 11140 (0.0009) +[2023-10-11 15:22:48,408][85175] Updated weights for policy 1, policy_version 11150 (0.0010) +[2023-10-11 15:22:48,771][85175] Updated weights for policy 1, policy_version 11160 (0.0011) +[2023-10-11 15:22:50,934][85176] Updated weights for policy 0, policy_version 11012 (0.0008) +[2023-10-11 15:22:51,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 22708224. Throughput: 0: 1678.1, 1: 1682.5. Samples: 5684314. Policy #0 lag: (min: 31.0, avg: 31.2, max: 40.0) +[2023-10-11 15:22:51,063][84230] Avg episode reward: [(0, '7.120'), (1, '6.440')] +[2023-10-11 15:22:51,302][85176] Updated weights for policy 0, policy_version 11022 (0.0010) +[2023-10-11 15:22:51,685][85176] Updated weights for policy 0, policy_version 11032 (0.0007) +[2023-10-11 15:22:52,719][85175] Updated weights for policy 1, policy_version 11170 (0.0011) +[2023-10-11 15:22:53,086][85175] Updated weights for policy 1, policy_version 11180 (0.0010) +[2023-10-11 15:22:53,460][85175] Updated weights for policy 1, policy_version 11190 (0.0011) +[2023-10-11 15:22:53,827][85175] Updated weights for policy 1, policy_version 11200 (0.0009) +[2023-10-11 15:22:55,731][85176] Updated weights for policy 0, policy_version 11042 (0.0007) +[2023-10-11 15:22:56,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 22773760. Throughput: 0: 1676.7, 1: 1674.4. Samples: 5704200. Policy #0 lag: (min: 31.0, avg: 31.2, max: 40.0) +[2023-10-11 15:22:56,063][84230] Avg episode reward: [(0, '7.010'), (1, '6.350')] +[2023-10-11 15:22:56,116][85176] Updated weights for policy 0, policy_version 11052 (0.0007) +[2023-10-11 15:22:56,481][85176] Updated weights for policy 0, policy_version 11062 (0.0008) +[2023-10-11 15:22:56,853][85176] Updated weights for policy 0, policy_version 11072 (0.0009) +[2023-10-11 15:22:57,692][85175] Updated weights for policy 1, policy_version 11210 (0.0009) +[2023-10-11 15:22:58,048][85175] Updated weights for policy 1, policy_version 11220 (0.0009) +[2023-10-11 15:22:58,420][85175] Updated weights for policy 1, policy_version 11230 (0.0007) +[2023-10-11 15:23:01,000][85176] Updated weights for policy 0, policy_version 11082 (0.0008) +[2023-10-11 15:23:01,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 22839296. Throughput: 0: 1667.3, 1: 1701.3. Samples: 5724886. Policy #0 lag: (min: 31.0, avg: 44.4, max: 63.0) +[2023-10-11 15:23:01,063][84230] Avg episode reward: [(0, '7.080'), (1, '6.830')] +[2023-10-11 15:23:01,384][85176] Updated weights for policy 0, policy_version 11092 (0.0008) +[2023-10-11 15:23:01,756][85176] Updated weights for policy 0, policy_version 11102 (0.0008) +[2023-10-11 15:23:02,458][85175] Updated weights for policy 1, policy_version 11240 (0.0010) +[2023-10-11 15:23:02,825][85175] Updated weights for policy 1, policy_version 11250 (0.0008) +[2023-10-11 15:23:03,196][85175] Updated weights for policy 1, policy_version 11260 (0.0007) +[2023-10-11 15:23:05,922][85176] Updated weights for policy 0, policy_version 11112 (0.0007) +[2023-10-11 15:23:06,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 22904832. Throughput: 0: 1664.5, 1: 1673.5. Samples: 5733914. Policy #0 lag: (min: 31.0, avg: 44.4, max: 63.0) +[2023-10-11 15:23:06,064][84230] Avg episode reward: [(0, '7.080'), (1, '6.790')] +[2023-10-11 15:23:06,295][85176] Updated weights for policy 0, policy_version 11122 (0.0007) +[2023-10-11 15:23:06,660][85176] Updated weights for policy 0, policy_version 11132 (0.0008) +[2023-10-11 15:23:07,214][85175] Updated weights for policy 1, policy_version 11270 (0.0009) +[2023-10-11 15:23:07,579][85175] Updated weights for policy 1, policy_version 11280 (0.0010) +[2023-10-11 15:23:07,954][85175] Updated weights for policy 1, policy_version 11290 (0.0010) +[2023-10-11 15:23:10,717][85176] Updated weights for policy 0, policy_version 11142 (0.0008) +[2023-10-11 15:23:11,062][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 22970368. Throughput: 0: 1658.5, 1: 1696.9. Samples: 5754638. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:23:11,063][84230] Avg episode reward: [(0, '7.150'), (1, '6.680')] +[2023-10-11 15:23:11,082][85176] Updated weights for policy 0, policy_version 11152 (0.0007) +[2023-10-11 15:23:11,463][85176] Updated weights for policy 0, policy_version 11162 (0.0008) +[2023-10-11 15:23:11,924][85175] Updated weights for policy 1, policy_version 11300 (0.0010) +[2023-10-11 15:23:12,323][85175] Updated weights for policy 1, policy_version 11310 (0.0009) +[2023-10-11 15:23:12,704][85175] Updated weights for policy 1, policy_version 11320 (0.0010) +[2023-10-11 15:23:15,595][85176] Updated weights for policy 0, policy_version 11172 (0.0008) +[2023-10-11 15:23:15,977][85176] Updated weights for policy 0, policy_version 11182 (0.0007) +[2023-10-11 15:23:16,062][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 23035904. Throughput: 0: 1650.5, 1: 1707.8. Samples: 5775054. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:23:16,063][84230] Avg episode reward: [(0, '6.770'), (1, '7.040')] +[2023-10-11 15:23:16,348][85176] Updated weights for policy 0, policy_version 11192 (0.0009) +[2023-10-11 15:23:16,806][85175] Updated weights for policy 1, policy_version 11330 (0.0009) +[2023-10-11 15:23:17,174][85175] Updated weights for policy 1, policy_version 11340 (0.0008) +[2023-10-11 15:23:17,536][85175] Updated weights for policy 1, policy_version 11350 (0.0009) +[2023-10-11 15:23:17,909][85175] Updated weights for policy 1, policy_version 11360 (0.0008) +[2023-10-11 15:23:20,575][85176] Updated weights for policy 0, policy_version 11202 (0.0008) +[2023-10-11 15:23:20,982][85176] Updated weights for policy 0, policy_version 11212 (0.0010) +[2023-10-11 15:23:21,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 23101440. Throughput: 0: 1654.0, 1: 1681.3. Samples: 5784308. Policy #0 lag: (min: 10.0, avg: 13.1, max: 42.0) +[2023-10-11 15:23:21,063][84230] Avg episode reward: [(0, '6.480'), (1, '6.770')] +[2023-10-11 15:23:21,362][85176] Updated weights for policy 0, policy_version 11222 (0.0009) +[2023-10-11 15:23:21,728][85176] Updated weights for policy 0, policy_version 11232 (0.0007) +[2023-10-11 15:23:22,165][85175] Updated weights for policy 1, policy_version 11370 (0.0009) +[2023-10-11 15:23:22,536][85175] Updated weights for policy 1, policy_version 11380 (0.0008) +[2023-10-11 15:23:22,896][85175] Updated weights for policy 1, policy_version 11390 (0.0008) +[2023-10-11 15:23:25,764][85176] Updated weights for policy 0, policy_version 11242 (0.0009) +[2023-10-11 15:23:26,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 23166976. Throughput: 0: 1650.1, 1: 1706.8. Samples: 5804964. Policy #0 lag: (min: 10.0, avg: 13.1, max: 42.0) +[2023-10-11 15:23:26,063][84230] Avg episode reward: [(0, '5.980'), (1, '6.670')] +[2023-10-11 15:23:26,149][85176] Updated weights for policy 0, policy_version 11252 (0.0008) +[2023-10-11 15:23:26,523][85176] Updated weights for policy 0, policy_version 11262 (0.0009) +[2023-10-11 15:23:26,765][85175] Updated weights for policy 1, policy_version 11400 (0.0010) +[2023-10-11 15:23:27,137][85175] Updated weights for policy 1, policy_version 11410 (0.0011) +[2023-10-11 15:23:27,502][85175] Updated weights for policy 1, policy_version 11420 (0.0010) +[2023-10-11 15:23:30,532][85176] Updated weights for policy 0, policy_version 11272 (0.0008) +[2023-10-11 15:23:30,901][85176] Updated weights for policy 0, policy_version 11282 (0.0007) +[2023-10-11 15:23:31,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 23232512. Throughput: 0: 1650.3, 1: 1701.6. Samples: 5825340. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) +[2023-10-11 15:23:31,064][84230] Avg episode reward: [(0, '6.090'), (1, '6.720')] +[2023-10-11 15:23:31,284][85176] Updated weights for policy 0, policy_version 11292 (0.0008) +[2023-10-11 15:23:31,490][85175] Updated weights for policy 1, policy_version 11430 (0.0008) +[2023-10-11 15:23:31,871][85175] Updated weights for policy 1, policy_version 11440 (0.0007) +[2023-10-11 15:23:32,243][85175] Updated weights for policy 1, policy_version 11450 (0.0007) +[2023-10-11 15:23:35,253][85176] Updated weights for policy 0, policy_version 11302 (0.0007) +[2023-10-11 15:23:35,633][85176] Updated weights for policy 0, policy_version 11312 (0.0008) +[2023-10-11 15:23:36,004][85176] Updated weights for policy 0, policy_version 11322 (0.0008) +[2023-10-11 15:23:36,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 23298048. Throughput: 0: 1658.4, 1: 1685.1. Samples: 5834770. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) +[2023-10-11 15:23:36,064][84230] Avg episode reward: [(0, '6.420'), (1, '6.650')] +[2023-10-11 15:23:36,320][85175] Updated weights for policy 1, policy_version 11460 (0.0009) +[2023-10-11 15:23:36,685][85175] Updated weights for policy 1, policy_version 11470 (0.0008) +[2023-10-11 15:23:37,055][85175] Updated weights for policy 1, policy_version 11480 (0.0007) +[2023-10-11 15:23:40,285][85176] Updated weights for policy 0, policy_version 11332 (0.0008) +[2023-10-11 15:23:40,666][85176] Updated weights for policy 0, policy_version 11342 (0.0008) +[2023-10-11 15:23:41,041][85176] Updated weights for policy 0, policy_version 11352 (0.0009) +[2023-10-11 15:23:41,042][85175] Updated weights for policy 1, policy_version 11490 (0.0009) +[2023-10-11 15:23:41,062][84230] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 23363584. Throughput: 0: 1653.6, 1: 1705.4. Samples: 5855352. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:23:41,063][84230] Avg episode reward: [(0, '6.860'), (1, '6.810')] +[2023-10-11 15:23:41,413][85175] Updated weights for policy 1, policy_version 11500 (0.0008) +[2023-10-11 15:23:41,782][85175] Updated weights for policy 1, policy_version 11510 (0.0007) +[2023-10-11 15:23:42,146][85175] Updated weights for policy 1, policy_version 11520 (0.0008) +[2023-10-11 15:23:45,156][85176] Updated weights for policy 0, policy_version 11362 (0.0009) +[2023-10-11 15:23:45,525][85176] Updated weights for policy 0, policy_version 11372 (0.0008) +[2023-10-11 15:23:45,903][85176] Updated weights for policy 0, policy_version 11382 (0.0007) +[2023-10-11 15:23:46,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 23429120. Throughput: 0: 1644.1, 1: 1700.0. Samples: 5875372. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:23:46,064][84230] Avg episode reward: [(0, '7.450'), (1, '6.650')] +[2023-10-11 15:23:46,254][85175] Updated weights for policy 1, policy_version 11530 (0.0007) +[2023-10-11 15:23:46,268][84801] Saving new best policy, reward=7.450! +[2023-10-11 15:23:46,271][85176] Updated weights for policy 0, policy_version 11392 (0.0008) +[2023-10-11 15:23:46,624][85175] Updated weights for policy 1, policy_version 11540 (0.0009) +[2023-10-11 15:23:46,987][85175] Updated weights for policy 1, policy_version 11550 (0.0011) +[2023-10-11 15:23:50,329][85176] Updated weights for policy 0, policy_version 11402 (0.0009) +[2023-10-11 15:23:50,700][85176] Updated weights for policy 0, policy_version 11412 (0.0010) +[2023-10-11 15:23:51,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 23494656. Throughput: 0: 1656.4, 1: 1696.2. Samples: 5884780. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:23:51,063][84230] Avg episode reward: [(0, '7.490'), (1, '6.690')] +[2023-10-11 15:23:51,072][85176] Updated weights for policy 0, policy_version 11422 (0.0009) +[2023-10-11 15:23:51,134][85175] Updated weights for policy 1, policy_version 11560 (0.0008) +[2023-10-11 15:23:51,142][84801] Saving new best policy, reward=7.490! +[2023-10-11 15:23:51,490][85175] Updated weights for policy 1, policy_version 11570 (0.0008) +[2023-10-11 15:23:51,860][85175] Updated weights for policy 1, policy_version 11580 (0.0008) +[2023-10-11 15:23:55,160][85176] Updated weights for policy 0, policy_version 11432 (0.0007) +[2023-10-11 15:23:55,543][85176] Updated weights for policy 0, policy_version 11442 (0.0009) +[2023-10-11 15:23:55,909][85176] Updated weights for policy 0, policy_version 11452 (0.0009) +[2023-10-11 15:23:55,911][85175] Updated weights for policy 1, policy_version 11590 (0.0008) +[2023-10-11 15:23:56,062][84230] Fps is (10 sec: 16384.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 23592960. Throughput: 0: 1659.3, 1: 1694.3. Samples: 5905552. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:23:56,063][84230] Avg episode reward: [(0, '6.970'), (1, '6.820')] +[2023-10-11 15:23:56,273][85175] Updated weights for policy 1, policy_version 11600 (0.0008) +[2023-10-11 15:23:56,646][85175] Updated weights for policy 1, policy_version 11610 (0.0008) +[2023-10-11 15:24:00,209][85176] Updated weights for policy 0, policy_version 11462 (0.0008) +[2023-10-11 15:24:00,586][85176] Updated weights for policy 0, policy_version 11472 (0.0007) +[2023-10-11 15:24:00,776][85175] Updated weights for policy 1, policy_version 11620 (0.0008) +[2023-10-11 15:24:00,950][85176] Updated weights for policy 0, policy_version 11482 (0.0007) +[2023-10-11 15:24:01,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 23625728. Throughput: 0: 1652.5, 1: 1696.6. Samples: 5925764. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:24:01,063][84230] Avg episode reward: [(0, '6.850'), (1, '7.270')] +[2023-10-11 15:24:01,177][85175] Updated weights for policy 1, policy_version 11630 (0.0007) +[2023-10-11 15:24:01,536][85175] Updated weights for policy 1, policy_version 11640 (0.0008) +[2023-10-11 15:24:01,829][85000] Saving new best policy, reward=7.270! +[2023-10-11 15:24:05,033][85176] Updated weights for policy 0, policy_version 11492 (0.0008) +[2023-10-11 15:24:05,409][85176] Updated weights for policy 0, policy_version 11502 (0.0007) +[2023-10-11 15:24:05,623][85175] Updated weights for policy 1, policy_version 11650 (0.0007) +[2023-10-11 15:24:05,780][85176] Updated weights for policy 0, policy_version 11512 (0.0007) +[2023-10-11 15:24:05,990][85175] Updated weights for policy 1, policy_version 11660 (0.0007) +[2023-10-11 15:24:06,062][84230] Fps is (10 sec: 9830.4, 60 sec: 13107.2, 300 sec: 13218.3). Total num frames: 23691264. Throughput: 0: 1662.0, 1: 1692.4. Samples: 5935254. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:24:06,063][84230] Avg episode reward: [(0, '6.590'), (1, '7.300')] +[2023-10-11 15:24:06,366][85175] Updated weights for policy 1, policy_version 11670 (0.0007) +[2023-10-11 15:24:06,731][85000] Saving new best policy, reward=7.300! +[2023-10-11 15:24:06,735][85175] Updated weights for policy 1, policy_version 11680 (0.0009) +[2023-10-11 15:24:09,871][85176] Updated weights for policy 0, policy_version 11522 (0.0007) +[2023-10-11 15:24:10,265][85176] Updated weights for policy 0, policy_version 11532 (0.0009) +[2023-10-11 15:24:10,632][85176] Updated weights for policy 0, policy_version 11542 (0.0009) +[2023-10-11 15:24:10,822][85175] Updated weights for policy 1, policy_version 11690 (0.0007) +[2023-10-11 15:24:11,005][85176] Updated weights for policy 0, policy_version 11552 (0.0009) +[2023-10-11 15:24:11,063][84230] Fps is (10 sec: 16383.8, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 23789568. Throughput: 0: 1669.2, 1: 1685.1. Samples: 5955912. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:24:11,064][84230] Avg episode reward: [(0, '6.350'), (1, '7.070')] +[2023-10-11 15:24:11,200][85175] Updated weights for policy 1, policy_version 11700 (0.0009) +[2023-10-11 15:24:11,569][85175] Updated weights for policy 1, policy_version 11710 (0.0011) +[2023-10-11 15:24:15,104][85176] Updated weights for policy 0, policy_version 11562 (0.0009) +[2023-10-11 15:24:15,482][85176] Updated weights for policy 0, policy_version 11572 (0.0010) +[2023-10-11 15:24:15,572][85175] Updated weights for policy 1, policy_version 11720 (0.0009) +[2023-10-11 15:24:15,852][85176] Updated weights for policy 0, policy_version 11582 (0.0007) +[2023-10-11 15:24:15,940][85175] Updated weights for policy 1, policy_version 11730 (0.0008) +[2023-10-11 15:24:16,062][84230] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 23855104. Throughput: 0: 1648.7, 1: 1680.3. Samples: 5975144. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:24:16,063][84230] Avg episode reward: [(0, '6.130'), (1, '6.870')] +[2023-10-11 15:24:16,074][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000011584_11862016.pth... +[2023-10-11 15:24:16,110][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000010016_10256384.pth +[2023-10-11 15:24:16,310][85175] Updated weights for policy 1, policy_version 11740 (0.0008) +[2023-10-11 15:24:16,450][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000011744_12025856.pth... +[2023-10-11 15:24:16,479][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000010144_10387456.pth +[2023-10-11 15:24:19,825][85176] Updated weights for policy 0, policy_version 11592 (0.0008) +[2023-10-11 15:24:20,191][85176] Updated weights for policy 0, policy_version 11602 (0.0008) +[2023-10-11 15:24:20,309][85175] Updated weights for policy 1, policy_version 11750 (0.0008) +[2023-10-11 15:24:20,569][85176] Updated weights for policy 0, policy_version 11612 (0.0009) +[2023-10-11 15:24:20,680][85175] Updated weights for policy 1, policy_version 11760 (0.0009) +[2023-10-11 15:24:21,053][85175] Updated weights for policy 1, policy_version 11770 (0.0007) +[2023-10-11 15:24:21,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 23920640. Throughput: 0: 1662.0, 1: 1685.7. Samples: 5985416. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:24:21,064][84230] Avg episode reward: [(0, '6.350'), (1, '7.080')] +[2023-10-11 15:24:24,830][85176] Updated weights for policy 0, policy_version 11622 (0.0008) +[2023-10-11 15:24:25,059][85175] Updated weights for policy 1, policy_version 11780 (0.0007) +[2023-10-11 15:24:25,193][85176] Updated weights for policy 0, policy_version 11632 (0.0008) +[2023-10-11 15:24:25,421][85175] Updated weights for policy 1, policy_version 11790 (0.0007) +[2023-10-11 15:24:25,574][85176] Updated weights for policy 0, policy_version 11642 (0.0007) +[2023-10-11 15:24:25,788][85175] Updated weights for policy 1, policy_version 11800 (0.0009) +[2023-10-11 15:24:26,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 23986176. Throughput: 0: 1663.4, 1: 1686.8. Samples: 6006108. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:24:26,063][84230] Avg episode reward: [(0, '6.410'), (1, '7.150')] +[2023-10-11 15:24:29,697][85176] Updated weights for policy 0, policy_version 11652 (0.0008) +[2023-10-11 15:24:29,775][85175] Updated weights for policy 1, policy_version 11810 (0.0007) +[2023-10-11 15:24:30,074][85176] Updated weights for policy 0, policy_version 11662 (0.0007) +[2023-10-11 15:24:30,148][85175] Updated weights for policy 1, policy_version 11820 (0.0007) +[2023-10-11 15:24:30,447][85176] Updated weights for policy 0, policy_version 11672 (0.0007) +[2023-10-11 15:24:30,514][85175] Updated weights for policy 1, policy_version 11830 (0.0010) +[2023-10-11 15:24:30,891][85175] Updated weights for policy 1, policy_version 11840 (0.0007) +[2023-10-11 15:24:31,063][84230] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 13440.4). Total num frames: 24084480. Throughput: 0: 1650.8, 1: 1674.0. Samples: 6024990. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:24:31,064][84230] Avg episode reward: [(0, '6.540'), (1, '7.300')] +[2023-10-11 15:24:34,617][85176] Updated weights for policy 0, policy_version 11682 (0.0007) +[2023-10-11 15:24:34,993][85176] Updated weights for policy 0, policy_version 11692 (0.0008) +[2023-10-11 15:24:35,024][85175] Updated weights for policy 1, policy_version 11850 (0.0007) +[2023-10-11 15:24:35,363][85176] Updated weights for policy 0, policy_version 11702 (0.0007) +[2023-10-11 15:24:35,395][85175] Updated weights for policy 1, policy_version 11860 (0.0008) +[2023-10-11 15:24:35,723][85176] Updated weights for policy 0, policy_version 11712 (0.0008) +[2023-10-11 15:24:35,759][85175] Updated weights for policy 1, policy_version 11870 (0.0009) +[2023-10-11 15:24:36,062][84230] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 13440.4). Total num frames: 24150016. Throughput: 0: 1661.2, 1: 1696.3. Samples: 6035870. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:24:36,063][84230] Avg episode reward: [(0, '6.670'), (1, '7.220')] +[2023-10-11 15:24:39,727][85175] Updated weights for policy 1, policy_version 11880 (0.0009) +[2023-10-11 15:24:39,879][85176] Updated weights for policy 0, policy_version 11722 (0.0008) +[2023-10-11 15:24:40,093][85175] Updated weights for policy 1, policy_version 11890 (0.0007) +[2023-10-11 15:24:40,250][85176] Updated weights for policy 0, policy_version 11732 (0.0010) +[2023-10-11 15:24:40,457][85175] Updated weights for policy 1, policy_version 11900 (0.0008) +[2023-10-11 15:24:40,630][85176] Updated weights for policy 0, policy_version 11742 (0.0008) +[2023-10-11 15:24:41,063][84230] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 13440.4). Total num frames: 24215552. Throughput: 0: 1657.7, 1: 1688.3. Samples: 6056126. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:24:41,064][84230] Avg episode reward: [(0, '6.790'), (1, '6.870')] +[2023-10-11 15:24:44,617][85175] Updated weights for policy 1, policy_version 11910 (0.0008) +[2023-10-11 15:24:44,680][85176] Updated weights for policy 0, policy_version 11752 (0.0009) +[2023-10-11 15:24:44,979][85175] Updated weights for policy 1, policy_version 11920 (0.0009) +[2023-10-11 15:24:45,056][85176] Updated weights for policy 0, policy_version 11762 (0.0008) +[2023-10-11 15:24:45,341][85175] Updated weights for policy 1, policy_version 11930 (0.0007) +[2023-10-11 15:24:45,424][85176] Updated weights for policy 0, policy_version 11772 (0.0008) +[2023-10-11 15:24:46,062][84230] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 13440.4). Total num frames: 24281088. Throughput: 0: 1642.0, 1: 1663.4. Samples: 6074510. Policy #0 lag: (min: 9.0, avg: 17.4, max: 41.0) +[2023-10-11 15:24:46,063][84230] Avg episode reward: [(0, '6.860'), (1, '6.730')] +[2023-10-11 15:24:49,287][85176] Updated weights for policy 0, policy_version 11782 (0.0009) +[2023-10-11 15:24:49,451][85175] Updated weights for policy 1, policy_version 11940 (0.0008) +[2023-10-11 15:24:49,664][85176] Updated weights for policy 0, policy_version 11792 (0.0009) +[2023-10-11 15:24:49,842][85175] Updated weights for policy 1, policy_version 11950 (0.0008) +[2023-10-11 15:24:50,042][85176] Updated weights for policy 0, policy_version 11802 (0.0008) +[2023-10-11 15:24:50,208][85175] Updated weights for policy 1, policy_version 11960 (0.0007) +[2023-10-11 15:24:51,063][84230] Fps is (10 sec: 13107.4, 60 sec: 14199.4, 300 sec: 13440.4). Total num frames: 24346624. Throughput: 0: 1656.8, 1: 1692.1. Samples: 6085954. Policy #0 lag: (min: 9.0, avg: 17.4, max: 41.0) +[2023-10-11 15:24:51,063][84230] Avg episode reward: [(0, '7.080'), (1, '6.880')] +[2023-10-11 15:24:54,203][85175] Updated weights for policy 1, policy_version 11970 (0.0007) +[2023-10-11 15:24:54,419][85176] Updated weights for policy 0, policy_version 11812 (0.0008) +[2023-10-11 15:24:54,560][85175] Updated weights for policy 1, policy_version 11980 (0.0008) +[2023-10-11 15:24:54,806][85176] Updated weights for policy 0, policy_version 11822 (0.0007) +[2023-10-11 15:24:54,927][85175] Updated weights for policy 1, policy_version 11990 (0.0007) +[2023-10-11 15:24:55,171][85176] Updated weights for policy 0, policy_version 11832 (0.0008) +[2023-10-11 15:24:55,298][85175] Updated weights for policy 1, policy_version 12000 (0.0009) +[2023-10-11 15:24:56,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 24412160. Throughput: 0: 1643.2, 1: 1682.4. Samples: 6105568. Policy #0 lag: (min: 31.0, avg: 35.1, max: 63.0) +[2023-10-11 15:24:56,063][84230] Avg episode reward: [(0, '6.990'), (1, '6.990')] +[2023-10-11 15:24:59,348][85175] Updated weights for policy 1, policy_version 12010 (0.0007) +[2023-10-11 15:24:59,505][85176] Updated weights for policy 0, policy_version 11842 (0.0007) +[2023-10-11 15:24:59,710][85175] Updated weights for policy 1, policy_version 12020 (0.0008) +[2023-10-11 15:24:59,877][85176] Updated weights for policy 0, policy_version 11852 (0.0007) +[2023-10-11 15:25:00,079][85175] Updated weights for policy 1, policy_version 12030 (0.0008) +[2023-10-11 15:25:00,259][85176] Updated weights for policy 0, policy_version 11862 (0.0008) +[2023-10-11 15:25:00,623][85176] Updated weights for policy 0, policy_version 11872 (0.0010) +[2023-10-11 15:25:01,063][84230] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 13440.4). Total num frames: 24477696. Throughput: 0: 1642.4, 1: 1669.4. Samples: 6124176. Policy #0 lag: (min: 31.0, avg: 35.1, max: 63.0) +[2023-10-11 15:25:01,064][84230] Avg episode reward: [(0, '6.620'), (1, '7.330')] +[2023-10-11 15:25:01,077][85000] Saving new best policy, reward=7.330! +[2023-10-11 15:25:04,173][85175] Updated weights for policy 1, policy_version 12040 (0.0010) +[2023-10-11 15:25:04,550][85175] Updated weights for policy 1, policy_version 12050 (0.0009) +[2023-10-11 15:25:04,785][85176] Updated weights for policy 0, policy_version 11882 (0.0009) +[2023-10-11 15:25:04,919][85175] Updated weights for policy 1, policy_version 12060 (0.0007) +[2023-10-11 15:25:05,153][85176] Updated weights for policy 0, policy_version 11892 (0.0008) +[2023-10-11 15:25:05,521][85176] Updated weights for policy 0, policy_version 11902 (0.0007) +[2023-10-11 15:25:06,062][84230] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13440.4). Total num frames: 24543232. Throughput: 0: 1650.5, 1: 1690.1. Samples: 6135740. Policy #0 lag: (min: 18.0, avg: 19.0, max: 34.0) +[2023-10-11 15:25:06,063][84230] Avg episode reward: [(0, '6.710'), (1, '6.960')] +[2023-10-11 15:25:08,776][85175] Updated weights for policy 1, policy_version 12070 (0.0009) +[2023-10-11 15:25:09,145][85175] Updated weights for policy 1, policy_version 12080 (0.0007) +[2023-10-11 15:25:09,514][85175] Updated weights for policy 1, policy_version 12090 (0.0008) +[2023-10-11 15:25:09,677][85176] Updated weights for policy 0, policy_version 11912 (0.0008) +[2023-10-11 15:25:10,047][85176] Updated weights for policy 0, policy_version 11922 (0.0008) +[2023-10-11 15:25:10,419][85176] Updated weights for policy 0, policy_version 11932 (0.0009) +[2023-10-11 15:25:11,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 24608768. Throughput: 0: 1652.2, 1: 1663.9. Samples: 6155336. Policy #0 lag: (min: 18.0, avg: 19.0, max: 34.0) +[2023-10-11 15:25:11,064][84230] Avg episode reward: [(0, '6.750'), (1, '6.870')] +[2023-10-11 15:25:13,705][85175] Updated weights for policy 1, policy_version 12100 (0.0009) +[2023-10-11 15:25:14,072][85175] Updated weights for policy 1, policy_version 12110 (0.0009) +[2023-10-11 15:25:14,410][85176] Updated weights for policy 0, policy_version 11942 (0.0008) +[2023-10-11 15:25:14,440][85175] Updated weights for policy 1, policy_version 12120 (0.0008) +[2023-10-11 15:25:14,779][85176] Updated weights for policy 0, policy_version 11952 (0.0007) +[2023-10-11 15:25:15,154][85176] Updated weights for policy 0, policy_version 11962 (0.0007) +[2023-10-11 15:25:16,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 24674304. Throughput: 0: 1653.4, 1: 1666.7. Samples: 6174392. Policy #0 lag: (min: 27.0, avg: 27.0, max: 27.0) +[2023-10-11 15:25:16,063][84230] Avg episode reward: [(0, '6.570'), (1, '6.560')] +[2023-10-11 15:25:18,630][85175] Updated weights for policy 1, policy_version 12130 (0.0008) +[2023-10-11 15:25:19,007][85175] Updated weights for policy 1, policy_version 12140 (0.0011) +[2023-10-11 15:25:19,151][85176] Updated weights for policy 0, policy_version 11972 (0.0007) +[2023-10-11 15:25:19,370][85175] Updated weights for policy 1, policy_version 12150 (0.0008) +[2023-10-11 15:25:19,521][85176] Updated weights for policy 0, policy_version 11982 (0.0009) +[2023-10-11 15:25:19,747][85175] Updated weights for policy 1, policy_version 12160 (0.0008) +[2023-10-11 15:25:19,893][85176] Updated weights for policy 0, policy_version 11992 (0.0009) +[2023-10-11 15:25:21,063][84230] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 24739840. Throughput: 0: 1661.4, 1: 1675.3. Samples: 6186020. Policy #0 lag: (min: 27.0, avg: 27.0, max: 27.0) +[2023-10-11 15:25:21,064][84230] Avg episode reward: [(0, '6.610'), (1, '6.600')] +[2023-10-11 15:25:23,767][85175] Updated weights for policy 1, policy_version 12170 (0.0007) +[2023-10-11 15:25:24,134][85176] Updated weights for policy 0, policy_version 12002 (0.0008) +[2023-10-11 15:25:24,135][85175] Updated weights for policy 1, policy_version 12180 (0.0008) +[2023-10-11 15:25:24,501][85175] Updated weights for policy 1, policy_version 12190 (0.0007) +[2023-10-11 15:25:24,510][85176] Updated weights for policy 0, policy_version 12012 (0.0008) +[2023-10-11 15:25:24,896][85176] Updated weights for policy 0, policy_version 12022 (0.0009) +[2023-10-11 15:25:25,269][85176] Updated weights for policy 0, policy_version 12032 (0.0009) +[2023-10-11 15:25:26,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 24805376. Throughput: 0: 1649.6, 1: 1661.1. Samples: 6205108. Policy #0 lag: (min: 27.0, avg: 27.0, max: 27.0) +[2023-10-11 15:25:26,064][84230] Avg episode reward: [(0, '6.810'), (1, '7.120')] +[2023-10-11 15:25:28,542][85175] Updated weights for policy 1, policy_version 12200 (0.0010) +[2023-10-11 15:25:28,914][85175] Updated weights for policy 1, policy_version 12210 (0.0010) +[2023-10-11 15:25:29,275][85175] Updated weights for policy 1, policy_version 12220 (0.0007) +[2023-10-11 15:25:29,288][85176] Updated weights for policy 0, policy_version 12042 (0.0007) +[2023-10-11 15:25:29,649][85176] Updated weights for policy 0, policy_version 12052 (0.0009) +[2023-10-11 15:25:30,023][85176] Updated weights for policy 0, policy_version 12062 (0.0009) +[2023-10-11 15:25:31,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 24870912. Throughput: 0: 1662.8, 1: 1685.4. Samples: 6225178. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) +[2023-10-11 15:25:31,064][84230] Avg episode reward: [(0, '6.900'), (1, '7.610')] +[2023-10-11 15:25:31,076][85000] Saving new best policy, reward=7.610! +[2023-10-11 15:25:33,289][85175] Updated weights for policy 1, policy_version 12230 (0.0008) +[2023-10-11 15:25:33,668][85175] Updated weights for policy 1, policy_version 12240 (0.0011) +[2023-10-11 15:25:34,042][85175] Updated weights for policy 1, policy_version 12250 (0.0009) +[2023-10-11 15:25:34,185][85176] Updated weights for policy 0, policy_version 12072 (0.0007) +[2023-10-11 15:25:34,554][85176] Updated weights for policy 0, policy_version 12082 (0.0007) +[2023-10-11 15:25:34,921][85176] Updated weights for policy 0, policy_version 12092 (0.0007) +[2023-10-11 15:25:36,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 24936448. Throughput: 0: 1662.8, 1: 1677.2. Samples: 6236256. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) +[2023-10-11 15:25:36,064][84230] Avg episode reward: [(0, '6.910'), (1, '7.290')] +[2023-10-11 15:25:38,257][85175] Updated weights for policy 1, policy_version 12260 (0.0008) +[2023-10-11 15:25:38,657][85175] Updated weights for policy 1, policy_version 12270 (0.0008) +[2023-10-11 15:25:38,809][85176] Updated weights for policy 0, policy_version 12102 (0.0007) +[2023-10-11 15:25:39,018][85175] Updated weights for policy 1, policy_version 12280 (0.0009) +[2023-10-11 15:25:39,186][85176] Updated weights for policy 0, policy_version 12112 (0.0008) +[2023-10-11 15:25:39,556][85176] Updated weights for policy 0, policy_version 12122 (0.0010) +[2023-10-11 15:25:41,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 25001984. Throughput: 0: 1656.7, 1: 1662.6. Samples: 6254938. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:25:41,064][84230] Avg episode reward: [(0, '7.040'), (1, '7.050')] +[2023-10-11 15:25:42,800][85175] Updated weights for policy 1, policy_version 12290 (0.0010) +[2023-10-11 15:25:43,166][85175] Updated weights for policy 1, policy_version 12300 (0.0009) +[2023-10-11 15:25:43,531][85175] Updated weights for policy 1, policy_version 12310 (0.0009) +[2023-10-11 15:25:43,775][85176] Updated weights for policy 0, policy_version 12132 (0.0009) +[2023-10-11 15:25:43,906][85175] Updated weights for policy 1, policy_version 12320 (0.0009) +[2023-10-11 15:25:44,159][85176] Updated weights for policy 0, policy_version 12142 (0.0010) +[2023-10-11 15:25:44,547][85176] Updated weights for policy 0, policy_version 12152 (0.0010) +[2023-10-11 15:25:46,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 25067520. Throughput: 0: 1671.5, 1: 1691.3. Samples: 6275502. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:25:46,064][84230] Avg episode reward: [(0, '6.840'), (1, '6.750')] +[2023-10-11 15:25:48,119][85175] Updated weights for policy 1, policy_version 12330 (0.0007) +[2023-10-11 15:25:48,486][85175] Updated weights for policy 1, policy_version 12340 (0.0009) +[2023-10-11 15:25:48,746][85176] Updated weights for policy 0, policy_version 12162 (0.0010) +[2023-10-11 15:25:48,865][85175] Updated weights for policy 1, policy_version 12350 (0.0007) +[2023-10-11 15:25:49,112][85176] Updated weights for policy 0, policy_version 12172 (0.0010) +[2023-10-11 15:25:49,493][85176] Updated weights for policy 0, policy_version 12182 (0.0007) +[2023-10-11 15:25:49,860][85176] Updated weights for policy 0, policy_version 12192 (0.0009) +[2023-10-11 15:25:51,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 25133056. Throughput: 0: 1671.0, 1: 1671.0. Samples: 6286130. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:25:51,063][84230] Avg episode reward: [(0, '6.680'), (1, '7.300')] +[2023-10-11 15:25:52,774][85175] Updated weights for policy 1, policy_version 12360 (0.0009) +[2023-10-11 15:25:53,144][85175] Updated weights for policy 1, policy_version 12370 (0.0010) +[2023-10-11 15:25:53,516][85175] Updated weights for policy 1, policy_version 12380 (0.0009) +[2023-10-11 15:25:53,844][85176] Updated weights for policy 0, policy_version 12202 (0.0008) +[2023-10-11 15:25:54,214][85176] Updated weights for policy 0, policy_version 12212 (0.0009) +[2023-10-11 15:25:54,589][85176] Updated weights for policy 0, policy_version 12222 (0.0007) +[2023-10-11 15:25:56,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 25198592. Throughput: 0: 1648.1, 1: 1686.4. Samples: 6305384. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2023-10-11 15:25:56,063][84230] Avg episode reward: [(0, '6.680'), (1, '7.280')] +[2023-10-11 15:25:57,507][85175] Updated weights for policy 1, policy_version 12390 (0.0008) +[2023-10-11 15:25:57,881][85175] Updated weights for policy 1, policy_version 12400 (0.0008) +[2023-10-11 15:25:58,250][85175] Updated weights for policy 1, policy_version 12410 (0.0008) +[2023-10-11 15:25:58,770][85176] Updated weights for policy 0, policy_version 12232 (0.0010) +[2023-10-11 15:25:59,142][85176] Updated weights for policy 0, policy_version 12242 (0.0009) +[2023-10-11 15:25:59,522][85176] Updated weights for policy 0, policy_version 12252 (0.0008) +[2023-10-11 15:26:01,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 25264128. Throughput: 0: 1668.9, 1: 1701.8. Samples: 6326074. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) +[2023-10-11 15:26:01,063][84230] Avg episode reward: [(0, '6.860'), (1, '7.130')] +[2023-10-11 15:26:02,278][85175] Updated weights for policy 1, policy_version 12420 (0.0007) +[2023-10-11 15:26:02,652][85175] Updated weights for policy 1, policy_version 12430 (0.0007) +[2023-10-11 15:26:03,023][85175] Updated weights for policy 1, policy_version 12440 (0.0009) +[2023-10-11 15:26:03,546][85176] Updated weights for policy 0, policy_version 12262 (0.0008) +[2023-10-11 15:26:03,924][85176] Updated weights for policy 0, policy_version 12272 (0.0009) +[2023-10-11 15:26:04,291][85176] Updated weights for policy 0, policy_version 12282 (0.0009) +[2023-10-11 15:26:06,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 25329664. Throughput: 0: 1663.3, 1: 1674.4. Samples: 6336218. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-11 15:26:06,064][84230] Avg episode reward: [(0, '7.080'), (1, '7.270')] +[2023-10-11 15:26:06,994][85175] Updated weights for policy 1, policy_version 12450 (0.0009) +[2023-10-11 15:26:07,362][85175] Updated weights for policy 1, policy_version 12460 (0.0009) +[2023-10-11 15:26:07,733][85175] Updated weights for policy 1, policy_version 12470 (0.0008) +[2023-10-11 15:26:08,107][85175] Updated weights for policy 1, policy_version 12480 (0.0008) +[2023-10-11 15:26:08,316][85176] Updated weights for policy 0, policy_version 12292 (0.0009) +[2023-10-11 15:26:08,689][85176] Updated weights for policy 0, policy_version 12302 (0.0007) +[2023-10-11 15:26:09,064][85176] Updated weights for policy 0, policy_version 12312 (0.0008) +[2023-10-11 15:26:11,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 25395200. Throughput: 0: 1655.7, 1: 1699.9. Samples: 6356110. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-11 15:26:11,063][84230] Avg episode reward: [(0, '6.960'), (1, '6.690')] +[2023-10-11 15:26:12,222][85175] Updated weights for policy 1, policy_version 12490 (0.0007) +[2023-10-11 15:26:12,601][85175] Updated weights for policy 1, policy_version 12500 (0.0009) +[2023-10-11 15:26:12,966][85175] Updated weights for policy 1, policy_version 12510 (0.0009) +[2023-10-11 15:26:13,256][85176] Updated weights for policy 0, policy_version 12322 (0.0009) +[2023-10-11 15:26:13,633][85176] Updated weights for policy 0, policy_version 12332 (0.0008) +[2023-10-11 15:26:14,005][85176] Updated weights for policy 0, policy_version 12342 (0.0008) +[2023-10-11 15:26:14,380][85176] Updated weights for policy 0, policy_version 12352 (0.0007) +[2023-10-11 15:26:16,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 25460736. Throughput: 0: 1668.4, 1: 1699.7. Samples: 6376744. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-11 15:26:16,063][84230] Avg episode reward: [(0, '7.100'), (1, '7.000')] +[2023-10-11 15:26:16,073][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000012512_12812288.pth... +[2023-10-11 15:26:16,074][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000012352_12648448.pth... +[2023-10-11 15:26:16,111][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000010784_11042816.pth +[2023-10-11 15:26:16,113][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000010944_11206656.pth +[2023-10-11 15:26:16,997][85175] Updated weights for policy 1, policy_version 12520 (0.0008) +[2023-10-11 15:26:17,372][85175] Updated weights for policy 1, policy_version 12530 (0.0007) +[2023-10-11 15:26:17,737][85175] Updated weights for policy 1, policy_version 12540 (0.0007) +[2023-10-11 15:26:18,413][85176] Updated weights for policy 0, policy_version 12362 (0.0009) +[2023-10-11 15:26:18,781][85176] Updated weights for policy 0, policy_version 12372 (0.0009) +[2023-10-11 15:26:19,149][85176] Updated weights for policy 0, policy_version 12382 (0.0007) +[2023-10-11 15:26:21,063][84230] Fps is (10 sec: 13106.8, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 25526272. Throughput: 0: 1658.3, 1: 1681.8. Samples: 6386562. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:26:21,064][84230] Avg episode reward: [(0, '7.220'), (1, '7.550')] +[2023-10-11 15:26:21,998][85175] Updated weights for policy 1, policy_version 12550 (0.0009) +[2023-10-11 15:26:22,369][85175] Updated weights for policy 1, policy_version 12560 (0.0008) +[2023-10-11 15:26:22,747][85175] Updated weights for policy 1, policy_version 12570 (0.0009) +[2023-10-11 15:26:23,266][85176] Updated weights for policy 0, policy_version 12392 (0.0008) +[2023-10-11 15:26:23,648][85176] Updated weights for policy 0, policy_version 12402 (0.0007) +[2023-10-11 15:26:24,020][85176] Updated weights for policy 0, policy_version 12412 (0.0008) +[2023-10-11 15:26:26,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 25591808. Throughput: 0: 1661.1, 1: 1705.3. Samples: 6406428. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:26:26,064][84230] Avg episode reward: [(0, '7.080'), (1, '7.150')] +[2023-10-11 15:26:26,649][85175] Updated weights for policy 1, policy_version 12580 (0.0007) +[2023-10-11 15:26:27,026][85175] Updated weights for policy 1, policy_version 12590 (0.0007) +[2023-10-11 15:26:27,399][85175] Updated weights for policy 1, policy_version 12600 (0.0008) +[2023-10-11 15:26:28,226][85176] Updated weights for policy 0, policy_version 12422 (0.0009) +[2023-10-11 15:26:28,593][85176] Updated weights for policy 0, policy_version 12432 (0.0010) +[2023-10-11 15:26:28,974][85176] Updated weights for policy 0, policy_version 12442 (0.0009) +[2023-10-11 15:26:31,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 25657344. Throughput: 0: 1669.6, 1: 1699.5. Samples: 6427110. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:26:31,064][84230] Avg episode reward: [(0, '6.960'), (1, '6.780')] +[2023-10-11 15:26:31,425][85175] Updated weights for policy 1, policy_version 12610 (0.0009) +[2023-10-11 15:26:31,808][85175] Updated weights for policy 1, policy_version 12620 (0.0008) +[2023-10-11 15:26:32,175][85175] Updated weights for policy 1, policy_version 12630 (0.0007) +[2023-10-11 15:26:32,549][85175] Updated weights for policy 1, policy_version 12640 (0.0010) +[2023-10-11 15:26:32,955][85176] Updated weights for policy 0, policy_version 12452 (0.0009) +[2023-10-11 15:26:33,342][85176] Updated weights for policy 0, policy_version 12462 (0.0007) +[2023-10-11 15:26:33,715][85176] Updated weights for policy 0, policy_version 12472 (0.0010) +[2023-10-11 15:26:36,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 25722880. Throughput: 0: 1655.5, 1: 1695.1. Samples: 6436908. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:26:36,063][84230] Avg episode reward: [(0, '6.840'), (1, '6.870')] +[2023-10-11 15:26:36,420][85175] Updated weights for policy 1, policy_version 12650 (0.0007) +[2023-10-11 15:26:36,798][85175] Updated weights for policy 1, policy_version 12660 (0.0007) +[2023-10-11 15:26:37,165][85175] Updated weights for policy 1, policy_version 12670 (0.0010) +[2023-10-11 15:26:37,786][85176] Updated weights for policy 0, policy_version 12482 (0.0008) +[2023-10-11 15:26:38,143][85176] Updated weights for policy 0, policy_version 12492 (0.0010) +[2023-10-11 15:26:38,518][85176] Updated weights for policy 0, policy_version 12502 (0.0010) +[2023-10-11 15:26:38,888][85176] Updated weights for policy 0, policy_version 12512 (0.0007) +[2023-10-11 15:26:41,062][84230] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 25788416. Throughput: 0: 1669.7, 1: 1703.8. Samples: 6457192. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:26:41,063][84230] Avg episode reward: [(0, '6.640'), (1, '7.370')] +[2023-10-11 15:26:41,198][85175] Updated weights for policy 1, policy_version 12680 (0.0010) +[2023-10-11 15:26:41,562][85175] Updated weights for policy 1, policy_version 12690 (0.0010) +[2023-10-11 15:26:41,932][85175] Updated weights for policy 1, policy_version 12700 (0.0008) +[2023-10-11 15:26:42,893][85176] Updated weights for policy 0, policy_version 12522 (0.0007) +[2023-10-11 15:26:43,264][85176] Updated weights for policy 0, policy_version 12532 (0.0007) +[2023-10-11 15:26:43,638][85176] Updated weights for policy 0, policy_version 12542 (0.0008) +[2023-10-11 15:26:45,972][85175] Updated weights for policy 1, policy_version 12710 (0.0007) +[2023-10-11 15:26:46,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 25853952. Throughput: 0: 1678.1, 1: 1699.6. Samples: 6478072. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:26:46,063][84230] Avg episode reward: [(0, '6.530'), (1, '7.200')] +[2023-10-11 15:26:46,345][85175] Updated weights for policy 1, policy_version 12720 (0.0008) +[2023-10-11 15:26:46,720][85175] Updated weights for policy 1, policy_version 12730 (0.0010) +[2023-10-11 15:26:47,610][85176] Updated weights for policy 0, policy_version 12552 (0.0009) +[2023-10-11 15:26:47,976][85176] Updated weights for policy 0, policy_version 12562 (0.0008) +[2023-10-11 15:26:48,357][85176] Updated weights for policy 0, policy_version 12572 (0.0009) +[2023-10-11 15:26:50,645][85175] Updated weights for policy 1, policy_version 12740 (0.0009) +[2023-10-11 15:26:51,015][85175] Updated weights for policy 1, policy_version 12750 (0.0009) +[2023-10-11 15:26:51,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 25919488. Throughput: 0: 1656.0, 1: 1698.4. Samples: 6487164. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) +[2023-10-11 15:26:51,063][84230] Avg episode reward: [(0, '6.490'), (1, '7.230')] +[2023-10-11 15:26:51,381][85175] Updated weights for policy 1, policy_version 12760 (0.0009) +[2023-10-11 15:26:52,329][85176] Updated weights for policy 0, policy_version 12582 (0.0009) +[2023-10-11 15:26:52,703][85176] Updated weights for policy 0, policy_version 12592 (0.0007) +[2023-10-11 15:26:53,071][85176] Updated weights for policy 0, policy_version 12602 (0.0007) +[2023-10-11 15:26:55,500][85175] Updated weights for policy 1, policy_version 12770 (0.0008) +[2023-10-11 15:26:55,861][85175] Updated weights for policy 1, policy_version 12780 (0.0009) +[2023-10-11 15:26:56,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 25985024. Throughput: 0: 1678.2, 1: 1697.7. Samples: 6508024. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) +[2023-10-11 15:26:56,063][84230] Avg episode reward: [(0, '6.270'), (1, '6.950')] +[2023-10-11 15:26:56,237][85175] Updated weights for policy 1, policy_version 12790 (0.0008) +[2023-10-11 15:26:56,609][85175] Updated weights for policy 1, policy_version 12800 (0.0007) +[2023-10-11 15:26:57,169][85176] Updated weights for policy 0, policy_version 12612 (0.0008) +[2023-10-11 15:26:57,543][85176] Updated weights for policy 0, policy_version 12622 (0.0008) +[2023-10-11 15:26:57,915][85176] Updated weights for policy 0, policy_version 12632 (0.0007) +[2023-10-11 15:27:00,771][85175] Updated weights for policy 1, policy_version 12810 (0.0009) +[2023-10-11 15:27:01,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 26050560. Throughput: 0: 1675.6, 1: 1690.0. Samples: 6528200. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) +[2023-10-11 15:27:01,064][84230] Avg episode reward: [(0, '6.420'), (1, '6.950')] +[2023-10-11 15:27:01,143][85175] Updated weights for policy 1, policy_version 12820 (0.0007) +[2023-10-11 15:27:01,508][85175] Updated weights for policy 1, policy_version 12830 (0.0009) +[2023-10-11 15:27:02,094][85176] Updated weights for policy 0, policy_version 12642 (0.0010) +[2023-10-11 15:27:02,473][85176] Updated weights for policy 0, policy_version 12652 (0.0009) +[2023-10-11 15:27:02,852][85176] Updated weights for policy 0, policy_version 12662 (0.0010) +[2023-10-11 15:27:03,220][85176] Updated weights for policy 0, policy_version 12672 (0.0007) +[2023-10-11 15:27:05,510][85175] Updated weights for policy 1, policy_version 12840 (0.0008) +[2023-10-11 15:27:05,875][85175] Updated weights for policy 1, policy_version 12850 (0.0009) +[2023-10-11 15:27:06,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 26116096. Throughput: 0: 1656.4, 1: 1697.6. Samples: 6537490. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:27:06,064][84230] Avg episode reward: [(0, '6.640'), (1, '7.140')] +[2023-10-11 15:27:06,246][85175] Updated weights for policy 1, policy_version 12860 (0.0008) +[2023-10-11 15:27:07,432][85176] Updated weights for policy 0, policy_version 12682 (0.0008) +[2023-10-11 15:27:07,810][85176] Updated weights for policy 0, policy_version 12692 (0.0009) +[2023-10-11 15:27:08,187][85176] Updated weights for policy 0, policy_version 12702 (0.0009) +[2023-10-11 15:27:10,443][85175] Updated weights for policy 1, policy_version 12870 (0.0008) +[2023-10-11 15:27:10,819][85175] Updated weights for policy 1, policy_version 12880 (0.0008) +[2023-10-11 15:27:11,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 26181632. Throughput: 0: 1668.3, 1: 1697.4. Samples: 6557882. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:27:11,063][84230] Avg episode reward: [(0, '6.750'), (1, '7.110')] +[2023-10-11 15:27:11,192][85175] Updated weights for policy 1, policy_version 12890 (0.0010) +[2023-10-11 15:27:12,292][85176] Updated weights for policy 0, policy_version 12712 (0.0007) +[2023-10-11 15:27:12,673][85176] Updated weights for policy 0, policy_version 12722 (0.0007) +[2023-10-11 15:27:13,039][85176] Updated weights for policy 0, policy_version 12732 (0.0008) +[2023-10-11 15:27:15,308][85175] Updated weights for policy 1, policy_version 12900 (0.0007) +[2023-10-11 15:27:15,712][85175] Updated weights for policy 1, policy_version 12910 (0.0008) +[2023-10-11 15:27:16,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 26247168. Throughput: 0: 1669.7, 1: 1683.6. Samples: 6578006. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:27:16,063][84230] Avg episode reward: [(0, '6.970'), (1, '6.960')] +[2023-10-11 15:27:16,079][85175] Updated weights for policy 1, policy_version 12920 (0.0008) +[2023-10-11 15:27:17,126][85176] Updated weights for policy 0, policy_version 12742 (0.0008) +[2023-10-11 15:27:17,493][85176] Updated weights for policy 0, policy_version 12752 (0.0007) +[2023-10-11 15:27:17,865][85176] Updated weights for policy 0, policy_version 12762 (0.0008) +[2023-10-11 15:27:19,951][85175] Updated weights for policy 1, policy_version 12930 (0.0008) +[2023-10-11 15:27:20,323][85175] Updated weights for policy 1, policy_version 12940 (0.0012) +[2023-10-11 15:27:20,690][85175] Updated weights for policy 1, policy_version 12950 (0.0008) +[2023-10-11 15:27:21,063][84230] Fps is (10 sec: 16383.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 26345472. Throughput: 0: 1655.9, 1: 1690.6. Samples: 6587502. Policy #0 lag: (min: 31.0, avg: 42.3, max: 63.0) +[2023-10-11 15:27:21,064][84230] Avg episode reward: [(0, '7.160'), (1, '6.770')] +[2023-10-11 15:27:21,065][85175] Updated weights for policy 1, policy_version 12960 (0.0008) +[2023-10-11 15:27:22,162][85176] Updated weights for policy 0, policy_version 12772 (0.0007) +[2023-10-11 15:27:22,538][85176] Updated weights for policy 0, policy_version 12782 (0.0007) +[2023-10-11 15:27:22,920][85176] Updated weights for policy 0, policy_version 12792 (0.0008) +[2023-10-11 15:27:25,061][85175] Updated weights for policy 1, policy_version 12970 (0.0007) +[2023-10-11 15:27:25,422][85175] Updated weights for policy 1, policy_version 12980 (0.0007) +[2023-10-11 15:27:25,785][85175] Updated weights for policy 1, policy_version 12990 (0.0010) +[2023-10-11 15:27:26,063][84230] Fps is (10 sec: 16384.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 26411008. Throughput: 0: 1665.0, 1: 1687.6. Samples: 6608058. Policy #0 lag: (min: 31.0, avg: 42.3, max: 63.0) +[2023-10-11 15:27:26,063][84230] Avg episode reward: [(0, '7.170'), (1, '6.850')] +[2023-10-11 15:27:27,229][85176] Updated weights for policy 0, policy_version 12802 (0.0009) +[2023-10-11 15:27:27,631][85176] Updated weights for policy 0, policy_version 12812 (0.0008) +[2023-10-11 15:27:28,004][85176] Updated weights for policy 0, policy_version 12822 (0.0007) +[2023-10-11 15:27:28,377][85176] Updated weights for policy 0, policy_version 12832 (0.0007) +[2023-10-11 15:27:29,852][85175] Updated weights for policy 1, policy_version 13000 (0.0008) +[2023-10-11 15:27:30,227][85175] Updated weights for policy 1, policy_version 13010 (0.0008) +[2023-10-11 15:27:30,604][85175] Updated weights for policy 1, policy_version 13020 (0.0010) +[2023-10-11 15:27:31,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 26476544. Throughput: 0: 1655.8, 1: 1666.3. Samples: 6627566. Policy #0 lag: (min: 31.0, avg: 42.3, max: 63.0) +[2023-10-11 15:27:31,063][84230] Avg episode reward: [(0, '6.820'), (1, '6.980')] +[2023-10-11 15:27:32,500][85176] Updated weights for policy 0, policy_version 12842 (0.0009) +[2023-10-11 15:27:32,868][85176] Updated weights for policy 0, policy_version 12852 (0.0010) +[2023-10-11 15:27:33,248][85176] Updated weights for policy 0, policy_version 12862 (0.0007) +[2023-10-11 15:27:34,489][85175] Updated weights for policy 1, policy_version 13030 (0.0007) +[2023-10-11 15:27:34,850][85175] Updated weights for policy 1, policy_version 13040 (0.0007) +[2023-10-11 15:27:35,214][85175] Updated weights for policy 1, policy_version 13050 (0.0008) +[2023-10-11 15:27:36,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 26542080. Throughput: 0: 1653.8, 1: 1694.4. Samples: 6637832. Policy #0 lag: (min: 40.0, avg: 55.2, max: 56.0) +[2023-10-11 15:27:36,063][84230] Avg episode reward: [(0, '6.860'), (1, '7.370')] +[2023-10-11 15:27:37,228][85176] Updated weights for policy 0, policy_version 12872 (0.0007) +[2023-10-11 15:27:37,598][85176] Updated weights for policy 0, policy_version 12882 (0.0010) +[2023-10-11 15:27:37,976][85176] Updated weights for policy 0, policy_version 12892 (0.0010) +[2023-10-11 15:27:39,271][85175] Updated weights for policy 1, policy_version 13060 (0.0008) +[2023-10-11 15:27:39,640][85175] Updated weights for policy 1, policy_version 13070 (0.0007) +[2023-10-11 15:27:40,002][85175] Updated weights for policy 1, policy_version 13080 (0.0008) +[2023-10-11 15:27:41,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 26607616. Throughput: 0: 1654.0, 1: 1687.8. Samples: 6658406. Policy #0 lag: (min: 40.0, avg: 55.2, max: 56.0) +[2023-10-11 15:27:41,064][84230] Avg episode reward: [(0, '6.640'), (1, '7.450')] +[2023-10-11 15:27:42,172][85176] Updated weights for policy 0, policy_version 12902 (0.0010) +[2023-10-11 15:27:42,539][85176] Updated weights for policy 0, policy_version 12912 (0.0010) +[2023-10-11 15:27:42,911][85176] Updated weights for policy 0, policy_version 12922 (0.0008) +[2023-10-11 15:27:43,866][85175] Updated weights for policy 1, policy_version 13090 (0.0008) +[2023-10-11 15:27:44,232][85175] Updated weights for policy 1, policy_version 13100 (0.0010) +[2023-10-11 15:27:44,600][85175] Updated weights for policy 1, policy_version 13110 (0.0008) +[2023-10-11 15:27:44,971][85175] Updated weights for policy 1, policy_version 13120 (0.0008) +[2023-10-11 15:27:46,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 26673152. Throughput: 0: 1661.2, 1: 1676.7. Samples: 6678406. Policy #0 lag: (min: 40.0, avg: 55.2, max: 56.0) +[2023-10-11 15:27:46,064][84230] Avg episode reward: [(0, '6.750'), (1, '7.210')] +[2023-10-11 15:27:46,993][85176] Updated weights for policy 0, policy_version 12932 (0.0008) +[2023-10-11 15:27:47,365][85176] Updated weights for policy 0, policy_version 12942 (0.0009) +[2023-10-11 15:27:47,743][85176] Updated weights for policy 0, policy_version 12952 (0.0007) +[2023-10-11 15:27:48,867][85175] Updated weights for policy 1, policy_version 13130 (0.0008) +[2023-10-11 15:27:49,240][85175] Updated weights for policy 1, policy_version 13140 (0.0008) +[2023-10-11 15:27:49,596][85175] Updated weights for policy 1, policy_version 13150 (0.0009) +[2023-10-11 15:27:51,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 26738688. Throughput: 0: 1661.6, 1: 1700.1. Samples: 6688768. Policy #0 lag: (min: 1.0, avg: 10.7, max: 33.0) +[2023-10-11 15:27:51,064][84230] Avg episode reward: [(0, '6.820'), (1, '6.630')] +[2023-10-11 15:27:51,929][85176] Updated weights for policy 0, policy_version 12962 (0.0009) +[2023-10-11 15:27:52,309][85176] Updated weights for policy 0, policy_version 12972 (0.0008) +[2023-10-11 15:27:52,681][85176] Updated weights for policy 0, policy_version 12982 (0.0008) +[2023-10-11 15:27:53,062][85176] Updated weights for policy 0, policy_version 12992 (0.0009) +[2023-10-11 15:27:53,589][85175] Updated weights for policy 1, policy_version 13160 (0.0007) +[2023-10-11 15:27:53,964][85175] Updated weights for policy 1, policy_version 13170 (0.0009) +[2023-10-11 15:27:54,334][85175] Updated weights for policy 1, policy_version 13180 (0.0009) +[2023-10-11 15:27:56,063][84230] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 26804224. Throughput: 0: 1661.2, 1: 1679.5. Samples: 6708212. Policy #0 lag: (min: 1.0, avg: 10.7, max: 33.0) +[2023-10-11 15:27:56,063][84230] Avg episode reward: [(0, '6.750'), (1, '6.860')] +[2023-10-11 15:27:57,259][85176] Updated weights for policy 0, policy_version 13002 (0.0010) +[2023-10-11 15:27:57,626][85176] Updated weights for policy 0, policy_version 13012 (0.0009) +[2023-10-11 15:27:58,006][85176] Updated weights for policy 0, policy_version 13022 (0.0007) +[2023-10-11 15:27:58,445][85175] Updated weights for policy 1, policy_version 13190 (0.0009) +[2023-10-11 15:27:58,820][85175] Updated weights for policy 1, policy_version 13200 (0.0009) +[2023-10-11 15:27:59,191][85175] Updated weights for policy 1, policy_version 13210 (0.0009) +[2023-10-11 15:28:01,062][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 26869760. Throughput: 0: 1660.8, 1: 1695.3. Samples: 6729028. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:28:01,063][84230] Avg episode reward: [(0, '7.080'), (1, '6.960')] +[2023-10-11 15:28:01,960][85176] Updated weights for policy 0, policy_version 13032 (0.0008) +[2023-10-11 15:28:02,340][85176] Updated weights for policy 0, policy_version 13042 (0.0008) +[2023-10-11 15:28:02,720][85176] Updated weights for policy 0, policy_version 13052 (0.0008) +[2023-10-11 15:28:03,089][85175] Updated weights for policy 1, policy_version 13220 (0.0009) +[2023-10-11 15:28:03,481][85175] Updated weights for policy 1, policy_version 13230 (0.0007) +[2023-10-11 15:28:03,843][85175] Updated weights for policy 1, policy_version 13240 (0.0008) +[2023-10-11 15:28:06,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 26935296. Throughput: 0: 1662.0, 1: 1699.7. Samples: 6738780. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:28:06,063][84230] Avg episode reward: [(0, '6.940'), (1, '7.680')] +[2023-10-11 15:28:06,064][85000] Saving new best policy, reward=7.680! +[2023-10-11 15:28:06,745][85176] Updated weights for policy 0, policy_version 13062 (0.0009) +[2023-10-11 15:28:07,130][85176] Updated weights for policy 0, policy_version 13072 (0.0007) +[2023-10-11 15:28:07,504][85176] Updated weights for policy 0, policy_version 13082 (0.0008) +[2023-10-11 15:28:07,810][85175] Updated weights for policy 1, policy_version 13250 (0.0009) +[2023-10-11 15:28:08,187][85175] Updated weights for policy 1, policy_version 13260 (0.0008) +[2023-10-11 15:28:08,560][85175] Updated weights for policy 1, policy_version 13270 (0.0007) +[2023-10-11 15:28:08,928][85175] Updated weights for policy 1, policy_version 13280 (0.0008) +[2023-10-11 15:28:11,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 27000832. Throughput: 0: 1666.4, 1: 1683.6. Samples: 6758810. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:28:11,063][84230] Avg episode reward: [(0, '6.930'), (1, '7.450')] +[2023-10-11 15:28:11,518][85176] Updated weights for policy 0, policy_version 13092 (0.0009) +[2023-10-11 15:28:11,905][85176] Updated weights for policy 0, policy_version 13102 (0.0009) +[2023-10-11 15:28:12,283][85176] Updated weights for policy 0, policy_version 13112 (0.0008) +[2023-10-11 15:28:13,006][85175] Updated weights for policy 1, policy_version 13290 (0.0007) +[2023-10-11 15:28:13,378][85175] Updated weights for policy 1, policy_version 13300 (0.0007) +[2023-10-11 15:28:13,742][85175] Updated weights for policy 1, policy_version 13310 (0.0007) +[2023-10-11 15:28:16,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 27066368. Throughput: 0: 1667.8, 1: 1702.8. Samples: 6779242. Policy #0 lag: (min: 31.0, avg: 32.8, max: 59.0) +[2023-10-11 15:28:16,063][84230] Avg episode reward: [(0, '6.850'), (1, '6.910')] +[2023-10-11 15:28:16,075][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000013312_13631488.pth... +[2023-10-11 15:28:16,075][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000013120_13434880.pth... +[2023-10-11 15:28:16,116][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000011744_12025856.pth +[2023-10-11 15:28:16,117][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000011584_11862016.pth +[2023-10-11 15:28:16,384][85176] Updated weights for policy 0, policy_version 13122 (0.0008) +[2023-10-11 15:28:16,755][85176] Updated weights for policy 0, policy_version 13132 (0.0009) +[2023-10-11 15:28:17,130][85176] Updated weights for policy 0, policy_version 13142 (0.0009) +[2023-10-11 15:28:17,513][85176] Updated weights for policy 0, policy_version 13152 (0.0009) +[2023-10-11 15:28:17,943][85175] Updated weights for policy 1, policy_version 13320 (0.0008) +[2023-10-11 15:28:18,304][85175] Updated weights for policy 1, policy_version 13330 (0.0008) +[2023-10-11 15:28:18,669][85175] Updated weights for policy 1, policy_version 13340 (0.0007) +[2023-10-11 15:28:21,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 27131904. Throughput: 0: 1669.9, 1: 1684.0. Samples: 6788760. Policy #0 lag: (min: 31.0, avg: 32.8, max: 59.0) +[2023-10-11 15:28:21,064][84230] Avg episode reward: [(0, '7.010'), (1, '7.090')] +[2023-10-11 15:28:21,531][85176] Updated weights for policy 0, policy_version 13162 (0.0009) +[2023-10-11 15:28:21,901][85176] Updated weights for policy 0, policy_version 13172 (0.0008) +[2023-10-11 15:28:22,273][85176] Updated weights for policy 0, policy_version 13182 (0.0010) +[2023-10-11 15:28:22,677][85175] Updated weights for policy 1, policy_version 13350 (0.0009) +[2023-10-11 15:28:23,048][85175] Updated weights for policy 1, policy_version 13360 (0.0008) +[2023-10-11 15:28:23,409][85175] Updated weights for policy 1, policy_version 13370 (0.0009) +[2023-10-11 15:28:26,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 27197440. Throughput: 0: 1669.2, 1: 1674.5. Samples: 6808870. Policy #0 lag: (min: 31.0, avg: 32.8, max: 59.0) +[2023-10-11 15:28:26,063][84230] Avg episode reward: [(0, '6.900'), (1, '7.370')] +[2023-10-11 15:28:26,310][85176] Updated weights for policy 0, policy_version 13192 (0.0009) +[2023-10-11 15:28:26,682][85176] Updated weights for policy 0, policy_version 13202 (0.0009) +[2023-10-11 15:28:27,065][85176] Updated weights for policy 0, policy_version 13212 (0.0009) +[2023-10-11 15:28:27,620][85175] Updated weights for policy 1, policy_version 13380 (0.0009) +[2023-10-11 15:28:27,999][85175] Updated weights for policy 1, policy_version 13390 (0.0008) +[2023-10-11 15:28:28,361][85175] Updated weights for policy 1, policy_version 13400 (0.0009) +[2023-10-11 15:28:31,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 27262976. Throughput: 0: 1667.9, 1: 1692.8. Samples: 6829640. Policy #0 lag: (min: 11.0, avg: 11.0, max: 12.0) +[2023-10-11 15:28:31,064][84230] Avg episode reward: [(0, '6.640'), (1, '7.490')] +[2023-10-11 15:28:31,172][85176] Updated weights for policy 0, policy_version 13222 (0.0011) +[2023-10-11 15:28:31,541][85176] Updated weights for policy 0, policy_version 13232 (0.0010) +[2023-10-11 15:28:31,914][85176] Updated weights for policy 0, policy_version 13242 (0.0010) +[2023-10-11 15:28:32,505][85175] Updated weights for policy 1, policy_version 13410 (0.0009) +[2023-10-11 15:28:32,869][85175] Updated weights for policy 1, policy_version 13420 (0.0009) +[2023-10-11 15:28:33,235][85175] Updated weights for policy 1, policy_version 13430 (0.0008) +[2023-10-11 15:28:33,603][85175] Updated weights for policy 1, policy_version 13440 (0.0007) +[2023-10-11 15:28:36,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 27328512. Throughput: 0: 1667.3, 1: 1668.9. Samples: 6838896. Policy #0 lag: (min: 11.0, avg: 11.0, max: 12.0) +[2023-10-11 15:28:36,063][84230] Avg episode reward: [(0, '6.640'), (1, '7.160')] +[2023-10-11 15:28:36,085][85176] Updated weights for policy 0, policy_version 13252 (0.0010) +[2023-10-11 15:28:36,457][85176] Updated weights for policy 0, policy_version 13262 (0.0007) +[2023-10-11 15:28:36,834][85176] Updated weights for policy 0, policy_version 13272 (0.0008) +[2023-10-11 15:28:37,538][85175] Updated weights for policy 1, policy_version 13450 (0.0009) +[2023-10-11 15:28:37,908][85175] Updated weights for policy 1, policy_version 13460 (0.0009) +[2023-10-11 15:28:38,286][85175] Updated weights for policy 1, policy_version 13470 (0.0009) +[2023-10-11 15:28:40,954][85176] Updated weights for policy 0, policy_version 13282 (0.0007) +[2023-10-11 15:28:41,062][84230] Fps is (10 sec: 13107.6, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 27394048. Throughput: 0: 1668.0, 1: 1690.4. Samples: 6859342. Policy #0 lag: (min: 11.0, avg: 11.0, max: 12.0) +[2023-10-11 15:28:41,063][84230] Avg episode reward: [(0, '6.860'), (1, '7.140')] +[2023-10-11 15:28:41,324][85176] Updated weights for policy 0, policy_version 13292 (0.0012) +[2023-10-11 15:28:41,703][85176] Updated weights for policy 0, policy_version 13302 (0.0011) +[2023-10-11 15:28:42,081][85176] Updated weights for policy 0, policy_version 13312 (0.0009) +[2023-10-11 15:28:42,448][85175] Updated weights for policy 1, policy_version 13480 (0.0010) +[2023-10-11 15:28:42,817][85175] Updated weights for policy 1, policy_version 13490 (0.0007) +[2023-10-11 15:28:43,193][85175] Updated weights for policy 1, policy_version 13500 (0.0008) +[2023-10-11 15:28:46,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 27459584. Throughput: 0: 1664.6, 1: 1684.6. Samples: 6879740. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:28:46,063][84230] Avg episode reward: [(0, '6.750'), (1, '6.620')] +[2023-10-11 15:28:46,363][85176] Updated weights for policy 0, policy_version 13322 (0.0008) +[2023-10-11 15:28:46,738][85176] Updated weights for policy 0, policy_version 13332 (0.0008) +[2023-10-11 15:28:47,110][85176] Updated weights for policy 0, policy_version 13342 (0.0007) +[2023-10-11 15:28:47,289][85175] Updated weights for policy 1, policy_version 13510 (0.0008) +[2023-10-11 15:28:47,652][85175] Updated weights for policy 1, policy_version 13520 (0.0007) +[2023-10-11 15:28:48,014][85175] Updated weights for policy 1, policy_version 13530 (0.0007) +[2023-10-11 15:28:51,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 27525120. Throughput: 0: 1664.8, 1: 1672.2. Samples: 6888946. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:28:51,063][84230] Avg episode reward: [(0, '6.790'), (1, '6.720')] +[2023-10-11 15:28:51,350][85176] Updated weights for policy 0, policy_version 13352 (0.0008) +[2023-10-11 15:28:51,720][85176] Updated weights for policy 0, policy_version 13362 (0.0007) +[2023-10-11 15:28:52,020][85175] Updated weights for policy 1, policy_version 13540 (0.0008) +[2023-10-11 15:28:52,084][85176] Updated weights for policy 0, policy_version 13372 (0.0007) +[2023-10-11 15:28:52,383][85175] Updated weights for policy 1, policy_version 13550 (0.0009) +[2023-10-11 15:28:52,760][85175] Updated weights for policy 1, policy_version 13560 (0.0007) +[2023-10-11 15:28:55,957][85176] Updated weights for policy 0, policy_version 13382 (0.0007) +[2023-10-11 15:28:56,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 27590656. Throughput: 0: 1668.0, 1: 1686.4. Samples: 6909756. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:28:56,063][84230] Avg episode reward: [(0, '6.460'), (1, '7.510')] +[2023-10-11 15:28:56,336][85176] Updated weights for policy 0, policy_version 13392 (0.0007) +[2023-10-11 15:28:56,718][85176] Updated weights for policy 0, policy_version 13402 (0.0009) +[2023-10-11 15:28:56,856][85175] Updated weights for policy 1, policy_version 13570 (0.0009) +[2023-10-11 15:28:57,279][85175] Updated weights for policy 1, policy_version 13580 (0.0008) +[2023-10-11 15:28:57,639][85175] Updated weights for policy 1, policy_version 13590 (0.0008) +[2023-10-11 15:28:58,000][85175] Updated weights for policy 1, policy_version 13600 (0.0007) +[2023-10-11 15:29:00,740][85176] Updated weights for policy 0, policy_version 13412 (0.0009) +[2023-10-11 15:29:01,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 27656192. Throughput: 0: 1669.7, 1: 1691.1. Samples: 6930478. Policy #0 lag: (min: 20.0, avg: 20.0, max: 20.0) +[2023-10-11 15:29:01,064][84230] Avg episode reward: [(0, '6.750'), (1, '7.630')] +[2023-10-11 15:29:01,126][85176] Updated weights for policy 0, policy_version 13422 (0.0010) +[2023-10-11 15:29:01,498][85176] Updated weights for policy 0, policy_version 13432 (0.0009) +[2023-10-11 15:29:01,896][85175] Updated weights for policy 1, policy_version 13610 (0.0008) +[2023-10-11 15:29:02,269][85175] Updated weights for policy 1, policy_version 13620 (0.0009) +[2023-10-11 15:29:02,641][85175] Updated weights for policy 1, policy_version 13630 (0.0011) +[2023-10-11 15:29:05,685][85176] Updated weights for policy 0, policy_version 13442 (0.0009) +[2023-10-11 15:29:06,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 27721728. Throughput: 0: 1668.1, 1: 1686.1. Samples: 6939700. Policy #0 lag: (min: 20.0, avg: 20.0, max: 20.0) +[2023-10-11 15:29:06,063][85176] Updated weights for policy 0, policy_version 13452 (0.0010) +[2023-10-11 15:29:06,064][84230] Avg episode reward: [(0, '8.690'), (1, '7.280')] +[2023-10-11 15:29:06,433][85176] Updated weights for policy 0, policy_version 13462 (0.0008) +[2023-10-11 15:29:06,646][85175] Updated weights for policy 1, policy_version 13640 (0.0008) +[2023-10-11 15:29:06,798][84801] Saving new best policy, reward=8.690! +[2023-10-11 15:29:06,803][85176] Updated weights for policy 0, policy_version 13472 (0.0009) +[2023-10-11 15:29:07,010][85175] Updated weights for policy 1, policy_version 13650 (0.0008) +[2023-10-11 15:29:07,375][85175] Updated weights for policy 1, policy_version 13660 (0.0007) +[2023-10-11 15:29:10,893][85176] Updated weights for policy 0, policy_version 13482 (0.0007) +[2023-10-11 15:29:11,062][84230] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 27787264. Throughput: 0: 1665.7, 1: 1697.1. Samples: 6960196. Policy #0 lag: (min: 20.0, avg: 20.0, max: 20.0) +[2023-10-11 15:29:11,063][84230] Avg episode reward: [(0, '9.250'), (1, '7.020')] +[2023-10-11 15:29:11,268][85176] Updated weights for policy 0, policy_version 13492 (0.0008) +[2023-10-11 15:29:11,585][85175] Updated weights for policy 1, policy_version 13670 (0.0008) +[2023-10-11 15:29:11,634][85176] Updated weights for policy 0, policy_version 13502 (0.0009) +[2023-10-11 15:29:11,712][84801] Saving new best policy, reward=9.250! +[2023-10-11 15:29:11,953][85175] Updated weights for policy 1, policy_version 13680 (0.0008) +[2023-10-11 15:29:12,325][85175] Updated weights for policy 1, policy_version 13690 (0.0007) +[2023-10-11 15:29:15,697][85176] Updated weights for policy 0, policy_version 13512 (0.0010) +[2023-10-11 15:29:16,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 27852800. Throughput: 0: 1658.8, 1: 1693.8. Samples: 6980506. Policy #0 lag: (min: 31.0, avg: 33.8, max: 63.0) +[2023-10-11 15:29:16,064][84230] Avg episode reward: [(0, '7.530'), (1, '7.290')] +[2023-10-11 15:29:16,072][85176] Updated weights for policy 0, policy_version 13522 (0.0009) +[2023-10-11 15:29:16,402][85175] Updated weights for policy 1, policy_version 13700 (0.0007) +[2023-10-11 15:29:16,447][85176] Updated weights for policy 0, policy_version 13532 (0.0008) +[2023-10-11 15:29:16,774][85175] Updated weights for policy 1, policy_version 13710 (0.0007) +[2023-10-11 15:29:17,144][85175] Updated weights for policy 1, policy_version 13720 (0.0008) +[2023-10-11 15:29:20,435][85176] Updated weights for policy 0, policy_version 13542 (0.0009) +[2023-10-11 15:29:20,808][85176] Updated weights for policy 0, policy_version 13552 (0.0009) +[2023-10-11 15:29:21,062][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 27918336. Throughput: 0: 1667.1, 1: 1688.0. Samples: 6989878. Policy #0 lag: (min: 31.0, avg: 33.8, max: 63.0) +[2023-10-11 15:29:21,063][84230] Avg episode reward: [(0, '6.860'), (1, '7.620')] +[2023-10-11 15:29:21,164][85175] Updated weights for policy 1, policy_version 13730 (0.0007) +[2023-10-11 15:29:21,183][85176] Updated weights for policy 0, policy_version 13562 (0.0007) +[2023-10-11 15:29:21,537][85175] Updated weights for policy 1, policy_version 13740 (0.0008) +[2023-10-11 15:29:21,917][85175] Updated weights for policy 1, policy_version 13750 (0.0007) +[2023-10-11 15:29:22,280][85175] Updated weights for policy 1, policy_version 13760 (0.0009) +[2023-10-11 15:29:25,595][85176] Updated weights for policy 0, policy_version 13572 (0.0008) +[2023-10-11 15:29:25,968][85176] Updated weights for policy 0, policy_version 13582 (0.0009) +[2023-10-11 15:29:26,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13218.3). Total num frames: 27983872. Throughput: 0: 1666.8, 1: 1693.6. Samples: 7010558. Policy #0 lag: (min: 31.0, avg: 33.8, max: 63.0) +[2023-10-11 15:29:26,063][84230] Avg episode reward: [(0, '7.030'), (1, '7.490')] +[2023-10-11 15:29:26,183][85175] Updated weights for policy 1, policy_version 13770 (0.0007) +[2023-10-11 15:29:26,338][85176] Updated weights for policy 0, policy_version 13592 (0.0007) +[2023-10-11 15:29:26,547][85175] Updated weights for policy 1, policy_version 13780 (0.0008) +[2023-10-11 15:29:26,919][85175] Updated weights for policy 1, policy_version 13790 (0.0007) +[2023-10-11 15:29:30,293][85176] Updated weights for policy 0, policy_version 13602 (0.0008) +[2023-10-11 15:29:30,665][85176] Updated weights for policy 0, policy_version 13612 (0.0008) +[2023-10-11 15:29:30,946][85175] Updated weights for policy 1, policy_version 13800 (0.0007) +[2023-10-11 15:29:31,028][85176] Updated weights for policy 0, policy_version 13622 (0.0008) +[2023-10-11 15:29:31,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13218.3). Total num frames: 28049408. Throughput: 0: 1662.3, 1: 1701.0. Samples: 7031090. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:29:31,063][84230] Avg episode reward: [(0, '7.070'), (1, '7.320')] +[2023-10-11 15:29:31,322][85175] Updated weights for policy 1, policy_version 13810 (0.0008) +[2023-10-11 15:29:31,409][85176] Updated weights for policy 0, policy_version 13632 (0.0009) +[2023-10-11 15:29:31,694][85175] Updated weights for policy 1, policy_version 13820 (0.0007) +[2023-10-11 15:29:35,583][85176] Updated weights for policy 0, policy_version 13642 (0.0007) +[2023-10-11 15:29:35,691][85175] Updated weights for policy 1, policy_version 13830 (0.0007) +[2023-10-11 15:29:35,952][85176] Updated weights for policy 0, policy_version 13652 (0.0007) +[2023-10-11 15:29:36,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13218.3). Total num frames: 28114944. Throughput: 0: 1668.7, 1: 1702.5. Samples: 7040648. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:29:36,064][84230] Avg episode reward: [(0, '6.640'), (1, '7.070')] +[2023-10-11 15:29:36,065][85175] Updated weights for policy 1, policy_version 13840 (0.0010) +[2023-10-11 15:29:36,326][85176] Updated weights for policy 0, policy_version 13662 (0.0009) +[2023-10-11 15:29:36,430][85175] Updated weights for policy 1, policy_version 13850 (0.0009) +[2023-10-11 15:29:40,626][85175] Updated weights for policy 1, policy_version 13860 (0.0008) +[2023-10-11 15:29:40,634][85176] Updated weights for policy 0, policy_version 13672 (0.0009) +[2023-10-11 15:29:40,983][85175] Updated weights for policy 1, policy_version 13870 (0.0007) +[2023-10-11 15:29:41,011][85176] Updated weights for policy 0, policy_version 13682 (0.0009) +[2023-10-11 15:29:41,062][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13218.3). Total num frames: 28180480. Throughput: 0: 1662.4, 1: 1701.0. Samples: 7061112. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:29:41,063][84230] Avg episode reward: [(0, '6.750'), (1, '6.850')] +[2023-10-11 15:29:41,347][85175] Updated weights for policy 1, policy_version 13880 (0.0008) +[2023-10-11 15:29:41,382][85176] Updated weights for policy 0, policy_version 13692 (0.0007) +[2023-10-11 15:29:45,497][85176] Updated weights for policy 0, policy_version 13702 (0.0008) +[2023-10-11 15:29:45,499][85175] Updated weights for policy 1, policy_version 13890 (0.0007) +[2023-10-11 15:29:45,876][85176] Updated weights for policy 0, policy_version 13712 (0.0007) +[2023-10-11 15:29:45,912][85175] Updated weights for policy 1, policy_version 13900 (0.0008) +[2023-10-11 15:29:46,063][84230] Fps is (10 sec: 13106.8, 60 sec: 13107.1, 300 sec: 13218.3). Total num frames: 28246016. Throughput: 0: 1654.7, 1: 1690.1. Samples: 7080998. Policy #0 lag: (min: 21.0, avg: 22.3, max: 46.0) +[2023-10-11 15:29:46,064][84230] Avg episode reward: [(0, '6.970'), (1, '7.120')] +[2023-10-11 15:29:46,250][85176] Updated weights for policy 0, policy_version 13722 (0.0007) +[2023-10-11 15:29:46,289][85175] Updated weights for policy 1, policy_version 13910 (0.0009) +[2023-10-11 15:29:46,651][85175] Updated weights for policy 1, policy_version 13920 (0.0009) +[2023-10-11 15:29:50,261][85176] Updated weights for policy 0, policy_version 13732 (0.0007) +[2023-10-11 15:29:50,634][85176] Updated weights for policy 0, policy_version 13742 (0.0008) +[2023-10-11 15:29:50,745][85175] Updated weights for policy 1, policy_version 13930 (0.0009) +[2023-10-11 15:29:51,008][85176] Updated weights for policy 0, policy_version 13752 (0.0008) +[2023-10-11 15:29:51,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13218.3). Total num frames: 28311552. Throughput: 0: 1660.0, 1: 1686.8. Samples: 7090304. Policy #0 lag: (min: 21.0, avg: 22.3, max: 46.0) +[2023-10-11 15:29:51,064][84230] Avg episode reward: [(0, '7.730'), (1, '7.100')] +[2023-10-11 15:29:51,102][85175] Updated weights for policy 1, policy_version 13940 (0.0008) +[2023-10-11 15:29:51,470][85175] Updated weights for policy 1, policy_version 13950 (0.0009) +[2023-10-11 15:29:55,308][85176] Updated weights for policy 0, policy_version 13762 (0.0009) +[2023-10-11 15:29:55,462][85175] Updated weights for policy 1, policy_version 13960 (0.0007) +[2023-10-11 15:29:55,681][85176] Updated weights for policy 0, policy_version 13772 (0.0007) +[2023-10-11 15:29:55,829][85175] Updated weights for policy 1, policy_version 13970 (0.0008) +[2023-10-11 15:29:56,046][85176] Updated weights for policy 0, policy_version 13782 (0.0010) +[2023-10-11 15:29:56,062][84230] Fps is (10 sec: 13107.8, 60 sec: 13107.2, 300 sec: 13218.3). Total num frames: 28377088. Throughput: 0: 1656.4, 1: 1691.2. Samples: 7110838. Policy #0 lag: (min: 21.0, avg: 22.3, max: 46.0) +[2023-10-11 15:29:56,063][84230] Avg episode reward: [(0, '7.510'), (1, '7.090')] +[2023-10-11 15:29:56,196][85175] Updated weights for policy 1, policy_version 13980 (0.0007) +[2023-10-11 15:29:56,421][85176] Updated weights for policy 0, policy_version 13792 (0.0009) +[2023-10-11 15:30:00,219][85175] Updated weights for policy 1, policy_version 13990 (0.0010) +[2023-10-11 15:30:00,582][85175] Updated weights for policy 1, policy_version 14000 (0.0010) +[2023-10-11 15:30:00,584][85176] Updated weights for policy 0, policy_version 13802 (0.0008) +[2023-10-11 15:30:00,954][85175] Updated weights for policy 1, policy_version 14010 (0.0007) +[2023-10-11 15:30:00,957][85176] Updated weights for policy 0, policy_version 13812 (0.0008) +[2023-10-11 15:30:01,063][84230] Fps is (10 sec: 13106.8, 60 sec: 13107.1, 300 sec: 13218.3). Total num frames: 28442624. Throughput: 0: 1647.7, 1: 1679.6. Samples: 7130234. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:30:01,064][84230] Avg episode reward: [(0, '6.530'), (1, '7.540')] +[2023-10-11 15:30:01,316][85176] Updated weights for policy 0, policy_version 13822 (0.0009) +[2023-10-11 15:30:04,880][85175] Updated weights for policy 1, policy_version 14020 (0.0007) +[2023-10-11 15:30:05,248][85175] Updated weights for policy 1, policy_version 14030 (0.0008) +[2023-10-11 15:30:05,478][85176] Updated weights for policy 0, policy_version 13832 (0.0008) +[2023-10-11 15:30:05,615][85175] Updated weights for policy 1, policy_version 14040 (0.0007) +[2023-10-11 15:30:05,860][85176] Updated weights for policy 0, policy_version 13842 (0.0009) +[2023-10-11 15:30:06,063][84230] Fps is (10 sec: 16383.5, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 28540928. Throughput: 0: 1649.1, 1: 1695.5. Samples: 7140386. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:30:06,064][84230] Avg episode reward: [(0, '6.350'), (1, '7.110')] +[2023-10-11 15:30:06,239][85176] Updated weights for policy 0, policy_version 13852 (0.0009) +[2023-10-11 15:30:09,648][85175] Updated weights for policy 1, policy_version 14050 (0.0007) +[2023-10-11 15:30:10,018][85175] Updated weights for policy 1, policy_version 14060 (0.0007) +[2023-10-11 15:30:10,278][85176] Updated weights for policy 0, policy_version 13862 (0.0009) +[2023-10-11 15:30:10,385][85175] Updated weights for policy 1, policy_version 14070 (0.0008) +[2023-10-11 15:30:10,654][85176] Updated weights for policy 0, policy_version 13872 (0.0009) +[2023-10-11 15:30:10,754][85175] Updated weights for policy 1, policy_version 14080 (0.0007) +[2023-10-11 15:30:11,023][85176] Updated weights for policy 0, policy_version 13882 (0.0009) +[2023-10-11 15:30:11,063][84230] Fps is (10 sec: 16384.5, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 28606464. Throughput: 0: 1650.0, 1: 1693.6. Samples: 7161020. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:30:11,064][84230] Avg episode reward: [(0, '6.520'), (1, '7.010')] +[2023-10-11 15:30:14,711][85175] Updated weights for policy 1, policy_version 14090 (0.0009) +[2023-10-11 15:30:15,079][85175] Updated weights for policy 1, policy_version 14100 (0.0009) +[2023-10-11 15:30:15,185][85176] Updated weights for policy 0, policy_version 13892 (0.0008) +[2023-10-11 15:30:15,445][85175] Updated weights for policy 1, policy_version 14110 (0.0009) +[2023-10-11 15:30:15,553][85176] Updated weights for policy 0, policy_version 13902 (0.0010) +[2023-10-11 15:30:15,934][85176] Updated weights for policy 0, policy_version 13912 (0.0007) +[2023-10-11 15:30:16,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 28672000. Throughput: 0: 1645.8, 1: 1665.4. Samples: 7180094. Policy #0 lag: (min: 18.0, avg: 25.7, max: 50.0) +[2023-10-11 15:30:16,064][84230] Avg episode reward: [(0, '6.850'), (1, '6.880')] +[2023-10-11 15:30:16,073][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000014112_14450688.pth... +[2023-10-11 15:30:16,104][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000012512_12812288.pth +[2023-10-11 15:30:16,220][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000013920_14254080.pth... +[2023-10-11 15:30:16,259][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000012352_12648448.pth +[2023-10-11 15:30:19,502][85175] Updated weights for policy 1, policy_version 14120 (0.0009) +[2023-10-11 15:30:19,872][85175] Updated weights for policy 1, policy_version 14130 (0.0007) +[2023-10-11 15:30:20,148][85176] Updated weights for policy 0, policy_version 13922 (0.0008) +[2023-10-11 15:30:20,228][85175] Updated weights for policy 1, policy_version 14140 (0.0008) +[2023-10-11 15:30:20,523][85176] Updated weights for policy 0, policy_version 13932 (0.0008) +[2023-10-11 15:30:20,897][85176] Updated weights for policy 0, policy_version 13942 (0.0007) +[2023-10-11 15:30:21,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 28737536. Throughput: 0: 1650.0, 1: 1688.4. Samples: 7190878. Policy #0 lag: (min: 18.0, avg: 25.7, max: 50.0) +[2023-10-11 15:30:21,064][84230] Avg episode reward: [(0, '7.560'), (1, '7.510')] +[2023-10-11 15:30:21,261][85176] Updated weights for policy 0, policy_version 13952 (0.0009) +[2023-10-11 15:30:24,244][85175] Updated weights for policy 1, policy_version 14150 (0.0010) +[2023-10-11 15:30:24,621][85175] Updated weights for policy 1, policy_version 14160 (0.0011) +[2023-10-11 15:30:24,987][85175] Updated weights for policy 1, policy_version 14170 (0.0010) +[2023-10-11 15:30:25,372][85176] Updated weights for policy 0, policy_version 13962 (0.0008) +[2023-10-11 15:30:25,746][85176] Updated weights for policy 0, policy_version 13972 (0.0010) +[2023-10-11 15:30:26,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 28803072. Throughput: 0: 1645.8, 1: 1685.4. Samples: 7211016. Policy #0 lag: (min: 18.0, avg: 25.7, max: 50.0) +[2023-10-11 15:30:26,064][84230] Avg episode reward: [(0, '7.900'), (1, '7.600')] +[2023-10-11 15:30:26,132][85176] Updated weights for policy 0, policy_version 13982 (0.0009) +[2023-10-11 15:30:29,053][85175] Updated weights for policy 1, policy_version 14180 (0.0010) +[2023-10-11 15:30:29,409][85175] Updated weights for policy 1, policy_version 14190 (0.0010) +[2023-10-11 15:30:29,776][85175] Updated weights for policy 1, policy_version 14200 (0.0008) +[2023-10-11 15:30:30,014][85176] Updated weights for policy 0, policy_version 13992 (0.0007) +[2023-10-11 15:30:30,386][85176] Updated weights for policy 0, policy_version 14002 (0.0007) +[2023-10-11 15:30:30,759][85176] Updated weights for policy 0, policy_version 14012 (0.0008) +[2023-10-11 15:30:31,063][84230] Fps is (10 sec: 16384.0, 60 sec: 14199.4, 300 sec: 13440.4). Total num frames: 28901376. Throughput: 0: 1634.5, 1: 1676.1. Samples: 7229970. Policy #0 lag: (min: 40.0, avg: 55.2, max: 56.0) +[2023-10-11 15:30:31,064][84230] Avg episode reward: [(0, '7.080'), (1, '7.480')] +[2023-10-11 15:30:33,848][85175] Updated weights for policy 1, policy_version 14210 (0.0007) +[2023-10-11 15:30:34,223][85175] Updated weights for policy 1, policy_version 14220 (0.0010) +[2023-10-11 15:30:34,598][85175] Updated weights for policy 1, policy_version 14230 (0.0009) +[2023-10-11 15:30:34,784][85176] Updated weights for policy 0, policy_version 14022 (0.0010) +[2023-10-11 15:30:34,952][85175] Updated weights for policy 1, policy_version 14240 (0.0009) +[2023-10-11 15:30:35,169][85176] Updated weights for policy 0, policy_version 14032 (0.0009) +[2023-10-11 15:30:35,530][85176] Updated weights for policy 0, policy_version 14042 (0.0009) +[2023-10-11 15:30:36,063][84230] Fps is (10 sec: 16384.2, 60 sec: 14199.5, 300 sec: 13440.4). Total num frames: 28966912. Throughput: 0: 1651.3, 1: 1704.6. Samples: 7241320. Policy #0 lag: (min: 40.0, avg: 55.2, max: 56.0) +[2023-10-11 15:30:36,063][84230] Avg episode reward: [(0, '6.310'), (1, '7.020')] +[2023-10-11 15:30:38,861][85175] Updated weights for policy 1, policy_version 14250 (0.0008) +[2023-10-11 15:30:39,240][85175] Updated weights for policy 1, policy_version 14260 (0.0008) +[2023-10-11 15:30:39,601][85175] Updated weights for policy 1, policy_version 14270 (0.0008) +[2023-10-11 15:30:39,689][85176] Updated weights for policy 0, policy_version 14052 (0.0007) +[2023-10-11 15:30:40,060][85176] Updated weights for policy 0, policy_version 14062 (0.0008) +[2023-10-11 15:30:40,432][85176] Updated weights for policy 0, policy_version 14072 (0.0010) +[2023-10-11 15:30:41,063][84230] Fps is (10 sec: 13107.4, 60 sec: 14199.4, 300 sec: 13440.4). Total num frames: 29032448. Throughput: 0: 1655.9, 1: 1679.0. Samples: 7260908. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-10-11 15:30:41,063][84230] Avg episode reward: [(0, '6.640'), (1, '6.780')] +[2023-10-11 15:30:43,753][85175] Updated weights for policy 1, policy_version 14280 (0.0009) +[2023-10-11 15:30:44,121][85175] Updated weights for policy 1, policy_version 14290 (0.0008) +[2023-10-11 15:30:44,488][85175] Updated weights for policy 1, policy_version 14300 (0.0010) +[2023-10-11 15:30:44,513][85176] Updated weights for policy 0, policy_version 14082 (0.0008) +[2023-10-11 15:30:44,880][85176] Updated weights for policy 0, policy_version 14092 (0.0007) +[2023-10-11 15:30:45,253][85176] Updated weights for policy 0, policy_version 14102 (0.0007) +[2023-10-11 15:30:45,626][85176] Updated weights for policy 0, policy_version 14112 (0.0009) +[2023-10-11 15:30:46,063][84230] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 13440.4). Total num frames: 29097984. Throughput: 0: 1648.9, 1: 1690.0. Samples: 7280486. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-10-11 15:30:46,064][84230] Avg episode reward: [(0, '7.330'), (1, '7.140')] +[2023-10-11 15:30:48,576][85175] Updated weights for policy 1, policy_version 14310 (0.0007) +[2023-10-11 15:30:48,945][85175] Updated weights for policy 1, policy_version 14320 (0.0008) +[2023-10-11 15:30:49,308][85175] Updated weights for policy 1, policy_version 14330 (0.0010) +[2023-10-11 15:30:49,864][85176] Updated weights for policy 0, policy_version 14122 (0.0008) +[2023-10-11 15:30:50,242][85176] Updated weights for policy 0, policy_version 14132 (0.0008) +[2023-10-11 15:30:50,609][85176] Updated weights for policy 0, policy_version 14142 (0.0007) +[2023-10-11 15:30:51,062][84230] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13440.4). Total num frames: 29163520. Throughput: 0: 1664.2, 1: 1694.5. Samples: 7291524. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-10-11 15:30:51,063][84230] Avg episode reward: [(0, '8.220'), (1, '7.440')] +[2023-10-11 15:30:53,417][85175] Updated weights for policy 1, policy_version 14340 (0.0007) +[2023-10-11 15:30:53,782][85175] Updated weights for policy 1, policy_version 14350 (0.0009) +[2023-10-11 15:30:54,158][85175] Updated weights for policy 1, policy_version 14360 (0.0008) +[2023-10-11 15:30:54,803][85176] Updated weights for policy 0, policy_version 14152 (0.0011) +[2023-10-11 15:30:55,181][85176] Updated weights for policy 0, policy_version 14162 (0.0008) +[2023-10-11 15:30:55,547][85176] Updated weights for policy 0, policy_version 14172 (0.0009) +[2023-10-11 15:30:56,063][84230] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 13440.4). Total num frames: 29229056. Throughput: 0: 1665.6, 1: 1668.1. Samples: 7311034. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) +[2023-10-11 15:30:56,063][84230] Avg episode reward: [(0, '7.650'), (1, '7.560')] +[2023-10-11 15:30:57,992][85175] Updated weights for policy 1, policy_version 14370 (0.0008) +[2023-10-11 15:30:58,357][85175] Updated weights for policy 1, policy_version 14380 (0.0011) +[2023-10-11 15:30:58,726][85175] Updated weights for policy 1, policy_version 14390 (0.0009) +[2023-10-11 15:30:59,090][85175] Updated weights for policy 1, policy_version 14400 (0.0010) +[2023-10-11 15:30:59,453][85176] Updated weights for policy 0, policy_version 14182 (0.0009) +[2023-10-11 15:30:59,821][85176] Updated weights for policy 0, policy_version 14192 (0.0009) +[2023-10-11 15:31:00,196][85176] Updated weights for policy 0, policy_version 14202 (0.0008) +[2023-10-11 15:31:01,062][84230] Fps is (10 sec: 13107.2, 60 sec: 14199.6, 300 sec: 13440.4). Total num frames: 29294592. Throughput: 0: 1651.0, 1: 1695.3. Samples: 7330674. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) +[2023-10-11 15:31:01,063][84230] Avg episode reward: [(0, '6.970'), (1, '7.630')] +[2023-10-11 15:31:03,216][85175] Updated weights for policy 1, policy_version 14410 (0.0008) +[2023-10-11 15:31:03,584][85175] Updated weights for policy 1, policy_version 14420 (0.0008) +[2023-10-11 15:31:03,950][85175] Updated weights for policy 1, policy_version 14430 (0.0010) +[2023-10-11 15:31:04,297][85176] Updated weights for policy 0, policy_version 14212 (0.0009) +[2023-10-11 15:31:04,675][85176] Updated weights for policy 0, policy_version 14222 (0.0009) +[2023-10-11 15:31:05,054][85176] Updated weights for policy 0, policy_version 14232 (0.0011) +[2023-10-11 15:31:06,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 29360128. Throughput: 0: 1667.9, 1: 1683.5. Samples: 7341688. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) +[2023-10-11 15:31:06,063][84230] Avg episode reward: [(0, '6.530'), (1, '7.130')] +[2023-10-11 15:31:07,839][85175] Updated weights for policy 1, policy_version 14440 (0.0009) +[2023-10-11 15:31:08,215][85175] Updated weights for policy 1, policy_version 14450 (0.0009) +[2023-10-11 15:31:08,579][85175] Updated weights for policy 1, policy_version 14460 (0.0007) +[2023-10-11 15:31:09,086][85176] Updated weights for policy 0, policy_version 14242 (0.0010) +[2023-10-11 15:31:09,462][85176] Updated weights for policy 0, policy_version 14252 (0.0009) +[2023-10-11 15:31:09,832][85176] Updated weights for policy 0, policy_version 14262 (0.0009) +[2023-10-11 15:31:10,215][85176] Updated weights for policy 0, policy_version 14272 (0.0010) +[2023-10-11 15:31:11,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 29425664. Throughput: 0: 1659.7, 1: 1681.6. Samples: 7361376. Policy #0 lag: (min: 31.0, avg: 42.8, max: 63.0) +[2023-10-11 15:31:11,064][84230] Avg episode reward: [(0, '6.630'), (1, '7.490')] +[2023-10-11 15:31:12,575][85175] Updated weights for policy 1, policy_version 14470 (0.0008) +[2023-10-11 15:31:12,934][85175] Updated weights for policy 1, policy_version 14480 (0.0007) +[2023-10-11 15:31:13,306][85175] Updated weights for policy 1, policy_version 14490 (0.0007) +[2023-10-11 15:31:14,364][85176] Updated weights for policy 0, policy_version 14282 (0.0007) +[2023-10-11 15:31:14,734][85176] Updated weights for policy 0, policy_version 14292 (0.0007) +[2023-10-11 15:31:15,105][85176] Updated weights for policy 0, policy_version 14302 (0.0008) +[2023-10-11 15:31:16,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 29491200. Throughput: 0: 1664.7, 1: 1703.1. Samples: 7381520. Policy #0 lag: (min: 31.0, avg: 42.8, max: 63.0) +[2023-10-11 15:31:16,063][84230] Avg episode reward: [(0, '6.520'), (1, '7.150')] +[2023-10-11 15:31:17,477][85175] Updated weights for policy 1, policy_version 14500 (0.0009) +[2023-10-11 15:31:17,850][85175] Updated weights for policy 1, policy_version 14510 (0.0009) +[2023-10-11 15:31:18,216][85175] Updated weights for policy 1, policy_version 14520 (0.0008) +[2023-10-11 15:31:19,195][85176] Updated weights for policy 0, policy_version 14312 (0.0008) +[2023-10-11 15:31:19,566][85176] Updated weights for policy 0, policy_version 14322 (0.0009) +[2023-10-11 15:31:19,938][85176] Updated weights for policy 0, policy_version 14332 (0.0011) +[2023-10-11 15:31:21,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 29556736. Throughput: 0: 1674.1, 1: 1675.1. Samples: 7392034. Policy #0 lag: (min: 31.0, avg: 42.8, max: 63.0) +[2023-10-11 15:31:21,064][84230] Avg episode reward: [(0, '6.530'), (1, '7.140')] +[2023-10-11 15:31:22,073][85175] Updated weights for policy 1, policy_version 14530 (0.0008) +[2023-10-11 15:31:22,449][85175] Updated weights for policy 1, policy_version 14540 (0.0008) +[2023-10-11 15:31:22,816][85175] Updated weights for policy 1, policy_version 14550 (0.0008) +[2023-10-11 15:31:23,177][85175] Updated weights for policy 1, policy_version 14560 (0.0008) +[2023-10-11 15:31:24,254][85176] Updated weights for policy 0, policy_version 14342 (0.0010) +[2023-10-11 15:31:24,629][85176] Updated weights for policy 0, policy_version 14352 (0.0009) +[2023-10-11 15:31:25,006][85176] Updated weights for policy 0, policy_version 14362 (0.0008) +[2023-10-11 15:31:26,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 29622272. Throughput: 0: 1653.2, 1: 1699.3. Samples: 7411772. Policy #0 lag: (min: 27.0, avg: 27.0, max: 27.0) +[2023-10-11 15:31:26,064][84230] Avg episode reward: [(0, '6.530'), (1, '7.140')] +[2023-10-11 15:31:27,334][85175] Updated weights for policy 1, policy_version 14570 (0.0008) +[2023-10-11 15:31:27,707][85175] Updated weights for policy 1, policy_version 14580 (0.0010) +[2023-10-11 15:31:28,080][85175] Updated weights for policy 1, policy_version 14590 (0.0008) +[2023-10-11 15:31:29,137][85176] Updated weights for policy 0, policy_version 14372 (0.0009) +[2023-10-11 15:31:29,506][85176] Updated weights for policy 0, policy_version 14382 (0.0009) +[2023-10-11 15:31:29,878][85176] Updated weights for policy 0, policy_version 14392 (0.0007) +[2023-10-11 15:31:31,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 29687808. Throughput: 0: 1656.4, 1: 1700.6. Samples: 7431550. Policy #0 lag: (min: 27.0, avg: 27.0, max: 27.0) +[2023-10-11 15:31:31,064][84230] Avg episode reward: [(0, '6.850'), (1, '6.950')] +[2023-10-11 15:31:32,198][85175] Updated weights for policy 1, policy_version 14600 (0.0008) +[2023-10-11 15:31:32,562][85175] Updated weights for policy 1, policy_version 14610 (0.0007) +[2023-10-11 15:31:32,933][85175] Updated weights for policy 1, policy_version 14620 (0.0007) +[2023-10-11 15:31:33,880][85176] Updated weights for policy 0, policy_version 14402 (0.0007) +[2023-10-11 15:31:34,250][85176] Updated weights for policy 0, policy_version 14412 (0.0009) +[2023-10-11 15:31:34,613][85176] Updated weights for policy 0, policy_version 14422 (0.0008) +[2023-10-11 15:31:34,991][85176] Updated weights for policy 0, policy_version 14432 (0.0007) +[2023-10-11 15:31:36,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 29753344. Throughput: 0: 1665.1, 1: 1679.0. Samples: 7442008. Policy #0 lag: (min: 27.0, avg: 27.0, max: 27.0) +[2023-10-11 15:31:36,064][84230] Avg episode reward: [(0, '6.850'), (1, '7.310')] +[2023-10-11 15:31:36,949][85175] Updated weights for policy 1, policy_version 14630 (0.0009) +[2023-10-11 15:31:37,318][85175] Updated weights for policy 1, policy_version 14640 (0.0007) +[2023-10-11 15:31:37,679][85175] Updated weights for policy 1, policy_version 14650 (0.0007) +[2023-10-11 15:31:39,176][85176] Updated weights for policy 0, policy_version 14442 (0.0009) +[2023-10-11 15:31:39,547][85176] Updated weights for policy 0, policy_version 14452 (0.0007) +[2023-10-11 15:31:39,921][85176] Updated weights for policy 0, policy_version 14462 (0.0009) +[2023-10-11 15:31:41,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 29818880. Throughput: 0: 1651.2, 1: 1704.0. Samples: 7462014. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:31:41,063][84230] Avg episode reward: [(0, '6.640'), (1, '7.220')] +[2023-10-11 15:31:41,681][85175] Updated weights for policy 1, policy_version 14660 (0.0007) +[2023-10-11 15:31:42,048][85175] Updated weights for policy 1, policy_version 14670 (0.0007) +[2023-10-11 15:31:42,426][85175] Updated weights for policy 1, policy_version 14680 (0.0008) +[2023-10-11 15:31:43,969][85176] Updated weights for policy 0, policy_version 14472 (0.0007) +[2023-10-11 15:31:44,344][85176] Updated weights for policy 0, policy_version 14482 (0.0009) +[2023-10-11 15:31:44,714][85176] Updated weights for policy 0, policy_version 14492 (0.0007) +[2023-10-11 15:31:46,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 29884416. Throughput: 0: 1668.3, 1: 1703.2. Samples: 7482392. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:31:46,064][84230] Avg episode reward: [(0, '7.120'), (1, '7.240')] +[2023-10-11 15:31:46,451][85175] Updated weights for policy 1, policy_version 14690 (0.0008) +[2023-10-11 15:31:46,818][85175] Updated weights for policy 1, policy_version 14700 (0.0009) +[2023-10-11 15:31:47,189][85175] Updated weights for policy 1, policy_version 14710 (0.0007) +[2023-10-11 15:31:47,549][85175] Updated weights for policy 1, policy_version 14720 (0.0009) +[2023-10-11 15:31:48,705][85176] Updated weights for policy 0, policy_version 14502 (0.0009) +[2023-10-11 15:31:49,090][85176] Updated weights for policy 0, policy_version 14512 (0.0010) +[2023-10-11 15:31:49,454][85176] Updated weights for policy 0, policy_version 14522 (0.0007) +[2023-10-11 15:31:51,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 29949952. Throughput: 0: 1665.3, 1: 1688.1. Samples: 7492592. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:31:51,063][84230] Avg episode reward: [(0, '7.080'), (1, '7.220')] +[2023-10-11 15:31:51,573][85175] Updated weights for policy 1, policy_version 14730 (0.0008) +[2023-10-11 15:31:51,943][85175] Updated weights for policy 1, policy_version 14740 (0.0008) +[2023-10-11 15:31:52,307][85175] Updated weights for policy 1, policy_version 14750 (0.0007) +[2023-10-11 15:31:53,625][85176] Updated weights for policy 0, policy_version 14532 (0.0009) +[2023-10-11 15:31:53,993][85176] Updated weights for policy 0, policy_version 14542 (0.0008) +[2023-10-11 15:31:54,371][85176] Updated weights for policy 0, policy_version 14552 (0.0009) +[2023-10-11 15:31:56,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 30015488. Throughput: 0: 1653.5, 1: 1702.4. Samples: 7512394. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:31:56,063][84230] Avg episode reward: [(0, '6.750'), (1, '7.640')] +[2023-10-11 15:31:56,283][85175] Updated weights for policy 1, policy_version 14760 (0.0008) +[2023-10-11 15:31:56,647][85175] Updated weights for policy 1, policy_version 14770 (0.0010) +[2023-10-11 15:31:57,026][85175] Updated weights for policy 1, policy_version 14780 (0.0008) +[2023-10-11 15:31:58,201][85176] Updated weights for policy 0, policy_version 14562 (0.0009) +[2023-10-11 15:31:58,581][85176] Updated weights for policy 0, policy_version 14572 (0.0008) +[2023-10-11 15:31:58,959][85176] Updated weights for policy 0, policy_version 14582 (0.0011) +[2023-10-11 15:31:59,323][85176] Updated weights for policy 0, policy_version 14592 (0.0010) +[2023-10-11 15:32:01,005][85175] Updated weights for policy 1, policy_version 14790 (0.0009) +[2023-10-11 15:32:01,062][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 30081024. Throughput: 0: 1669.9, 1: 1702.8. Samples: 7533292. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:32:01,063][84230] Avg episode reward: [(0, '6.750'), (1, '7.500')] +[2023-10-11 15:32:01,369][85175] Updated weights for policy 1, policy_version 14800 (0.0009) +[2023-10-11 15:32:01,748][85175] Updated weights for policy 1, policy_version 14810 (0.0008) +[2023-10-11 15:32:03,470][85176] Updated weights for policy 0, policy_version 14602 (0.0007) +[2023-10-11 15:32:03,841][85176] Updated weights for policy 0, policy_version 14612 (0.0009) +[2023-10-11 15:32:04,220][85176] Updated weights for policy 0, policy_version 14622 (0.0009) +[2023-10-11 15:32:05,901][85175] Updated weights for policy 1, policy_version 14820 (0.0007) +[2023-10-11 15:32:06,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 30146560. Throughput: 0: 1657.6, 1: 1699.6. Samples: 7543108. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:32:06,063][84230] Avg episode reward: [(0, '7.080'), (1, '7.350')] +[2023-10-11 15:32:06,270][85175] Updated weights for policy 1, policy_version 14830 (0.0009) +[2023-10-11 15:32:06,647][85175] Updated weights for policy 1, policy_version 14840 (0.0008) +[2023-10-11 15:32:08,403][85176] Updated weights for policy 0, policy_version 14632 (0.0009) +[2023-10-11 15:32:08,766][85176] Updated weights for policy 0, policy_version 14642 (0.0008) +[2023-10-11 15:32:09,150][85176] Updated weights for policy 0, policy_version 14652 (0.0010) +[2023-10-11 15:32:10,594][85175] Updated weights for policy 1, policy_version 14850 (0.0010) +[2023-10-11 15:32:10,962][85175] Updated weights for policy 1, policy_version 14860 (0.0009) +[2023-10-11 15:32:11,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 30212096. Throughput: 0: 1658.5, 1: 1694.9. Samples: 7562674. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-11 15:32:11,063][84230] Avg episode reward: [(0, '6.750'), (1, '7.350')] +[2023-10-11 15:32:11,328][85175] Updated weights for policy 1, policy_version 14870 (0.0010) +[2023-10-11 15:32:11,696][85175] Updated weights for policy 1, policy_version 14880 (0.0009) +[2023-10-11 15:32:13,316][85176] Updated weights for policy 0, policy_version 14662 (0.0010) +[2023-10-11 15:32:13,701][85176] Updated weights for policy 0, policy_version 14672 (0.0009) +[2023-10-11 15:32:14,080][85176] Updated weights for policy 0, policy_version 14682 (0.0008) +[2023-10-11 15:32:15,929][85175] Updated weights for policy 1, policy_version 14890 (0.0009) +[2023-10-11 15:32:16,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 30277632. Throughput: 0: 1670.9, 1: 1692.8. Samples: 7582914. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-11 15:32:16,064][84230] Avg episode reward: [(0, '6.640'), (1, '6.960')] +[2023-10-11 15:32:16,073][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000014688_15040512.pth... +[2023-10-11 15:32:16,104][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000013120_13434880.pth +[2023-10-11 15:32:16,302][85175] Updated weights for policy 1, policy_version 14900 (0.0009) +[2023-10-11 15:32:16,669][85175] Updated weights for policy 1, policy_version 14910 (0.0009) +[2023-10-11 15:32:16,741][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000014912_15269888.pth... +[2023-10-11 15:32:16,770][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000013312_13631488.pth +[2023-10-11 15:32:18,331][85176] Updated weights for policy 0, policy_version 14692 (0.0008) +[2023-10-11 15:32:18,700][85176] Updated weights for policy 0, policy_version 14702 (0.0009) +[2023-10-11 15:32:19,072][85176] Updated weights for policy 0, policy_version 14712 (0.0009) +[2023-10-11 15:32:20,586][85175] Updated weights for policy 1, policy_version 14920 (0.0009) +[2023-10-11 15:32:20,955][85175] Updated weights for policy 1, policy_version 14930 (0.0007) +[2023-10-11 15:32:21,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 30343168. Throughput: 0: 1658.1, 1: 1693.0. Samples: 7592808. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-11 15:32:21,063][84230] Avg episode reward: [(0, '6.530'), (1, '7.300')] +[2023-10-11 15:32:21,327][85175] Updated weights for policy 1, policy_version 14940 (0.0008) +[2023-10-11 15:32:23,292][85176] Updated weights for policy 0, policy_version 14722 (0.0010) +[2023-10-11 15:32:23,664][85176] Updated weights for policy 0, policy_version 14732 (0.0008) +[2023-10-11 15:32:24,039][85176] Updated weights for policy 0, policy_version 14742 (0.0007) +[2023-10-11 15:32:24,413][85176] Updated weights for policy 0, policy_version 14752 (0.0009) +[2023-10-11 15:32:25,424][85175] Updated weights for policy 1, policy_version 14950 (0.0009) +[2023-10-11 15:32:25,798][85175] Updated weights for policy 1, policy_version 14960 (0.0007) +[2023-10-11 15:32:26,062][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 30408704. Throughput: 0: 1649.9, 1: 1693.0. Samples: 7612444. Policy #0 lag: (min: 5.0, avg: 20.3, max: 37.0) +[2023-10-11 15:32:26,063][84230] Avg episode reward: [(0, '6.530'), (1, '8.020')] +[2023-10-11 15:32:26,171][85175] Updated weights for policy 1, policy_version 14970 (0.0010) +[2023-10-11 15:32:26,384][85000] Saving new best policy, reward=8.020! +[2023-10-11 15:32:28,475][85176] Updated weights for policy 0, policy_version 14762 (0.0010) +[2023-10-11 15:32:28,847][85176] Updated weights for policy 0, policy_version 14772 (0.0010) +[2023-10-11 15:32:29,218][85176] Updated weights for policy 0, policy_version 14782 (0.0011) +[2023-10-11 15:32:30,209][85175] Updated weights for policy 1, policy_version 14980 (0.0008) +[2023-10-11 15:32:30,572][85175] Updated weights for policy 1, policy_version 14990 (0.0007) +[2023-10-11 15:32:30,947][85175] Updated weights for policy 1, policy_version 15000 (0.0007) +[2023-10-11 15:32:31,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 30474240. Throughput: 0: 1656.3, 1: 1681.1. Samples: 7632574. Policy #0 lag: (min: 5.0, avg: 20.3, max: 37.0) +[2023-10-11 15:32:31,063][84230] Avg episode reward: [(0, '6.970'), (1, '7.830')] +[2023-10-11 15:32:33,488][85176] Updated weights for policy 0, policy_version 14792 (0.0008) +[2023-10-11 15:32:33,853][85176] Updated weights for policy 0, policy_version 14802 (0.0008) +[2023-10-11 15:32:34,228][85176] Updated weights for policy 0, policy_version 14812 (0.0009) +[2023-10-11 15:32:34,941][85175] Updated weights for policy 1, policy_version 15010 (0.0009) +[2023-10-11 15:32:35,319][85175] Updated weights for policy 1, policy_version 15020 (0.0010) +[2023-10-11 15:32:35,691][85175] Updated weights for policy 1, policy_version 15030 (0.0008) +[2023-10-11 15:32:36,060][85175] Updated weights for policy 1, policy_version 15040 (0.0011) +[2023-10-11 15:32:36,063][84230] Fps is (10 sec: 16383.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 30572544. Throughput: 0: 1650.5, 1: 1690.7. Samples: 7642944. Policy #0 lag: (min: 5.0, avg: 20.3, max: 37.0) +[2023-10-11 15:32:36,064][84230] Avg episode reward: [(0, '6.860'), (1, '7.270')] +[2023-10-11 15:32:38,074][85176] Updated weights for policy 0, policy_version 14822 (0.0007) +[2023-10-11 15:32:38,450][85176] Updated weights for policy 0, policy_version 14832 (0.0007) +[2023-10-11 15:32:38,824][85176] Updated weights for policy 0, policy_version 14842 (0.0008) +[2023-10-11 15:32:40,295][85175] Updated weights for policy 1, policy_version 15050 (0.0007) +[2023-10-11 15:32:40,664][85175] Updated weights for policy 1, policy_version 15060 (0.0008) +[2023-10-11 15:32:41,021][85175] Updated weights for policy 1, policy_version 15070 (0.0009) +[2023-10-11 15:32:41,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 30605312. Throughput: 0: 1656.8, 1: 1681.6. Samples: 7662624. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:32:41,063][84230] Avg episode reward: [(0, '6.640'), (1, '6.980')] +[2023-10-11 15:32:43,098][85176] Updated weights for policy 0, policy_version 14852 (0.0007) +[2023-10-11 15:32:43,471][85176] Updated weights for policy 0, policy_version 14862 (0.0007) +[2023-10-11 15:32:43,848][85176] Updated weights for policy 0, policy_version 14872 (0.0008) +[2023-10-11 15:32:45,174][85175] Updated weights for policy 1, policy_version 15080 (0.0009) +[2023-10-11 15:32:45,544][85175] Updated weights for policy 1, policy_version 15090 (0.0009) +[2023-10-11 15:32:45,912][85175] Updated weights for policy 1, policy_version 15100 (0.0008) +[2023-10-11 15:32:46,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 30703616. Throughput: 0: 1656.6, 1: 1661.4. Samples: 7682602. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:32:46,063][84230] Avg episode reward: [(0, '6.420'), (1, '6.980')] +[2023-10-11 15:32:48,059][85176] Updated weights for policy 0, policy_version 14882 (0.0009) +[2023-10-11 15:32:48,427][85176] Updated weights for policy 0, policy_version 14892 (0.0009) +[2023-10-11 15:32:48,802][85176] Updated weights for policy 0, policy_version 14902 (0.0008) +[2023-10-11 15:32:49,176][85176] Updated weights for policy 0, policy_version 14912 (0.0008) +[2023-10-11 15:32:49,889][85175] Updated weights for policy 1, policy_version 15110 (0.0009) +[2023-10-11 15:32:50,253][85175] Updated weights for policy 1, policy_version 15120 (0.0010) +[2023-10-11 15:32:50,634][85175] Updated weights for policy 1, policy_version 15130 (0.0010) +[2023-10-11 15:32:51,063][84230] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 30769152. Throughput: 0: 1651.8, 1: 1676.6. Samples: 7692886. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:32:51,063][84230] Avg episode reward: [(0, '6.530'), (1, '7.580')] +[2023-10-11 15:32:53,479][85176] Updated weights for policy 0, policy_version 14922 (0.0007) +[2023-10-11 15:32:53,850][85176] Updated weights for policy 0, policy_version 14932 (0.0010) +[2023-10-11 15:32:54,212][85176] Updated weights for policy 0, policy_version 14942 (0.0010) +[2023-10-11 15:32:54,865][85175] Updated weights for policy 1, policy_version 15140 (0.0009) +[2023-10-11 15:32:55,233][85175] Updated weights for policy 1, policy_version 15150 (0.0007) +[2023-10-11 15:32:55,591][85175] Updated weights for policy 1, policy_version 15160 (0.0007) +[2023-10-11 15:32:56,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 30834688. Throughput: 0: 1653.1, 1: 1677.3. Samples: 7712542. Policy #0 lag: (min: 5.0, avg: 5.1, max: 11.0) +[2023-10-11 15:32:56,063][84230] Avg episode reward: [(0, '6.970'), (1, '7.460')] +[2023-10-11 15:32:58,307][85176] Updated weights for policy 0, policy_version 14952 (0.0007) +[2023-10-11 15:32:58,682][85176] Updated weights for policy 0, policy_version 14962 (0.0009) +[2023-10-11 15:32:59,061][85176] Updated weights for policy 0, policy_version 14972 (0.0007) +[2023-10-11 15:32:59,798][85175] Updated weights for policy 1, policy_version 15170 (0.0008) +[2023-10-11 15:33:00,171][85175] Updated weights for policy 1, policy_version 15180 (0.0008) +[2023-10-11 15:33:00,539][85175] Updated weights for policy 1, policy_version 15190 (0.0009) +[2023-10-11 15:33:00,910][85175] Updated weights for policy 1, policy_version 15200 (0.0007) +[2023-10-11 15:33:01,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 30900224. Throughput: 0: 1659.5, 1: 1662.4. Samples: 7732398. Policy #0 lag: (min: 5.0, avg: 5.1, max: 11.0) +[2023-10-11 15:33:01,063][84230] Avg episode reward: [(0, '6.960'), (1, '7.290')] +[2023-10-11 15:33:02,970][85176] Updated weights for policy 0, policy_version 14982 (0.0008) +[2023-10-11 15:33:03,345][85176] Updated weights for policy 0, policy_version 14992 (0.0009) +[2023-10-11 15:33:03,712][85176] Updated weights for policy 0, policy_version 15002 (0.0007) +[2023-10-11 15:33:04,975][85175] Updated weights for policy 1, policy_version 15210 (0.0007) +[2023-10-11 15:33:05,346][85175] Updated weights for policy 1, policy_version 15220 (0.0007) +[2023-10-11 15:33:05,720][85175] Updated weights for policy 1, policy_version 15230 (0.0007) +[2023-10-11 15:33:06,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 30965760. Throughput: 0: 1648.8, 1: 1685.9. Samples: 7742870. Policy #0 lag: (min: 5.0, avg: 5.1, max: 11.0) +[2023-10-11 15:33:06,063][84230] Avg episode reward: [(0, '7.180'), (1, '7.070')] +[2023-10-11 15:33:07,795][85176] Updated weights for policy 0, policy_version 15012 (0.0009) +[2023-10-11 15:33:08,167][85176] Updated weights for policy 0, policy_version 15022 (0.0009) +[2023-10-11 15:33:08,538][85176] Updated weights for policy 0, policy_version 15032 (0.0007) +[2023-10-11 15:33:09,490][85175] Updated weights for policy 1, policy_version 15240 (0.0008) +[2023-10-11 15:33:09,858][85175] Updated weights for policy 1, policy_version 15250 (0.0009) +[2023-10-11 15:33:10,219][85175] Updated weights for policy 1, policy_version 15260 (0.0007) +[2023-10-11 15:33:11,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 31031296. Throughput: 0: 1660.4, 1: 1679.3. Samples: 7762732. Policy #0 lag: (min: 31.0, avg: 32.0, max: 53.0) +[2023-10-11 15:33:11,063][84230] Avg episode reward: [(0, '7.190'), (1, '6.920')] +[2023-10-11 15:33:12,655][85176] Updated weights for policy 0, policy_version 15042 (0.0008) +[2023-10-11 15:33:13,028][85176] Updated weights for policy 0, policy_version 15052 (0.0007) +[2023-10-11 15:33:13,400][85176] Updated weights for policy 0, policy_version 15062 (0.0008) +[2023-10-11 15:33:13,778][85176] Updated weights for policy 0, policy_version 15072 (0.0008) +[2023-10-11 15:33:14,283][85175] Updated weights for policy 1, policy_version 15270 (0.0008) +[2023-10-11 15:33:14,650][85175] Updated weights for policy 1, policy_version 15280 (0.0008) +[2023-10-11 15:33:15,026][85175] Updated weights for policy 1, policy_version 15290 (0.0008) +[2023-10-11 15:33:16,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 31096832. Throughput: 0: 1668.4, 1: 1664.5. Samples: 7782556. Policy #0 lag: (min: 31.0, avg: 32.0, max: 53.0) +[2023-10-11 15:33:16,064][84230] Avg episode reward: [(0, '6.970'), (1, '7.390')] +[2023-10-11 15:33:17,712][85176] Updated weights for policy 0, policy_version 15082 (0.0007) +[2023-10-11 15:33:18,095][85176] Updated weights for policy 0, policy_version 15092 (0.0007) +[2023-10-11 15:33:18,477][85176] Updated weights for policy 0, policy_version 15102 (0.0008) +[2023-10-11 15:33:19,171][85175] Updated weights for policy 1, policy_version 15300 (0.0008) +[2023-10-11 15:33:19,539][85175] Updated weights for policy 1, policy_version 15310 (0.0009) +[2023-10-11 15:33:19,911][85175] Updated weights for policy 1, policy_version 15320 (0.0007) +[2023-10-11 15:33:21,062][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 31162368. Throughput: 0: 1649.3, 1: 1681.7. Samples: 7792840. Policy #0 lag: (min: 31.0, avg: 32.0, max: 53.0) +[2023-10-11 15:33:21,063][84230] Avg episode reward: [(0, '6.530'), (1, '7.720')] +[2023-10-11 15:33:22,476][85176] Updated weights for policy 0, policy_version 15112 (0.0011) +[2023-10-11 15:33:22,853][85176] Updated weights for policy 0, policy_version 15122 (0.0008) +[2023-10-11 15:33:23,225][85176] Updated weights for policy 0, policy_version 15132 (0.0007) +[2023-10-11 15:33:23,807][85175] Updated weights for policy 1, policy_version 15330 (0.0008) +[2023-10-11 15:33:24,186][85175] Updated weights for policy 1, policy_version 15340 (0.0007) +[2023-10-11 15:33:24,548][85175] Updated weights for policy 1, policy_version 15350 (0.0009) +[2023-10-11 15:33:24,923][85175] Updated weights for policy 1, policy_version 15360 (0.0007) +[2023-10-11 15:33:26,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 31227904. Throughput: 0: 1664.4, 1: 1672.5. Samples: 7812786. Policy #0 lag: (min: 24.0, avg: 46.7, max: 56.0) +[2023-10-11 15:33:26,063][84230] Avg episode reward: [(0, '6.340'), (1, '7.540')] +[2023-10-11 15:33:27,420][85176] Updated weights for policy 0, policy_version 15142 (0.0008) +[2023-10-11 15:33:27,793][85176] Updated weights for policy 0, policy_version 15152 (0.0011) +[2023-10-11 15:33:28,164][85176] Updated weights for policy 0, policy_version 15162 (0.0010) +[2023-10-11 15:33:29,187][85175] Updated weights for policy 1, policy_version 15370 (0.0011) +[2023-10-11 15:33:29,558][85175] Updated weights for policy 1, policy_version 15380 (0.0011) +[2023-10-11 15:33:29,927][85175] Updated weights for policy 1, policy_version 15390 (0.0010) +[2023-10-11 15:33:31,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 31293440. Throughput: 0: 1650.9, 1: 1663.9. Samples: 7831772. Policy #0 lag: (min: 24.0, avg: 46.7, max: 56.0) +[2023-10-11 15:33:31,064][84230] Avg episode reward: [(0, '6.190'), (1, '7.580')] +[2023-10-11 15:33:32,722][85176] Updated weights for policy 0, policy_version 15172 (0.0010) +[2023-10-11 15:33:33,097][85176] Updated weights for policy 0, policy_version 15182 (0.0009) +[2023-10-11 15:33:33,466][85176] Updated weights for policy 0, policy_version 15192 (0.0007) +[2023-10-11 15:33:34,092][85175] Updated weights for policy 1, policy_version 15400 (0.0009) +[2023-10-11 15:33:34,462][85175] Updated weights for policy 1, policy_version 15410 (0.0009) +[2023-10-11 15:33:34,825][85175] Updated weights for policy 1, policy_version 15420 (0.0009) +[2023-10-11 15:33:36,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 31358976. Throughput: 0: 1638.0, 1: 1676.5. Samples: 7842038. Policy #0 lag: (min: 24.0, avg: 46.7, max: 56.0) +[2023-10-11 15:33:36,063][84230] Avg episode reward: [(0, '6.850'), (1, '7.230')] +[2023-10-11 15:33:37,754][85176] Updated weights for policy 0, policy_version 15202 (0.0009) +[2023-10-11 15:33:38,124][85176] Updated weights for policy 0, policy_version 15212 (0.0011) +[2023-10-11 15:33:38,493][85176] Updated weights for policy 0, policy_version 15222 (0.0009) +[2023-10-11 15:33:38,871][85176] Updated weights for policy 0, policy_version 15232 (0.0009) +[2023-10-11 15:33:39,112][85175] Updated weights for policy 1, policy_version 15430 (0.0010) +[2023-10-11 15:33:39,476][85175] Updated weights for policy 1, policy_version 15440 (0.0010) +[2023-10-11 15:33:39,847][85175] Updated weights for policy 1, policy_version 15450 (0.0009) +[2023-10-11 15:33:41,062][84230] Fps is (10 sec: 13107.6, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 31424512. Throughput: 0: 1636.1, 1: 1655.0. Samples: 7860640. Policy #0 lag: (min: 31.0, avg: 33.0, max: 62.0) +[2023-10-11 15:33:41,063][84230] Avg episode reward: [(0, '7.070'), (1, '7.330')] +[2023-10-11 15:33:43,658][85176] Updated weights for policy 0, policy_version 15242 (0.0010) +[2023-10-11 15:33:44,017][85176] Updated weights for policy 0, policy_version 15252 (0.0011) +[2023-10-11 15:33:44,389][85176] Updated weights for policy 0, policy_version 15262 (0.0009) +[2023-10-11 15:33:44,457][85175] Updated weights for policy 1, policy_version 15460 (0.0011) +[2023-10-11 15:33:44,828][85175] Updated weights for policy 1, policy_version 15470 (0.0009) +[2023-10-11 15:33:45,202][85175] Updated weights for policy 1, policy_version 15480 (0.0009) +[2023-10-11 15:33:46,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 31490048. Throughput: 0: 1610.4, 1: 1638.1. Samples: 7878582. Policy #0 lag: (min: 31.0, avg: 33.0, max: 62.0) +[2023-10-11 15:33:46,063][84230] Avg episode reward: [(0, '6.740'), (1, '6.580')] +[2023-10-11 15:33:48,867][85176] Updated weights for policy 0, policy_version 15272 (0.0010) +[2023-10-11 15:33:49,232][85176] Updated weights for policy 0, policy_version 15282 (0.0010) +[2023-10-11 15:33:49,608][85176] Updated weights for policy 0, policy_version 15292 (0.0010) +[2023-10-11 15:33:49,636][85175] Updated weights for policy 1, policy_version 15490 (0.0009) +[2023-10-11 15:33:50,008][85175] Updated weights for policy 1, policy_version 15500 (0.0008) +[2023-10-11 15:33:50,371][85175] Updated weights for policy 1, policy_version 15510 (0.0009) +[2023-10-11 15:33:50,740][85175] Updated weights for policy 1, policy_version 15520 (0.0009) +[2023-10-11 15:33:51,062][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 31555584. Throughput: 0: 1617.6, 1: 1633.0. Samples: 7889146. Policy #0 lag: (min: 31.0, avg: 33.0, max: 62.0) +[2023-10-11 15:33:51,063][84230] Avg episode reward: [(0, '6.640'), (1, '6.900')] +[2023-10-11 15:33:54,015][85176] Updated weights for policy 0, policy_version 15302 (0.0010) +[2023-10-11 15:33:54,396][85176] Updated weights for policy 0, policy_version 15312 (0.0009) +[2023-10-11 15:33:54,766][85176] Updated weights for policy 0, policy_version 15322 (0.0008) +[2023-10-11 15:33:55,184][85175] Updated weights for policy 1, policy_version 15530 (0.0008) +[2023-10-11 15:33:55,555][85175] Updated weights for policy 1, policy_version 15540 (0.0009) +[2023-10-11 15:33:55,918][85175] Updated weights for policy 1, policy_version 15550 (0.0009) +[2023-10-11 15:33:56,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 31621120. Throughput: 0: 1597.7, 1: 1623.1. Samples: 7907670. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:33:56,063][84230] Avg episode reward: [(0, '6.420'), (1, '7.540')] +[2023-10-11 15:33:59,140][85176] Updated weights for policy 0, policy_version 15332 (0.0009) +[2023-10-11 15:33:59,517][85176] Updated weights for policy 0, policy_version 15342 (0.0010) +[2023-10-11 15:33:59,887][85176] Updated weights for policy 0, policy_version 15352 (0.0008) +[2023-10-11 15:34:00,291][85175] Updated weights for policy 1, policy_version 15560 (0.0008) +[2023-10-11 15:34:00,660][85175] Updated weights for policy 1, policy_version 15570 (0.0008) +[2023-10-11 15:34:01,033][85175] Updated weights for policy 1, policy_version 15580 (0.0011) +[2023-10-11 15:34:01,062][84230] Fps is (10 sec: 9830.4, 60 sec: 12561.1, 300 sec: 13329.4). Total num frames: 31653888. Throughput: 0: 1568.2, 1: 1615.3. Samples: 7925812. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:34:01,063][84230] Avg episode reward: [(0, '6.630'), (1, '7.380')] +[2023-10-11 15:34:04,306][85176] Updated weights for policy 0, policy_version 15362 (0.0009) +[2023-10-11 15:34:04,674][85176] Updated weights for policy 0, policy_version 15372 (0.0010) +[2023-10-11 15:34:05,047][85176] Updated weights for policy 0, policy_version 15382 (0.0007) +[2023-10-11 15:34:05,244][85175] Updated weights for policy 1, policy_version 15590 (0.0008) +[2023-10-11 15:34:05,420][85176] Updated weights for policy 0, policy_version 15392 (0.0009) +[2023-10-11 15:34:05,608][85175] Updated weights for policy 1, policy_version 15600 (0.0008) +[2023-10-11 15:34:05,976][85175] Updated weights for policy 1, policy_version 15610 (0.0010) +[2023-10-11 15:34:06,063][84230] Fps is (10 sec: 9830.5, 60 sec: 12561.1, 300 sec: 13329.4). Total num frames: 31719424. Throughput: 0: 1584.0, 1: 1590.9. Samples: 7935712. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:34:06,063][84230] Avg episode reward: [(0, '6.850'), (1, '7.200')] +[2023-10-11 15:34:09,561][85176] Updated weights for policy 0, policy_version 15402 (0.0007) +[2023-10-11 15:34:09,928][85176] Updated weights for policy 0, policy_version 15412 (0.0008) +[2023-10-11 15:34:10,010][85175] Updated weights for policy 1, policy_version 15620 (0.0007) +[2023-10-11 15:34:10,302][85176] Updated weights for policy 0, policy_version 15422 (0.0008) +[2023-10-11 15:34:10,375][85175] Updated weights for policy 1, policy_version 15630 (0.0008) +[2023-10-11 15:34:10,740][85175] Updated weights for policy 1, policy_version 15640 (0.0009) +[2023-10-11 15:34:11,062][84230] Fps is (10 sec: 16384.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 31817728. Throughput: 0: 1575.6, 1: 1605.9. Samples: 7955952. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:34:11,063][84230] Avg episode reward: [(0, '7.070'), (1, '6.760')] +[2023-10-11 15:34:14,448][85176] Updated weights for policy 0, policy_version 15432 (0.0008) +[2023-10-11 15:34:14,827][85176] Updated weights for policy 0, policy_version 15442 (0.0008) +[2023-10-11 15:34:14,871][85175] Updated weights for policy 1, policy_version 15650 (0.0007) +[2023-10-11 15:34:15,198][85176] Updated weights for policy 0, policy_version 15452 (0.0008) +[2023-10-11 15:34:15,236][85175] Updated weights for policy 1, policy_version 15660 (0.0008) +[2023-10-11 15:34:15,603][85175] Updated weights for policy 1, policy_version 15670 (0.0007) +[2023-10-11 15:34:15,976][85175] Updated weights for policy 1, policy_version 15680 (0.0007) +[2023-10-11 15:34:16,063][84230] Fps is (10 sec: 16383.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 31883264. Throughput: 0: 1572.0, 1: 1612.7. Samples: 7975086. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:34:16,064][84230] Avg episode reward: [(0, '6.740'), (1, '7.100')] +[2023-10-11 15:34:16,075][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000015456_15826944.pth... +[2023-10-11 15:34:16,075][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000015680_16056320.pth... +[2023-10-11 15:34:16,112][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000014112_14450688.pth +[2023-10-11 15:34:16,114][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000013920_14254080.pth +[2023-10-11 15:34:16,116][85000] Saving a milestone ./train_atari/atari_frostbite_APPO/checkpoint_p1/milestones/checkpoint_000015680_16056320.pth +[2023-10-11 15:34:16,119][84801] Saving a milestone ./train_atari/atari_frostbite_APPO/checkpoint_p0/milestones/checkpoint_000015456_15826944.pth +[2023-10-11 15:34:19,235][85176] Updated weights for policy 0, policy_version 15462 (0.0009) +[2023-10-11 15:34:19,612][85176] Updated weights for policy 0, policy_version 15472 (0.0008) +[2023-10-11 15:34:19,985][85176] Updated weights for policy 0, policy_version 15482 (0.0007) +[2023-10-11 15:34:20,025][85175] Updated weights for policy 1, policy_version 15690 (0.0007) +[2023-10-11 15:34:20,401][85175] Updated weights for policy 1, policy_version 15700 (0.0007) +[2023-10-11 15:34:20,766][85175] Updated weights for policy 1, policy_version 15710 (0.0008) +[2023-10-11 15:34:21,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 31948800. Throughput: 0: 1602.9, 1: 1601.4. Samples: 7986234. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:34:21,063][84230] Avg episode reward: [(0, '6.860'), (1, '6.670')] +[2023-10-11 15:34:24,111][85176] Updated weights for policy 0, policy_version 15492 (0.0008) +[2023-10-11 15:34:24,478][85176] Updated weights for policy 0, policy_version 15502 (0.0007) +[2023-10-11 15:34:24,774][85175] Updated weights for policy 1, policy_version 15720 (0.0007) +[2023-10-11 15:34:24,853][85176] Updated weights for policy 0, policy_version 15512 (0.0007) +[2023-10-11 15:34:25,132][85175] Updated weights for policy 1, policy_version 15730 (0.0007) +[2023-10-11 15:34:25,503][85175] Updated weights for policy 1, policy_version 15740 (0.0008) +[2023-10-11 15:34:26,063][84230] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 32014336. Throughput: 0: 1610.4, 1: 1623.9. Samples: 8006182. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:34:26,063][84230] Avg episode reward: [(0, '6.860'), (1, '6.460')] +[2023-10-11 15:34:28,913][85176] Updated weights for policy 0, policy_version 15522 (0.0009) +[2023-10-11 15:34:29,318][85176] Updated weights for policy 0, policy_version 15532 (0.0010) +[2023-10-11 15:34:29,606][85175] Updated weights for policy 1, policy_version 15750 (0.0008) +[2023-10-11 15:34:29,694][85176] Updated weights for policy 0, policy_version 15542 (0.0009) +[2023-10-11 15:34:29,970][85175] Updated weights for policy 1, policy_version 15760 (0.0007) +[2023-10-11 15:34:30,069][85176] Updated weights for policy 0, policy_version 15552 (0.0008) +[2023-10-11 15:34:30,335][85175] Updated weights for policy 1, policy_version 15770 (0.0008) +[2023-10-11 15:34:31,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 32079872. Throughput: 0: 1621.5, 1: 1637.5. Samples: 8025234. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:34:31,064][84230] Avg episode reward: [(0, '6.860'), (1, '6.740')] +[2023-10-11 15:34:34,055][85176] Updated weights for policy 0, policy_version 15562 (0.0009) +[2023-10-11 15:34:34,292][85175] Updated weights for policy 1, policy_version 15780 (0.0009) +[2023-10-11 15:34:34,431][85176] Updated weights for policy 0, policy_version 15572 (0.0008) +[2023-10-11 15:34:34,657][85175] Updated weights for policy 1, policy_version 15790 (0.0008) +[2023-10-11 15:34:34,807][85176] Updated weights for policy 0, policy_version 15582 (0.0007) +[2023-10-11 15:34:35,025][85175] Updated weights for policy 1, policy_version 15800 (0.0009) +[2023-10-11 15:34:36,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 32145408. Throughput: 0: 1638.0, 1: 1645.0. Samples: 8036878. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:34:36,063][84230] Avg episode reward: [(0, '6.750'), (1, '6.730')] +[2023-10-11 15:34:38,991][85176] Updated weights for policy 0, policy_version 15592 (0.0009) +[2023-10-11 15:34:39,205][85175] Updated weights for policy 1, policy_version 15810 (0.0008) +[2023-10-11 15:34:39,363][85176] Updated weights for policy 0, policy_version 15602 (0.0009) +[2023-10-11 15:34:39,573][85175] Updated weights for policy 1, policy_version 15820 (0.0008) +[2023-10-11 15:34:39,738][85176] Updated weights for policy 0, policy_version 15612 (0.0007) +[2023-10-11 15:34:39,949][85175] Updated weights for policy 1, policy_version 15830 (0.0008) +[2023-10-11 15:34:40,313][85175] Updated weights for policy 1, policy_version 15840 (0.0008) +[2023-10-11 15:34:41,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 32210944. Throughput: 0: 1645.5, 1: 1652.4. Samples: 8056076. Policy #0 lag: (min: 21.0, avg: 23.0, max: 44.0) +[2023-10-11 15:34:41,063][84230] Avg episode reward: [(0, '6.750'), (1, '6.870')] +[2023-10-11 15:34:43,866][85176] Updated weights for policy 0, policy_version 15622 (0.0007) +[2023-10-11 15:34:44,229][85176] Updated weights for policy 0, policy_version 15632 (0.0009) +[2023-10-11 15:34:44,597][85175] Updated weights for policy 1, policy_version 15850 (0.0009) +[2023-10-11 15:34:44,606][85176] Updated weights for policy 0, policy_version 15642 (0.0007) +[2023-10-11 15:34:44,964][85175] Updated weights for policy 1, policy_version 15860 (0.0008) +[2023-10-11 15:34:45,338][85175] Updated weights for policy 1, policy_version 15870 (0.0008) +[2023-10-11 15:34:46,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 32276480. Throughput: 0: 1660.0, 1: 1656.8. Samples: 8075068. Policy #0 lag: (min: 21.0, avg: 23.0, max: 44.0) +[2023-10-11 15:34:46,064][84230] Avg episode reward: [(0, '6.840'), (1, '6.710')] +[2023-10-11 15:34:48,708][85176] Updated weights for policy 0, policy_version 15652 (0.0008) +[2023-10-11 15:34:49,077][85176] Updated weights for policy 0, policy_version 15662 (0.0010) +[2023-10-11 15:34:49,443][85176] Updated weights for policy 0, policy_version 15672 (0.0010) +[2023-10-11 15:34:49,477][85175] Updated weights for policy 1, policy_version 15880 (0.0008) +[2023-10-11 15:34:49,851][85175] Updated weights for policy 1, policy_version 15890 (0.0010) +[2023-10-11 15:34:50,226][85175] Updated weights for policy 1, policy_version 15900 (0.0009) +[2023-10-11 15:34:51,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 32342016. Throughput: 0: 1671.8, 1: 1684.0. Samples: 8086720. Policy #0 lag: (min: 21.0, avg: 23.0, max: 44.0) +[2023-10-11 15:34:51,064][84230] Avg episode reward: [(0, '7.060'), (1, '6.750')] +[2023-10-11 15:34:53,626][85176] Updated weights for policy 0, policy_version 15682 (0.0008) +[2023-10-11 15:34:53,986][85176] Updated weights for policy 0, policy_version 15692 (0.0009) +[2023-10-11 15:34:54,308][85175] Updated weights for policy 1, policy_version 15910 (0.0010) +[2023-10-11 15:34:54,355][85176] Updated weights for policy 0, policy_version 15702 (0.0008) +[2023-10-11 15:34:54,677][85175] Updated weights for policy 1, policy_version 15920 (0.0008) +[2023-10-11 15:34:54,728][85176] Updated weights for policy 0, policy_version 15712 (0.0009) +[2023-10-11 15:34:55,062][85175] Updated weights for policy 1, policy_version 15930 (0.0008) +[2023-10-11 15:34:56,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 32407552. Throughput: 0: 1658.6, 1: 1670.2. Samples: 8105748. Policy #0 lag: (min: 1.0, avg: 9.0, max: 33.0) +[2023-10-11 15:34:56,064][84230] Avg episode reward: [(0, '7.080'), (1, '6.850')] +[2023-10-11 15:34:58,661][85176] Updated weights for policy 0, policy_version 15722 (0.0008) +[2023-10-11 15:34:59,017][85176] Updated weights for policy 0, policy_version 15732 (0.0008) +[2023-10-11 15:34:59,092][85175] Updated weights for policy 1, policy_version 15940 (0.0010) +[2023-10-11 15:34:59,388][85176] Updated weights for policy 0, policy_version 15742 (0.0008) +[2023-10-11 15:34:59,453][85175] Updated weights for policy 1, policy_version 15950 (0.0009) +[2023-10-11 15:34:59,828][85175] Updated weights for policy 1, policy_version 15960 (0.0011) +[2023-10-11 15:35:01,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 32473088. Throughput: 0: 1678.0, 1: 1666.2. Samples: 8125574. Policy #0 lag: (min: 1.0, avg: 9.0, max: 33.0) +[2023-10-11 15:35:01,063][84230] Avg episode reward: [(0, '6.750'), (1, '6.640')] +[2023-10-11 15:35:03,461][85176] Updated weights for policy 0, policy_version 15752 (0.0009) +[2023-10-11 15:35:03,837][85176] Updated weights for policy 0, policy_version 15762 (0.0008) +[2023-10-11 15:35:03,876][85175] Updated weights for policy 1, policy_version 15970 (0.0008) +[2023-10-11 15:35:04,205][85176] Updated weights for policy 0, policy_version 15772 (0.0007) +[2023-10-11 15:35:04,249][85175] Updated weights for policy 1, policy_version 15980 (0.0007) +[2023-10-11 15:35:04,609][85175] Updated weights for policy 1, policy_version 15990 (0.0010) +[2023-10-11 15:35:04,981][85175] Updated weights for policy 1, policy_version 16000 (0.0011) +[2023-10-11 15:35:06,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 32538624. Throughput: 0: 1662.0, 1: 1677.8. Samples: 8136526. Policy #0 lag: (min: 1.0, avg: 9.0, max: 33.0) +[2023-10-11 15:35:06,063][84230] Avg episode reward: [(0, '6.640'), (1, '7.140')] +[2023-10-11 15:35:08,273][85176] Updated weights for policy 0, policy_version 15782 (0.0007) +[2023-10-11 15:35:08,645][85176] Updated weights for policy 0, policy_version 15792 (0.0009) +[2023-10-11 15:35:09,016][85176] Updated weights for policy 0, policy_version 15802 (0.0007) +[2023-10-11 15:35:09,197][85175] Updated weights for policy 1, policy_version 16010 (0.0008) +[2023-10-11 15:35:09,563][85175] Updated weights for policy 1, policy_version 16020 (0.0010) +[2023-10-11 15:35:09,933][85175] Updated weights for policy 1, policy_version 16030 (0.0008) +[2023-10-11 15:35:11,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 32604160. Throughput: 0: 1658.4, 1: 1657.6. Samples: 8155404. Policy #0 lag: (min: 2.0, avg: 8.5, max: 34.0) +[2023-10-11 15:35:11,064][84230] Avg episode reward: [(0, '6.640'), (1, '7.180')] +[2023-10-11 15:35:12,910][85176] Updated weights for policy 0, policy_version 15812 (0.0009) +[2023-10-11 15:35:13,292][85176] Updated weights for policy 0, policy_version 15822 (0.0008) +[2023-10-11 15:35:13,656][85176] Updated weights for policy 0, policy_version 15832 (0.0010) +[2023-10-11 15:35:13,945][85175] Updated weights for policy 1, policy_version 16040 (0.0008) +[2023-10-11 15:35:14,309][85175] Updated weights for policy 1, policy_version 16050 (0.0010) +[2023-10-11 15:35:14,691][85175] Updated weights for policy 1, policy_version 16060 (0.0008) +[2023-10-11 15:35:16,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 32669696. Throughput: 0: 1671.0, 1: 1666.0. Samples: 8175398. Policy #0 lag: (min: 2.0, avg: 8.5, max: 34.0) +[2023-10-11 15:35:16,063][84230] Avg episode reward: [(0, '6.640'), (1, '7.010')] +[2023-10-11 15:35:17,865][85176] Updated weights for policy 0, policy_version 15842 (0.0010) +[2023-10-11 15:35:18,272][85176] Updated weights for policy 0, policy_version 15852 (0.0009) +[2023-10-11 15:35:18,566][85175] Updated weights for policy 1, policy_version 16070 (0.0009) +[2023-10-11 15:35:18,641][85176] Updated weights for policy 0, policy_version 15862 (0.0010) +[2023-10-11 15:35:18,933][85175] Updated weights for policy 1, policy_version 16080 (0.0008) +[2023-10-11 15:35:19,018][85176] Updated weights for policy 0, policy_version 15872 (0.0007) +[2023-10-11 15:35:19,296][85175] Updated weights for policy 1, policy_version 16090 (0.0011) +[2023-10-11 15:35:21,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 32735232. Throughput: 0: 1645.9, 1: 1669.8. Samples: 8186088. Policy #0 lag: (min: 2.0, avg: 8.5, max: 34.0) +[2023-10-11 15:35:21,064][84230] Avg episode reward: [(0, '6.420'), (1, '7.210')] +[2023-10-11 15:35:23,240][85176] Updated weights for policy 0, policy_version 15882 (0.0008) +[2023-10-11 15:35:23,466][85175] Updated weights for policy 1, policy_version 16100 (0.0010) +[2023-10-11 15:35:23,611][85176] Updated weights for policy 0, policy_version 15892 (0.0007) +[2023-10-11 15:35:23,823][85175] Updated weights for policy 1, policy_version 16110 (0.0008) +[2023-10-11 15:35:23,971][85176] Updated weights for policy 0, policy_version 15902 (0.0007) +[2023-10-11 15:35:24,193][85175] Updated weights for policy 1, policy_version 16120 (0.0009) +[2023-10-11 15:35:26,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13218.3). Total num frames: 32800768. Throughput: 0: 1655.6, 1: 1659.2. Samples: 8205242. Policy #0 lag: (min: 0.0, avg: 16.3, max: 32.0) +[2023-10-11 15:35:26,064][84230] Avg episode reward: [(0, '6.530'), (1, '6.840')] +[2023-10-11 15:35:28,128][85176] Updated weights for policy 0, policy_version 15912 (0.0009) +[2023-10-11 15:35:28,499][85176] Updated weights for policy 0, policy_version 15922 (0.0008) +[2023-10-11 15:35:28,520][85175] Updated weights for policy 1, policy_version 16130 (0.0008) +[2023-10-11 15:35:28,865][85176] Updated weights for policy 0, policy_version 15932 (0.0009) +[2023-10-11 15:35:28,880][85175] Updated weights for policy 1, policy_version 16140 (0.0008) +[2023-10-11 15:35:29,250][85175] Updated weights for policy 1, policy_version 16150 (0.0009) +[2023-10-11 15:35:29,626][85175] Updated weights for policy 1, policy_version 16160 (0.0011) +[2023-10-11 15:35:31,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13218.3). Total num frames: 32866304. Throughput: 0: 1665.6, 1: 1681.5. Samples: 8225688. Policy #0 lag: (min: 0.0, avg: 16.3, max: 32.0) +[2023-10-11 15:35:31,063][84230] Avg episode reward: [(0, '6.530'), (1, '7.120')] +[2023-10-11 15:35:33,050][85176] Updated weights for policy 0, policy_version 15942 (0.0008) +[2023-10-11 15:35:33,426][85176] Updated weights for policy 0, policy_version 15952 (0.0009) +[2023-10-11 15:35:33,801][85176] Updated weights for policy 0, policy_version 15962 (0.0008) +[2023-10-11 15:35:33,847][85175] Updated weights for policy 1, policy_version 16170 (0.0008) +[2023-10-11 15:35:34,214][85175] Updated weights for policy 1, policy_version 16180 (0.0008) +[2023-10-11 15:35:34,583][85175] Updated weights for policy 1, policy_version 16190 (0.0007) +[2023-10-11 15:35:36,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13218.3). Total num frames: 32931840. Throughput: 0: 1647.2, 1: 1676.6. Samples: 8236294. Policy #0 lag: (min: 0.0, avg: 16.3, max: 32.0) +[2023-10-11 15:35:36,064][84230] Avg episode reward: [(0, '6.530'), (1, '7.610')] +[2023-10-11 15:35:37,928][85176] Updated weights for policy 0, policy_version 15972 (0.0007) +[2023-10-11 15:35:38,297][85176] Updated weights for policy 0, policy_version 15982 (0.0007) +[2023-10-11 15:35:38,496][85175] Updated weights for policy 1, policy_version 16200 (0.0008) +[2023-10-11 15:35:38,667][85176] Updated weights for policy 0, policy_version 15992 (0.0007) +[2023-10-11 15:35:38,871][85175] Updated weights for policy 1, policy_version 16210 (0.0008) +[2023-10-11 15:35:39,232][85175] Updated weights for policy 1, policy_version 16220 (0.0008) +[2023-10-11 15:35:41,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13218.3). Total num frames: 32997376. Throughput: 0: 1660.6, 1: 1659.3. Samples: 8255144. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:35:41,063][84230] Avg episode reward: [(0, '7.080'), (1, '7.550')] +[2023-10-11 15:35:42,928][85176] Updated weights for policy 0, policy_version 16002 (0.0008) +[2023-10-11 15:35:43,306][85176] Updated weights for policy 0, policy_version 16012 (0.0008) +[2023-10-11 15:35:43,324][85175] Updated weights for policy 1, policy_version 16230 (0.0009) +[2023-10-11 15:35:43,679][85176] Updated weights for policy 0, policy_version 16022 (0.0009) +[2023-10-11 15:35:43,683][85175] Updated weights for policy 1, policy_version 16240 (0.0008) +[2023-10-11 15:35:44,046][85176] Updated weights for policy 0, policy_version 16032 (0.0011) +[2023-10-11 15:35:44,058][85175] Updated weights for policy 1, policy_version 16250 (0.0009) +[2023-10-11 15:35:46,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13107.3, 300 sec: 13218.3). Total num frames: 33062912. Throughput: 0: 1655.4, 1: 1681.2. Samples: 8275720. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:35:46,063][84230] Avg episode reward: [(0, '6.740'), (1, '7.110')] +[2023-10-11 15:35:48,030][85175] Updated weights for policy 1, policy_version 16260 (0.0008) +[2023-10-11 15:35:48,065][85176] Updated weights for policy 0, policy_version 16042 (0.0009) +[2023-10-11 15:35:48,405][85175] Updated weights for policy 1, policy_version 16270 (0.0007) +[2023-10-11 15:35:48,438][85176] Updated weights for policy 0, policy_version 16052 (0.0008) +[2023-10-11 15:35:48,771][85175] Updated weights for policy 1, policy_version 16280 (0.0008) +[2023-10-11 15:35:48,803][85176] Updated weights for policy 0, policy_version 16062 (0.0007) +[2023-10-11 15:35:51,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13218.3). Total num frames: 33128448. Throughput: 0: 1646.6, 1: 1668.9. Samples: 8285724. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:35:51,063][84230] Avg episode reward: [(0, '6.960'), (1, '6.520')] +[2023-10-11 15:35:52,917][85175] Updated weights for policy 1, policy_version 16290 (0.0008) +[2023-10-11 15:35:53,001][85176] Updated weights for policy 0, policy_version 16072 (0.0008) +[2023-10-11 15:35:53,300][85175] Updated weights for policy 1, policy_version 16300 (0.0008) +[2023-10-11 15:35:53,370][85176] Updated weights for policy 0, policy_version 16082 (0.0008) +[2023-10-11 15:35:53,664][85175] Updated weights for policy 1, policy_version 16310 (0.0008) +[2023-10-11 15:35:53,745][85176] Updated weights for policy 0, policy_version 16092 (0.0007) +[2023-10-11 15:35:54,025][85175] Updated weights for policy 1, policy_version 16320 (0.0009) +[2023-10-11 15:35:56,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13218.3). Total num frames: 33193984. Throughput: 0: 1654.6, 1: 1673.3. Samples: 8305158. Policy #0 lag: (min: 31.0, avg: 31.2, max: 41.0) +[2023-10-11 15:35:56,064][84230] Avg episode reward: [(0, '6.530'), (1, '6.590')] +[2023-10-11 15:35:57,940][85176] Updated weights for policy 0, policy_version 16102 (0.0008) +[2023-10-11 15:35:58,004][85175] Updated weights for policy 1, policy_version 16330 (0.0008) +[2023-10-11 15:35:58,310][85176] Updated weights for policy 0, policy_version 16112 (0.0007) +[2023-10-11 15:35:58,384][85175] Updated weights for policy 1, policy_version 16340 (0.0007) +[2023-10-11 15:35:58,683][85176] Updated weights for policy 0, policy_version 16122 (0.0008) +[2023-10-11 15:35:58,754][85175] Updated weights for policy 1, policy_version 16350 (0.0008) +[2023-10-11 15:36:01,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13218.3). Total num frames: 33259520. Throughput: 0: 1652.1, 1: 1694.0. Samples: 8325972. Policy #0 lag: (min: 31.0, avg: 31.2, max: 41.0) +[2023-10-11 15:36:01,064][84230] Avg episode reward: [(0, '6.640'), (1, '7.000')] +[2023-10-11 15:36:02,756][85175] Updated weights for policy 1, policy_version 16360 (0.0010) +[2023-10-11 15:36:02,971][85176] Updated weights for policy 0, policy_version 16132 (0.0009) +[2023-10-11 15:36:03,134][85175] Updated weights for policy 1, policy_version 16370 (0.0009) +[2023-10-11 15:36:03,360][85176] Updated weights for policy 0, policy_version 16142 (0.0009) +[2023-10-11 15:36:03,495][85175] Updated weights for policy 1, policy_version 16380 (0.0008) +[2023-10-11 15:36:03,722][85176] Updated weights for policy 0, policy_version 16152 (0.0007) +[2023-10-11 15:36:06,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13218.3). Total num frames: 33325056. Throughput: 0: 1653.3, 1: 1669.6. Samples: 8335616. Policy #0 lag: (min: 31.0, avg: 31.2, max: 41.0) +[2023-10-11 15:36:06,063][84230] Avg episode reward: [(0, '6.930'), (1, '7.660')] +[2023-10-11 15:36:07,528][85175] Updated weights for policy 1, policy_version 16390 (0.0008) +[2023-10-11 15:36:07,823][85176] Updated weights for policy 0, policy_version 16162 (0.0007) +[2023-10-11 15:36:07,888][85175] Updated weights for policy 1, policy_version 16400 (0.0007) +[2023-10-11 15:36:08,192][85176] Updated weights for policy 0, policy_version 16172 (0.0007) +[2023-10-11 15:36:08,256][85175] Updated weights for policy 1, policy_version 16410 (0.0007) +[2023-10-11 15:36:08,560][85176] Updated weights for policy 0, policy_version 16182 (0.0009) +[2023-10-11 15:36:08,932][85176] Updated weights for policy 0, policy_version 16192 (0.0010) +[2023-10-11 15:36:11,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13218.3). Total num frames: 33390592. Throughput: 0: 1657.8, 1: 1685.0. Samples: 8355668. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:36:11,063][84230] Avg episode reward: [(0, '6.920'), (1, '7.620')] +[2023-10-11 15:36:12,263][85175] Updated weights for policy 1, policy_version 16420 (0.0009) +[2023-10-11 15:36:12,634][85175] Updated weights for policy 1, policy_version 16430 (0.0009) +[2023-10-11 15:36:12,994][85175] Updated weights for policy 1, policy_version 16440 (0.0007) +[2023-10-11 15:36:13,009][85176] Updated weights for policy 0, policy_version 16202 (0.0010) +[2023-10-11 15:36:13,387][85176] Updated weights for policy 0, policy_version 16212 (0.0008) +[2023-10-11 15:36:13,759][85176] Updated weights for policy 0, policy_version 16222 (0.0008) +[2023-10-11 15:36:16,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13218.3). Total num frames: 33456128. Throughput: 0: 1658.3, 1: 1690.8. Samples: 8376400. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:36:16,063][84230] Avg episode reward: [(0, '7.180'), (1, '7.500')] +[2023-10-11 15:36:16,075][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000016224_16613376.pth... +[2023-10-11 15:36:16,075][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000016448_16842752.pth... +[2023-10-11 15:36:16,110][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000014912_15269888.pth +[2023-10-11 15:36:16,111][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000014688_15040512.pth +[2023-10-11 15:36:17,027][85175] Updated weights for policy 1, policy_version 16450 (0.0007) +[2023-10-11 15:36:17,399][85175] Updated weights for policy 1, policy_version 16460 (0.0008) +[2023-10-11 15:36:17,759][85175] Updated weights for policy 1, policy_version 16470 (0.0009) +[2023-10-11 15:36:17,799][85176] Updated weights for policy 0, policy_version 16232 (0.0009) +[2023-10-11 15:36:18,127][85175] Updated weights for policy 1, policy_version 16480 (0.0007) +[2023-10-11 15:36:18,178][85176] Updated weights for policy 0, policy_version 16242 (0.0008) +[2023-10-11 15:36:18,548][85176] Updated weights for policy 0, policy_version 16252 (0.0009) +[2023-10-11 15:36:21,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13218.3). Total num frames: 33521664. Throughput: 0: 1653.5, 1: 1665.9. Samples: 8385668. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:36:21,063][84230] Avg episode reward: [(0, '6.640'), (1, '7.280')] +[2023-10-11 15:36:22,310][85175] Updated weights for policy 1, policy_version 16490 (0.0009) +[2023-10-11 15:36:22,672][85175] Updated weights for policy 1, policy_version 16500 (0.0007) +[2023-10-11 15:36:22,701][85176] Updated weights for policy 0, policy_version 16262 (0.0008) +[2023-10-11 15:36:23,037][85175] Updated weights for policy 1, policy_version 16510 (0.0007) +[2023-10-11 15:36:23,069][85176] Updated weights for policy 0, policy_version 16272 (0.0007) +[2023-10-11 15:36:23,434][85176] Updated weights for policy 0, policy_version 16282 (0.0010) +[2023-10-11 15:36:26,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13218.3). Total num frames: 33587200. Throughput: 0: 1659.3, 1: 1692.6. Samples: 8405978. Policy #0 lag: (min: 27.0, avg: 27.0, max: 27.0) +[2023-10-11 15:36:26,064][84230] Avg episode reward: [(0, '6.640'), (1, '7.350')] +[2023-10-11 15:36:26,896][85175] Updated weights for policy 1, policy_version 16520 (0.0009) +[2023-10-11 15:36:27,276][85175] Updated weights for policy 1, policy_version 16530 (0.0007) +[2023-10-11 15:36:27,397][85176] Updated weights for policy 0, policy_version 16292 (0.0007) +[2023-10-11 15:36:27,639][85175] Updated weights for policy 1, policy_version 16540 (0.0009) +[2023-10-11 15:36:27,763][85176] Updated weights for policy 0, policy_version 16302 (0.0007) +[2023-10-11 15:36:28,141][85176] Updated weights for policy 0, policy_version 16312 (0.0009) +[2023-10-11 15:36:31,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13218.3). Total num frames: 33652736. Throughput: 0: 1663.4, 1: 1692.0. Samples: 8426714. Policy #0 lag: (min: 27.0, avg: 27.0, max: 27.0) +[2023-10-11 15:36:31,063][84230] Avg episode reward: [(0, '6.410'), (1, '7.420')] +[2023-10-11 15:36:31,831][85175] Updated weights for policy 1, policy_version 16550 (0.0010) +[2023-10-11 15:36:32,204][85175] Updated weights for policy 1, policy_version 16560 (0.0007) +[2023-10-11 15:36:32,283][85176] Updated weights for policy 0, policy_version 16322 (0.0009) +[2023-10-11 15:36:32,567][85175] Updated weights for policy 1, policy_version 16570 (0.0010) +[2023-10-11 15:36:32,643][85176] Updated weights for policy 0, policy_version 16332 (0.0007) +[2023-10-11 15:36:33,017][85176] Updated weights for policy 0, policy_version 16342 (0.0008) +[2023-10-11 15:36:33,391][85176] Updated weights for policy 0, policy_version 16352 (0.0010) +[2023-10-11 15:36:36,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13218.3). Total num frames: 33718272. Throughput: 0: 1655.4, 1: 1682.8. Samples: 8435944. Policy #0 lag: (min: 27.0, avg: 27.0, max: 27.0) +[2023-10-11 15:36:36,063][84230] Avg episode reward: [(0, '6.420'), (1, '7.050')] +[2023-10-11 15:36:36,400][85175] Updated weights for policy 1, policy_version 16580 (0.0007) +[2023-10-11 15:36:36,773][85175] Updated weights for policy 1, policy_version 16590 (0.0008) +[2023-10-11 15:36:37,141][85175] Updated weights for policy 1, policy_version 16600 (0.0007) +[2023-10-11 15:36:37,433][85176] Updated weights for policy 0, policy_version 16362 (0.0008) +[2023-10-11 15:36:37,805][85176] Updated weights for policy 0, policy_version 16372 (0.0008) +[2023-10-11 15:36:38,183][85176] Updated weights for policy 0, policy_version 16382 (0.0007) +[2023-10-11 15:36:40,959][85175] Updated weights for policy 1, policy_version 16610 (0.0007) +[2023-10-11 15:36:41,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13218.3). Total num frames: 33783808. Throughput: 0: 1664.3, 1: 1709.7. Samples: 8456986. Policy #0 lag: (min: 8.0, avg: 32.5, max: 40.0) +[2023-10-11 15:36:41,064][84230] Avg episode reward: [(0, '6.900'), (1, '7.140')] +[2023-10-11 15:36:41,326][85175] Updated weights for policy 1, policy_version 16620 (0.0007) +[2023-10-11 15:36:41,704][85175] Updated weights for policy 1, policy_version 16630 (0.0007) +[2023-10-11 15:36:42,070][85175] Updated weights for policy 1, policy_version 16640 (0.0007) +[2023-10-11 15:36:42,198][85176] Updated weights for policy 0, policy_version 16392 (0.0008) +[2023-10-11 15:36:42,561][85176] Updated weights for policy 0, policy_version 16402 (0.0007) +[2023-10-11 15:36:42,935][85176] Updated weights for policy 0, policy_version 16412 (0.0007) +[2023-10-11 15:36:46,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13107.1, 300 sec: 13218.3). Total num frames: 33849344. Throughput: 0: 1666.6, 1: 1700.5. Samples: 8477490. Policy #0 lag: (min: 8.0, avg: 32.5, max: 40.0) +[2023-10-11 15:36:46,064][84230] Avg episode reward: [(0, '7.010'), (1, '7.440')] +[2023-10-11 15:36:46,280][85175] Updated weights for policy 1, policy_version 16650 (0.0007) +[2023-10-11 15:36:46,650][85175] Updated weights for policy 1, policy_version 16660 (0.0007) +[2023-10-11 15:36:47,028][85175] Updated weights for policy 1, policy_version 16670 (0.0007) +[2023-10-11 15:36:47,276][85176] Updated weights for policy 0, policy_version 16422 (0.0009) +[2023-10-11 15:36:47,651][85176] Updated weights for policy 0, policy_version 16432 (0.0009) +[2023-10-11 15:36:48,028][85176] Updated weights for policy 0, policy_version 16442 (0.0009) +[2023-10-11 15:36:51,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13218.3). Total num frames: 33914880. Throughput: 0: 1654.6, 1: 1696.9. Samples: 8486432. Policy #0 lag: (min: 8.0, avg: 32.5, max: 40.0) +[2023-10-11 15:36:51,063][84230] Avg episode reward: [(0, '6.640'), (1, '7.250')] +[2023-10-11 15:36:51,183][85175] Updated weights for policy 1, policy_version 16680 (0.0007) +[2023-10-11 15:36:51,545][85175] Updated weights for policy 1, policy_version 16690 (0.0007) +[2023-10-11 15:36:51,911][85175] Updated weights for policy 1, policy_version 16700 (0.0007) +[2023-10-11 15:36:52,247][85176] Updated weights for policy 0, policy_version 16452 (0.0009) +[2023-10-11 15:36:52,628][85176] Updated weights for policy 0, policy_version 16462 (0.0010) +[2023-10-11 15:36:52,999][85176] Updated weights for policy 0, policy_version 16472 (0.0007) +[2023-10-11 15:36:55,834][85175] Updated weights for policy 1, policy_version 16710 (0.0007) +[2023-10-11 15:36:56,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13218.3). Total num frames: 33980416. Throughput: 0: 1659.3, 1: 1700.6. Samples: 8506864. Policy #0 lag: (min: 6.0, avg: 14.7, max: 38.0) +[2023-10-11 15:36:56,064][84230] Avg episode reward: [(0, '6.750'), (1, '7.030')] +[2023-10-11 15:36:56,201][85175] Updated weights for policy 1, policy_version 16720 (0.0007) +[2023-10-11 15:36:56,567][85175] Updated weights for policy 1, policy_version 16730 (0.0010) +[2023-10-11 15:36:57,103][85176] Updated weights for policy 0, policy_version 16482 (0.0007) +[2023-10-11 15:36:57,475][85176] Updated weights for policy 0, policy_version 16492 (0.0008) +[2023-10-11 15:36:57,846][85176] Updated weights for policy 0, policy_version 16502 (0.0011) +[2023-10-11 15:36:58,224][85176] Updated weights for policy 0, policy_version 16512 (0.0007) +[2023-10-11 15:37:00,687][85175] Updated weights for policy 1, policy_version 16740 (0.0010) +[2023-10-11 15:37:01,053][85175] Updated weights for policy 1, policy_version 16750 (0.0008) +[2023-10-11 15:37:01,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13218.3). Total num frames: 34045952. Throughput: 0: 1658.1, 1: 1701.5. Samples: 8527580. Policy #0 lag: (min: 6.0, avg: 14.7, max: 38.0) +[2023-10-11 15:37:01,064][84230] Avg episode reward: [(0, '6.850'), (1, '7.520')] +[2023-10-11 15:37:01,421][85175] Updated weights for policy 1, policy_version 16760 (0.0009) +[2023-10-11 15:37:02,405][85176] Updated weights for policy 0, policy_version 16522 (0.0007) +[2023-10-11 15:37:02,779][85176] Updated weights for policy 0, policy_version 16532 (0.0007) +[2023-10-11 15:37:03,154][85176] Updated weights for policy 0, policy_version 16542 (0.0008) +[2023-10-11 15:37:05,340][85175] Updated weights for policy 1, policy_version 16770 (0.0007) +[2023-10-11 15:37:05,705][85175] Updated weights for policy 1, policy_version 16780 (0.0008) +[2023-10-11 15:37:06,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13218.3). Total num frames: 34111488. Throughput: 0: 1652.4, 1: 1704.7. Samples: 8536734. Policy #0 lag: (min: 6.0, avg: 14.7, max: 38.0) +[2023-10-11 15:37:06,063][84230] Avg episode reward: [(0, '7.070'), (1, '7.940')] +[2023-10-11 15:37:06,077][85175] Updated weights for policy 1, policy_version 16790 (0.0010) +[2023-10-11 15:37:06,440][85175] Updated weights for policy 1, policy_version 16800 (0.0007) +[2023-10-11 15:37:07,297][85176] Updated weights for policy 0, policy_version 16552 (0.0008) +[2023-10-11 15:37:07,661][85176] Updated weights for policy 0, policy_version 16562 (0.0008) +[2023-10-11 15:37:08,038][85176] Updated weights for policy 0, policy_version 16572 (0.0009) +[2023-10-11 15:37:10,428][85175] Updated weights for policy 1, policy_version 16810 (0.0011) +[2023-10-11 15:37:10,795][85175] Updated weights for policy 1, policy_version 16820 (0.0007) +[2023-10-11 15:37:11,062][84230] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13218.3). Total num frames: 34177024. Throughput: 0: 1660.5, 1: 1707.5. Samples: 8557534. Policy #0 lag: (min: 9.0, avg: 22.2, max: 41.0) +[2023-10-11 15:37:11,063][84230] Avg episode reward: [(0, '6.640'), (1, '7.010')] +[2023-10-11 15:37:11,157][85175] Updated weights for policy 1, policy_version 16830 (0.0009) +[2023-10-11 15:37:12,124][85176] Updated weights for policy 0, policy_version 16582 (0.0008) +[2023-10-11 15:37:12,500][85176] Updated weights for policy 0, policy_version 16592 (0.0008) +[2023-10-11 15:37:12,874][85176] Updated weights for policy 0, policy_version 16602 (0.0008) +[2023-10-11 15:37:15,384][85175] Updated weights for policy 1, policy_version 16840 (0.0008) +[2023-10-11 15:37:15,768][85175] Updated weights for policy 1, policy_version 16850 (0.0008) +[2023-10-11 15:37:16,063][84230] Fps is (10 sec: 13106.8, 60 sec: 13107.2, 300 sec: 13218.3). Total num frames: 34242560. Throughput: 0: 1660.8, 1: 1690.5. Samples: 8577526. Policy #0 lag: (min: 9.0, avg: 22.2, max: 41.0) +[2023-10-11 15:37:16,064][84230] Avg episode reward: [(0, '6.420'), (1, '6.910')] +[2023-10-11 15:37:16,136][85175] Updated weights for policy 1, policy_version 16860 (0.0009) +[2023-10-11 15:37:16,974][85176] Updated weights for policy 0, policy_version 16612 (0.0009) +[2023-10-11 15:37:17,335][85176] Updated weights for policy 0, policy_version 16622 (0.0010) +[2023-10-11 15:37:17,701][85176] Updated weights for policy 0, policy_version 16632 (0.0009) +[2023-10-11 15:37:20,258][85175] Updated weights for policy 1, policy_version 16870 (0.0009) +[2023-10-11 15:37:20,626][85175] Updated weights for policy 1, policy_version 16880 (0.0009) +[2023-10-11 15:37:20,992][85175] Updated weights for policy 1, policy_version 16890 (0.0009) +[2023-10-11 15:37:21,062][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13218.3). Total num frames: 34308096. Throughput: 0: 1661.5, 1: 1695.7. Samples: 8587016. Policy #0 lag: (min: 9.0, avg: 22.2, max: 41.0) +[2023-10-11 15:37:21,063][84230] Avg episode reward: [(0, '6.590'), (1, '7.440')] +[2023-10-11 15:37:21,771][85176] Updated weights for policy 0, policy_version 16642 (0.0008) +[2023-10-11 15:37:22,137][85176] Updated weights for policy 0, policy_version 16652 (0.0007) +[2023-10-11 15:37:22,508][85176] Updated weights for policy 0, policy_version 16662 (0.0007) +[2023-10-11 15:37:22,889][85176] Updated weights for policy 0, policy_version 16672 (0.0009) +[2023-10-11 15:37:25,055][85175] Updated weights for policy 1, policy_version 16900 (0.0008) +[2023-10-11 15:37:25,423][85175] Updated weights for policy 1, policy_version 16910 (0.0010) +[2023-10-11 15:37:25,803][85175] Updated weights for policy 1, policy_version 16920 (0.0011) +[2023-10-11 15:37:26,063][84230] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13218.3). Total num frames: 34373632. Throughput: 0: 1663.4, 1: 1687.7. Samples: 8607782. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) +[2023-10-11 15:37:26,063][84230] Avg episode reward: [(0, '6.480'), (1, '7.010')] +[2023-10-11 15:37:26,599][85176] Updated weights for policy 0, policy_version 16682 (0.0009) +[2023-10-11 15:37:26,959][85176] Updated weights for policy 0, policy_version 16692 (0.0007) +[2023-10-11 15:37:27,334][85176] Updated weights for policy 0, policy_version 16702 (0.0009) +[2023-10-11 15:37:29,921][85175] Updated weights for policy 1, policy_version 16930 (0.0010) +[2023-10-11 15:37:30,299][85175] Updated weights for policy 1, policy_version 16940 (0.0008) +[2023-10-11 15:37:30,663][85175] Updated weights for policy 1, policy_version 16950 (0.0009) +[2023-10-11 15:37:31,030][85175] Updated weights for policy 1, policy_version 16960 (0.0008) +[2023-10-11 15:37:31,063][84230] Fps is (10 sec: 16383.6, 60 sec: 13653.3, 300 sec: 13218.3). Total num frames: 34471936. Throughput: 0: 1661.8, 1: 1675.3. Samples: 8627660. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) +[2023-10-11 15:37:31,064][84230] Avg episode reward: [(0, '6.750'), (1, '7.040')] +[2023-10-11 15:37:31,482][85176] Updated weights for policy 0, policy_version 16712 (0.0008) +[2023-10-11 15:37:31,862][85176] Updated weights for policy 0, policy_version 16722 (0.0008) +[2023-10-11 15:37:32,224][85176] Updated weights for policy 0, policy_version 16732 (0.0008) +[2023-10-11 15:37:34,989][85175] Updated weights for policy 1, policy_version 16970 (0.0009) +[2023-10-11 15:37:35,367][85175] Updated weights for policy 1, policy_version 16980 (0.0009) +[2023-10-11 15:37:35,732][85175] Updated weights for policy 1, policy_version 16990 (0.0009) +[2023-10-11 15:37:36,063][84230] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 34537472. Throughput: 0: 1663.7, 1: 1694.6. Samples: 8637556. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) +[2023-10-11 15:37:36,063][84230] Avg episode reward: [(0, '6.530'), (1, '7.390')] +[2023-10-11 15:37:36,450][85176] Updated weights for policy 0, policy_version 16742 (0.0007) +[2023-10-11 15:37:36,818][85176] Updated weights for policy 0, policy_version 16752 (0.0009) +[2023-10-11 15:37:37,195][85176] Updated weights for policy 0, policy_version 16762 (0.0010) +[2023-10-11 15:37:39,827][85175] Updated weights for policy 1, policy_version 17000 (0.0009) +[2023-10-11 15:37:40,196][85175] Updated weights for policy 1, policy_version 17010 (0.0009) +[2023-10-11 15:37:40,572][85175] Updated weights for policy 1, policy_version 17020 (0.0007) +[2023-10-11 15:37:41,062][84230] Fps is (10 sec: 13107.6, 60 sec: 13653.4, 300 sec: 13218.3). Total num frames: 34603008. Throughput: 0: 1666.5, 1: 1691.6. Samples: 8657974. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:37:41,063][84230] Avg episode reward: [(0, '6.860'), (1, '7.260')] +[2023-10-11 15:37:41,556][85176] Updated weights for policy 0, policy_version 16772 (0.0008) +[2023-10-11 15:37:41,939][85176] Updated weights for policy 0, policy_version 16782 (0.0007) +[2023-10-11 15:37:42,306][85176] Updated weights for policy 0, policy_version 16792 (0.0008) +[2023-10-11 15:37:44,461][85175] Updated weights for policy 1, policy_version 17030 (0.0008) +[2023-10-11 15:37:44,826][85175] Updated weights for policy 1, policy_version 17040 (0.0008) +[2023-10-11 15:37:45,193][85175] Updated weights for policy 1, policy_version 17050 (0.0007) +[2023-10-11 15:37:46,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13218.3). Total num frames: 34668544. Throughput: 0: 1670.7, 1: 1663.4. Samples: 8677614. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:37:46,064][84230] Avg episode reward: [(0, '6.630'), (1, '7.750')] +[2023-10-11 15:37:46,168][85176] Updated weights for policy 0, policy_version 16802 (0.0008) +[2023-10-11 15:37:46,537][85176] Updated weights for policy 0, policy_version 16812 (0.0008) +[2023-10-11 15:37:46,924][85176] Updated weights for policy 0, policy_version 16822 (0.0011) +[2023-10-11 15:37:47,290][85176] Updated weights for policy 0, policy_version 16832 (0.0010) +[2023-10-11 15:37:49,398][85175] Updated weights for policy 1, policy_version 17060 (0.0010) +[2023-10-11 15:37:49,778][85175] Updated weights for policy 1, policy_version 17070 (0.0010) +[2023-10-11 15:37:50,138][85175] Updated weights for policy 1, policy_version 17080 (0.0008) +[2023-10-11 15:37:51,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13218.3). Total num frames: 34734080. Throughput: 0: 1669.6, 1: 1689.2. Samples: 8687878. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:37:51,063][84230] Avg episode reward: [(0, '6.890'), (1, '7.220')] +[2023-10-11 15:37:51,574][85176] Updated weights for policy 0, policy_version 16842 (0.0008) +[2023-10-11 15:37:51,958][85176] Updated weights for policy 0, policy_version 16852 (0.0009) +[2023-10-11 15:37:52,335][85176] Updated weights for policy 0, policy_version 16862 (0.0008) +[2023-10-11 15:37:54,012][85175] Updated weights for policy 1, policy_version 17090 (0.0007) +[2023-10-11 15:37:54,385][85175] Updated weights for policy 1, policy_version 17100 (0.0007) +[2023-10-11 15:37:54,749][85175] Updated weights for policy 1, policy_version 17110 (0.0007) +[2023-10-11 15:37:55,116][85175] Updated weights for policy 1, policy_version 17120 (0.0009) +[2023-10-11 15:37:56,063][84230] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13218.3). Total num frames: 34799616. Throughput: 0: 1666.7, 1: 1675.9. Samples: 8707952. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:37:56,063][84230] Avg episode reward: [(0, '7.120'), (1, '7.160')] +[2023-10-11 15:37:56,322][85176] Updated weights for policy 0, policy_version 16872 (0.0008) +[2023-10-11 15:37:56,702][85176] Updated weights for policy 0, policy_version 16882 (0.0010) +[2023-10-11 15:37:57,075][85176] Updated weights for policy 0, policy_version 16892 (0.0010) +[2023-10-11 15:37:59,268][85175] Updated weights for policy 1, policy_version 17130 (0.0012) +[2023-10-11 15:37:59,641][85175] Updated weights for policy 1, policy_version 17140 (0.0008) +[2023-10-11 15:38:00,014][85175] Updated weights for policy 1, policy_version 17150 (0.0008) +[2023-10-11 15:38:01,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13218.3). Total num frames: 34865152. Throughput: 0: 1667.2, 1: 1677.7. Samples: 8728046. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:38:01,063][84230] Avg episode reward: [(0, '7.220'), (1, '7.090')] +[2023-10-11 15:38:01,151][85176] Updated weights for policy 0, policy_version 16902 (0.0008) +[2023-10-11 15:38:01,535][85176] Updated weights for policy 0, policy_version 16912 (0.0007) +[2023-10-11 15:38:01,903][85176] Updated weights for policy 0, policy_version 16922 (0.0008) +[2023-10-11 15:38:04,216][85175] Updated weights for policy 1, policy_version 17160 (0.0007) +[2023-10-11 15:38:04,583][85175] Updated weights for policy 1, policy_version 17170 (0.0008) +[2023-10-11 15:38:04,957][85175] Updated weights for policy 1, policy_version 17180 (0.0008) +[2023-10-11 15:38:05,964][85176] Updated weights for policy 0, policy_version 16932 (0.0007) +[2023-10-11 15:38:06,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13218.3). Total num frames: 34930688. Throughput: 0: 1665.3, 1: 1695.8. Samples: 8738266. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:38:06,063][84230] Avg episode reward: [(0, '7.000'), (1, '7.390')] +[2023-10-11 15:38:06,343][85176] Updated weights for policy 0, policy_version 16942 (0.0009) +[2023-10-11 15:38:06,713][85176] Updated weights for policy 0, policy_version 16952 (0.0009) +[2023-10-11 15:38:08,866][85175] Updated weights for policy 1, policy_version 17190 (0.0008) +[2023-10-11 15:38:09,238][85175] Updated weights for policy 1, policy_version 17200 (0.0011) +[2023-10-11 15:38:09,613][85175] Updated weights for policy 1, policy_version 17210 (0.0008) +[2023-10-11 15:38:10,802][85176] Updated weights for policy 0, policy_version 16962 (0.0009) +[2023-10-11 15:38:11,062][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13218.3). Total num frames: 34996224. Throughput: 0: 1665.2, 1: 1674.9. Samples: 8758086. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) +[2023-10-11 15:38:11,063][84230] Avg episode reward: [(0, '6.640'), (1, '7.520')] +[2023-10-11 15:38:11,174][85176] Updated weights for policy 0, policy_version 16972 (0.0008) +[2023-10-11 15:38:11,546][85176] Updated weights for policy 0, policy_version 16982 (0.0008) +[2023-10-11 15:38:11,917][85176] Updated weights for policy 0, policy_version 16992 (0.0010) +[2023-10-11 15:38:13,645][85175] Updated weights for policy 1, policy_version 17220 (0.0008) +[2023-10-11 15:38:14,011][85175] Updated weights for policy 1, policy_version 17230 (0.0009) +[2023-10-11 15:38:14,372][85175] Updated weights for policy 1, policy_version 17240 (0.0009) +[2023-10-11 15:38:16,025][85176] Updated weights for policy 0, policy_version 17002 (0.0010) +[2023-10-11 15:38:16,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13218.3). Total num frames: 35061760. Throughput: 0: 1668.8, 1: 1683.8. Samples: 8778528. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) +[2023-10-11 15:38:16,064][84230] Avg episode reward: [(0, '6.530'), (1, '7.230')] +[2023-10-11 15:38:16,073][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000017248_17661952.pth... +[2023-10-11 15:38:16,102][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000015680_16056320.pth +[2023-10-11 15:38:16,407][85176] Updated weights for policy 0, policy_version 17012 (0.0008) +[2023-10-11 15:38:16,782][85176] Updated weights for policy 0, policy_version 17022 (0.0008) +[2023-10-11 15:38:16,850][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000017024_17432576.pth... +[2023-10-11 15:38:16,889][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000015456_15826944.pth +[2023-10-11 15:38:18,349][85175] Updated weights for policy 1, policy_version 17250 (0.0007) +[2023-10-11 15:38:18,724][85175] Updated weights for policy 1, policy_version 17260 (0.0009) +[2023-10-11 15:38:19,089][85175] Updated weights for policy 1, policy_version 17270 (0.0008) +[2023-10-11 15:38:19,451][85175] Updated weights for policy 1, policy_version 17280 (0.0007) +[2023-10-11 15:38:21,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13218.3). Total num frames: 35127296. Throughput: 0: 1668.1, 1: 1686.0. Samples: 8788490. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) +[2023-10-11 15:38:21,063][84230] Avg episode reward: [(0, '6.750'), (1, '7.190')] +[2023-10-11 15:38:21,123][85176] Updated weights for policy 0, policy_version 17032 (0.0009) +[2023-10-11 15:38:21,507][85176] Updated weights for policy 0, policy_version 17042 (0.0009) +[2023-10-11 15:38:21,879][85176] Updated weights for policy 0, policy_version 17052 (0.0010) +[2023-10-11 15:38:23,404][85175] Updated weights for policy 1, policy_version 17290 (0.0009) +[2023-10-11 15:38:23,778][85175] Updated weights for policy 1, policy_version 17300 (0.0010) +[2023-10-11 15:38:24,151][85175] Updated weights for policy 1, policy_version 17310 (0.0009) +[2023-10-11 15:38:25,920][85176] Updated weights for policy 0, policy_version 17062 (0.0010) +[2023-10-11 15:38:26,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13218.3). Total num frames: 35192832. Throughput: 0: 1667.9, 1: 1667.3. Samples: 8808062. Policy #0 lag: (min: 23.0, avg: 27.6, max: 55.0) +[2023-10-11 15:38:26,064][84230] Avg episode reward: [(0, '6.640'), (1, '7.160')] +[2023-10-11 15:38:26,302][85176] Updated weights for policy 0, policy_version 17072 (0.0010) +[2023-10-11 15:38:26,673][85176] Updated weights for policy 0, policy_version 17082 (0.0009) +[2023-10-11 15:38:28,056][85175] Updated weights for policy 1, policy_version 17320 (0.0009) +[2023-10-11 15:38:28,428][85175] Updated weights for policy 1, policy_version 17330 (0.0008) +[2023-10-11 15:38:28,799][85175] Updated weights for policy 1, policy_version 17340 (0.0008) +[2023-10-11 15:38:30,820][85176] Updated weights for policy 0, policy_version 17092 (0.0009) +[2023-10-11 15:38:31,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13218.3). Total num frames: 35258368. Throughput: 0: 1663.0, 1: 1696.0. Samples: 8828766. Policy #0 lag: (min: 23.0, avg: 27.6, max: 55.0) +[2023-10-11 15:38:31,064][84230] Avg episode reward: [(0, '6.530'), (1, '6.990')] +[2023-10-11 15:38:31,190][85176] Updated weights for policy 0, policy_version 17102 (0.0008) +[2023-10-11 15:38:31,564][85176] Updated weights for policy 0, policy_version 17112 (0.0008) +[2023-10-11 15:38:32,916][85175] Updated weights for policy 1, policy_version 17350 (0.0008) +[2023-10-11 15:38:33,293][85175] Updated weights for policy 1, policy_version 17360 (0.0008) +[2023-10-11 15:38:33,659][85175] Updated weights for policy 1, policy_version 17370 (0.0007) +[2023-10-11 15:38:35,618][85176] Updated weights for policy 0, policy_version 17122 (0.0008) +[2023-10-11 15:38:35,985][85176] Updated weights for policy 0, policy_version 17132 (0.0010) +[2023-10-11 15:38:36,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13218.3). Total num frames: 35323904. Throughput: 0: 1666.9, 1: 1677.8. Samples: 8838390. Policy #0 lag: (min: 23.0, avg: 27.6, max: 55.0) +[2023-10-11 15:38:36,064][84230] Avg episode reward: [(0, '6.750'), (1, '7.050')] +[2023-10-11 15:38:36,360][85176] Updated weights for policy 0, policy_version 17142 (0.0011) +[2023-10-11 15:38:36,730][85176] Updated weights for policy 0, policy_version 17152 (0.0010) +[2023-10-11 15:38:37,635][85175] Updated weights for policy 1, policy_version 17380 (0.0008) +[2023-10-11 15:38:37,999][85175] Updated weights for policy 1, policy_version 17390 (0.0009) +[2023-10-11 15:38:38,378][85175] Updated weights for policy 1, policy_version 17400 (0.0008) +[2023-10-11 15:38:40,831][85176] Updated weights for policy 0, policy_version 17162 (0.0009) +[2023-10-11 15:38:41,062][84230] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13218.3). Total num frames: 35389440. Throughput: 0: 1667.3, 1: 1681.5. Samples: 8858650. Policy #0 lag: (min: 31.0, avg: 38.7, max: 63.0) +[2023-10-11 15:38:41,063][84230] Avg episode reward: [(0, '7.050'), (1, '7.080')] +[2023-10-11 15:38:41,210][85176] Updated weights for policy 0, policy_version 17172 (0.0008) +[2023-10-11 15:38:41,580][85176] Updated weights for policy 0, policy_version 17182 (0.0007) +[2023-10-11 15:38:42,329][85175] Updated weights for policy 1, policy_version 17410 (0.0009) +[2023-10-11 15:38:42,696][85175] Updated weights for policy 1, policy_version 17420 (0.0009) +[2023-10-11 15:38:43,068][85175] Updated weights for policy 1, policy_version 17430 (0.0007) +[2023-10-11 15:38:43,427][85175] Updated weights for policy 1, policy_version 17440 (0.0011) +[2023-10-11 15:38:45,619][85176] Updated weights for policy 0, policy_version 17192 (0.0007) +[2023-10-11 15:38:45,986][85176] Updated weights for policy 0, policy_version 17202 (0.0007) +[2023-10-11 15:38:46,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13107.3, 300 sec: 13218.3). Total num frames: 35454976. Throughput: 0: 1659.8, 1: 1695.1. Samples: 8879018. Policy #0 lag: (min: 31.0, avg: 38.7, max: 63.0) +[2023-10-11 15:38:46,063][84230] Avg episode reward: [(0, '7.120'), (1, '7.700')] +[2023-10-11 15:38:46,370][85176] Updated weights for policy 0, policy_version 17212 (0.0008) +[2023-10-11 15:38:47,508][85175] Updated weights for policy 1, policy_version 17450 (0.0008) +[2023-10-11 15:38:47,870][85175] Updated weights for policy 1, policy_version 17460 (0.0010) +[2023-10-11 15:38:48,247][85175] Updated weights for policy 1, policy_version 17470 (0.0010) +[2023-10-11 15:38:50,433][85176] Updated weights for policy 0, policy_version 17222 (0.0009) +[2023-10-11 15:38:50,805][85176] Updated weights for policy 0, policy_version 17232 (0.0007) +[2023-10-11 15:38:51,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13218.3). Total num frames: 35520512. Throughput: 0: 1667.5, 1: 1665.2. Samples: 8888238. Policy #0 lag: (min: 31.0, avg: 38.7, max: 63.0) +[2023-10-11 15:38:51,063][84230] Avg episode reward: [(0, '6.970'), (1, '7.000')] +[2023-10-11 15:38:51,181][85176] Updated weights for policy 0, policy_version 17242 (0.0007) +[2023-10-11 15:38:52,357][85175] Updated weights for policy 1, policy_version 17480 (0.0008) +[2023-10-11 15:38:52,731][85175] Updated weights for policy 1, policy_version 17490 (0.0010) +[2023-10-11 15:38:53,102][85175] Updated weights for policy 1, policy_version 17500 (0.0009) +[2023-10-11 15:38:55,291][85176] Updated weights for policy 0, policy_version 17252 (0.0009) +[2023-10-11 15:38:55,657][85176] Updated weights for policy 0, policy_version 17262 (0.0007) +[2023-10-11 15:38:56,031][85176] Updated weights for policy 0, policy_version 17272 (0.0007) +[2023-10-11 15:38:56,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 35586048. Throughput: 0: 1666.3, 1: 1688.5. Samples: 8909056. Policy #0 lag: (min: 26.0, avg: 26.0, max: 26.0) +[2023-10-11 15:38:56,064][84230] Avg episode reward: [(0, '6.640'), (1, '6.980')] +[2023-10-11 15:38:57,286][85175] Updated weights for policy 1, policy_version 17510 (0.0008) +[2023-10-11 15:38:57,676][85175] Updated weights for policy 1, policy_version 17520 (0.0007) +[2023-10-11 15:38:58,041][85175] Updated weights for policy 1, policy_version 17530 (0.0008) +[2023-10-11 15:39:00,145][85176] Updated weights for policy 0, policy_version 17282 (0.0009) +[2023-10-11 15:39:00,516][85176] Updated weights for policy 0, policy_version 17292 (0.0010) +[2023-10-11 15:39:00,890][85176] Updated weights for policy 0, policy_version 17302 (0.0007) +[2023-10-11 15:39:01,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 35651584. Throughput: 0: 1653.0, 1: 1690.3. Samples: 8928978. Policy #0 lag: (min: 26.0, avg: 26.0, max: 26.0) +[2023-10-11 15:39:01,064][84230] Avg episode reward: [(0, '6.640'), (1, '7.870')] +[2023-10-11 15:39:01,259][85176] Updated weights for policy 0, policy_version 17312 (0.0009) +[2023-10-11 15:39:01,916][85175] Updated weights for policy 1, policy_version 17540 (0.0007) +[2023-10-11 15:39:02,294][85175] Updated weights for policy 1, policy_version 17550 (0.0007) +[2023-10-11 15:39:02,667][85175] Updated weights for policy 1, policy_version 17560 (0.0008) +[2023-10-11 15:39:05,458][85176] Updated weights for policy 0, policy_version 17322 (0.0009) +[2023-10-11 15:39:05,845][85176] Updated weights for policy 0, policy_version 17332 (0.0008) +[2023-10-11 15:39:06,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13218.3). Total num frames: 35717120. Throughput: 0: 1669.3, 1: 1667.9. Samples: 8938662. Policy #0 lag: (min: 26.0, avg: 26.0, max: 26.0) +[2023-10-11 15:39:06,063][84230] Avg episode reward: [(0, '6.100'), (1, '8.120')] +[2023-10-11 15:39:06,064][85000] Saving new best policy, reward=8.120! +[2023-10-11 15:39:06,218][85176] Updated weights for policy 0, policy_version 17342 (0.0007) +[2023-10-11 15:39:06,711][85175] Updated weights for policy 1, policy_version 17570 (0.0008) +[2023-10-11 15:39:07,072][85175] Updated weights for policy 1, policy_version 17580 (0.0008) +[2023-10-11 15:39:07,444][85175] Updated weights for policy 1, policy_version 17590 (0.0010) +[2023-10-11 15:39:07,798][85175] Updated weights for policy 1, policy_version 17600 (0.0009) +[2023-10-11 15:39:10,348][85176] Updated weights for policy 0, policy_version 17352 (0.0009) +[2023-10-11 15:39:10,715][85176] Updated weights for policy 0, policy_version 17362 (0.0008) +[2023-10-11 15:39:11,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13218.3). Total num frames: 35782656. Throughput: 0: 1668.5, 1: 1693.0. Samples: 8959330. Policy #0 lag: (min: 15.0, avg: 28.1, max: 47.0) +[2023-10-11 15:39:11,063][84230] Avg episode reward: [(0, '6.000'), (1, '7.310')] +[2023-10-11 15:39:11,092][85176] Updated weights for policy 0, policy_version 17372 (0.0007) +[2023-10-11 15:39:11,809][85175] Updated weights for policy 1, policy_version 17610 (0.0009) +[2023-10-11 15:39:12,171][85175] Updated weights for policy 1, policy_version 17620 (0.0009) +[2023-10-11 15:39:12,534][85175] Updated weights for policy 1, policy_version 17630 (0.0008) +[2023-10-11 15:39:15,291][85176] Updated weights for policy 0, policy_version 17382 (0.0007) +[2023-10-11 15:39:15,666][85176] Updated weights for policy 0, policy_version 17392 (0.0008) +[2023-10-11 15:39:16,046][85176] Updated weights for policy 0, policy_version 17402 (0.0008) +[2023-10-11 15:39:16,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.3, 300 sec: 13218.3). Total num frames: 35848192. Throughput: 0: 1649.9, 1: 1692.3. Samples: 8979162. Policy #0 lag: (min: 15.0, avg: 28.1, max: 47.0) +[2023-10-11 15:39:16,063][84230] Avg episode reward: [(0, '6.230'), (1, '7.120')] +[2023-10-11 15:39:16,709][85175] Updated weights for policy 1, policy_version 17640 (0.0010) +[2023-10-11 15:39:17,077][85175] Updated weights for policy 1, policy_version 17650 (0.0010) +[2023-10-11 15:39:17,452][85175] Updated weights for policy 1, policy_version 17660 (0.0008) +[2023-10-11 15:39:20,035][85176] Updated weights for policy 0, policy_version 17412 (0.0009) +[2023-10-11 15:39:20,412][85176] Updated weights for policy 0, policy_version 17422 (0.0009) +[2023-10-11 15:39:20,784][85176] Updated weights for policy 0, policy_version 17432 (0.0008) +[2023-10-11 15:39:21,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13218.3). Total num frames: 35913728. Throughput: 0: 1659.8, 1: 1679.5. Samples: 8988658. Policy #0 lag: (min: 15.0, avg: 28.1, max: 47.0) +[2023-10-11 15:39:21,064][84230] Avg episode reward: [(0, '6.860'), (1, '7.440')] +[2023-10-11 15:39:21,411][85175] Updated weights for policy 1, policy_version 17670 (0.0008) +[2023-10-11 15:39:21,778][85175] Updated weights for policy 1, policy_version 17680 (0.0007) +[2023-10-11 15:39:22,145][85175] Updated weights for policy 1, policy_version 17690 (0.0007) +[2023-10-11 15:39:24,977][85176] Updated weights for policy 0, policy_version 17442 (0.0008) +[2023-10-11 15:39:25,340][85176] Updated weights for policy 0, policy_version 17452 (0.0008) +[2023-10-11 15:39:25,710][85176] Updated weights for policy 0, policy_version 17462 (0.0008) +[2023-10-11 15:39:26,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13218.3). Total num frames: 35979264. Throughput: 0: 1661.3, 1: 1689.0. Samples: 9009412. Policy #0 lag: (min: 31.0, avg: 38.5, max: 63.0) +[2023-10-11 15:39:26,063][84230] Avg episode reward: [(0, '7.080'), (1, '7.130')] +[2023-10-11 15:39:26,081][85176] Updated weights for policy 0, policy_version 17472 (0.0009) +[2023-10-11 15:39:26,133][85175] Updated weights for policy 1, policy_version 17700 (0.0007) +[2023-10-11 15:39:26,511][85175] Updated weights for policy 1, policy_version 17710 (0.0009) +[2023-10-11 15:39:26,871][85175] Updated weights for policy 1, policy_version 17720 (0.0010) +[2023-10-11 15:39:30,234][85176] Updated weights for policy 0, policy_version 17482 (0.0008) +[2023-10-11 15:39:30,607][85176] Updated weights for policy 0, policy_version 17492 (0.0011) +[2023-10-11 15:39:30,973][85176] Updated weights for policy 0, policy_version 17502 (0.0011) +[2023-10-11 15:39:30,999][85175] Updated weights for policy 1, policy_version 17730 (0.0010) +[2023-10-11 15:39:31,063][84230] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 13329.3). Total num frames: 36077568. Throughput: 0: 1648.8, 1: 1693.2. Samples: 9029408. Policy #0 lag: (min: 31.0, avg: 38.5, max: 63.0) +[2023-10-11 15:39:31,064][84230] Avg episode reward: [(0, '7.040'), (1, '6.850')] +[2023-10-11 15:39:31,370][85175] Updated weights for policy 1, policy_version 17740 (0.0007) +[2023-10-11 15:39:31,742][85175] Updated weights for policy 1, policy_version 17750 (0.0010) +[2023-10-11 15:39:32,110][85175] Updated weights for policy 1, policy_version 17760 (0.0010) +[2023-10-11 15:39:35,215][85176] Updated weights for policy 0, policy_version 17512 (0.0009) +[2023-10-11 15:39:35,592][85176] Updated weights for policy 0, policy_version 17522 (0.0008) +[2023-10-11 15:39:35,957][85176] Updated weights for policy 0, policy_version 17532 (0.0009) +[2023-10-11 15:39:36,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13218.3). Total num frames: 36110336. Throughput: 0: 1659.2, 1: 1693.3. Samples: 9039102. Policy #0 lag: (min: 31.0, avg: 38.5, max: 63.0) +[2023-10-11 15:39:36,064][84230] Avg episode reward: [(0, '6.970'), (1, '7.160')] +[2023-10-11 15:39:36,133][85175] Updated weights for policy 1, policy_version 17770 (0.0010) +[2023-10-11 15:39:36,502][85175] Updated weights for policy 1, policy_version 17780 (0.0008) +[2023-10-11 15:39:36,877][85175] Updated weights for policy 1, policy_version 17790 (0.0008) +[2023-10-11 15:39:40,052][85176] Updated weights for policy 0, policy_version 17542 (0.0007) +[2023-10-11 15:39:40,432][85176] Updated weights for policy 0, policy_version 17552 (0.0010) +[2023-10-11 15:39:40,803][85176] Updated weights for policy 0, policy_version 17562 (0.0009) +[2023-10-11 15:39:40,891][85175] Updated weights for policy 1, policy_version 17800 (0.0009) +[2023-10-11 15:39:41,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 36208640. Throughput: 0: 1656.3, 1: 1694.5. Samples: 9059840. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:39:41,063][84230] Avg episode reward: [(0, '6.710'), (1, '7.220')] +[2023-10-11 15:39:41,255][85175] Updated weights for policy 1, policy_version 17810 (0.0008) +[2023-10-11 15:39:41,630][85175] Updated weights for policy 1, policy_version 17820 (0.0008) +[2023-10-11 15:39:44,449][85176] Updated weights for policy 0, policy_version 17572 (0.0008) +[2023-10-11 15:39:44,825][85176] Updated weights for policy 0, policy_version 17582 (0.0010) +[2023-10-11 15:39:45,211][85176] Updated weights for policy 0, policy_version 17592 (0.0009) +[2023-10-11 15:39:45,797][85175] Updated weights for policy 1, policy_version 17830 (0.0008) +[2023-10-11 15:39:46,062][84230] Fps is (10 sec: 16384.3, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 36274176. Throughput: 0: 1643.6, 1: 1700.5. Samples: 9079464. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:39:46,063][84230] Avg episode reward: [(0, '6.490'), (1, '7.230')] +[2023-10-11 15:39:46,188][85175] Updated weights for policy 1, policy_version 17840 (0.0007) +[2023-10-11 15:39:46,552][85175] Updated weights for policy 1, policy_version 17850 (0.0009) +[2023-10-11 15:39:49,467][85176] Updated weights for policy 0, policy_version 17602 (0.0009) +[2023-10-11 15:39:49,827][85176] Updated weights for policy 0, policy_version 17612 (0.0009) +[2023-10-11 15:39:50,207][85176] Updated weights for policy 0, policy_version 17622 (0.0007) +[2023-10-11 15:39:50,473][85175] Updated weights for policy 1, policy_version 17860 (0.0010) +[2023-10-11 15:39:50,576][85176] Updated weights for policy 0, policy_version 17632 (0.0007) +[2023-10-11 15:39:50,832][85175] Updated weights for policy 1, policy_version 17870 (0.0007) +[2023-10-11 15:39:51,062][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13329.4). Total num frames: 36339712. Throughput: 0: 1654.3, 1: 1698.9. Samples: 9089556. Policy #0 lag: (min: 24.0, avg: 45.5, max: 56.0) +[2023-10-11 15:39:51,063][84230] Avg episode reward: [(0, '6.570'), (1, '7.090')] +[2023-10-11 15:39:51,206][85175] Updated weights for policy 1, policy_version 17880 (0.0007) +[2023-10-11 15:39:54,618][85176] Updated weights for policy 0, policy_version 17642 (0.0007) +[2023-10-11 15:39:54,995][85176] Updated weights for policy 0, policy_version 17652 (0.0007) +[2023-10-11 15:39:55,135][85175] Updated weights for policy 1, policy_version 17890 (0.0007) +[2023-10-11 15:39:55,359][85176] Updated weights for policy 0, policy_version 17662 (0.0007) +[2023-10-11 15:39:55,507][85175] Updated weights for policy 1, policy_version 17900 (0.0007) +[2023-10-11 15:39:55,876][85175] Updated weights for policy 1, policy_version 17910 (0.0008) +[2023-10-11 15:39:56,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13329.4). Total num frames: 36405248. Throughput: 0: 1652.4, 1: 1696.1. Samples: 9110016. Policy #0 lag: (min: 24.0, avg: 45.5, max: 56.0) +[2023-10-11 15:39:56,063][84230] Avg episode reward: [(0, '6.680'), (1, '7.040')] +[2023-10-11 15:39:56,249][85175] Updated weights for policy 1, policy_version 17920 (0.0008) +[2023-10-11 15:39:59,593][85176] Updated weights for policy 0, policy_version 17672 (0.0007) +[2023-10-11 15:39:59,967][85176] Updated weights for policy 0, policy_version 17682 (0.0008) +[2023-10-11 15:40:00,333][85176] Updated weights for policy 0, policy_version 17692 (0.0008) +[2023-10-11 15:40:00,356][85175] Updated weights for policy 1, policy_version 17930 (0.0007) +[2023-10-11 15:40:00,729][85175] Updated weights for policy 1, policy_version 17940 (0.0009) +[2023-10-11 15:40:01,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13329.4). Total num frames: 36470784. Throughput: 0: 1654.4, 1: 1681.5. Samples: 9129278. Policy #0 lag: (min: 24.0, avg: 45.5, max: 56.0) +[2023-10-11 15:40:01,063][84230] Avg episode reward: [(0, '6.640'), (1, '7.140')] +[2023-10-11 15:40:01,094][85175] Updated weights for policy 1, policy_version 17950 (0.0008) +[2023-10-11 15:40:04,516][85176] Updated weights for policy 0, policy_version 17702 (0.0009) +[2023-10-11 15:40:04,898][85176] Updated weights for policy 0, policy_version 17712 (0.0009) +[2023-10-11 15:40:05,231][85175] Updated weights for policy 1, policy_version 17960 (0.0009) +[2023-10-11 15:40:05,268][85176] Updated weights for policy 0, policy_version 17722 (0.0008) +[2023-10-11 15:40:05,591][85175] Updated weights for policy 1, policy_version 17970 (0.0009) +[2023-10-11 15:40:05,967][85175] Updated weights for policy 1, policy_version 17980 (0.0007) +[2023-10-11 15:40:06,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 36536320. Throughput: 0: 1673.8, 1: 1693.9. Samples: 9140206. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:40:06,063][84230] Avg episode reward: [(0, '6.570'), (1, '6.940')] +[2023-10-11 15:40:09,413][85176] Updated weights for policy 0, policy_version 17732 (0.0009) +[2023-10-11 15:40:09,782][85176] Updated weights for policy 0, policy_version 17742 (0.0009) +[2023-10-11 15:40:09,966][85175] Updated weights for policy 1, policy_version 17990 (0.0009) +[2023-10-11 15:40:10,156][85176] Updated weights for policy 0, policy_version 17752 (0.0007) +[2023-10-11 15:40:10,329][85175] Updated weights for policy 1, policy_version 18000 (0.0007) +[2023-10-11 15:40:10,695][85175] Updated weights for policy 1, policy_version 18010 (0.0009) +[2023-10-11 15:40:11,063][84230] Fps is (10 sec: 16383.8, 60 sec: 14199.4, 300 sec: 13440.4). Total num frames: 36634624. Throughput: 0: 1662.2, 1: 1695.2. Samples: 9160496. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:40:11,064][84230] Avg episode reward: [(0, '6.530'), (1, '7.040')] +[2023-10-11 15:40:14,348][85176] Updated weights for policy 0, policy_version 17762 (0.0008) +[2023-10-11 15:40:14,720][85176] Updated weights for policy 0, policy_version 17772 (0.0011) +[2023-10-11 15:40:14,981][85175] Updated weights for policy 1, policy_version 18020 (0.0009) +[2023-10-11 15:40:15,091][85176] Updated weights for policy 0, policy_version 17782 (0.0010) +[2023-10-11 15:40:15,355][85175] Updated weights for policy 1, policy_version 18030 (0.0009) +[2023-10-11 15:40:15,469][85176] Updated weights for policy 0, policy_version 17792 (0.0009) +[2023-10-11 15:40:15,721][85175] Updated weights for policy 1, policy_version 18040 (0.0007) +[2023-10-11 15:40:16,062][84230] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 13440.4). Total num frames: 36700160. Throughput: 0: 1652.7, 1: 1673.1. Samples: 9179066. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:40:16,063][84230] Avg episode reward: [(0, '6.510'), (1, '7.060')] +[2023-10-11 15:40:16,071][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000017792_18219008.pth... +[2023-10-11 15:40:16,071][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000018048_18481152.pth... +[2023-10-11 15:40:16,112][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000016448_16842752.pth +[2023-10-11 15:40:16,112][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000016224_16613376.pth +[2023-10-11 15:40:19,449][85176] Updated weights for policy 0, policy_version 17802 (0.0007) +[2023-10-11 15:40:19,645][85175] Updated weights for policy 1, policy_version 18050 (0.0009) +[2023-10-11 15:40:19,822][85176] Updated weights for policy 0, policy_version 17812 (0.0007) +[2023-10-11 15:40:20,012][85175] Updated weights for policy 1, policy_version 18060 (0.0008) +[2023-10-11 15:40:20,191][85176] Updated weights for policy 0, policy_version 17822 (0.0008) +[2023-10-11 15:40:20,388][85175] Updated weights for policy 1, policy_version 18070 (0.0007) +[2023-10-11 15:40:20,751][85175] Updated weights for policy 1, policy_version 18080 (0.0009) +[2023-10-11 15:40:21,063][84230] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 13440.4). Total num frames: 36765696. Throughput: 0: 1663.7, 1: 1695.4. Samples: 9190258. Policy #0 lag: (min: 18.0, avg: 21.6, max: 44.0) +[2023-10-11 15:40:21,063][84230] Avg episode reward: [(0, '6.530'), (1, '6.950')] +[2023-10-11 15:40:24,385][85176] Updated weights for policy 0, policy_version 17832 (0.0011) +[2023-10-11 15:40:24,765][85176] Updated weights for policy 0, policy_version 17842 (0.0009) +[2023-10-11 15:40:24,902][85175] Updated weights for policy 1, policy_version 18090 (0.0007) +[2023-10-11 15:40:25,142][85176] Updated weights for policy 0, policy_version 17852 (0.0008) +[2023-10-11 15:40:25,275][85175] Updated weights for policy 1, policy_version 18100 (0.0007) +[2023-10-11 15:40:25,652][85175] Updated weights for policy 1, policy_version 18110 (0.0008) +[2023-10-11 15:40:26,063][84230] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13440.4). Total num frames: 36831232. Throughput: 0: 1653.6, 1: 1688.3. Samples: 9210224. Policy #0 lag: (min: 18.0, avg: 21.6, max: 44.0) +[2023-10-11 15:40:26,063][84230] Avg episode reward: [(0, '6.690'), (1, '6.630')] +[2023-10-11 15:40:29,091][85176] Updated weights for policy 0, policy_version 17862 (0.0008) +[2023-10-11 15:40:29,470][85176] Updated weights for policy 0, policy_version 17872 (0.0008) +[2023-10-11 15:40:29,722][85175] Updated weights for policy 1, policy_version 18120 (0.0007) +[2023-10-11 15:40:29,833][85176] Updated weights for policy 0, policy_version 17882 (0.0008) +[2023-10-11 15:40:30,076][85175] Updated weights for policy 1, policy_version 18130 (0.0007) +[2023-10-11 15:40:30,448][85175] Updated weights for policy 1, policy_version 18140 (0.0008) +[2023-10-11 15:40:31,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 36896768. Throughput: 0: 1663.4, 1: 1660.1. Samples: 9229022. Policy #0 lag: (min: 18.0, avg: 21.6, max: 44.0) +[2023-10-11 15:40:31,063][84230] Avg episode reward: [(0, '6.750'), (1, '7.560')] +[2023-10-11 15:40:33,848][85176] Updated weights for policy 0, policy_version 17892 (0.0008) +[2023-10-11 15:40:34,214][85176] Updated weights for policy 0, policy_version 17902 (0.0007) +[2023-10-11 15:40:34,582][85175] Updated weights for policy 1, policy_version 18150 (0.0008) +[2023-10-11 15:40:34,596][85176] Updated weights for policy 0, policy_version 17912 (0.0008) +[2023-10-11 15:40:34,951][85175] Updated weights for policy 1, policy_version 18160 (0.0007) +[2023-10-11 15:40:35,315][85175] Updated weights for policy 1, policy_version 18170 (0.0007) +[2023-10-11 15:40:36,063][84230] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 13440.4). Total num frames: 36962304. Throughput: 0: 1668.6, 1: 1693.8. Samples: 9240864. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) +[2023-10-11 15:40:36,063][84230] Avg episode reward: [(0, '6.660'), (1, '7.580')] +[2023-10-11 15:40:38,837][85176] Updated weights for policy 0, policy_version 17922 (0.0009) +[2023-10-11 15:40:39,209][85176] Updated weights for policy 0, policy_version 17932 (0.0009) +[2023-10-11 15:40:39,249][85175] Updated weights for policy 1, policy_version 18180 (0.0009) +[2023-10-11 15:40:39,580][85176] Updated weights for policy 0, policy_version 17942 (0.0008) +[2023-10-11 15:40:39,617][85175] Updated weights for policy 1, policy_version 18190 (0.0009) +[2023-10-11 15:40:39,948][85176] Updated weights for policy 0, policy_version 17952 (0.0007) +[2023-10-11 15:40:39,987][85175] Updated weights for policy 1, policy_version 18200 (0.0007) +[2023-10-11 15:40:41,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 37027840. Throughput: 0: 1654.6, 1: 1684.6. Samples: 9260282. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) +[2023-10-11 15:40:41,064][84230] Avg episode reward: [(0, '6.760'), (1, '7.260')] +[2023-10-11 15:40:43,969][85175] Updated weights for policy 1, policy_version 18210 (0.0008) +[2023-10-11 15:40:44,024][85176] Updated weights for policy 0, policy_version 17962 (0.0009) +[2023-10-11 15:40:44,335][85175] Updated weights for policy 1, policy_version 18220 (0.0010) +[2023-10-11 15:40:44,393][85176] Updated weights for policy 0, policy_version 17972 (0.0009) +[2023-10-11 15:40:44,698][85175] Updated weights for policy 1, policy_version 18230 (0.0008) +[2023-10-11 15:40:44,771][85176] Updated weights for policy 0, policy_version 17982 (0.0008) +[2023-10-11 15:40:45,077][85175] Updated weights for policy 1, policy_version 18240 (0.0008) +[2023-10-11 15:40:46,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 37093376. Throughput: 0: 1666.4, 1: 1680.1. Samples: 9279870. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) +[2023-10-11 15:40:46,063][84230] Avg episode reward: [(0, '6.660'), (1, '6.880')] +[2023-10-11 15:40:48,947][85176] Updated weights for policy 0, policy_version 17992 (0.0008) +[2023-10-11 15:40:49,059][85175] Updated weights for policy 1, policy_version 18250 (0.0008) +[2023-10-11 15:40:49,321][85176] Updated weights for policy 0, policy_version 18002 (0.0007) +[2023-10-11 15:40:49,426][85175] Updated weights for policy 1, policy_version 18260 (0.0009) +[2023-10-11 15:40:49,694][85176] Updated weights for policy 0, policy_version 18012 (0.0008) +[2023-10-11 15:40:49,805][85175] Updated weights for policy 1, policy_version 18270 (0.0007) +[2023-10-11 15:40:51,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 37158912. Throughput: 0: 1663.6, 1: 1700.8. Samples: 9291602. Policy #0 lag: (min: 24.0, avg: 45.9, max: 56.0) +[2023-10-11 15:40:51,063][84230] Avg episode reward: [(0, '6.980'), (1, '7.340')] +[2023-10-11 15:40:53,788][85176] Updated weights for policy 0, policy_version 18022 (0.0009) +[2023-10-11 15:40:54,052][85175] Updated weights for policy 1, policy_version 18280 (0.0009) +[2023-10-11 15:40:54,157][85176] Updated weights for policy 0, policy_version 18032 (0.0008) +[2023-10-11 15:40:54,422][85175] Updated weights for policy 1, policy_version 18290 (0.0008) +[2023-10-11 15:40:54,533][85176] Updated weights for policy 0, policy_version 18042 (0.0007) +[2023-10-11 15:40:54,784][85175] Updated weights for policy 1, policy_version 18300 (0.0008) +[2023-10-11 15:40:56,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 37224448. Throughput: 0: 1646.7, 1: 1677.8. Samples: 9310098. Policy #0 lag: (min: 24.0, avg: 45.9, max: 56.0) +[2023-10-11 15:40:56,063][84230] Avg episode reward: [(0, '6.800'), (1, '7.480')] +[2023-10-11 15:40:58,434][85176] Updated weights for policy 0, policy_version 18052 (0.0010) +[2023-10-11 15:40:58,794][85175] Updated weights for policy 1, policy_version 18310 (0.0007) +[2023-10-11 15:40:58,810][85176] Updated weights for policy 0, policy_version 18062 (0.0008) +[2023-10-11 15:40:59,161][85175] Updated weights for policy 1, policy_version 18320 (0.0008) +[2023-10-11 15:40:59,176][85176] Updated weights for policy 0, policy_version 18072 (0.0009) +[2023-10-11 15:40:59,531][85175] Updated weights for policy 1, policy_version 18330 (0.0008) +[2023-10-11 15:41:01,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 37289984. Throughput: 0: 1671.9, 1: 1685.3. Samples: 9330140. Policy #0 lag: (min: 24.0, avg: 45.9, max: 56.0) +[2023-10-11 15:41:01,063][84230] Avg episode reward: [(0, '6.490'), (1, '7.540')] +[2023-10-11 15:41:03,228][85176] Updated weights for policy 0, policy_version 18082 (0.0009) +[2023-10-11 15:41:03,461][85175] Updated weights for policy 1, policy_version 18340 (0.0009) +[2023-10-11 15:41:03,603][85176] Updated weights for policy 0, policy_version 18092 (0.0008) +[2023-10-11 15:41:03,834][85175] Updated weights for policy 1, policy_version 18350 (0.0009) +[2023-10-11 15:41:03,969][85176] Updated weights for policy 0, policy_version 18102 (0.0007) +[2023-10-11 15:41:04,192][85175] Updated weights for policy 1, policy_version 18360 (0.0009) +[2023-10-11 15:41:04,342][85176] Updated weights for policy 0, policy_version 18112 (0.0007) +[2023-10-11 15:41:06,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 37355520. Throughput: 0: 1666.2, 1: 1688.6. Samples: 9341226. Policy #0 lag: (min: 31.0, avg: 34.6, max: 63.0) +[2023-10-11 15:41:06,063][84230] Avg episode reward: [(0, '6.640'), (1, '7.620')] +[2023-10-11 15:41:08,244][85175] Updated weights for policy 1, policy_version 18370 (0.0010) +[2023-10-11 15:41:08,503][85176] Updated weights for policy 0, policy_version 18122 (0.0007) +[2023-10-11 15:41:08,608][85175] Updated weights for policy 1, policy_version 18380 (0.0008) +[2023-10-11 15:41:08,873][85176] Updated weights for policy 0, policy_version 18132 (0.0009) +[2023-10-11 15:41:08,979][85175] Updated weights for policy 1, policy_version 18390 (0.0007) +[2023-10-11 15:41:09,248][85176] Updated weights for policy 0, policy_version 18142 (0.0008) +[2023-10-11 15:41:09,349][85175] Updated weights for policy 1, policy_version 18400 (0.0008) +[2023-10-11 15:41:11,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 37421056. Throughput: 0: 1658.5, 1: 1670.2. Samples: 9360016. Policy #0 lag: (min: 31.0, avg: 34.6, max: 63.0) +[2023-10-11 15:41:11,063][84230] Avg episode reward: [(0, '6.430'), (1, '7.260')] +[2023-10-11 15:41:13,298][85175] Updated weights for policy 1, policy_version 18410 (0.0008) +[2023-10-11 15:41:13,472][85176] Updated weights for policy 0, policy_version 18152 (0.0007) +[2023-10-11 15:41:13,665][85175] Updated weights for policy 1, policy_version 18420 (0.0009) +[2023-10-11 15:41:13,847][85176] Updated weights for policy 0, policy_version 18162 (0.0009) +[2023-10-11 15:41:14,024][85175] Updated weights for policy 1, policy_version 18430 (0.0009) +[2023-10-11 15:41:14,214][85176] Updated weights for policy 0, policy_version 18172 (0.0008) +[2023-10-11 15:41:16,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 37486592. Throughput: 0: 1668.7, 1: 1699.5. Samples: 9380590. Policy #0 lag: (min: 31.0, avg: 34.6, max: 63.0) +[2023-10-11 15:41:16,063][84230] Avg episode reward: [(0, '6.770'), (1, '7.180')] +[2023-10-11 15:41:18,000][85175] Updated weights for policy 1, policy_version 18440 (0.0008) +[2023-10-11 15:41:18,366][85176] Updated weights for policy 0, policy_version 18182 (0.0007) +[2023-10-11 15:41:18,373][85175] Updated weights for policy 1, policy_version 18450 (0.0009) +[2023-10-11 15:41:18,739][85176] Updated weights for policy 0, policy_version 18192 (0.0007) +[2023-10-11 15:41:18,742][85175] Updated weights for policy 1, policy_version 18460 (0.0008) +[2023-10-11 15:41:19,113][85176] Updated weights for policy 0, policy_version 18202 (0.0007) +[2023-10-11 15:41:21,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 37552128. Throughput: 0: 1656.8, 1: 1678.5. Samples: 9390952. Policy #0 lag: (min: 16.0, avg: 43.3, max: 48.0) +[2023-10-11 15:41:21,064][84230] Avg episode reward: [(0, '6.540'), (1, '7.540')] +[2023-10-11 15:41:22,733][85175] Updated weights for policy 1, policy_version 18470 (0.0009) +[2023-10-11 15:41:23,100][85175] Updated weights for policy 1, policy_version 18480 (0.0009) +[2023-10-11 15:41:23,235][85176] Updated weights for policy 0, policy_version 18212 (0.0008) +[2023-10-11 15:41:23,474][85175] Updated weights for policy 1, policy_version 18490 (0.0007) +[2023-10-11 15:41:23,591][85176] Updated weights for policy 0, policy_version 18222 (0.0008) +[2023-10-11 15:41:23,969][85176] Updated weights for policy 0, policy_version 18232 (0.0008) +[2023-10-11 15:41:26,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 37617664. Throughput: 0: 1657.8, 1: 1679.2. Samples: 9410448. Policy #0 lag: (min: 16.0, avg: 43.3, max: 48.0) +[2023-10-11 15:41:26,063][84230] Avg episode reward: [(0, '6.690'), (1, '7.540')] +[2023-10-11 15:41:27,694][85175] Updated weights for policy 1, policy_version 18500 (0.0010) +[2023-10-11 15:41:27,970][85176] Updated weights for policy 0, policy_version 18242 (0.0010) +[2023-10-11 15:41:28,090][85175] Updated weights for policy 1, policy_version 18510 (0.0007) +[2023-10-11 15:41:28,350][85176] Updated weights for policy 0, policy_version 18252 (0.0008) +[2023-10-11 15:41:28,458][85175] Updated weights for policy 1, policy_version 18520 (0.0008) +[2023-10-11 15:41:28,714][85176] Updated weights for policy 0, policy_version 18262 (0.0008) +[2023-10-11 15:41:29,090][85176] Updated weights for policy 0, policy_version 18272 (0.0009) +[2023-10-11 15:41:31,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 37683200. Throughput: 0: 1669.4, 1: 1698.3. Samples: 9431416. Policy #0 lag: (min: 16.0, avg: 43.3, max: 48.0) +[2023-10-11 15:41:31,064][84230] Avg episode reward: [(0, '6.600'), (1, '7.200')] +[2023-10-11 15:41:32,394][85175] Updated weights for policy 1, policy_version 18530 (0.0008) +[2023-10-11 15:41:32,758][85175] Updated weights for policy 1, policy_version 18540 (0.0009) +[2023-10-11 15:41:33,132][85175] Updated weights for policy 1, policy_version 18550 (0.0009) +[2023-10-11 15:41:33,257][85176] Updated weights for policy 0, policy_version 18282 (0.0007) +[2023-10-11 15:41:33,499][85175] Updated weights for policy 1, policy_version 18560 (0.0008) +[2023-10-11 15:41:33,641][85176] Updated weights for policy 0, policy_version 18292 (0.0007) +[2023-10-11 15:41:34,019][85176] Updated weights for policy 0, policy_version 18302 (0.0008) +[2023-10-11 15:41:36,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 37748736. Throughput: 0: 1650.5, 1: 1668.2. Samples: 9440944. Policy #0 lag: (min: 31.0, avg: 38.4, max: 63.0) +[2023-10-11 15:41:36,064][84230] Avg episode reward: [(0, '6.580'), (1, '7.250')] +[2023-10-11 15:41:37,468][85175] Updated weights for policy 1, policy_version 18570 (0.0008) +[2023-10-11 15:41:37,838][85175] Updated weights for policy 1, policy_version 18580 (0.0008) +[2023-10-11 15:41:38,205][85175] Updated weights for policy 1, policy_version 18590 (0.0008) +[2023-10-11 15:41:38,251][85176] Updated weights for policy 0, policy_version 18312 (0.0007) +[2023-10-11 15:41:38,633][85176] Updated weights for policy 0, policy_version 18322 (0.0009) +[2023-10-11 15:41:39,003][85176] Updated weights for policy 0, policy_version 18332 (0.0011) +[2023-10-11 15:41:41,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 37814272. Throughput: 0: 1657.7, 1: 1691.8. Samples: 9460826. Policy #0 lag: (min: 31.0, avg: 38.4, max: 63.0) +[2023-10-11 15:41:41,063][84230] Avg episode reward: [(0, '6.780'), (1, '7.410')] +[2023-10-11 15:41:42,290][85175] Updated weights for policy 1, policy_version 18600 (0.0008) +[2023-10-11 15:41:42,669][85175] Updated weights for policy 1, policy_version 18610 (0.0009) +[2023-10-11 15:41:43,032][85175] Updated weights for policy 1, policy_version 18620 (0.0009) +[2023-10-11 15:41:43,188][85176] Updated weights for policy 0, policy_version 18342 (0.0009) +[2023-10-11 15:41:43,567][85176] Updated weights for policy 0, policy_version 18352 (0.0011) +[2023-10-11 15:41:43,940][85176] Updated weights for policy 0, policy_version 18362 (0.0010) +[2023-10-11 15:41:46,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 37879808. Throughput: 0: 1663.6, 1: 1700.3. Samples: 9481518. Policy #0 lag: (min: 31.0, avg: 38.4, max: 63.0) +[2023-10-11 15:41:46,063][84230] Avg episode reward: [(0, '7.020'), (1, '7.160')] +[2023-10-11 15:41:47,151][85175] Updated weights for policy 1, policy_version 18630 (0.0008) +[2023-10-11 15:41:47,514][85175] Updated weights for policy 1, policy_version 18640 (0.0008) +[2023-10-11 15:41:47,883][85175] Updated weights for policy 1, policy_version 18650 (0.0008) +[2023-10-11 15:41:47,894][85176] Updated weights for policy 0, policy_version 18372 (0.0010) +[2023-10-11 15:41:48,273][85176] Updated weights for policy 0, policy_version 18382 (0.0010) +[2023-10-11 15:41:48,643][85176] Updated weights for policy 0, policy_version 18392 (0.0010) +[2023-10-11 15:41:51,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 37945344. Throughput: 0: 1649.6, 1: 1676.6. Samples: 9490908. Policy #0 lag: (min: 31.0, avg: 33.3, max: 63.0) +[2023-10-11 15:41:51,064][84230] Avg episode reward: [(0, '7.060'), (1, '6.850')] +[2023-10-11 15:41:51,957][85175] Updated weights for policy 1, policy_version 18660 (0.0007) +[2023-10-11 15:41:52,329][85175] Updated weights for policy 1, policy_version 18670 (0.0008) +[2023-10-11 15:41:52,692][85175] Updated weights for policy 1, policy_version 18680 (0.0010) +[2023-10-11 15:41:52,796][85176] Updated weights for policy 0, policy_version 18402 (0.0010) +[2023-10-11 15:41:53,167][85176] Updated weights for policy 0, policy_version 18412 (0.0009) +[2023-10-11 15:41:53,549][85176] Updated weights for policy 0, policy_version 18422 (0.0009) +[2023-10-11 15:41:53,920][85176] Updated weights for policy 0, policy_version 18432 (0.0008) +[2023-10-11 15:41:56,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 38010880. Throughput: 0: 1658.3, 1: 1694.1. Samples: 9510876. Policy #0 lag: (min: 31.0, avg: 33.3, max: 63.0) +[2023-10-11 15:41:56,063][84230] Avg episode reward: [(0, '7.230'), (1, '7.070')] +[2023-10-11 15:41:56,766][85175] Updated weights for policy 1, policy_version 18690 (0.0010) +[2023-10-11 15:41:57,135][85175] Updated weights for policy 1, policy_version 18700 (0.0008) +[2023-10-11 15:41:57,504][85175] Updated weights for policy 1, policy_version 18710 (0.0007) +[2023-10-11 15:41:57,871][85175] Updated weights for policy 1, policy_version 18720 (0.0009) +[2023-10-11 15:41:58,070][85176] Updated weights for policy 0, policy_version 18442 (0.0009) +[2023-10-11 15:41:58,447][85176] Updated weights for policy 0, policy_version 18452 (0.0008) +[2023-10-11 15:41:58,825][85176] Updated weights for policy 0, policy_version 18462 (0.0009) +[2023-10-11 15:42:01,062][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 38076416. Throughput: 0: 1658.2, 1: 1689.8. Samples: 9531252. Policy #0 lag: (min: 31.0, avg: 33.3, max: 63.0) +[2023-10-11 15:42:01,063][84230] Avg episode reward: [(0, '6.890'), (1, '7.600')] +[2023-10-11 15:42:01,964][85175] Updated weights for policy 1, policy_version 18730 (0.0008) +[2023-10-11 15:42:02,321][85175] Updated weights for policy 1, policy_version 18740 (0.0008) +[2023-10-11 15:42:02,688][85175] Updated weights for policy 1, policy_version 18750 (0.0008) +[2023-10-11 15:42:02,983][85176] Updated weights for policy 0, policy_version 18472 (0.0009) +[2023-10-11 15:42:03,363][85176] Updated weights for policy 0, policy_version 18482 (0.0007) +[2023-10-11 15:42:03,734][85176] Updated weights for policy 0, policy_version 18492 (0.0008) +[2023-10-11 15:42:06,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 38141952. Throughput: 0: 1650.8, 1: 1675.2. Samples: 9540624. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:42:06,064][84230] Avg episode reward: [(0, '6.770'), (1, '7.260')] +[2023-10-11 15:42:06,738][85175] Updated weights for policy 1, policy_version 18760 (0.0008) +[2023-10-11 15:42:07,112][85175] Updated weights for policy 1, policy_version 18770 (0.0009) +[2023-10-11 15:42:07,477][85175] Updated weights for policy 1, policy_version 18780 (0.0007) +[2023-10-11 15:42:07,919][85176] Updated weights for policy 0, policy_version 18502 (0.0010) +[2023-10-11 15:42:08,298][85176] Updated weights for policy 0, policy_version 18512 (0.0008) +[2023-10-11 15:42:08,667][85176] Updated weights for policy 0, policy_version 18522 (0.0009) +[2023-10-11 15:42:11,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 38207488. Throughput: 0: 1656.6, 1: 1691.9. Samples: 9561128. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:42:11,064][84230] Avg episode reward: [(0, '6.710'), (1, '7.160')] +[2023-10-11 15:42:11,411][85175] Updated weights for policy 1, policy_version 18790 (0.0007) +[2023-10-11 15:42:11,777][85175] Updated weights for policy 1, policy_version 18800 (0.0007) +[2023-10-11 15:42:12,148][85175] Updated weights for policy 1, policy_version 18810 (0.0007) +[2023-10-11 15:42:12,621][85176] Updated weights for policy 0, policy_version 18532 (0.0008) +[2023-10-11 15:42:12,991][85176] Updated weights for policy 0, policy_version 18542 (0.0010) +[2023-10-11 15:42:13,374][85176] Updated weights for policy 0, policy_version 18552 (0.0008) +[2023-10-11 15:42:16,062][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 38273024. Throughput: 0: 1654.8, 1: 1690.8. Samples: 9581968. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:42:16,063][84230] Avg episode reward: [(0, '6.480'), (1, '6.970')] +[2023-10-11 15:42:16,071][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000018560_19005440.pth... +[2023-10-11 15:42:16,111][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000017024_17432576.pth +[2023-10-11 15:42:16,171][85175] Updated weights for policy 1, policy_version 18820 (0.0009) +[2023-10-11 15:42:16,559][85175] Updated weights for policy 1, policy_version 18830 (0.0007) +[2023-10-11 15:42:16,932][85175] Updated weights for policy 1, policy_version 18840 (0.0010) +[2023-10-11 15:42:17,225][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000018848_19300352.pth... +[2023-10-11 15:42:17,253][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000017248_17661952.pth +[2023-10-11 15:42:17,475][85176] Updated weights for policy 0, policy_version 18562 (0.0007) +[2023-10-11 15:42:17,854][85176] Updated weights for policy 0, policy_version 18572 (0.0007) +[2023-10-11 15:42:18,224][85176] Updated weights for policy 0, policy_version 18582 (0.0009) +[2023-10-11 15:42:18,588][85176] Updated weights for policy 0, policy_version 18592 (0.0009) +[2023-10-11 15:42:20,929][85175] Updated weights for policy 1, policy_version 18850 (0.0008) +[2023-10-11 15:42:21,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 38338560. Throughput: 0: 1649.0, 1: 1691.2. Samples: 9591252. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:42:21,063][84230] Avg episode reward: [(0, '6.610'), (1, '7.150')] +[2023-10-11 15:42:21,308][85175] Updated weights for policy 1, policy_version 18860 (0.0007) +[2023-10-11 15:42:21,677][85175] Updated weights for policy 1, policy_version 18870 (0.0007) +[2023-10-11 15:42:22,038][85175] Updated weights for policy 1, policy_version 18880 (0.0009) +[2023-10-11 15:42:22,526][85176] Updated weights for policy 0, policy_version 18602 (0.0010) +[2023-10-11 15:42:22,895][85176] Updated weights for policy 0, policy_version 18612 (0.0010) +[2023-10-11 15:42:23,277][85176] Updated weights for policy 0, policy_version 18622 (0.0008) +[2023-10-11 15:42:26,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 38404096. Throughput: 0: 1664.2, 1: 1689.1. Samples: 9611724. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:42:26,063][84230] Avg episode reward: [(0, '6.450'), (1, '7.700')] +[2023-10-11 15:42:26,141][85175] Updated weights for policy 1, policy_version 18890 (0.0009) +[2023-10-11 15:42:26,508][85175] Updated weights for policy 1, policy_version 18900 (0.0009) +[2023-10-11 15:42:26,882][85175] Updated weights for policy 1, policy_version 18910 (0.0011) +[2023-10-11 15:42:27,551][85176] Updated weights for policy 0, policy_version 18632 (0.0010) +[2023-10-11 15:42:27,937][85176] Updated weights for policy 0, policy_version 18642 (0.0009) +[2023-10-11 15:42:28,309][85176] Updated weights for policy 0, policy_version 18652 (0.0010) +[2023-10-11 15:42:30,899][85175] Updated weights for policy 1, policy_version 18920 (0.0007) +[2023-10-11 15:42:31,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 38469632. Throughput: 0: 1658.2, 1: 1696.4. Samples: 9632476. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:42:31,063][84230] Avg episode reward: [(0, '6.570'), (1, '7.140')] +[2023-10-11 15:42:31,265][85175] Updated weights for policy 1, policy_version 18930 (0.0007) +[2023-10-11 15:42:31,624][85175] Updated weights for policy 1, policy_version 18940 (0.0011) +[2023-10-11 15:42:32,516][85176] Updated weights for policy 0, policy_version 18662 (0.0009) +[2023-10-11 15:42:32,886][85176] Updated weights for policy 0, policy_version 18672 (0.0009) +[2023-10-11 15:42:33,258][85176] Updated weights for policy 0, policy_version 18682 (0.0008) +[2023-10-11 15:42:35,755][85175] Updated weights for policy 1, policy_version 18950 (0.0009) +[2023-10-11 15:42:36,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 38535168. Throughput: 0: 1649.1, 1: 1696.3. Samples: 9641450. Policy #0 lag: (min: 11.0, avg: 21.3, max: 43.0) +[2023-10-11 15:42:36,063][84230] Avg episode reward: [(0, '6.620'), (1, '6.900')] +[2023-10-11 15:42:36,129][85175] Updated weights for policy 1, policy_version 18960 (0.0010) +[2023-10-11 15:42:36,495][85175] Updated weights for policy 1, policy_version 18970 (0.0009) +[2023-10-11 15:42:37,266][85176] Updated weights for policy 0, policy_version 18692 (0.0008) +[2023-10-11 15:42:37,634][85176] Updated weights for policy 0, policy_version 18702 (0.0008) +[2023-10-11 15:42:38,014][85176] Updated weights for policy 0, policy_version 18712 (0.0009) +[2023-10-11 15:42:40,372][85175] Updated weights for policy 1, policy_version 18980 (0.0008) +[2023-10-11 15:42:40,733][85175] Updated weights for policy 1, policy_version 18990 (0.0007) +[2023-10-11 15:42:41,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 38600704. Throughput: 0: 1663.2, 1: 1700.5. Samples: 9662242. Policy #0 lag: (min: 11.0, avg: 21.3, max: 43.0) +[2023-10-11 15:42:41,063][84230] Avg episode reward: [(0, '6.620'), (1, '7.000')] +[2023-10-11 15:42:41,093][85175] Updated weights for policy 1, policy_version 19000 (0.0007) +[2023-10-11 15:42:42,176][85176] Updated weights for policy 0, policy_version 18722 (0.0007) +[2023-10-11 15:42:42,541][85176] Updated weights for policy 0, policy_version 18732 (0.0007) +[2023-10-11 15:42:42,924][85176] Updated weights for policy 0, policy_version 18742 (0.0007) +[2023-10-11 15:42:43,296][85176] Updated weights for policy 0, policy_version 18752 (0.0007) +[2023-10-11 15:42:45,217][85175] Updated weights for policy 1, policy_version 19010 (0.0008) +[2023-10-11 15:42:45,578][85175] Updated weights for policy 1, policy_version 19020 (0.0009) +[2023-10-11 15:42:45,948][85175] Updated weights for policy 1, policy_version 19030 (0.0007) +[2023-10-11 15:42:46,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 38666240. Throughput: 0: 1670.8, 1: 1697.2. Samples: 9682808. Policy #0 lag: (min: 11.0, avg: 21.3, max: 43.0) +[2023-10-11 15:42:46,063][84230] Avg episode reward: [(0, '6.700'), (1, '7.100')] +[2023-10-11 15:42:46,318][85175] Updated weights for policy 1, policy_version 19040 (0.0009) +[2023-10-11 15:42:47,339][85176] Updated weights for policy 0, policy_version 18762 (0.0008) +[2023-10-11 15:42:47,715][85176] Updated weights for policy 0, policy_version 18772 (0.0010) +[2023-10-11 15:42:48,083][85176] Updated weights for policy 0, policy_version 18782 (0.0008) +[2023-10-11 15:42:50,335][85175] Updated weights for policy 1, policy_version 19050 (0.0008) +[2023-10-11 15:42:50,705][85175] Updated weights for policy 1, policy_version 19060 (0.0007) +[2023-10-11 15:42:51,062][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 38731776. Throughput: 0: 1663.3, 1: 1710.7. Samples: 9692456. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:42:51,063][84230] Avg episode reward: [(0, '6.850'), (1, '7.160')] +[2023-10-11 15:42:51,087][85175] Updated weights for policy 1, policy_version 19070 (0.0008) +[2023-10-11 15:42:52,168][85176] Updated weights for policy 0, policy_version 18792 (0.0008) +[2023-10-11 15:42:52,537][85176] Updated weights for policy 0, policy_version 18802 (0.0009) +[2023-10-11 15:42:52,913][85176] Updated weights for policy 0, policy_version 18812 (0.0008) +[2023-10-11 15:42:54,999][85175] Updated weights for policy 1, policy_version 19080 (0.0007) +[2023-10-11 15:42:55,373][85175] Updated weights for policy 1, policy_version 19090 (0.0007) +[2023-10-11 15:42:55,746][85175] Updated weights for policy 1, policy_version 19100 (0.0010) +[2023-10-11 15:42:56,062][84230] Fps is (10 sec: 16384.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 38830080. Throughput: 0: 1674.9, 1: 1702.9. Samples: 9713126. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:42:56,063][84230] Avg episode reward: [(0, '6.960'), (1, '7.420')] +[2023-10-11 15:42:57,132][85176] Updated weights for policy 0, policy_version 18822 (0.0010) +[2023-10-11 15:42:57,506][85176] Updated weights for policy 0, policy_version 18832 (0.0008) +[2023-10-11 15:42:57,888][85176] Updated weights for policy 0, policy_version 18842 (0.0007) +[2023-10-11 15:42:59,676][85175] Updated weights for policy 1, policy_version 19110 (0.0007) +[2023-10-11 15:43:00,046][85175] Updated weights for policy 1, policy_version 19120 (0.0008) +[2023-10-11 15:43:00,414][85175] Updated weights for policy 1, policy_version 19130 (0.0010) +[2023-10-11 15:43:01,063][84230] Fps is (10 sec: 16383.6, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 38895616. Throughput: 0: 1675.9, 1: 1677.4. Samples: 9732864. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:43:01,064][84230] Avg episode reward: [(0, '6.870'), (1, '7.360')] +[2023-10-11 15:43:01,870][85176] Updated weights for policy 0, policy_version 18852 (0.0009) +[2023-10-11 15:43:02,250][85176] Updated weights for policy 0, policy_version 18862 (0.0009) +[2023-10-11 15:43:02,620][85176] Updated weights for policy 0, policy_version 18872 (0.0010) +[2023-10-11 15:43:04,721][85175] Updated weights for policy 1, policy_version 19140 (0.0010) +[2023-10-11 15:43:05,125][85175] Updated weights for policy 1, policy_version 19150 (0.0009) +[2023-10-11 15:43:05,498][85175] Updated weights for policy 1, policy_version 19160 (0.0007) +[2023-10-11 15:43:06,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 38961152. Throughput: 0: 1670.2, 1: 1700.7. Samples: 9742942. Policy #0 lag: (min: 23.0, avg: 25.5, max: 55.0) +[2023-10-11 15:43:06,064][84230] Avg episode reward: [(0, '7.100'), (1, '6.920')] +[2023-10-11 15:43:06,600][85176] Updated weights for policy 0, policy_version 18882 (0.0010) +[2023-10-11 15:43:06,975][85176] Updated weights for policy 0, policy_version 18892 (0.0010) +[2023-10-11 15:43:07,355][85176] Updated weights for policy 0, policy_version 18902 (0.0009) +[2023-10-11 15:43:07,728][85176] Updated weights for policy 0, policy_version 18912 (0.0010) +[2023-10-11 15:43:09,309][85175] Updated weights for policy 1, policy_version 19170 (0.0007) +[2023-10-11 15:43:09,680][85175] Updated weights for policy 1, policy_version 19180 (0.0009) +[2023-10-11 15:43:10,046][85175] Updated weights for policy 1, policy_version 19190 (0.0007) +[2023-10-11 15:43:10,414][85175] Updated weights for policy 1, policy_version 19200 (0.0009) +[2023-10-11 15:43:11,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 39026688. Throughput: 0: 1672.8, 1: 1697.7. Samples: 9763398. Policy #0 lag: (min: 23.0, avg: 25.5, max: 55.0) +[2023-10-11 15:43:11,064][84230] Avg episode reward: [(0, '7.180'), (1, '6.980')] +[2023-10-11 15:43:12,002][85176] Updated weights for policy 0, policy_version 18922 (0.0009) +[2023-10-11 15:43:12,370][85176] Updated weights for policy 0, policy_version 18932 (0.0007) +[2023-10-11 15:43:12,751][85176] Updated weights for policy 0, policy_version 18942 (0.0010) +[2023-10-11 15:43:14,447][85175] Updated weights for policy 1, policy_version 19210 (0.0010) +[2023-10-11 15:43:14,811][85175] Updated weights for policy 1, policy_version 19220 (0.0010) +[2023-10-11 15:43:15,184][85175] Updated weights for policy 1, policy_version 19230 (0.0010) +[2023-10-11 15:43:16,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 39092224. Throughput: 0: 1677.3, 1: 1672.4. Samples: 9783212. Policy #0 lag: (min: 23.0, avg: 25.5, max: 55.0) +[2023-10-11 15:43:16,063][84230] Avg episode reward: [(0, '7.190'), (1, '7.410')] +[2023-10-11 15:43:16,554][85176] Updated weights for policy 0, policy_version 18952 (0.0009) +[2023-10-11 15:43:16,930][85176] Updated weights for policy 0, policy_version 18962 (0.0011) +[2023-10-11 15:43:17,294][85176] Updated weights for policy 0, policy_version 18972 (0.0010) +[2023-10-11 15:43:19,279][85175] Updated weights for policy 1, policy_version 19240 (0.0009) +[2023-10-11 15:43:19,652][85175] Updated weights for policy 1, policy_version 19250 (0.0011) +[2023-10-11 15:43:20,025][85175] Updated weights for policy 1, policy_version 19260 (0.0010) +[2023-10-11 15:43:21,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 39157760. Throughput: 0: 1675.1, 1: 1703.4. Samples: 9793484. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:43:21,064][84230] Avg episode reward: [(0, '7.070'), (1, '7.080')] +[2023-10-11 15:43:21,445][85176] Updated weights for policy 0, policy_version 18982 (0.0009) +[2023-10-11 15:43:21,814][85176] Updated weights for policy 0, policy_version 18992 (0.0007) +[2023-10-11 15:43:22,187][85176] Updated weights for policy 0, policy_version 19002 (0.0007) +[2023-10-11 15:43:24,165][85175] Updated weights for policy 1, policy_version 19270 (0.0007) +[2023-10-11 15:43:24,533][85175] Updated weights for policy 1, policy_version 19280 (0.0007) +[2023-10-11 15:43:24,897][85175] Updated weights for policy 1, policy_version 19290 (0.0008) +[2023-10-11 15:43:26,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 39223296. Throughput: 0: 1675.1, 1: 1686.9. Samples: 9813534. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:43:26,063][84230] Avg episode reward: [(0, '7.170'), (1, '6.930')] +[2023-10-11 15:43:26,367][85176] Updated weights for policy 0, policy_version 19012 (0.0008) +[2023-10-11 15:43:26,733][85176] Updated weights for policy 0, policy_version 19022 (0.0008) +[2023-10-11 15:43:27,114][85176] Updated weights for policy 0, policy_version 19032 (0.0009) +[2023-10-11 15:43:29,103][85175] Updated weights for policy 1, policy_version 19300 (0.0007) +[2023-10-11 15:43:29,464][85175] Updated weights for policy 1, policy_version 19310 (0.0008) +[2023-10-11 15:43:29,837][85175] Updated weights for policy 1, policy_version 19320 (0.0008) +[2023-10-11 15:43:31,063][84230] Fps is (10 sec: 13107.6, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 39288832. Throughput: 0: 1670.3, 1: 1673.8. Samples: 9833294. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:43:31,063][84230] Avg episode reward: [(0, '7.090'), (1, '6.950')] +[2023-10-11 15:43:31,333][85176] Updated weights for policy 0, policy_version 19042 (0.0009) +[2023-10-11 15:43:31,701][85176] Updated weights for policy 0, policy_version 19052 (0.0007) +[2023-10-11 15:43:32,080][85176] Updated weights for policy 0, policy_version 19062 (0.0009) +[2023-10-11 15:43:32,447][85176] Updated weights for policy 0, policy_version 19072 (0.0010) +[2023-10-11 15:43:33,943][85175] Updated weights for policy 1, policy_version 19330 (0.0007) +[2023-10-11 15:43:34,310][85175] Updated weights for policy 1, policy_version 19340 (0.0009) +[2023-10-11 15:43:34,689][85175] Updated weights for policy 1, policy_version 19350 (0.0009) +[2023-10-11 15:43:35,048][85175] Updated weights for policy 1, policy_version 19360 (0.0007) +[2023-10-11 15:43:36,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 39354368. Throughput: 0: 1663.9, 1: 1694.0. Samples: 9843560. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:43:36,063][84230] Avg episode reward: [(0, '6.990'), (1, '7.270')] +[2023-10-11 15:43:36,803][85176] Updated weights for policy 0, policy_version 19082 (0.0011) +[2023-10-11 15:43:37,177][85176] Updated weights for policy 0, policy_version 19092 (0.0011) +[2023-10-11 15:43:37,555][85176] Updated weights for policy 0, policy_version 19102 (0.0008) +[2023-10-11 15:43:39,018][85175] Updated weights for policy 1, policy_version 19370 (0.0008) +[2023-10-11 15:43:39,383][85175] Updated weights for policy 1, policy_version 19380 (0.0008) +[2023-10-11 15:43:39,754][85175] Updated weights for policy 1, policy_version 19390 (0.0010) +[2023-10-11 15:43:41,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 39419904. Throughput: 0: 1662.3, 1: 1674.5. Samples: 9863284. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:43:41,063][84230] Avg episode reward: [(0, '6.870'), (1, '7.070')] +[2023-10-11 15:43:41,646][85176] Updated weights for policy 0, policy_version 19112 (0.0007) +[2023-10-11 15:43:42,030][85176] Updated weights for policy 0, policy_version 19122 (0.0007) +[2023-10-11 15:43:42,397][85176] Updated weights for policy 0, policy_version 19132 (0.0007) +[2023-10-11 15:43:43,708][85175] Updated weights for policy 1, policy_version 19400 (0.0007) +[2023-10-11 15:43:44,073][85175] Updated weights for policy 1, policy_version 19410 (0.0007) +[2023-10-11 15:43:44,447][85175] Updated weights for policy 1, policy_version 19420 (0.0008) +[2023-10-11 15:43:46,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 39485440. Throughput: 0: 1657.7, 1: 1694.1. Samples: 9883696. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:43:46,063][84230] Avg episode reward: [(0, '6.980'), (1, '7.020')] +[2023-10-11 15:43:46,411][85176] Updated weights for policy 0, policy_version 19142 (0.0008) +[2023-10-11 15:43:46,795][85176] Updated weights for policy 0, policy_version 19152 (0.0008) +[2023-10-11 15:43:47,176][85176] Updated weights for policy 0, policy_version 19162 (0.0009) +[2023-10-11 15:43:48,558][85175] Updated weights for policy 1, policy_version 19430 (0.0008) +[2023-10-11 15:43:48,917][85175] Updated weights for policy 1, policy_version 19440 (0.0009) +[2023-10-11 15:43:49,295][85175] Updated weights for policy 1, policy_version 19450 (0.0009) +[2023-10-11 15:43:51,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 39550976. Throughput: 0: 1659.5, 1: 1691.9. Samples: 9893756. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:43:51,064][84230] Avg episode reward: [(0, '6.890'), (1, '6.920')] +[2023-10-11 15:43:51,175][85176] Updated weights for policy 0, policy_version 19172 (0.0010) +[2023-10-11 15:43:51,555][85176] Updated weights for policy 0, policy_version 19182 (0.0008) +[2023-10-11 15:43:51,941][85176] Updated weights for policy 0, policy_version 19192 (0.0010) +[2023-10-11 15:43:53,272][85175] Updated weights for policy 1, policy_version 19460 (0.0007) +[2023-10-11 15:43:53,628][85175] Updated weights for policy 1, policy_version 19470 (0.0007) +[2023-10-11 15:43:53,999][85175] Updated weights for policy 1, policy_version 19480 (0.0009) +[2023-10-11 15:43:55,937][85176] Updated weights for policy 0, policy_version 19202 (0.0011) +[2023-10-11 15:43:56,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 39616512. Throughput: 0: 1662.8, 1: 1672.1. Samples: 9913468. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:43:56,063][84230] Avg episode reward: [(0, '6.930'), (1, '7.500')] +[2023-10-11 15:43:56,303][85176] Updated weights for policy 0, policy_version 19212 (0.0008) +[2023-10-11 15:43:56,677][85176] Updated weights for policy 0, policy_version 19222 (0.0007) +[2023-10-11 15:43:57,047][85176] Updated weights for policy 0, policy_version 19232 (0.0007) +[2023-10-11 15:43:58,069][85175] Updated weights for policy 1, policy_version 19490 (0.0009) +[2023-10-11 15:43:58,482][85175] Updated weights for policy 1, policy_version 19500 (0.0008) +[2023-10-11 15:43:58,844][85175] Updated weights for policy 1, policy_version 19510 (0.0007) +[2023-10-11 15:43:59,217][85175] Updated weights for policy 1, policy_version 19520 (0.0009) +[2023-10-11 15:44:01,062][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 39682048. Throughput: 0: 1665.0, 1: 1693.2. Samples: 9934332. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:44:01,063][84230] Avg episode reward: [(0, '7.040'), (1, '7.140')] +[2023-10-11 15:44:01,178][85176] Updated weights for policy 0, policy_version 19242 (0.0009) +[2023-10-11 15:44:01,557][85176] Updated weights for policy 0, policy_version 19252 (0.0007) +[2023-10-11 15:44:01,929][85176] Updated weights for policy 0, policy_version 19262 (0.0007) +[2023-10-11 15:44:03,214][85175] Updated weights for policy 1, policy_version 19530 (0.0008) +[2023-10-11 15:44:03,582][85175] Updated weights for policy 1, policy_version 19540 (0.0008) +[2023-10-11 15:44:03,950][85175] Updated weights for policy 1, policy_version 19550 (0.0009) +[2023-10-11 15:44:05,966][85176] Updated weights for policy 0, policy_version 19272 (0.0007) +[2023-10-11 15:44:06,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 39747584. Throughput: 0: 1664.5, 1: 1674.9. Samples: 9943754. Policy #0 lag: (min: 31.0, avg: 46.2, max: 63.0) +[2023-10-11 15:44:06,063][84230] Avg episode reward: [(0, '7.040'), (1, '7.240')] +[2023-10-11 15:44:06,343][85176] Updated weights for policy 0, policy_version 19282 (0.0007) +[2023-10-11 15:44:06,711][85176] Updated weights for policy 0, policy_version 19292 (0.0009) +[2023-10-11 15:44:07,963][85175] Updated weights for policy 1, policy_version 19560 (0.0009) +[2023-10-11 15:44:08,339][85175] Updated weights for policy 1, policy_version 19570 (0.0009) +[2023-10-11 15:44:08,693][85175] Updated weights for policy 1, policy_version 19580 (0.0007) +[2023-10-11 15:44:10,790][85176] Updated weights for policy 0, policy_version 19302 (0.0009) +[2023-10-11 15:44:11,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 39813120. Throughput: 0: 1667.9, 1: 1676.0. Samples: 9964014. Policy #0 lag: (min: 31.0, avg: 46.2, max: 63.0) +[2023-10-11 15:44:11,064][84230] Avg episode reward: [(0, '7.310'), (1, '7.500')] +[2023-10-11 15:44:11,162][85176] Updated weights for policy 0, policy_version 19312 (0.0009) +[2023-10-11 15:44:11,539][85176] Updated weights for policy 0, policy_version 19322 (0.0007) +[2023-10-11 15:44:12,626][85175] Updated weights for policy 1, policy_version 19590 (0.0009) +[2023-10-11 15:44:12,999][85175] Updated weights for policy 1, policy_version 19600 (0.0009) +[2023-10-11 15:44:13,367][85175] Updated weights for policy 1, policy_version 19610 (0.0007) +[2023-10-11 15:44:15,625][85176] Updated weights for policy 0, policy_version 19332 (0.0008) +[2023-10-11 15:44:16,004][85176] Updated weights for policy 0, policy_version 19342 (0.0008) +[2023-10-11 15:44:16,063][84230] Fps is (10 sec: 13106.6, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 39878656. Throughput: 0: 1670.7, 1: 1695.7. Samples: 9984780. Policy #0 lag: (min: 31.0, avg: 46.2, max: 63.0) +[2023-10-11 15:44:16,064][84230] Avg episode reward: [(0, '6.880'), (1, '7.280')] +[2023-10-11 15:44:16,076][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000019616_20086784.pth... +[2023-10-11 15:44:16,111][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000018048_18481152.pth +[2023-10-11 15:44:16,375][85176] Updated weights for policy 0, policy_version 19352 (0.0009) +[2023-10-11 15:44:16,668][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000019360_19824640.pth... +[2023-10-11 15:44:16,706][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000017792_18219008.pth +[2023-10-11 15:44:17,278][85175] Updated weights for policy 1, policy_version 19620 (0.0007) +[2023-10-11 15:44:17,648][85175] Updated weights for policy 1, policy_version 19630 (0.0008) +[2023-10-11 15:44:18,011][85175] Updated weights for policy 1, policy_version 19640 (0.0008) +[2023-10-11 15:44:20,439][85176] Updated weights for policy 0, policy_version 19362 (0.0009) +[2023-10-11 15:44:20,815][85176] Updated weights for policy 0, policy_version 19372 (0.0010) +[2023-10-11 15:44:21,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 39944192. Throughput: 0: 1678.2, 1: 1667.5. Samples: 9994118. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:44:21,063][84230] Avg episode reward: [(0, '6.770'), (1, '7.120')] +[2023-10-11 15:44:21,196][85176] Updated weights for policy 0, policy_version 19382 (0.0009) +[2023-10-11 15:44:21,574][85176] Updated weights for policy 0, policy_version 19392 (0.0008) +[2023-10-11 15:44:22,097][85175] Updated weights for policy 1, policy_version 19650 (0.0008) +[2023-10-11 15:44:22,471][85175] Updated weights for policy 1, policy_version 19660 (0.0010) +[2023-10-11 15:44:22,834][85175] Updated weights for policy 1, policy_version 19670 (0.0007) +[2023-10-11 15:44:23,202][85175] Updated weights for policy 1, policy_version 19680 (0.0007) +[2023-10-11 15:44:25,634][85176] Updated weights for policy 0, policy_version 19402 (0.0008) +[2023-10-11 15:44:26,011][85176] Updated weights for policy 0, policy_version 19412 (0.0008) +[2023-10-11 15:44:26,063][84230] Fps is (10 sec: 13107.5, 60 sec: 13107.1, 300 sec: 13329.4). Total num frames: 40009728. Throughput: 0: 1680.2, 1: 1687.8. Samples: 10014846. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:44:26,064][84230] Avg episode reward: [(0, '6.870'), (1, '6.940')] +[2023-10-11 15:44:26,389][85176] Updated weights for policy 0, policy_version 19422 (0.0007) +[2023-10-11 15:44:27,249][85175] Updated weights for policy 1, policy_version 19690 (0.0007) +[2023-10-11 15:44:27,619][85175] Updated weights for policy 1, policy_version 19700 (0.0009) +[2023-10-11 15:44:27,983][85175] Updated weights for policy 1, policy_version 19710 (0.0009) +[2023-10-11 15:44:30,320][85176] Updated weights for policy 0, policy_version 19432 (0.0009) +[2023-10-11 15:44:30,694][85176] Updated weights for policy 0, policy_version 19442 (0.0008) +[2023-10-11 15:44:31,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 40075264. Throughput: 0: 1668.0, 1: 1696.1. Samples: 10035078. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:44:31,063][84230] Avg episode reward: [(0, '7.220'), (1, '7.200')] +[2023-10-11 15:44:31,075][85176] Updated weights for policy 0, policy_version 19452 (0.0007) +[2023-10-11 15:44:32,059][85175] Updated weights for policy 1, policy_version 19720 (0.0008) +[2023-10-11 15:44:32,429][85175] Updated weights for policy 1, policy_version 19730 (0.0007) +[2023-10-11 15:44:32,793][85175] Updated weights for policy 1, policy_version 19740 (0.0007) +[2023-10-11 15:44:35,190][85176] Updated weights for policy 0, policy_version 19462 (0.0009) +[2023-10-11 15:44:35,560][85176] Updated weights for policy 0, policy_version 19472 (0.0010) +[2023-10-11 15:44:35,929][85176] Updated weights for policy 0, policy_version 19482 (0.0010) +[2023-10-11 15:44:36,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 40140800. Throughput: 0: 1678.6, 1: 1673.7. Samples: 10044612. Policy #0 lag: (min: 31.0, avg: 31.5, max: 46.0) +[2023-10-11 15:44:36,063][84230] Avg episode reward: [(0, '7.200'), (1, '7.320')] +[2023-10-11 15:44:36,760][85175] Updated weights for policy 1, policy_version 19750 (0.0009) +[2023-10-11 15:44:37,125][85175] Updated weights for policy 1, policy_version 19760 (0.0007) +[2023-10-11 15:44:37,485][85175] Updated weights for policy 1, policy_version 19770 (0.0009) +[2023-10-11 15:44:39,992][85176] Updated weights for policy 0, policy_version 19492 (0.0010) +[2023-10-11 15:44:40,370][85176] Updated weights for policy 0, policy_version 19502 (0.0010) +[2023-10-11 15:44:40,746][85176] Updated weights for policy 0, policy_version 19512 (0.0010) +[2023-10-11 15:44:41,063][84230] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 40239104. Throughput: 0: 1676.2, 1: 1694.9. Samples: 10065168. Policy #0 lag: (min: 31.0, avg: 31.5, max: 46.0) +[2023-10-11 15:44:41,063][84230] Avg episode reward: [(0, '6.910'), (1, '7.480')] +[2023-10-11 15:44:41,584][85175] Updated weights for policy 1, policy_version 19780 (0.0009) +[2023-10-11 15:44:41,952][85175] Updated weights for policy 1, policy_version 19790 (0.0007) +[2023-10-11 15:44:42,322][85175] Updated weights for policy 1, policy_version 19800 (0.0007) +[2023-10-11 15:44:44,908][85176] Updated weights for policy 0, policy_version 19522 (0.0009) +[2023-10-11 15:44:45,277][85176] Updated weights for policy 0, policy_version 19532 (0.0009) +[2023-10-11 15:44:45,657][85176] Updated weights for policy 0, policy_version 19542 (0.0007) +[2023-10-11 15:44:46,023][85176] Updated weights for policy 0, policy_version 19552 (0.0007) +[2023-10-11 15:44:46,063][84230] Fps is (10 sec: 16383.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 40304640. Throughput: 0: 1651.9, 1: 1704.3. Samples: 10085358. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:44:46,063][84230] Avg episode reward: [(0, '6.900'), (1, '6.890')] +[2023-10-11 15:44:46,249][85175] Updated weights for policy 1, policy_version 19810 (0.0009) +[2023-10-11 15:44:46,662][85175] Updated weights for policy 1, policy_version 19820 (0.0009) +[2023-10-11 15:44:47,041][85175] Updated weights for policy 1, policy_version 19830 (0.0010) +[2023-10-11 15:44:47,396][85175] Updated weights for policy 1, policy_version 19840 (0.0007) +[2023-10-11 15:44:50,136][85176] Updated weights for policy 0, policy_version 19562 (0.0009) +[2023-10-11 15:44:50,516][85176] Updated weights for policy 0, policy_version 19572 (0.0009) +[2023-10-11 15:44:50,879][85176] Updated weights for policy 0, policy_version 19582 (0.0008) +[2023-10-11 15:44:51,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 40370176. Throughput: 0: 1676.0, 1: 1689.1. Samples: 10095182. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:44:51,063][84230] Avg episode reward: [(0, '6.840'), (1, '6.490')] +[2023-10-11 15:44:51,392][85175] Updated weights for policy 1, policy_version 19850 (0.0009) +[2023-10-11 15:44:51,752][85175] Updated weights for policy 1, policy_version 19860 (0.0007) +[2023-10-11 15:44:52,120][85175] Updated weights for policy 1, policy_version 19870 (0.0008) +[2023-10-11 15:44:55,118][85176] Updated weights for policy 0, policy_version 19592 (0.0007) +[2023-10-11 15:44:55,495][85176] Updated weights for policy 0, policy_version 19602 (0.0010) +[2023-10-11 15:44:55,871][85176] Updated weights for policy 0, policy_version 19612 (0.0010) +[2023-10-11 15:44:56,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 40435712. Throughput: 0: 1667.9, 1: 1705.0. Samples: 10115792. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:44:56,063][84230] Avg episode reward: [(0, '6.640'), (1, '7.460')] +[2023-10-11 15:44:56,086][85175] Updated weights for policy 1, policy_version 19880 (0.0008) +[2023-10-11 15:44:56,458][85175] Updated weights for policy 1, policy_version 19890 (0.0009) +[2023-10-11 15:44:56,830][85175] Updated weights for policy 1, policy_version 19900 (0.0009) +[2023-10-11 15:45:00,059][85176] Updated weights for policy 0, policy_version 19622 (0.0008) +[2023-10-11 15:45:00,433][85176] Updated weights for policy 0, policy_version 19632 (0.0011) +[2023-10-11 15:45:00,814][85176] Updated weights for policy 0, policy_version 19642 (0.0010) +[2023-10-11 15:45:00,840][85175] Updated weights for policy 1, policy_version 19910 (0.0009) +[2023-10-11 15:45:01,063][84230] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 40501248. Throughput: 0: 1646.9, 1: 1708.3. Samples: 10135766. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:45:01,064][84230] Avg episode reward: [(0, '6.870'), (1, '7.970')] +[2023-10-11 15:45:01,210][85175] Updated weights for policy 1, policy_version 19920 (0.0009) +[2023-10-11 15:45:01,583][85175] Updated weights for policy 1, policy_version 19930 (0.0010) +[2023-10-11 15:45:05,009][85176] Updated weights for policy 0, policy_version 19652 (0.0008) +[2023-10-11 15:45:05,377][85176] Updated weights for policy 0, policy_version 19662 (0.0008) +[2023-10-11 15:45:05,614][85175] Updated weights for policy 1, policy_version 19940 (0.0007) +[2023-10-11 15:45:05,759][85176] Updated weights for policy 0, policy_version 19672 (0.0007) +[2023-10-11 15:45:05,977][85175] Updated weights for policy 1, policy_version 19950 (0.0008) +[2023-10-11 15:45:06,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 40566784. Throughput: 0: 1659.4, 1: 1708.6. Samples: 10145678. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:45:06,063][84230] Avg episode reward: [(0, '6.980'), (1, '7.250')] +[2023-10-11 15:45:06,340][85175] Updated weights for policy 1, policy_version 19960 (0.0008) +[2023-10-11 15:45:09,997][85176] Updated weights for policy 0, policy_version 19682 (0.0007) +[2023-10-11 15:45:10,369][85176] Updated weights for policy 0, policy_version 19692 (0.0009) +[2023-10-11 15:45:10,507][85175] Updated weights for policy 1, policy_version 19970 (0.0007) +[2023-10-11 15:45:10,738][85176] Updated weights for policy 0, policy_version 19702 (0.0007) +[2023-10-11 15:45:10,862][85175] Updated weights for policy 1, policy_version 19980 (0.0007) +[2023-10-11 15:45:11,063][84230] Fps is (10 sec: 9830.5, 60 sec: 13107.2, 300 sec: 13218.3). Total num frames: 40599552. Throughput: 0: 1657.5, 1: 1710.0. Samples: 10166382. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:45:11,065][84230] Avg episode reward: [(0, '6.970'), (1, '7.150')] +[2023-10-11 15:45:11,115][85176] Updated weights for policy 0, policy_version 19712 (0.0007) +[2023-10-11 15:45:11,234][85175] Updated weights for policy 1, policy_version 19990 (0.0008) +[2023-10-11 15:45:11,603][85175] Updated weights for policy 1, policy_version 20000 (0.0009) +[2023-10-11 15:45:15,052][85176] Updated weights for policy 0, policy_version 19722 (0.0007) +[2023-10-11 15:45:15,435][85176] Updated weights for policy 0, policy_version 19732 (0.0007) +[2023-10-11 15:45:15,798][85175] Updated weights for policy 1, policy_version 20010 (0.0008) +[2023-10-11 15:45:15,802][85176] Updated weights for policy 0, policy_version 19742 (0.0007) +[2023-10-11 15:45:16,063][84230] Fps is (10 sec: 13106.8, 60 sec: 13653.4, 300 sec: 13329.3). Total num frames: 40697856. Throughput: 0: 1652.9, 1: 1699.6. Samples: 10185942. Policy #0 lag: (min: 31.0, avg: 34.7, max: 63.0) +[2023-10-11 15:45:16,064][84230] Avg episode reward: [(0, '7.230'), (1, '7.320')] +[2023-10-11 15:45:16,161][85175] Updated weights for policy 1, policy_version 20020 (0.0008) +[2023-10-11 15:45:16,532][85175] Updated weights for policy 1, policy_version 20030 (0.0009) +[2023-10-11 15:45:19,844][85176] Updated weights for policy 0, policy_version 19752 (0.0008) +[2023-10-11 15:45:20,225][85176] Updated weights for policy 0, policy_version 19762 (0.0008) +[2023-10-11 15:45:20,370][85175] Updated weights for policy 1, policy_version 20040 (0.0008) +[2023-10-11 15:45:20,597][85176] Updated weights for policy 0, policy_version 19772 (0.0008) +[2023-10-11 15:45:20,740][85175] Updated weights for policy 1, policy_version 20050 (0.0009) +[2023-10-11 15:45:21,062][84230] Fps is (10 sec: 16384.3, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 40763392. Throughput: 0: 1664.4, 1: 1703.9. Samples: 10196184. Policy #0 lag: (min: 31.0, avg: 34.7, max: 63.0) +[2023-10-11 15:45:21,063][84230] Avg episode reward: [(0, '7.230'), (1, '7.510')] +[2023-10-11 15:45:21,113][85175] Updated weights for policy 1, policy_version 20060 (0.0007) +[2023-10-11 15:45:24,678][85176] Updated weights for policy 0, policy_version 19782 (0.0009) +[2023-10-11 15:45:25,041][85176] Updated weights for policy 0, policy_version 19792 (0.0009) +[2023-10-11 15:45:25,087][85175] Updated weights for policy 1, policy_version 20070 (0.0008) +[2023-10-11 15:45:25,419][85176] Updated weights for policy 0, policy_version 19802 (0.0009) +[2023-10-11 15:45:25,456][85175] Updated weights for policy 1, policy_version 20080 (0.0008) +[2023-10-11 15:45:25,829][85175] Updated weights for policy 1, policy_version 20090 (0.0009) +[2023-10-11 15:45:26,062][84230] Fps is (10 sec: 16384.4, 60 sec: 14199.5, 300 sec: 13440.4). Total num frames: 40861696. Throughput: 0: 1660.8, 1: 1707.5. Samples: 10216738. Policy #0 lag: (min: 31.0, avg: 34.7, max: 63.0) +[2023-10-11 15:45:26,063][84230] Avg episode reward: [(0, '7.120'), (1, '7.120')] +[2023-10-11 15:45:29,520][85176] Updated weights for policy 0, policy_version 19812 (0.0008) +[2023-10-11 15:45:29,801][85175] Updated weights for policy 1, policy_version 20100 (0.0009) +[2023-10-11 15:45:29,901][85176] Updated weights for policy 0, policy_version 19822 (0.0008) +[2023-10-11 15:45:30,169][85175] Updated weights for policy 1, policy_version 20110 (0.0008) +[2023-10-11 15:45:30,279][85176] Updated weights for policy 0, policy_version 19832 (0.0010) +[2023-10-11 15:45:30,537][85175] Updated weights for policy 1, policy_version 20120 (0.0008) +[2023-10-11 15:45:31,063][84230] Fps is (10 sec: 16383.9, 60 sec: 14199.4, 300 sec: 13440.4). Total num frames: 40927232. Throughput: 0: 1652.8, 1: 1676.0. Samples: 10235154. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:45:31,063][84230] Avg episode reward: [(0, '7.230'), (1, '7.270')] +[2023-10-11 15:45:34,485][85176] Updated weights for policy 0, policy_version 19842 (0.0008) +[2023-10-11 15:45:34,649][85175] Updated weights for policy 1, policy_version 20130 (0.0008) +[2023-10-11 15:45:34,870][85176] Updated weights for policy 0, policy_version 19852 (0.0008) +[2023-10-11 15:45:35,064][85175] Updated weights for policy 1, policy_version 20140 (0.0009) +[2023-10-11 15:45:35,241][85176] Updated weights for policy 0, policy_version 19862 (0.0008) +[2023-10-11 15:45:35,437][85175] Updated weights for policy 1, policy_version 20150 (0.0007) +[2023-10-11 15:45:35,616][85176] Updated weights for policy 0, policy_version 19872 (0.0008) +[2023-10-11 15:45:35,806][85175] Updated weights for policy 1, policy_version 20160 (0.0007) +[2023-10-11 15:45:36,063][84230] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 13440.4). Total num frames: 40992768. Throughput: 0: 1658.7, 1: 1702.1. Samples: 10246420. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:45:36,064][84230] Avg episode reward: [(0, '7.180'), (1, '7.350')] +[2023-10-11 15:45:39,606][85176] Updated weights for policy 0, policy_version 19882 (0.0010) +[2023-10-11 15:45:39,756][85175] Updated weights for policy 1, policy_version 20170 (0.0009) +[2023-10-11 15:45:39,972][85176] Updated weights for policy 0, policy_version 19892 (0.0009) +[2023-10-11 15:45:40,121][85175] Updated weights for policy 1, policy_version 20180 (0.0009) +[2023-10-11 15:45:40,350][85176] Updated weights for policy 0, policy_version 19902 (0.0009) +[2023-10-11 15:45:40,497][85175] Updated weights for policy 1, policy_version 20190 (0.0007) +[2023-10-11 15:45:41,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 41058304. Throughput: 0: 1648.8, 1: 1696.3. Samples: 10266324. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:45:41,064][84230] Avg episode reward: [(0, '7.030'), (1, '7.020')] +[2023-10-11 15:45:44,451][85176] Updated weights for policy 0, policy_version 19912 (0.0009) +[2023-10-11 15:45:44,675][85175] Updated weights for policy 1, policy_version 20200 (0.0007) +[2023-10-11 15:45:44,820][85176] Updated weights for policy 0, policy_version 19922 (0.0007) +[2023-10-11 15:45:45,048][85175] Updated weights for policy 1, policy_version 20210 (0.0008) +[2023-10-11 15:45:45,199][85176] Updated weights for policy 0, policy_version 19932 (0.0007) +[2023-10-11 15:45:45,417][85175] Updated weights for policy 1, policy_version 20220 (0.0009) +[2023-10-11 15:45:46,062][84230] Fps is (10 sec: 13107.6, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 41123840. Throughput: 0: 1650.9, 1: 1667.4. Samples: 10285086. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:45:46,063][84230] Avg episode reward: [(0, '6.940'), (1, '6.960')] +[2023-10-11 15:45:49,481][85176] Updated weights for policy 0, policy_version 19942 (0.0009) +[2023-10-11 15:45:49,516][85175] Updated weights for policy 1, policy_version 20230 (0.0007) +[2023-10-11 15:45:49,852][85176] Updated weights for policy 0, policy_version 19952 (0.0009) +[2023-10-11 15:45:49,876][85175] Updated weights for policy 1, policy_version 20240 (0.0009) +[2023-10-11 15:45:50,220][85176] Updated weights for policy 0, policy_version 19962 (0.0011) +[2023-10-11 15:45:50,246][85175] Updated weights for policy 1, policy_version 20250 (0.0008) +[2023-10-11 15:45:51,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 41189376. Throughput: 0: 1663.0, 1: 1691.1. Samples: 10296612. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:45:51,063][84230] Avg episode reward: [(0, '7.040'), (1, '6.840')] +[2023-10-11 15:45:54,344][85175] Updated weights for policy 1, policy_version 20260 (0.0007) +[2023-10-11 15:45:54,438][85176] Updated weights for policy 0, policy_version 19972 (0.0007) +[2023-10-11 15:45:54,712][85175] Updated weights for policy 1, policy_version 20270 (0.0009) +[2023-10-11 15:45:54,806][85176] Updated weights for policy 0, policy_version 19982 (0.0007) +[2023-10-11 15:45:55,077][85175] Updated weights for policy 1, policy_version 20280 (0.0007) +[2023-10-11 15:45:55,176][85176] Updated weights for policy 0, policy_version 19992 (0.0007) +[2023-10-11 15:45:56,062][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 41254912. Throughput: 0: 1652.4, 1: 1677.9. Samples: 10316244. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:45:56,063][84230] Avg episode reward: [(0, '7.170'), (1, '6.710')] +[2023-10-11 15:45:59,085][85175] Updated weights for policy 1, policy_version 20290 (0.0008) +[2023-10-11 15:45:59,365][85176] Updated weights for policy 0, policy_version 20002 (0.0008) +[2023-10-11 15:45:59,456][85175] Updated weights for policy 1, policy_version 20300 (0.0008) +[2023-10-11 15:45:59,739][85176] Updated weights for policy 0, policy_version 20012 (0.0008) +[2023-10-11 15:45:59,820][85175] Updated weights for policy 1, policy_version 20310 (0.0008) +[2023-10-11 15:46:00,113][85176] Updated weights for policy 0, policy_version 20022 (0.0007) +[2023-10-11 15:46:00,182][85175] Updated weights for policy 1, policy_version 20320 (0.0009) +[2023-10-11 15:46:00,486][85176] Updated weights for policy 0, policy_version 20032 (0.0007) +[2023-10-11 15:46:01,063][84230] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 41320448. Throughput: 0: 1646.8, 1: 1664.9. Samples: 10334970. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:46:01,064][84230] Avg episode reward: [(0, '6.960'), (1, '7.160')] +[2023-10-11 15:46:04,108][85175] Updated weights for policy 1, policy_version 20330 (0.0010) +[2023-10-11 15:46:04,337][85176] Updated weights for policy 0, policy_version 20042 (0.0009) +[2023-10-11 15:46:04,480][85175] Updated weights for policy 1, policy_version 20340 (0.0007) +[2023-10-11 15:46:04,706][85176] Updated weights for policy 0, policy_version 20052 (0.0008) +[2023-10-11 15:46:04,847][85175] Updated weights for policy 1, policy_version 20350 (0.0007) +[2023-10-11 15:46:05,076][85176] Updated weights for policy 0, policy_version 20062 (0.0009) +[2023-10-11 15:46:06,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 41385984. Throughput: 0: 1655.5, 1: 1692.7. Samples: 10346850. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:46:06,063][84230] Avg episode reward: [(0, '6.760'), (1, '7.200')] +[2023-10-11 15:46:08,892][85175] Updated weights for policy 1, policy_version 20360 (0.0007) +[2023-10-11 15:46:09,069][85176] Updated weights for policy 0, policy_version 20072 (0.0007) +[2023-10-11 15:46:09,249][85175] Updated weights for policy 1, policy_version 20370 (0.0008) +[2023-10-11 15:46:09,441][85176] Updated weights for policy 0, policy_version 20082 (0.0008) +[2023-10-11 15:46:09,620][85175] Updated weights for policy 1, policy_version 20380 (0.0009) +[2023-10-11 15:46:09,810][85176] Updated weights for policy 0, policy_version 20092 (0.0009) +[2023-10-11 15:46:11,062][84230] Fps is (10 sec: 13107.6, 60 sec: 14199.5, 300 sec: 13440.4). Total num frames: 41451520. Throughput: 0: 1644.9, 1: 1670.0. Samples: 10365908. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:46:11,063][84230] Avg episode reward: [(0, '6.760'), (1, '7.480')] +[2023-10-11 15:46:13,789][85175] Updated weights for policy 1, policy_version 20390 (0.0009) +[2023-10-11 15:46:13,903][85176] Updated weights for policy 0, policy_version 20102 (0.0008) +[2023-10-11 15:46:14,163][85175] Updated weights for policy 1, policy_version 20400 (0.0008) +[2023-10-11 15:46:14,279][85176] Updated weights for policy 0, policy_version 20112 (0.0009) +[2023-10-11 15:46:14,530][85175] Updated weights for policy 1, policy_version 20410 (0.0007) +[2023-10-11 15:46:14,652][85176] Updated weights for policy 0, policy_version 20122 (0.0009) +[2023-10-11 15:46:16,063][84230] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 41517056. Throughput: 0: 1665.1, 1: 1680.8. Samples: 10385722. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:46:16,064][84230] Avg episode reward: [(0, '6.750'), (1, '7.240')] +[2023-10-11 15:46:16,073][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000020416_20905984.pth... +[2023-10-11 15:46:16,074][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000020128_20611072.pth... +[2023-10-11 15:46:16,111][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000018560_19005440.pth +[2023-10-11 15:46:16,113][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000018848_19300352.pth +[2023-10-11 15:46:18,621][85175] Updated weights for policy 1, policy_version 20420 (0.0009) +[2023-10-11 15:46:18,890][85176] Updated weights for policy 0, policy_version 20132 (0.0008) +[2023-10-11 15:46:18,979][85175] Updated weights for policy 1, policy_version 20430 (0.0008) +[2023-10-11 15:46:19,257][85176] Updated weights for policy 0, policy_version 20142 (0.0008) +[2023-10-11 15:46:19,339][85175] Updated weights for policy 1, policy_version 20440 (0.0009) +[2023-10-11 15:46:19,628][85176] Updated weights for policy 0, policy_version 20152 (0.0008) +[2023-10-11 15:46:21,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 41582592. Throughput: 0: 1666.1, 1: 1682.0. Samples: 10397084. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:46:21,063][84230] Avg episode reward: [(0, '7.230'), (1, '6.660')] +[2023-10-11 15:46:23,695][85176] Updated weights for policy 0, policy_version 20162 (0.0010) +[2023-10-11 15:46:23,732][85175] Updated weights for policy 1, policy_version 20450 (0.0008) +[2023-10-11 15:46:24,091][85176] Updated weights for policy 0, policy_version 20172 (0.0008) +[2023-10-11 15:46:24,146][85175] Updated weights for policy 1, policy_version 20460 (0.0009) +[2023-10-11 15:46:24,464][85176] Updated weights for policy 0, policy_version 20182 (0.0008) +[2023-10-11 15:46:24,501][85175] Updated weights for policy 1, policy_version 20470 (0.0009) +[2023-10-11 15:46:24,834][85176] Updated weights for policy 0, policy_version 20192 (0.0007) +[2023-10-11 15:46:24,872][85175] Updated weights for policy 1, policy_version 20480 (0.0007) +[2023-10-11 15:46:26,063][84230] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 41648128. Throughput: 0: 1658.4, 1: 1666.3. Samples: 10415934. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:46:26,063][84230] Avg episode reward: [(0, '7.340'), (1, '6.910')] +[2023-10-11 15:46:28,787][85175] Updated weights for policy 1, policy_version 20490 (0.0008) +[2023-10-11 15:46:28,808][85176] Updated weights for policy 0, policy_version 20202 (0.0009) +[2023-10-11 15:46:29,156][85175] Updated weights for policy 1, policy_version 20500 (0.0007) +[2023-10-11 15:46:29,183][85176] Updated weights for policy 0, policy_version 20212 (0.0007) +[2023-10-11 15:46:29,522][85175] Updated weights for policy 1, policy_version 20510 (0.0008) +[2023-10-11 15:46:29,553][85176] Updated weights for policy 0, policy_version 20222 (0.0007) +[2023-10-11 15:46:31,063][84230] Fps is (10 sec: 13106.6, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 41713664. Throughput: 0: 1668.1, 1: 1679.7. Samples: 10435738. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:46:31,064][84230] Avg episode reward: [(0, '7.200'), (1, '7.310')] +[2023-10-11 15:46:33,536][85175] Updated weights for policy 1, policy_version 20520 (0.0009) +[2023-10-11 15:46:33,766][85176] Updated weights for policy 0, policy_version 20232 (0.0007) +[2023-10-11 15:46:33,904][85175] Updated weights for policy 1, policy_version 20530 (0.0007) +[2023-10-11 15:46:34,136][85176] Updated weights for policy 0, policy_version 20242 (0.0008) +[2023-10-11 15:46:34,272][85175] Updated weights for policy 1, policy_version 20540 (0.0008) +[2023-10-11 15:46:34,515][85176] Updated weights for policy 0, policy_version 20252 (0.0009) +[2023-10-11 15:46:36,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 41779200. Throughput: 0: 1665.0, 1: 1675.6. Samples: 10446942. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:46:36,063][84230] Avg episode reward: [(0, '7.470'), (1, '7.770')] +[2023-10-11 15:46:38,273][85175] Updated weights for policy 1, policy_version 20550 (0.0008) +[2023-10-11 15:46:38,576][85176] Updated weights for policy 0, policy_version 20262 (0.0009) +[2023-10-11 15:46:38,639][85175] Updated weights for policy 1, policy_version 20560 (0.0008) +[2023-10-11 15:46:38,952][85176] Updated weights for policy 0, policy_version 20272 (0.0009) +[2023-10-11 15:46:39,013][85175] Updated weights for policy 1, policy_version 20570 (0.0008) +[2023-10-11 15:46:39,330][85176] Updated weights for policy 0, policy_version 20282 (0.0007) +[2023-10-11 15:46:41,063][84230] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 41844736. Throughput: 0: 1653.0, 1: 1667.9. Samples: 10465684. Policy #0 lag: (min: 7.0, avg: 15.0, max: 39.0) +[2023-10-11 15:46:41,063][84230] Avg episode reward: [(0, '7.250'), (1, '7.820')] +[2023-10-11 15:46:43,075][85175] Updated weights for policy 1, policy_version 20580 (0.0009) +[2023-10-11 15:46:43,442][85175] Updated weights for policy 1, policy_version 20590 (0.0007) +[2023-10-11 15:46:43,456][85176] Updated weights for policy 0, policy_version 20292 (0.0007) +[2023-10-11 15:46:43,810][85175] Updated weights for policy 1, policy_version 20600 (0.0009) +[2023-10-11 15:46:43,829][85176] Updated weights for policy 0, policy_version 20302 (0.0009) +[2023-10-11 15:46:44,197][85176] Updated weights for policy 0, policy_version 20312 (0.0009) +[2023-10-11 15:46:46,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 41910272. Throughput: 0: 1677.3, 1: 1692.0. Samples: 10486588. Policy #0 lag: (min: 7.0, avg: 15.0, max: 39.0) +[2023-10-11 15:46:46,064][84230] Avg episode reward: [(0, '7.130'), (1, '7.850')] +[2023-10-11 15:46:47,729][85175] Updated weights for policy 1, policy_version 20610 (0.0007) +[2023-10-11 15:46:48,092][85175] Updated weights for policy 1, policy_version 20620 (0.0009) +[2023-10-11 15:46:48,294][85176] Updated weights for policy 0, policy_version 20322 (0.0007) +[2023-10-11 15:46:48,472][85175] Updated weights for policy 1, policy_version 20630 (0.0008) +[2023-10-11 15:46:48,660][85176] Updated weights for policy 0, policy_version 20332 (0.0008) +[2023-10-11 15:46:48,831][85175] Updated weights for policy 1, policy_version 20640 (0.0008) +[2023-10-11 15:46:49,034][85176] Updated weights for policy 0, policy_version 20342 (0.0008) +[2023-10-11 15:46:49,402][85176] Updated weights for policy 0, policy_version 20352 (0.0007) +[2023-10-11 15:46:51,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 41975808. Throughput: 0: 1664.9, 1: 1672.5. Samples: 10497034. Policy #0 lag: (min: 7.0, avg: 15.0, max: 39.0) +[2023-10-11 15:46:51,064][84230] Avg episode reward: [(0, '6.750'), (1, '7.030')] +[2023-10-11 15:46:52,838][85175] Updated weights for policy 1, policy_version 20650 (0.0008) +[2023-10-11 15:46:53,209][85175] Updated weights for policy 1, policy_version 20660 (0.0009) +[2023-10-11 15:46:53,546][85176] Updated weights for policy 0, policy_version 20362 (0.0008) +[2023-10-11 15:46:53,583][85175] Updated weights for policy 1, policy_version 20670 (0.0009) +[2023-10-11 15:46:53,921][85176] Updated weights for policy 0, policy_version 20372 (0.0007) +[2023-10-11 15:46:54,288][85176] Updated weights for policy 0, policy_version 20382 (0.0008) +[2023-10-11 15:46:56,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 42041344. Throughput: 0: 1661.3, 1: 1685.6. Samples: 10516518. Policy #0 lag: (min: 18.0, avg: 18.0, max: 18.0) +[2023-10-11 15:46:56,063][84230] Avg episode reward: [(0, '6.750'), (1, '6.680')] +[2023-10-11 15:46:57,545][85175] Updated weights for policy 1, policy_version 20680 (0.0007) +[2023-10-11 15:46:57,912][85175] Updated weights for policy 1, policy_version 20690 (0.0007) +[2023-10-11 15:46:58,282][85175] Updated weights for policy 1, policy_version 20700 (0.0008) +[2023-10-11 15:46:58,513][85176] Updated weights for policy 0, policy_version 20392 (0.0008) +[2023-10-11 15:46:58,888][85176] Updated weights for policy 0, policy_version 20402 (0.0008) +[2023-10-11 15:46:59,253][85176] Updated weights for policy 0, policy_version 20412 (0.0008) +[2023-10-11 15:47:01,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 42106880. Throughput: 0: 1667.9, 1: 1698.8. Samples: 10537224. Policy #0 lag: (min: 18.0, avg: 18.0, max: 18.0) +[2023-10-11 15:47:01,064][84230] Avg episode reward: [(0, '6.680'), (1, '6.560')] +[2023-10-11 15:47:02,418][85175] Updated weights for policy 1, policy_version 20710 (0.0009) +[2023-10-11 15:47:02,783][85175] Updated weights for policy 1, policy_version 20720 (0.0010) +[2023-10-11 15:47:03,153][85175] Updated weights for policy 1, policy_version 20730 (0.0008) +[2023-10-11 15:47:03,250][85176] Updated weights for policy 0, policy_version 20422 (0.0008) +[2023-10-11 15:47:03,627][85176] Updated weights for policy 0, policy_version 20432 (0.0007) +[2023-10-11 15:47:03,993][85176] Updated weights for policy 0, policy_version 20442 (0.0007) +[2023-10-11 15:47:06,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 42172416. Throughput: 0: 1657.4, 1: 1673.8. Samples: 10546988. Policy #0 lag: (min: 18.0, avg: 18.0, max: 18.0) +[2023-10-11 15:47:06,063][84230] Avg episode reward: [(0, '6.880'), (1, '6.880')] +[2023-10-11 15:47:07,051][85175] Updated weights for policy 1, policy_version 20740 (0.0010) +[2023-10-11 15:47:07,423][85175] Updated weights for policy 1, policy_version 20750 (0.0009) +[2023-10-11 15:47:07,783][85175] Updated weights for policy 1, policy_version 20760 (0.0011) +[2023-10-11 15:47:08,075][85176] Updated weights for policy 0, policy_version 20452 (0.0009) +[2023-10-11 15:47:08,450][85176] Updated weights for policy 0, policy_version 20462 (0.0009) +[2023-10-11 15:47:08,825][85176] Updated weights for policy 0, policy_version 20472 (0.0010) +[2023-10-11 15:47:11,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 42237952. Throughput: 0: 1660.1, 1: 1697.1. Samples: 10567006. Policy #0 lag: (min: 31.0, avg: 38.2, max: 63.0) +[2023-10-11 15:47:11,064][84230] Avg episode reward: [(0, '7.070'), (1, '7.700')] +[2023-10-11 15:47:11,980][85175] Updated weights for policy 1, policy_version 20770 (0.0009) +[2023-10-11 15:47:12,377][85175] Updated weights for policy 1, policy_version 20780 (0.0009) +[2023-10-11 15:47:12,740][85175] Updated weights for policy 1, policy_version 20790 (0.0009) +[2023-10-11 15:47:12,951][85176] Updated weights for policy 0, policy_version 20482 (0.0010) +[2023-10-11 15:47:13,106][85175] Updated weights for policy 1, policy_version 20800 (0.0009) +[2023-10-11 15:47:13,331][85176] Updated weights for policy 0, policy_version 20492 (0.0008) +[2023-10-11 15:47:13,703][85176] Updated weights for policy 0, policy_version 20502 (0.0009) +[2023-10-11 15:47:14,072][85176] Updated weights for policy 0, policy_version 20512 (0.0008) +[2023-10-11 15:47:16,063][84230] Fps is (10 sec: 13106.8, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 42303488. Throughput: 0: 1673.2, 1: 1707.3. Samples: 10587860. Policy #0 lag: (min: 31.0, avg: 38.2, max: 63.0) +[2023-10-11 15:47:16,064][84230] Avg episode reward: [(0, '6.880'), (1, '8.230')] +[2023-10-11 15:47:16,076][85000] Saving new best policy, reward=8.230! +[2023-10-11 15:47:16,995][85175] Updated weights for policy 1, policy_version 20810 (0.0008) +[2023-10-11 15:47:17,365][85175] Updated weights for policy 1, policy_version 20820 (0.0010) +[2023-10-11 15:47:17,746][85175] Updated weights for policy 1, policy_version 20830 (0.0007) +[2023-10-11 15:47:18,211][85176] Updated weights for policy 0, policy_version 20522 (0.0007) +[2023-10-11 15:47:18,587][85176] Updated weights for policy 0, policy_version 20532 (0.0008) +[2023-10-11 15:47:18,958][85176] Updated weights for policy 0, policy_version 20542 (0.0008) +[2023-10-11 15:47:21,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 42369024. Throughput: 0: 1660.0, 1: 1684.7. Samples: 10597458. Policy #0 lag: (min: 31.0, avg: 38.2, max: 63.0) +[2023-10-11 15:47:21,064][84230] Avg episode reward: [(0, '6.520'), (1, '8.300')] +[2023-10-11 15:47:21,066][85000] Saving new best policy, reward=8.300! +[2023-10-11 15:47:21,819][85175] Updated weights for policy 1, policy_version 20840 (0.0008) +[2023-10-11 15:47:22,186][85175] Updated weights for policy 1, policy_version 20850 (0.0009) +[2023-10-11 15:47:22,548][85175] Updated weights for policy 1, policy_version 20860 (0.0008) +[2023-10-11 15:47:23,071][85176] Updated weights for policy 0, policy_version 20552 (0.0009) +[2023-10-11 15:47:23,444][85176] Updated weights for policy 0, policy_version 20562 (0.0009) +[2023-10-11 15:47:23,836][85176] Updated weights for policy 0, policy_version 20572 (0.0009) +[2023-10-11 15:47:26,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 42434560. Throughput: 0: 1668.8, 1: 1706.8. Samples: 10617586. Policy #0 lag: (min: 31.0, avg: 38.2, max: 63.0) +[2023-10-11 15:47:26,064][84230] Avg episode reward: [(0, '6.660'), (1, '7.650')] +[2023-10-11 15:47:26,632][85175] Updated weights for policy 1, policy_version 20870 (0.0007) +[2023-10-11 15:47:26,999][85175] Updated weights for policy 1, policy_version 20880 (0.0007) +[2023-10-11 15:47:27,368][85175] Updated weights for policy 1, policy_version 20890 (0.0009) +[2023-10-11 15:47:27,881][85176] Updated weights for policy 0, policy_version 20582 (0.0007) +[2023-10-11 15:47:28,249][85176] Updated weights for policy 0, policy_version 20592 (0.0007) +[2023-10-11 15:47:28,631][85176] Updated weights for policy 0, policy_version 20602 (0.0010) +[2023-10-11 15:47:31,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 42500096. Throughput: 0: 1669.9, 1: 1703.2. Samples: 10638376. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:47:31,063][84230] Avg episode reward: [(0, '7.000'), (1, '7.340')] +[2023-10-11 15:47:31,363][85175] Updated weights for policy 1, policy_version 20900 (0.0008) +[2023-10-11 15:47:31,730][85175] Updated weights for policy 1, policy_version 20910 (0.0009) +[2023-10-11 15:47:32,096][85175] Updated weights for policy 1, policy_version 20920 (0.0007) +[2023-10-11 15:47:32,636][85176] Updated weights for policy 0, policy_version 20612 (0.0010) +[2023-10-11 15:47:33,011][85176] Updated weights for policy 0, policy_version 20622 (0.0009) +[2023-10-11 15:47:33,387][85176] Updated weights for policy 0, policy_version 20632 (0.0007) +[2023-10-11 15:47:36,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 42565632. Throughput: 0: 1653.3, 1: 1696.3. Samples: 10647762. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:47:36,063][84230] Avg episode reward: [(0, '7.220'), (1, '6.640')] +[2023-10-11 15:47:36,107][85175] Updated weights for policy 1, policy_version 20930 (0.0008) +[2023-10-11 15:47:36,480][85175] Updated weights for policy 1, policy_version 20940 (0.0008) +[2023-10-11 15:47:36,843][85175] Updated weights for policy 1, policy_version 20950 (0.0010) +[2023-10-11 15:47:37,211][85175] Updated weights for policy 1, policy_version 20960 (0.0010) +[2023-10-11 15:47:37,370][85176] Updated weights for policy 0, policy_version 20642 (0.0009) +[2023-10-11 15:47:37,746][85176] Updated weights for policy 0, policy_version 20652 (0.0007) +[2023-10-11 15:47:38,117][85176] Updated weights for policy 0, policy_version 20662 (0.0009) +[2023-10-11 15:47:38,481][85176] Updated weights for policy 0, policy_version 20672 (0.0009) +[2023-10-11 15:47:41,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 42631168. Throughput: 0: 1671.0, 1: 1703.7. Samples: 10668380. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:47:41,064][84230] Avg episode reward: [(0, '7.370'), (1, '7.330')] +[2023-10-11 15:47:41,312][85175] Updated weights for policy 1, policy_version 20970 (0.0008) +[2023-10-11 15:47:41,685][85175] Updated weights for policy 1, policy_version 20980 (0.0008) +[2023-10-11 15:47:42,051][85175] Updated weights for policy 1, policy_version 20990 (0.0009) +[2023-10-11 15:47:42,664][85176] Updated weights for policy 0, policy_version 20682 (0.0010) +[2023-10-11 15:47:43,032][85176] Updated weights for policy 0, policy_version 20692 (0.0011) +[2023-10-11 15:47:43,403][85176] Updated weights for policy 0, policy_version 20702 (0.0007) +[2023-10-11 15:47:45,908][85175] Updated weights for policy 1, policy_version 21000 (0.0008) +[2023-10-11 15:47:46,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 42696704. Throughput: 0: 1669.2, 1: 1703.5. Samples: 10688994. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:47:46,064][84230] Avg episode reward: [(0, '6.900'), (1, '8.330')] +[2023-10-11 15:47:46,281][85175] Updated weights for policy 1, policy_version 21010 (0.0007) +[2023-10-11 15:47:46,642][85175] Updated weights for policy 1, policy_version 21020 (0.0007) +[2023-10-11 15:47:46,789][85000] Saving new best policy, reward=8.330! +[2023-10-11 15:47:47,522][85176] Updated weights for policy 0, policy_version 20712 (0.0008) +[2023-10-11 15:47:47,888][85176] Updated weights for policy 0, policy_version 20722 (0.0008) +[2023-10-11 15:47:48,260][85176] Updated weights for policy 0, policy_version 20732 (0.0008) +[2023-10-11 15:47:50,606][85175] Updated weights for policy 1, policy_version 21030 (0.0007) +[2023-10-11 15:47:50,980][85175] Updated weights for policy 1, policy_version 21040 (0.0008) +[2023-10-11 15:47:51,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 42762240. Throughput: 0: 1651.2, 1: 1706.5. Samples: 10698084. Policy #0 lag: (min: 0.0, avg: 24.2, max: 32.0) +[2023-10-11 15:47:51,063][84230] Avg episode reward: [(0, '6.780'), (1, '8.050')] +[2023-10-11 15:47:51,347][85175] Updated weights for policy 1, policy_version 21050 (0.0009) +[2023-10-11 15:47:52,413][85176] Updated weights for policy 0, policy_version 20742 (0.0009) +[2023-10-11 15:47:52,793][85176] Updated weights for policy 0, policy_version 20752 (0.0009) +[2023-10-11 15:47:53,160][85176] Updated weights for policy 0, policy_version 20762 (0.0009) +[2023-10-11 15:47:55,657][85175] Updated weights for policy 1, policy_version 21060 (0.0008) +[2023-10-11 15:47:56,035][85175] Updated weights for policy 1, policy_version 21070 (0.0008) +[2023-10-11 15:47:56,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 42827776. Throughput: 0: 1669.6, 1: 1697.6. Samples: 10718526. Policy #0 lag: (min: 0.0, avg: 24.2, max: 32.0) +[2023-10-11 15:47:56,063][84230] Avg episode reward: [(0, '7.220'), (1, '7.670')] +[2023-10-11 15:47:56,409][85175] Updated weights for policy 1, policy_version 21080 (0.0007) +[2023-10-11 15:47:57,238][85176] Updated weights for policy 0, policy_version 20772 (0.0008) +[2023-10-11 15:47:57,614][85176] Updated weights for policy 0, policy_version 20782 (0.0008) +[2023-10-11 15:47:57,983][85176] Updated weights for policy 0, policy_version 20792 (0.0009) +[2023-10-11 15:48:00,569][85175] Updated weights for policy 1, policy_version 21090 (0.0010) +[2023-10-11 15:48:00,999][85175] Updated weights for policy 1, policy_version 21100 (0.0010) +[2023-10-11 15:48:01,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 42893312. Throughput: 0: 1666.9, 1: 1696.3. Samples: 10739200. Policy #0 lag: (min: 0.0, avg: 24.2, max: 32.0) +[2023-10-11 15:48:01,063][84230] Avg episode reward: [(0, '7.100'), (1, '6.900')] +[2023-10-11 15:48:01,367][85175] Updated weights for policy 1, policy_version 21110 (0.0008) +[2023-10-11 15:48:01,730][85175] Updated weights for policy 1, policy_version 21120 (0.0008) +[2023-10-11 15:48:02,092][85176] Updated weights for policy 0, policy_version 20802 (0.0007) +[2023-10-11 15:48:02,501][85176] Updated weights for policy 0, policy_version 20812 (0.0009) +[2023-10-11 15:48:02,871][85176] Updated weights for policy 0, policy_version 20822 (0.0007) +[2023-10-11 15:48:03,249][85176] Updated weights for policy 0, policy_version 20832 (0.0007) +[2023-10-11 15:48:05,757][85175] Updated weights for policy 1, policy_version 21130 (0.0009) +[2023-10-11 15:48:06,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 42958848. Throughput: 0: 1653.9, 1: 1693.3. Samples: 10748082. Policy #0 lag: (min: 20.0, avg: 20.0, max: 20.0) +[2023-10-11 15:48:06,063][84230] Avg episode reward: [(0, '7.010'), (1, '7.610')] +[2023-10-11 15:48:06,139][85175] Updated weights for policy 1, policy_version 21140 (0.0009) +[2023-10-11 15:48:06,504][85175] Updated weights for policy 1, policy_version 21150 (0.0008) +[2023-10-11 15:48:07,219][85176] Updated weights for policy 0, policy_version 20842 (0.0008) +[2023-10-11 15:48:07,588][85176] Updated weights for policy 0, policy_version 20852 (0.0010) +[2023-10-11 15:48:07,962][85176] Updated weights for policy 0, policy_version 20862 (0.0007) +[2023-10-11 15:48:10,530][85175] Updated weights for policy 1, policy_version 21160 (0.0009) +[2023-10-11 15:48:10,908][85175] Updated weights for policy 1, policy_version 21170 (0.0010) +[2023-10-11 15:48:11,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 43024384. Throughput: 0: 1668.9, 1: 1686.8. Samples: 10768594. Policy #0 lag: (min: 20.0, avg: 20.0, max: 20.0) +[2023-10-11 15:48:11,064][84230] Avg episode reward: [(0, '6.880'), (1, '7.870')] +[2023-10-11 15:48:11,265][85175] Updated weights for policy 1, policy_version 21180 (0.0008) +[2023-10-11 15:48:12,053][85176] Updated weights for policy 0, policy_version 20872 (0.0007) +[2023-10-11 15:48:12,423][85176] Updated weights for policy 0, policy_version 20882 (0.0009) +[2023-10-11 15:48:12,794][85176] Updated weights for policy 0, policy_version 20892 (0.0010) +[2023-10-11 15:48:15,526][85175] Updated weights for policy 1, policy_version 21190 (0.0008) +[2023-10-11 15:48:15,894][85175] Updated weights for policy 1, policy_version 21200 (0.0007) +[2023-10-11 15:48:16,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 43089920. Throughput: 0: 1667.5, 1: 1684.0. Samples: 10789196. Policy #0 lag: (min: 20.0, avg: 20.0, max: 20.0) +[2023-10-11 15:48:16,063][84230] Avg episode reward: [(0, '7.040'), (1, '7.780')] +[2023-10-11 15:48:16,072][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000020896_21397504.pth... +[2023-10-11 15:48:16,107][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000019360_19824640.pth +[2023-10-11 15:48:16,264][85175] Updated weights for policy 1, policy_version 21210 (0.0007) +[2023-10-11 15:48:16,482][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000021216_21725184.pth... +[2023-10-11 15:48:16,513][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000019616_20086784.pth +[2023-10-11 15:48:17,015][85176] Updated weights for policy 0, policy_version 20902 (0.0008) +[2023-10-11 15:48:17,392][85176] Updated weights for policy 0, policy_version 20912 (0.0008) +[2023-10-11 15:48:17,763][85176] Updated weights for policy 0, policy_version 20922 (0.0009) +[2023-10-11 15:48:20,090][85175] Updated weights for policy 1, policy_version 21220 (0.0007) +[2023-10-11 15:48:20,455][85175] Updated weights for policy 1, policy_version 21230 (0.0009) +[2023-10-11 15:48:20,823][85175] Updated weights for policy 1, policy_version 21240 (0.0007) +[2023-10-11 15:48:21,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 43155456. Throughput: 0: 1664.3, 1: 1687.7. Samples: 10798604. Policy #0 lag: (min: 20.0, avg: 20.0, max: 20.0) +[2023-10-11 15:48:21,063][84230] Avg episode reward: [(0, '7.420'), (1, '7.450')] +[2023-10-11 15:48:21,895][85176] Updated weights for policy 0, policy_version 20932 (0.0009) +[2023-10-11 15:48:22,276][85176] Updated weights for policy 0, policy_version 20942 (0.0008) +[2023-10-11 15:48:22,648][85176] Updated weights for policy 0, policy_version 20952 (0.0007) +[2023-10-11 15:48:24,682][85175] Updated weights for policy 1, policy_version 21250 (0.0007) +[2023-10-11 15:48:25,044][85175] Updated weights for policy 1, policy_version 21260 (0.0007) +[2023-10-11 15:48:25,409][85175] Updated weights for policy 1, policy_version 21270 (0.0007) +[2023-10-11 15:48:25,788][85175] Updated weights for policy 1, policy_version 21280 (0.0009) +[2023-10-11 15:48:26,063][84230] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 43253760. Throughput: 0: 1668.5, 1: 1693.6. Samples: 10819674. Policy #0 lag: (min: 31.0, avg: 36.0, max: 63.0) +[2023-10-11 15:48:26,064][84230] Avg episode reward: [(0, '7.260'), (1, '7.160')] +[2023-10-11 15:48:26,573][85176] Updated weights for policy 0, policy_version 20962 (0.0010) +[2023-10-11 15:48:26,955][85176] Updated weights for policy 0, policy_version 20972 (0.0008) +[2023-10-11 15:48:27,328][85176] Updated weights for policy 0, policy_version 20982 (0.0009) +[2023-10-11 15:48:27,704][85176] Updated weights for policy 0, policy_version 20992 (0.0008) +[2023-10-11 15:48:29,799][85175] Updated weights for policy 1, policy_version 21290 (0.0008) +[2023-10-11 15:48:30,173][85175] Updated weights for policy 1, policy_version 21300 (0.0010) +[2023-10-11 15:48:30,541][85175] Updated weights for policy 1, policy_version 21310 (0.0007) +[2023-10-11 15:48:31,062][84230] Fps is (10 sec: 16384.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 43319296. Throughput: 0: 1675.9, 1: 1670.0. Samples: 10839556. Policy #0 lag: (min: 31.0, avg: 36.0, max: 63.0) +[2023-10-11 15:48:31,063][84230] Avg episode reward: [(0, '6.750'), (1, '7.740')] +[2023-10-11 15:48:31,863][85176] Updated weights for policy 0, policy_version 21002 (0.0007) +[2023-10-11 15:48:32,235][85176] Updated weights for policy 0, policy_version 21012 (0.0007) +[2023-10-11 15:48:32,600][85176] Updated weights for policy 0, policy_version 21022 (0.0010) +[2023-10-11 15:48:34,534][85175] Updated weights for policy 1, policy_version 21320 (0.0008) +[2023-10-11 15:48:34,912][85175] Updated weights for policy 1, policy_version 21330 (0.0009) +[2023-10-11 15:48:35,284][85175] Updated weights for policy 1, policy_version 21340 (0.0007) +[2023-10-11 15:48:36,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 43384832. Throughput: 0: 1678.9, 1: 1694.9. Samples: 10849908. Policy #0 lag: (min: 31.0, avg: 36.0, max: 63.0) +[2023-10-11 15:48:36,064][84230] Avg episode reward: [(0, '6.660'), (1, '7.770')] +[2023-10-11 15:48:36,595][85176] Updated weights for policy 0, policy_version 21032 (0.0009) +[2023-10-11 15:48:36,974][85176] Updated weights for policy 0, policy_version 21042 (0.0009) +[2023-10-11 15:48:37,347][85176] Updated weights for policy 0, policy_version 21052 (0.0009) +[2023-10-11 15:48:39,402][85175] Updated weights for policy 1, policy_version 21350 (0.0009) +[2023-10-11 15:48:39,771][85175] Updated weights for policy 1, policy_version 21360 (0.0009) +[2023-10-11 15:48:40,147][85175] Updated weights for policy 1, policy_version 21370 (0.0009) +[2023-10-11 15:48:41,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 43450368. Throughput: 0: 1679.8, 1: 1689.4. Samples: 10870140. Policy #0 lag: (min: 31.0, avg: 36.0, max: 63.0) +[2023-10-11 15:48:41,064][84230] Avg episode reward: [(0, '6.700'), (1, '7.710')] +[2023-10-11 15:48:41,436][85176] Updated weights for policy 0, policy_version 21062 (0.0007) +[2023-10-11 15:48:41,812][85176] Updated weights for policy 0, policy_version 21072 (0.0008) +[2023-10-11 15:48:42,183][85176] Updated weights for policy 0, policy_version 21082 (0.0007) +[2023-10-11 15:48:44,266][85175] Updated weights for policy 1, policy_version 21380 (0.0008) +[2023-10-11 15:48:44,637][85175] Updated weights for policy 1, policy_version 21390 (0.0007) +[2023-10-11 15:48:45,003][85175] Updated weights for policy 1, policy_version 21400 (0.0008) +[2023-10-11 15:48:46,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 43515904. Throughput: 0: 1683.5, 1: 1671.9. Samples: 10890190. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:48:46,063][84230] Avg episode reward: [(0, '6.930'), (1, '7.260')] +[2023-10-11 15:48:46,135][85176] Updated weights for policy 0, policy_version 21092 (0.0008) +[2023-10-11 15:48:46,524][85176] Updated weights for policy 0, policy_version 21102 (0.0009) +[2023-10-11 15:48:46,897][85176] Updated weights for policy 0, policy_version 21112 (0.0007) +[2023-10-11 15:48:48,874][85175] Updated weights for policy 1, policy_version 21410 (0.0008) +[2023-10-11 15:48:49,290][85175] Updated weights for policy 1, policy_version 21420 (0.0009) +[2023-10-11 15:48:49,659][85175] Updated weights for policy 1, policy_version 21430 (0.0009) +[2023-10-11 15:48:50,029][85175] Updated weights for policy 1, policy_version 21440 (0.0009) +[2023-10-11 15:48:51,062][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 43581440. Throughput: 0: 1681.7, 1: 1708.6. Samples: 10900648. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:48:51,063][84230] Avg episode reward: [(0, '7.550'), (1, '7.200')] +[2023-10-11 15:48:51,140][85176] Updated weights for policy 0, policy_version 21122 (0.0008) +[2023-10-11 15:48:51,505][85176] Updated weights for policy 0, policy_version 21132 (0.0008) +[2023-10-11 15:48:51,889][85176] Updated weights for policy 0, policy_version 21142 (0.0009) +[2023-10-11 15:48:52,265][85176] Updated weights for policy 0, policy_version 21152 (0.0009) +[2023-10-11 15:48:53,838][85175] Updated weights for policy 1, policy_version 21450 (0.0010) +[2023-10-11 15:48:54,206][85175] Updated weights for policy 1, policy_version 21460 (0.0009) +[2023-10-11 15:48:54,580][85175] Updated weights for policy 1, policy_version 21470 (0.0007) +[2023-10-11 15:48:56,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 43646976. Throughput: 0: 1680.3, 1: 1686.9. Samples: 10920120. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:48:56,063][84230] Avg episode reward: [(0, '7.300'), (1, '7.540')] +[2023-10-11 15:48:56,295][85176] Updated weights for policy 0, policy_version 21162 (0.0008) +[2023-10-11 15:48:56,666][85176] Updated weights for policy 0, policy_version 21172 (0.0007) +[2023-10-11 15:48:57,043][85176] Updated weights for policy 0, policy_version 21182 (0.0007) +[2023-10-11 15:48:58,689][85175] Updated weights for policy 1, policy_version 21480 (0.0009) +[2023-10-11 15:48:59,056][85175] Updated weights for policy 1, policy_version 21490 (0.0010) +[2023-10-11 15:48:59,412][85175] Updated weights for policy 1, policy_version 21500 (0.0009) +[2023-10-11 15:49:01,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 43712512. Throughput: 0: 1678.3, 1: 1688.1. Samples: 10940686. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:49:01,063][84230] Avg episode reward: [(0, '6.910'), (1, '7.960')] +[2023-10-11 15:49:01,073][85176] Updated weights for policy 0, policy_version 21192 (0.0007) +[2023-10-11 15:49:01,451][85176] Updated weights for policy 0, policy_version 21202 (0.0010) +[2023-10-11 15:49:01,819][85176] Updated weights for policy 0, policy_version 21212 (0.0007) +[2023-10-11 15:49:03,417][85175] Updated weights for policy 1, policy_version 21510 (0.0008) +[2023-10-11 15:49:03,778][85175] Updated weights for policy 1, policy_version 21520 (0.0008) +[2023-10-11 15:49:04,146][85175] Updated weights for policy 1, policy_version 21530 (0.0009) +[2023-10-11 15:49:06,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 43778048. Throughput: 0: 1679.6, 1: 1697.9. Samples: 10950590. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) +[2023-10-11 15:49:06,063][84230] Avg episode reward: [(0, '6.810'), (1, '7.390')] +[2023-10-11 15:49:06,065][85176] Updated weights for policy 0, policy_version 21222 (0.0010) +[2023-10-11 15:49:06,436][85176] Updated weights for policy 0, policy_version 21232 (0.0011) +[2023-10-11 15:49:06,810][85176] Updated weights for policy 0, policy_version 21242 (0.0010) +[2023-10-11 15:49:08,258][85175] Updated weights for policy 1, policy_version 21540 (0.0008) +[2023-10-11 15:49:08,622][85175] Updated weights for policy 1, policy_version 21550 (0.0009) +[2023-10-11 15:49:08,991][85175] Updated weights for policy 1, policy_version 21560 (0.0009) +[2023-10-11 15:49:10,930][85176] Updated weights for policy 0, policy_version 21252 (0.0010) +[2023-10-11 15:49:11,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.5). Total num frames: 43843584. Throughput: 0: 1676.5, 1: 1672.3. Samples: 10970370. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) +[2023-10-11 15:49:11,063][84230] Avg episode reward: [(0, '6.520'), (1, '7.320')] +[2023-10-11 15:49:11,294][85176] Updated weights for policy 0, policy_version 21262 (0.0010) +[2023-10-11 15:49:11,660][85176] Updated weights for policy 0, policy_version 21272 (0.0008) +[2023-10-11 15:49:12,861][85175] Updated weights for policy 1, policy_version 21570 (0.0008) +[2023-10-11 15:49:13,246][85175] Updated weights for policy 1, policy_version 21580 (0.0007) +[2023-10-11 15:49:13,621][85175] Updated weights for policy 1, policy_version 21590 (0.0007) +[2023-10-11 15:49:13,992][85175] Updated weights for policy 1, policy_version 21600 (0.0007) +[2023-10-11 15:49:15,715][85176] Updated weights for policy 0, policy_version 21282 (0.0007) +[2023-10-11 15:49:16,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 43909120. Throughput: 0: 1676.2, 1: 1702.3. Samples: 10991588. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) +[2023-10-11 15:49:16,064][84230] Avg episode reward: [(0, '6.400'), (1, '7.460')] +[2023-10-11 15:49:16,095][85176] Updated weights for policy 0, policy_version 21292 (0.0008) +[2023-10-11 15:49:16,481][85176] Updated weights for policy 0, policy_version 21302 (0.0009) +[2023-10-11 15:49:16,848][85176] Updated weights for policy 0, policy_version 21312 (0.0008) +[2023-10-11 15:49:17,894][85175] Updated weights for policy 1, policy_version 21610 (0.0008) +[2023-10-11 15:49:18,263][85175] Updated weights for policy 1, policy_version 21620 (0.0008) +[2023-10-11 15:49:18,633][85175] Updated weights for policy 1, policy_version 21630 (0.0007) +[2023-10-11 15:49:20,847][85176] Updated weights for policy 0, policy_version 21322 (0.0009) +[2023-10-11 15:49:21,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 43974656. Throughput: 0: 1675.2, 1: 1682.5. Samples: 11001006. Policy #0 lag: (min: 31.0, avg: 38.8, max: 63.0) +[2023-10-11 15:49:21,064][84230] Avg episode reward: [(0, '6.680'), (1, '8.070')] +[2023-10-11 15:49:21,225][85176] Updated weights for policy 0, policy_version 21332 (0.0011) +[2023-10-11 15:49:21,607][85176] Updated weights for policy 0, policy_version 21342 (0.0011) +[2023-10-11 15:49:22,635][85175] Updated weights for policy 1, policy_version 21640 (0.0010) +[2023-10-11 15:49:23,002][85175] Updated weights for policy 1, policy_version 21650 (0.0008) +[2023-10-11 15:49:23,380][85175] Updated weights for policy 1, policy_version 21660 (0.0008) +[2023-10-11 15:49:25,814][85176] Updated weights for policy 0, policy_version 21352 (0.0007) +[2023-10-11 15:49:26,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 44040192. Throughput: 0: 1671.6, 1: 1684.8. Samples: 11021178. Policy #0 lag: (min: 31.0, avg: 38.8, max: 63.0) +[2023-10-11 15:49:26,063][84230] Avg episode reward: [(0, '7.160'), (1, '7.630')] +[2023-10-11 15:49:26,194][85176] Updated weights for policy 0, policy_version 21362 (0.0010) +[2023-10-11 15:49:26,573][85176] Updated weights for policy 0, policy_version 21372 (0.0008) +[2023-10-11 15:49:27,549][85175] Updated weights for policy 1, policy_version 21670 (0.0010) +[2023-10-11 15:49:27,919][85175] Updated weights for policy 1, policy_version 21680 (0.0008) +[2023-10-11 15:49:28,293][85175] Updated weights for policy 1, policy_version 21690 (0.0008) +[2023-10-11 15:49:30,635][85176] Updated weights for policy 0, policy_version 21382 (0.0009) +[2023-10-11 15:49:31,007][85176] Updated weights for policy 0, policy_version 21392 (0.0007) +[2023-10-11 15:49:31,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 44105728. Throughput: 0: 1658.9, 1: 1702.4. Samples: 11041448. Policy #0 lag: (min: 31.0, avg: 38.8, max: 63.0) +[2023-10-11 15:49:31,063][84230] Avg episode reward: [(0, '7.160'), (1, '7.090')] +[2023-10-11 15:49:31,383][85176] Updated weights for policy 0, policy_version 21402 (0.0008) +[2023-10-11 15:49:32,318][85175] Updated weights for policy 1, policy_version 21700 (0.0008) +[2023-10-11 15:49:32,678][85175] Updated weights for policy 1, policy_version 21710 (0.0009) +[2023-10-11 15:49:33,055][85175] Updated weights for policy 1, policy_version 21720 (0.0009) +[2023-10-11 15:49:35,432][85176] Updated weights for policy 0, policy_version 21412 (0.0008) +[2023-10-11 15:49:35,820][85176] Updated weights for policy 0, policy_version 21422 (0.0010) +[2023-10-11 15:49:36,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 44171264. Throughput: 0: 1669.0, 1: 1669.6. Samples: 11050884. Policy #0 lag: (min: 31.0, avg: 38.8, max: 63.0) +[2023-10-11 15:49:36,063][84230] Avg episode reward: [(0, '6.860'), (1, '7.580')] +[2023-10-11 15:49:36,189][85176] Updated weights for policy 0, policy_version 21432 (0.0008) +[2023-10-11 15:49:37,252][85175] Updated weights for policy 1, policy_version 21730 (0.0009) +[2023-10-11 15:49:37,676][85175] Updated weights for policy 1, policy_version 21740 (0.0010) +[2023-10-11 15:49:38,037][85175] Updated weights for policy 1, policy_version 21750 (0.0009) +[2023-10-11 15:49:38,402][85175] Updated weights for policy 1, policy_version 21760 (0.0008) +[2023-10-11 15:49:40,228][85176] Updated weights for policy 0, policy_version 21442 (0.0007) +[2023-10-11 15:49:40,603][85176] Updated weights for policy 0, policy_version 21452 (0.0007) +[2023-10-11 15:49:40,979][85176] Updated weights for policy 0, policy_version 21462 (0.0010) +[2023-10-11 15:49:41,062][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 44236800. Throughput: 0: 1667.0, 1: 1690.4. Samples: 11071202. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:49:41,063][84230] Avg episode reward: [(0, '6.910'), (1, '7.870')] +[2023-10-11 15:49:41,345][85176] Updated weights for policy 0, policy_version 21472 (0.0010) +[2023-10-11 15:49:42,366][85175] Updated weights for policy 1, policy_version 21770 (0.0008) +[2023-10-11 15:49:42,736][85175] Updated weights for policy 1, policy_version 21780 (0.0009) +[2023-10-11 15:49:43,102][85175] Updated weights for policy 1, policy_version 21790 (0.0010) +[2023-10-11 15:49:45,606][85176] Updated weights for policy 0, policy_version 21482 (0.0008) +[2023-10-11 15:49:45,981][85176] Updated weights for policy 0, policy_version 21492 (0.0011) +[2023-10-11 15:49:46,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 44302336. Throughput: 0: 1656.8, 1: 1690.6. Samples: 11091322. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:49:46,063][84230] Avg episode reward: [(0, '6.940'), (1, '7.980')] +[2023-10-11 15:49:46,359][85176] Updated weights for policy 0, policy_version 21502 (0.0008) +[2023-10-11 15:49:47,281][85175] Updated weights for policy 1, policy_version 21800 (0.0007) +[2023-10-11 15:49:47,647][85175] Updated weights for policy 1, policy_version 21810 (0.0010) +[2023-10-11 15:49:48,021][85175] Updated weights for policy 1, policy_version 21820 (0.0008) +[2023-10-11 15:49:50,672][85176] Updated weights for policy 0, policy_version 21512 (0.0007) +[2023-10-11 15:49:51,046][85176] Updated weights for policy 0, policy_version 21522 (0.0009) +[2023-10-11 15:49:51,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 44367872. Throughput: 0: 1664.7, 1: 1671.9. Samples: 11100736. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:49:51,064][84230] Avg episode reward: [(0, '7.150'), (1, '7.430')] +[2023-10-11 15:49:51,418][85176] Updated weights for policy 0, policy_version 21532 (0.0009) +[2023-10-11 15:49:52,118][85175] Updated weights for policy 1, policy_version 21830 (0.0009) +[2023-10-11 15:49:52,487][85175] Updated weights for policy 1, policy_version 21840 (0.0011) +[2023-10-11 15:49:52,861][85175] Updated weights for policy 1, policy_version 21850 (0.0010) +[2023-10-11 15:49:55,735][85176] Updated weights for policy 0, policy_version 21542 (0.0007) +[2023-10-11 15:49:56,062][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 44433408. Throughput: 0: 1658.0, 1: 1689.9. Samples: 11121026. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:49:56,063][84230] Avg episode reward: [(0, '7.150'), (1, '7.440')] +[2023-10-11 15:49:56,105][85176] Updated weights for policy 0, policy_version 21552 (0.0008) +[2023-10-11 15:49:56,483][85176] Updated weights for policy 0, policy_version 21562 (0.0008) +[2023-10-11 15:49:56,868][85175] Updated weights for policy 1, policy_version 21860 (0.0009) +[2023-10-11 15:49:57,243][85175] Updated weights for policy 1, policy_version 21870 (0.0009) +[2023-10-11 15:49:57,613][85175] Updated weights for policy 1, policy_version 21880 (0.0010) +[2023-10-11 15:50:00,545][85176] Updated weights for policy 0, policy_version 21572 (0.0009) +[2023-10-11 15:50:00,918][85176] Updated weights for policy 0, policy_version 21582 (0.0009) +[2023-10-11 15:50:01,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 44498944. Throughput: 0: 1646.3, 1: 1679.7. Samples: 11141256. Policy #0 lag: (min: 3.0, avg: 3.3, max: 12.0) +[2023-10-11 15:50:01,063][84230] Avg episode reward: [(0, '6.950'), (1, '7.020')] +[2023-10-11 15:50:01,293][85176] Updated weights for policy 0, policy_version 21592 (0.0009) +[2023-10-11 15:50:01,694][85175] Updated weights for policy 1, policy_version 21890 (0.0009) +[2023-10-11 15:50:02,064][85175] Updated weights for policy 1, policy_version 21900 (0.0008) +[2023-10-11 15:50:02,425][85175] Updated weights for policy 1, policy_version 21910 (0.0008) +[2023-10-11 15:50:02,799][85175] Updated weights for policy 1, policy_version 21920 (0.0010) +[2023-10-11 15:50:05,240][85176] Updated weights for policy 0, policy_version 21602 (0.0008) +[2023-10-11 15:50:05,623][85176] Updated weights for policy 0, policy_version 21612 (0.0007) +[2023-10-11 15:50:05,994][85176] Updated weights for policy 0, policy_version 21622 (0.0007) +[2023-10-11 15:50:06,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 44564480. Throughput: 0: 1651.2, 1: 1670.3. Samples: 11150472. Policy #0 lag: (min: 3.0, avg: 3.3, max: 12.0) +[2023-10-11 15:50:06,064][84230] Avg episode reward: [(0, '6.950'), (1, '7.440')] +[2023-10-11 15:50:06,363][85176] Updated weights for policy 0, policy_version 21632 (0.0007) +[2023-10-11 15:50:06,880][85175] Updated weights for policy 1, policy_version 21930 (0.0007) +[2023-10-11 15:50:07,241][85175] Updated weights for policy 1, policy_version 21940 (0.0008) +[2023-10-11 15:50:07,618][85175] Updated weights for policy 1, policy_version 21950 (0.0007) +[2023-10-11 15:50:10,448][85176] Updated weights for policy 0, policy_version 21642 (0.0007) +[2023-10-11 15:50:10,829][85176] Updated weights for policy 0, policy_version 21652 (0.0008) +[2023-10-11 15:50:11,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 44630016. Throughput: 0: 1654.1, 1: 1679.9. Samples: 11171206. Policy #0 lag: (min: 3.0, avg: 3.3, max: 12.0) +[2023-10-11 15:50:11,064][84230] Avg episode reward: [(0, '6.930'), (1, '7.920')] +[2023-10-11 15:50:11,199][85176] Updated weights for policy 0, policy_version 21662 (0.0008) +[2023-10-11 15:50:11,556][85175] Updated weights for policy 1, policy_version 21960 (0.0007) +[2023-10-11 15:50:11,922][85175] Updated weights for policy 1, policy_version 21970 (0.0008) +[2023-10-11 15:50:12,288][85175] Updated weights for policy 1, policy_version 21980 (0.0008) +[2023-10-11 15:50:15,244][85176] Updated weights for policy 0, policy_version 21672 (0.0009) +[2023-10-11 15:50:15,621][85176] Updated weights for policy 0, policy_version 21682 (0.0007) +[2023-10-11 15:50:15,998][85176] Updated weights for policy 0, policy_version 21692 (0.0007) +[2023-10-11 15:50:16,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 44695552. Throughput: 0: 1644.8, 1: 1684.4. Samples: 11191262. Policy #0 lag: (min: 3.0, avg: 3.3, max: 12.0) +[2023-10-11 15:50:16,063][84230] Avg episode reward: [(0, '6.910'), (1, '7.950')] +[2023-10-11 15:50:16,073][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000021984_22511616.pth... +[2023-10-11 15:50:16,109][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000020416_20905984.pth +[2023-10-11 15:50:16,146][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000021696_22216704.pth... +[2023-10-11 15:50:16,178][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000020128_20611072.pth +[2023-10-11 15:50:16,437][85175] Updated weights for policy 1, policy_version 21990 (0.0007) +[2023-10-11 15:50:16,803][85175] Updated weights for policy 1, policy_version 22000 (0.0008) +[2023-10-11 15:50:17,165][85175] Updated weights for policy 1, policy_version 22010 (0.0008) +[2023-10-11 15:50:20,048][85176] Updated weights for policy 0, policy_version 21702 (0.0007) +[2023-10-11 15:50:20,416][85176] Updated weights for policy 0, policy_version 21712 (0.0009) +[2023-10-11 15:50:20,789][85176] Updated weights for policy 0, policy_version 21722 (0.0011) +[2023-10-11 15:50:21,062][84230] Fps is (10 sec: 16384.1, 60 sec: 13653.4, 300 sec: 13329.4). Total num frames: 44793856. Throughput: 0: 1652.0, 1: 1683.1. Samples: 11200964. Policy #0 lag: (min: 31.0, avg: 35.2, max: 63.0) +[2023-10-11 15:50:21,063][84230] Avg episode reward: [(0, '6.910'), (1, '7.930')] +[2023-10-11 15:50:21,316][85175] Updated weights for policy 1, policy_version 22020 (0.0008) +[2023-10-11 15:50:21,678][85175] Updated weights for policy 1, policy_version 22030 (0.0009) +[2023-10-11 15:50:22,040][85175] Updated weights for policy 1, policy_version 22040 (0.0010) +[2023-10-11 15:50:24,871][85176] Updated weights for policy 0, policy_version 21732 (0.0011) +[2023-10-11 15:50:25,239][85176] Updated weights for policy 0, policy_version 21742 (0.0010) +[2023-10-11 15:50:25,611][85176] Updated weights for policy 0, policy_version 21752 (0.0007) +[2023-10-11 15:50:26,063][84230] Fps is (10 sec: 16383.8, 60 sec: 13653.3, 300 sec: 13329.3). Total num frames: 44859392. Throughput: 0: 1655.7, 1: 1683.9. Samples: 11221486. Policy #0 lag: (min: 31.0, avg: 35.2, max: 63.0) +[2023-10-11 15:50:26,064][84230] Avg episode reward: [(0, '6.970'), (1, '7.000')] +[2023-10-11 15:50:26,228][85175] Updated weights for policy 1, policy_version 22050 (0.0010) +[2023-10-11 15:50:26,644][85175] Updated weights for policy 1, policy_version 22060 (0.0007) +[2023-10-11 15:50:27,020][85175] Updated weights for policy 1, policy_version 22070 (0.0011) +[2023-10-11 15:50:27,390][85175] Updated weights for policy 1, policy_version 22080 (0.0010) +[2023-10-11 15:50:29,824][85176] Updated weights for policy 0, policy_version 21762 (0.0008) +[2023-10-11 15:50:30,198][85176] Updated weights for policy 0, policy_version 21772 (0.0007) +[2023-10-11 15:50:30,567][85176] Updated weights for policy 0, policy_version 21782 (0.0007) +[2023-10-11 15:50:30,937][85176] Updated weights for policy 0, policy_version 21792 (0.0007) +[2023-10-11 15:50:31,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 44924928. Throughput: 0: 1640.8, 1: 1683.8. Samples: 11240932. Policy #0 lag: (min: 31.0, avg: 35.2, max: 63.0) +[2023-10-11 15:50:31,063][84230] Avg episode reward: [(0, '7.000'), (1, '7.270')] +[2023-10-11 15:50:31,271][85175] Updated weights for policy 1, policy_version 22090 (0.0010) +[2023-10-11 15:50:31,648][85175] Updated weights for policy 1, policy_version 22100 (0.0008) +[2023-10-11 15:50:32,011][85175] Updated weights for policy 1, policy_version 22110 (0.0008) +[2023-10-11 15:50:35,097][85176] Updated weights for policy 0, policy_version 21802 (0.0007) +[2023-10-11 15:50:35,471][85176] Updated weights for policy 0, policy_version 21812 (0.0008) +[2023-10-11 15:50:35,854][85176] Updated weights for policy 0, policy_version 21822 (0.0008) +[2023-10-11 15:50:36,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 44990464. Throughput: 0: 1653.2, 1: 1683.4. Samples: 11250884. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:50:36,063][84230] Avg episode reward: [(0, '7.080'), (1, '7.910')] +[2023-10-11 15:50:36,069][85175] Updated weights for policy 1, policy_version 22120 (0.0010) +[2023-10-11 15:50:36,434][85175] Updated weights for policy 1, policy_version 22130 (0.0008) +[2023-10-11 15:50:36,797][85175] Updated weights for policy 1, policy_version 22140 (0.0008) +[2023-10-11 15:50:39,960][85176] Updated weights for policy 0, policy_version 21832 (0.0007) +[2023-10-11 15:50:40,326][85176] Updated weights for policy 0, policy_version 21842 (0.0010) +[2023-10-11 15:50:40,702][85176] Updated weights for policy 0, policy_version 21852 (0.0008) +[2023-10-11 15:50:40,902][85175] Updated weights for policy 1, policy_version 22150 (0.0009) +[2023-10-11 15:50:41,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13329.3). Total num frames: 45056000. Throughput: 0: 1660.6, 1: 1683.1. Samples: 11271490. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:50:41,063][84230] Avg episode reward: [(0, '7.110'), (1, '8.120')] +[2023-10-11 15:50:41,267][85175] Updated weights for policy 1, policy_version 22160 (0.0009) +[2023-10-11 15:50:41,642][85175] Updated weights for policy 1, policy_version 22170 (0.0007) +[2023-10-11 15:50:44,718][85176] Updated weights for policy 0, policy_version 21862 (0.0007) +[2023-10-11 15:50:45,094][85176] Updated weights for policy 0, policy_version 21872 (0.0008) +[2023-10-11 15:50:45,473][85176] Updated weights for policy 0, policy_version 21882 (0.0009) +[2023-10-11 15:50:45,687][85175] Updated weights for policy 1, policy_version 22180 (0.0008) +[2023-10-11 15:50:46,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 45121536. Throughput: 0: 1646.0, 1: 1689.5. Samples: 11291354. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:50:46,063][84230] Avg episode reward: [(0, '7.440'), (1, '7.520')] +[2023-10-11 15:50:46,069][85175] Updated weights for policy 1, policy_version 22190 (0.0010) +[2023-10-11 15:50:46,436][85175] Updated weights for policy 1, policy_version 22200 (0.0011) +[2023-10-11 15:50:49,708][85176] Updated weights for policy 0, policy_version 21892 (0.0008) +[2023-10-11 15:50:50,078][85176] Updated weights for policy 0, policy_version 21902 (0.0009) +[2023-10-11 15:50:50,462][85176] Updated weights for policy 0, policy_version 21912 (0.0009) +[2023-10-11 15:50:50,526][85175] Updated weights for policy 1, policy_version 22210 (0.0011) +[2023-10-11 15:50:50,893][85175] Updated weights for policy 1, policy_version 22220 (0.0009) +[2023-10-11 15:50:51,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13329.4). Total num frames: 45187072. Throughput: 0: 1665.3, 1: 1688.8. Samples: 11301408. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:50:51,063][84230] Avg episode reward: [(0, '7.410'), (1, '7.420')] +[2023-10-11 15:50:51,268][85175] Updated weights for policy 1, policy_version 22230 (0.0010) +[2023-10-11 15:50:51,632][85175] Updated weights for policy 1, policy_version 22240 (0.0008) +[2023-10-11 15:50:54,719][85176] Updated weights for policy 0, policy_version 21922 (0.0009) +[2023-10-11 15:50:55,089][85176] Updated weights for policy 0, policy_version 21932 (0.0008) +[2023-10-11 15:50:55,461][85176] Updated weights for policy 0, policy_version 21942 (0.0009) +[2023-10-11 15:50:55,833][85176] Updated weights for policy 0, policy_version 21952 (0.0010) +[2023-10-11 15:50:55,936][85175] Updated weights for policy 1, policy_version 22250 (0.0007) +[2023-10-11 15:50:56,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 45252608. Throughput: 0: 1657.3, 1: 1683.4. Samples: 11321538. Policy #0 lag: (min: 1.0, avg: 15.0, max: 33.0) +[2023-10-11 15:50:56,064][84230] Avg episode reward: [(0, '7.400'), (1, '7.420')] +[2023-10-11 15:50:56,296][85175] Updated weights for policy 1, policy_version 22260 (0.0008) +[2023-10-11 15:50:56,667][85175] Updated weights for policy 1, policy_version 22270 (0.0009) +[2023-10-11 15:50:59,925][85176] Updated weights for policy 0, policy_version 21962 (0.0007) +[2023-10-11 15:51:00,310][85176] Updated weights for policy 0, policy_version 21972 (0.0009) +[2023-10-11 15:51:00,610][85175] Updated weights for policy 1, policy_version 22280 (0.0008) +[2023-10-11 15:51:00,674][85176] Updated weights for policy 0, policy_version 21982 (0.0009) +[2023-10-11 15:51:00,971][85175] Updated weights for policy 1, policy_version 22290 (0.0007) +[2023-10-11 15:51:01,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13329.3). Total num frames: 45318144. Throughput: 0: 1650.7, 1: 1674.1. Samples: 11340878. Policy #0 lag: (min: 1.0, avg: 15.0, max: 33.0) +[2023-10-11 15:51:01,064][84230] Avg episode reward: [(0, '6.790'), (1, '7.900')] +[2023-10-11 15:51:01,341][85175] Updated weights for policy 1, policy_version 22300 (0.0008) +[2023-10-11 15:51:04,719][85176] Updated weights for policy 0, policy_version 21992 (0.0010) +[2023-10-11 15:51:05,092][85176] Updated weights for policy 0, policy_version 22002 (0.0009) +[2023-10-11 15:51:05,435][85175] Updated weights for policy 1, policy_version 22310 (0.0007) +[2023-10-11 15:51:05,463][85176] Updated weights for policy 0, policy_version 22012 (0.0009) +[2023-10-11 15:51:05,796][85175] Updated weights for policy 1, policy_version 22320 (0.0009) +[2023-10-11 15:51:06,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 45383680. Throughput: 0: 1658.9, 1: 1677.7. Samples: 11351112. Policy #0 lag: (min: 1.0, avg: 15.0, max: 33.0) +[2023-10-11 15:51:06,063][84230] Avg episode reward: [(0, '6.420'), (1, '7.870')] +[2023-10-11 15:51:06,171][85175] Updated weights for policy 1, policy_version 22330 (0.0008) +[2023-10-11 15:51:09,796][85176] Updated weights for policy 0, policy_version 22022 (0.0008) +[2023-10-11 15:51:10,171][85176] Updated weights for policy 0, policy_version 22032 (0.0007) +[2023-10-11 15:51:10,329][85175] Updated weights for policy 1, policy_version 22340 (0.0008) +[2023-10-11 15:51:10,550][85176] Updated weights for policy 0, policy_version 22042 (0.0008) +[2023-10-11 15:51:10,696][85175] Updated weights for policy 1, policy_version 22350 (0.0008) +[2023-10-11 15:51:11,056][85175] Updated weights for policy 1, policy_version 22360 (0.0010) +[2023-10-11 15:51:11,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13329.4). Total num frames: 45449216. Throughput: 0: 1654.1, 1: 1679.5. Samples: 11371498. Policy #0 lag: (min: 21.0, avg: 25.2, max: 53.0) +[2023-10-11 15:51:11,063][84230] Avg episode reward: [(0, '6.250'), (1, '7.490')] +[2023-10-11 15:51:14,747][85176] Updated weights for policy 0, policy_version 22052 (0.0009) +[2023-10-11 15:51:15,139][85176] Updated weights for policy 0, policy_version 22062 (0.0010) +[2023-10-11 15:51:15,198][85175] Updated weights for policy 1, policy_version 22370 (0.0010) +[2023-10-11 15:51:15,510][85176] Updated weights for policy 0, policy_version 22072 (0.0008) +[2023-10-11 15:51:15,602][85175] Updated weights for policy 1, policy_version 22380 (0.0009) +[2023-10-11 15:51:15,958][85175] Updated weights for policy 1, policy_version 22390 (0.0008) +[2023-10-11 15:51:16,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13329.4). Total num frames: 45514752. Throughput: 0: 1656.0, 1: 1665.0. Samples: 11390374. Policy #0 lag: (min: 21.0, avg: 25.2, max: 53.0) +[2023-10-11 15:51:16,063][84230] Avg episode reward: [(0, '6.630'), (1, '6.930')] +[2023-10-11 15:51:16,323][85175] Updated weights for policy 1, policy_version 22400 (0.0007) +[2023-10-11 15:51:19,562][85176] Updated weights for policy 0, policy_version 22082 (0.0008) +[2023-10-11 15:51:19,946][85176] Updated weights for policy 0, policy_version 22092 (0.0007) +[2023-10-11 15:51:20,309][85176] Updated weights for policy 0, policy_version 22102 (0.0009) +[2023-10-11 15:51:20,377][85175] Updated weights for policy 1, policy_version 22410 (0.0009) +[2023-10-11 15:51:20,684][85176] Updated weights for policy 0, policy_version 22112 (0.0010) +[2023-10-11 15:51:20,747][85175] Updated weights for policy 1, policy_version 22420 (0.0009) +[2023-10-11 15:51:21,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 45580288. Throughput: 0: 1657.4, 1: 1678.3. Samples: 11400990. Policy #0 lag: (min: 21.0, avg: 25.2, max: 53.0) +[2023-10-11 15:51:21,063][84230] Avg episode reward: [(0, '6.750'), (1, '7.730')] +[2023-10-11 15:51:21,115][85175] Updated weights for policy 1, policy_version 22430 (0.0009) +[2023-10-11 15:51:24,717][85176] Updated weights for policy 0, policy_version 22122 (0.0007) +[2023-10-11 15:51:25,098][85176] Updated weights for policy 0, policy_version 22132 (0.0007) +[2023-10-11 15:51:25,202][85175] Updated weights for policy 1, policy_version 22440 (0.0008) +[2023-10-11 15:51:25,472][85176] Updated weights for policy 0, policy_version 22142 (0.0009) +[2023-10-11 15:51:25,585][85175] Updated weights for policy 1, policy_version 22450 (0.0008) +[2023-10-11 15:51:25,947][85175] Updated weights for policy 1, policy_version 22460 (0.0007) +[2023-10-11 15:51:26,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 45645824. Throughput: 0: 1650.4, 1: 1680.6. Samples: 11421384. Policy #0 lag: (min: 21.0, avg: 25.2, max: 53.0) +[2023-10-11 15:51:26,063][84230] Avg episode reward: [(0, '6.910'), (1, '7.810')] +[2023-10-11 15:51:29,521][85176] Updated weights for policy 0, policy_version 22152 (0.0007) +[2023-10-11 15:51:29,886][85176] Updated weights for policy 0, policy_version 22162 (0.0010) +[2023-10-11 15:51:30,055][85175] Updated weights for policy 1, policy_version 22470 (0.0008) +[2023-10-11 15:51:30,261][85176] Updated weights for policy 0, policy_version 22172 (0.0008) +[2023-10-11 15:51:30,421][85175] Updated weights for policy 1, policy_version 22480 (0.0008) +[2023-10-11 15:51:30,792][85175] Updated weights for policy 1, policy_version 22490 (0.0009) +[2023-10-11 15:51:31,062][84230] Fps is (10 sec: 16384.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 45744128. Throughput: 0: 1648.5, 1: 1660.6. Samples: 11440264. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:51:31,063][84230] Avg episode reward: [(0, '7.100'), (1, '7.900')] +[2023-10-11 15:51:34,199][85176] Updated weights for policy 0, policy_version 22182 (0.0009) +[2023-10-11 15:51:34,572][85176] Updated weights for policy 0, policy_version 22192 (0.0011) +[2023-10-11 15:51:34,863][85175] Updated weights for policy 1, policy_version 22500 (0.0007) +[2023-10-11 15:51:34,944][85176] Updated weights for policy 0, policy_version 22202 (0.0009) +[2023-10-11 15:51:35,228][85175] Updated weights for policy 1, policy_version 22510 (0.0008) +[2023-10-11 15:51:35,597][85175] Updated weights for policy 1, policy_version 22520 (0.0010) +[2023-10-11 15:51:36,063][84230] Fps is (10 sec: 16383.7, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 45809664. Throughput: 0: 1657.5, 1: 1681.9. Samples: 11451678. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:51:36,064][84230] Avg episode reward: [(0, '7.130'), (1, '7.960')] +[2023-10-11 15:51:39,288][85176] Updated weights for policy 0, policy_version 22212 (0.0008) +[2023-10-11 15:51:39,662][85176] Updated weights for policy 0, policy_version 22222 (0.0007) +[2023-10-11 15:51:39,842][85175] Updated weights for policy 1, policy_version 22530 (0.0008) +[2023-10-11 15:51:40,033][85176] Updated weights for policy 0, policy_version 22232 (0.0010) +[2023-10-11 15:51:40,211][85175] Updated weights for policy 1, policy_version 22540 (0.0009) +[2023-10-11 15:51:40,589][85175] Updated weights for policy 1, policy_version 22550 (0.0009) +[2023-10-11 15:51:40,947][85175] Updated weights for policy 1, policy_version 22560 (0.0009) +[2023-10-11 15:51:41,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 45875200. Throughput: 0: 1652.9, 1: 1680.1. Samples: 11471524. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:51:41,063][84230] Avg episode reward: [(0, '7.410'), (1, '6.940')] +[2023-10-11 15:51:44,182][85176] Updated weights for policy 0, policy_version 22242 (0.0009) +[2023-10-11 15:51:44,553][85176] Updated weights for policy 0, policy_version 22252 (0.0010) +[2023-10-11 15:51:44,920][85176] Updated weights for policy 0, policy_version 22262 (0.0009) +[2023-10-11 15:51:44,921][85175] Updated weights for policy 1, policy_version 22570 (0.0009) +[2023-10-11 15:51:45,278][85175] Updated weights for policy 1, policy_version 22580 (0.0009) +[2023-10-11 15:51:45,292][85176] Updated weights for policy 0, policy_version 22272 (0.0008) +[2023-10-11 15:51:45,650][85175] Updated weights for policy 1, policy_version 22590 (0.0009) +[2023-10-11 15:51:46,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 45940736. Throughput: 0: 1656.6, 1: 1666.7. Samples: 11490428. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:51:46,063][84230] Avg episode reward: [(0, '7.380'), (1, '7.130')] +[2023-10-11 15:51:49,231][85176] Updated weights for policy 0, policy_version 22282 (0.0011) +[2023-10-11 15:51:49,600][85176] Updated weights for policy 0, policy_version 22292 (0.0008) +[2023-10-11 15:51:49,712][85175] Updated weights for policy 1, policy_version 22600 (0.0007) +[2023-10-11 15:51:49,972][85176] Updated weights for policy 0, policy_version 22302 (0.0009) +[2023-10-11 15:51:50,073][85175] Updated weights for policy 1, policy_version 22610 (0.0007) +[2023-10-11 15:51:50,445][85175] Updated weights for policy 1, policy_version 22620 (0.0007) +[2023-10-11 15:51:51,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 46006272. Throughput: 0: 1662.4, 1: 1684.9. Samples: 11501740. Policy #0 lag: (min: 31.0, avg: 31.9, max: 52.0) +[2023-10-11 15:51:51,063][84230] Avg episode reward: [(0, '7.380'), (1, '7.650')] +[2023-10-11 15:51:54,039][85176] Updated weights for policy 0, policy_version 22312 (0.0008) +[2023-10-11 15:51:54,412][85176] Updated weights for policy 0, policy_version 22322 (0.0008) +[2023-10-11 15:51:54,547][85175] Updated weights for policy 1, policy_version 22630 (0.0007) +[2023-10-11 15:51:54,789][85176] Updated weights for policy 0, policy_version 22332 (0.0007) +[2023-10-11 15:51:54,915][85175] Updated weights for policy 1, policy_version 22640 (0.0008) +[2023-10-11 15:51:55,286][85175] Updated weights for policy 1, policy_version 22650 (0.0009) +[2023-10-11 15:51:56,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 46071808. Throughput: 0: 1647.0, 1: 1677.2. Samples: 11521088. Policy #0 lag: (min: 31.0, avg: 31.9, max: 52.0) +[2023-10-11 15:51:56,063][84230] Avg episode reward: [(0, '7.090'), (1, '7.940')] +[2023-10-11 15:51:58,821][85176] Updated weights for policy 0, policy_version 22342 (0.0009) +[2023-10-11 15:51:59,186][85176] Updated weights for policy 0, policy_version 22352 (0.0009) +[2023-10-11 15:51:59,256][85175] Updated weights for policy 1, policy_version 22660 (0.0007) +[2023-10-11 15:51:59,565][85176] Updated weights for policy 0, policy_version 22362 (0.0010) +[2023-10-11 15:51:59,622][85175] Updated weights for policy 1, policy_version 22670 (0.0007) +[2023-10-11 15:51:59,993][85175] Updated weights for policy 1, policy_version 22680 (0.0007) +[2023-10-11 15:52:01,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 46137344. Throughput: 0: 1662.5, 1: 1668.2. Samples: 11540254. Policy #0 lag: (min: 31.0, avg: 31.9, max: 52.0) +[2023-10-11 15:52:01,064][84230] Avg episode reward: [(0, '6.990'), (1, '7.850')] +[2023-10-11 15:52:03,781][85176] Updated weights for policy 0, policy_version 22372 (0.0009) +[2023-10-11 15:52:03,951][85175] Updated weights for policy 1, policy_version 22690 (0.0009) +[2023-10-11 15:52:04,144][85176] Updated weights for policy 0, policy_version 22382 (0.0008) +[2023-10-11 15:52:04,331][85175] Updated weights for policy 1, policy_version 22700 (0.0007) +[2023-10-11 15:52:04,525][85176] Updated weights for policy 0, policy_version 22392 (0.0007) +[2023-10-11 15:52:04,694][85175] Updated weights for policy 1, policy_version 22710 (0.0008) +[2023-10-11 15:52:05,069][85175] Updated weights for policy 1, policy_version 22720 (0.0010) +[2023-10-11 15:52:06,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 46202880. Throughput: 0: 1665.6, 1: 1687.7. Samples: 11551890. Policy #0 lag: (min: 31.0, avg: 31.9, max: 52.0) +[2023-10-11 15:52:06,063][84230] Avg episode reward: [(0, '6.610'), (1, '7.560')] +[2023-10-11 15:52:08,519][85176] Updated weights for policy 0, policy_version 22402 (0.0007) +[2023-10-11 15:52:08,892][85176] Updated weights for policy 0, policy_version 22412 (0.0007) +[2023-10-11 15:52:09,233][85175] Updated weights for policy 1, policy_version 22730 (0.0008) +[2023-10-11 15:52:09,265][85176] Updated weights for policy 0, policy_version 22422 (0.0008) +[2023-10-11 15:52:09,596][85175] Updated weights for policy 1, policy_version 22740 (0.0010) +[2023-10-11 15:52:09,631][85176] Updated weights for policy 0, policy_version 22432 (0.0009) +[2023-10-11 15:52:09,966][85175] Updated weights for policy 1, policy_version 22750 (0.0008) +[2023-10-11 15:52:11,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 46268416. Throughput: 0: 1649.8, 1: 1666.7. Samples: 11570628. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) +[2023-10-11 15:52:11,063][84230] Avg episode reward: [(0, '6.730'), (1, '7.030')] +[2023-10-11 15:52:13,619][85176] Updated weights for policy 0, policy_version 22442 (0.0010) +[2023-10-11 15:52:13,992][85176] Updated weights for policy 0, policy_version 22452 (0.0007) +[2023-10-11 15:52:14,036][85175] Updated weights for policy 1, policy_version 22760 (0.0009) +[2023-10-11 15:52:14,366][85176] Updated weights for policy 0, policy_version 22462 (0.0007) +[2023-10-11 15:52:14,406][85175] Updated weights for policy 1, policy_version 22770 (0.0009) +[2023-10-11 15:52:14,774][85175] Updated weights for policy 1, policy_version 22780 (0.0009) +[2023-10-11 15:52:16,063][84230] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 46333952. Throughput: 0: 1668.3, 1: 1671.3. Samples: 11590548. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) +[2023-10-11 15:52:16,064][84230] Avg episode reward: [(0, '6.610'), (1, '7.640')] +[2023-10-11 15:52:16,077][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000022464_23003136.pth... +[2023-10-11 15:52:16,077][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000022784_23330816.pth... +[2023-10-11 15:52:16,118][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000021216_21725184.pth +[2023-10-11 15:52:16,118][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000020896_21397504.pth +[2023-10-11 15:52:18,690][85176] Updated weights for policy 0, policy_version 22472 (0.0007) +[2023-10-11 15:52:18,727][85175] Updated weights for policy 1, policy_version 22790 (0.0008) +[2023-10-11 15:52:19,055][85176] Updated weights for policy 0, policy_version 22482 (0.0010) +[2023-10-11 15:52:19,094][85175] Updated weights for policy 1, policy_version 22800 (0.0009) +[2023-10-11 15:52:19,426][85176] Updated weights for policy 0, policy_version 22492 (0.0008) +[2023-10-11 15:52:19,446][85175] Updated weights for policy 1, policy_version 22810 (0.0007) +[2023-10-11 15:52:21,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 46399488. Throughput: 0: 1657.3, 1: 1681.3. Samples: 11601912. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) +[2023-10-11 15:52:21,063][84230] Avg episode reward: [(0, '7.000'), (1, '7.610')] +[2023-10-11 15:52:23,526][85175] Updated weights for policy 1, policy_version 22820 (0.0010) +[2023-10-11 15:52:23,588][85176] Updated weights for policy 0, policy_version 22502 (0.0009) +[2023-10-11 15:52:23,884][85175] Updated weights for policy 1, policy_version 22830 (0.0009) +[2023-10-11 15:52:23,959][85176] Updated weights for policy 0, policy_version 22512 (0.0008) +[2023-10-11 15:52:24,258][85175] Updated weights for policy 1, policy_version 22840 (0.0009) +[2023-10-11 15:52:24,332][85176] Updated weights for policy 0, policy_version 22522 (0.0008) +[2023-10-11 15:52:26,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 46465024. Throughput: 0: 1647.8, 1: 1659.4. Samples: 11620350. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) +[2023-10-11 15:52:26,064][84230] Avg episode reward: [(0, '6.970'), (1, '7.540')] +[2023-10-11 15:52:28,306][85175] Updated weights for policy 1, policy_version 22850 (0.0009) +[2023-10-11 15:52:28,327][85176] Updated weights for policy 0, policy_version 22532 (0.0008) +[2023-10-11 15:52:28,670][85175] Updated weights for policy 1, policy_version 22860 (0.0008) +[2023-10-11 15:52:28,706][85176] Updated weights for policy 0, policy_version 22542 (0.0008) +[2023-10-11 15:52:29,046][85175] Updated weights for policy 1, policy_version 22870 (0.0010) +[2023-10-11 15:52:29,067][85176] Updated weights for policy 0, policy_version 22552 (0.0008) +[2023-10-11 15:52:29,422][85175] Updated weights for policy 1, policy_version 22880 (0.0009) +[2023-10-11 15:52:31,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 46530560. Throughput: 0: 1667.5, 1: 1671.0. Samples: 11640660. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:52:31,064][84230] Avg episode reward: [(0, '7.320'), (1, '7.470')] +[2023-10-11 15:52:33,124][85176] Updated weights for policy 0, policy_version 22562 (0.0008) +[2023-10-11 15:52:33,498][85176] Updated weights for policy 0, policy_version 22572 (0.0007) +[2023-10-11 15:52:33,548][85175] Updated weights for policy 1, policy_version 22890 (0.0008) +[2023-10-11 15:52:33,870][85176] Updated weights for policy 0, policy_version 22582 (0.0008) +[2023-10-11 15:52:33,914][85175] Updated weights for policy 1, policy_version 22900 (0.0007) +[2023-10-11 15:52:34,241][85176] Updated weights for policy 0, policy_version 22592 (0.0009) +[2023-10-11 15:52:34,283][85175] Updated weights for policy 1, policy_version 22910 (0.0008) +[2023-10-11 15:52:36,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 46596096. Throughput: 0: 1658.9, 1: 1672.0. Samples: 11651634. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:52:36,064][84230] Avg episode reward: [(0, '7.250'), (1, '7.510')] +[2023-10-11 15:52:38,398][85176] Updated weights for policy 0, policy_version 22602 (0.0009) +[2023-10-11 15:52:38,430][85175] Updated weights for policy 1, policy_version 22920 (0.0008) +[2023-10-11 15:52:38,759][85176] Updated weights for policy 0, policy_version 22612 (0.0008) +[2023-10-11 15:52:38,796][85175] Updated weights for policy 1, policy_version 22930 (0.0009) +[2023-10-11 15:52:39,137][85176] Updated weights for policy 0, policy_version 22622 (0.0010) +[2023-10-11 15:52:39,167][85175] Updated weights for policy 1, policy_version 22940 (0.0010) +[2023-10-11 15:52:41,062][84230] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 46661632. Throughput: 0: 1663.4, 1: 1659.8. Samples: 11670632. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:52:41,063][84230] Avg episode reward: [(0, '7.160'), (1, '7.610')] +[2023-10-11 15:52:43,241][85175] Updated weights for policy 1, policy_version 22950 (0.0008) +[2023-10-11 15:52:43,268][85176] Updated weights for policy 0, policy_version 22632 (0.0008) +[2023-10-11 15:52:43,615][85175] Updated weights for policy 1, policy_version 22960 (0.0008) +[2023-10-11 15:52:43,633][85176] Updated weights for policy 0, policy_version 22642 (0.0007) +[2023-10-11 15:52:43,980][85175] Updated weights for policy 1, policy_version 22970 (0.0008) +[2023-10-11 15:52:44,015][85176] Updated weights for policy 0, policy_version 22652 (0.0009) +[2023-10-11 15:52:46,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 46727168. Throughput: 0: 1679.0, 1: 1681.1. Samples: 11691458. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:52:46,063][84230] Avg episode reward: [(0, '7.050'), (1, '7.650')] +[2023-10-11 15:52:48,050][85176] Updated weights for policy 0, policy_version 22662 (0.0009) +[2023-10-11 15:52:48,211][85175] Updated weights for policy 1, policy_version 22980 (0.0009) +[2023-10-11 15:52:48,421][85176] Updated weights for policy 0, policy_version 22672 (0.0009) +[2023-10-11 15:52:48,569][85175] Updated weights for policy 1, policy_version 22990 (0.0007) +[2023-10-11 15:52:48,791][85176] Updated weights for policy 0, policy_version 22682 (0.0008) +[2023-10-11 15:52:48,939][85175] Updated weights for policy 1, policy_version 23000 (0.0008) +[2023-10-11 15:52:51,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 46792704. Throughput: 0: 1664.8, 1: 1664.0. Samples: 11701690. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:52:51,063][84230] Avg episode reward: [(0, '6.820'), (1, '7.340')] +[2023-10-11 15:52:52,753][85175] Updated weights for policy 1, policy_version 23010 (0.0008) +[2023-10-11 15:52:52,910][85176] Updated weights for policy 0, policy_version 22692 (0.0010) +[2023-10-11 15:52:53,121][85175] Updated weights for policy 1, policy_version 23020 (0.0008) +[2023-10-11 15:52:53,284][85176] Updated weights for policy 0, policy_version 22702 (0.0007) +[2023-10-11 15:52:53,487][85175] Updated weights for policy 1, policy_version 23030 (0.0008) +[2023-10-11 15:52:53,656][85176] Updated weights for policy 0, policy_version 22712 (0.0009) +[2023-10-11 15:52:53,855][85175] Updated weights for policy 1, policy_version 23040 (0.0007) +[2023-10-11 15:52:56,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 46858240. Throughput: 0: 1675.2, 1: 1671.5. Samples: 11721228. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:52:56,063][84230] Avg episode reward: [(0, '6.950'), (1, '7.300')] +[2023-10-11 15:52:57,788][85176] Updated weights for policy 0, policy_version 22722 (0.0007) +[2023-10-11 15:52:57,966][85175] Updated weights for policy 1, policy_version 23050 (0.0007) +[2023-10-11 15:52:58,185][85176] Updated weights for policy 0, policy_version 22732 (0.0009) +[2023-10-11 15:52:58,331][85175] Updated weights for policy 1, policy_version 23060 (0.0009) +[2023-10-11 15:52:58,559][85176] Updated weights for policy 0, policy_version 22742 (0.0010) +[2023-10-11 15:52:58,698][85175] Updated weights for policy 1, policy_version 23070 (0.0008) +[2023-10-11 15:52:58,934][85176] Updated weights for policy 0, policy_version 22752 (0.0010) +[2023-10-11 15:53:01,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 46923776. Throughput: 0: 1680.5, 1: 1686.8. Samples: 11742076. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:53:01,063][84230] Avg episode reward: [(0, '7.260'), (1, '7.620')] +[2023-10-11 15:53:02,672][85175] Updated weights for policy 1, policy_version 23080 (0.0009) +[2023-10-11 15:53:02,939][85176] Updated weights for policy 0, policy_version 22762 (0.0007) +[2023-10-11 15:53:03,031][85175] Updated weights for policy 1, policy_version 23090 (0.0007) +[2023-10-11 15:53:03,306][85176] Updated weights for policy 0, policy_version 22772 (0.0007) +[2023-10-11 15:53:03,400][85175] Updated weights for policy 1, policy_version 23100 (0.0009) +[2023-10-11 15:53:03,680][85176] Updated weights for policy 0, policy_version 22782 (0.0007) +[2023-10-11 15:53:06,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 46989312. Throughput: 0: 1663.1, 1: 1659.1. Samples: 11751410. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:53:06,064][84230] Avg episode reward: [(0, '7.200'), (1, '7.230')] +[2023-10-11 15:53:07,396][85175] Updated weights for policy 1, policy_version 23110 (0.0009) +[2023-10-11 15:53:07,679][85176] Updated weights for policy 0, policy_version 22792 (0.0007) +[2023-10-11 15:53:07,755][85175] Updated weights for policy 1, policy_version 23120 (0.0010) +[2023-10-11 15:53:08,057][85176] Updated weights for policy 0, policy_version 22802 (0.0007) +[2023-10-11 15:53:08,122][85175] Updated weights for policy 1, policy_version 23130 (0.0008) +[2023-10-11 15:53:08,434][85176] Updated weights for policy 0, policy_version 22812 (0.0009) +[2023-10-11 15:53:11,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 47054848. Throughput: 0: 1678.2, 1: 1688.5. Samples: 11771852. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:53:11,063][84230] Avg episode reward: [(0, '7.110'), (1, '7.360')] +[2023-10-11 15:53:12,110][85175] Updated weights for policy 1, policy_version 23140 (0.0007) +[2023-10-11 15:53:12,480][85175] Updated weights for policy 1, policy_version 23150 (0.0007) +[2023-10-11 15:53:12,673][85176] Updated weights for policy 0, policy_version 22822 (0.0009) +[2023-10-11 15:53:12,848][85175] Updated weights for policy 1, policy_version 23160 (0.0008) +[2023-10-11 15:53:13,035][85176] Updated weights for policy 0, policy_version 22832 (0.0009) +[2023-10-11 15:53:13,407][85176] Updated weights for policy 0, policy_version 22842 (0.0009) +[2023-10-11 15:53:16,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 47120384. Throughput: 0: 1677.8, 1: 1699.5. Samples: 11792640. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:53:16,064][84230] Avg episode reward: [(0, '7.300'), (1, '7.610')] +[2023-10-11 15:53:16,925][85175] Updated weights for policy 1, policy_version 23170 (0.0008) +[2023-10-11 15:53:17,289][85175] Updated weights for policy 1, policy_version 23180 (0.0008) +[2023-10-11 15:53:17,432][85176] Updated weights for policy 0, policy_version 22852 (0.0009) +[2023-10-11 15:53:17,651][85175] Updated weights for policy 1, policy_version 23190 (0.0009) +[2023-10-11 15:53:17,796][85176] Updated weights for policy 0, policy_version 22862 (0.0009) +[2023-10-11 15:53:18,017][85175] Updated weights for policy 1, policy_version 23200 (0.0010) +[2023-10-11 15:53:18,171][85176] Updated weights for policy 0, policy_version 22872 (0.0007) +[2023-10-11 15:53:21,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 47185920. Throughput: 0: 1658.3, 1: 1676.7. Samples: 11801708. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:53:21,063][84230] Avg episode reward: [(0, '7.390'), (1, '7.800')] +[2023-10-11 15:53:22,063][85175] Updated weights for policy 1, policy_version 23210 (0.0009) +[2023-10-11 15:53:22,374][85176] Updated weights for policy 0, policy_version 22882 (0.0008) +[2023-10-11 15:53:22,421][85175] Updated weights for policy 1, policy_version 23220 (0.0009) +[2023-10-11 15:53:22,742][85176] Updated weights for policy 0, policy_version 22892 (0.0007) +[2023-10-11 15:53:22,781][85175] Updated weights for policy 1, policy_version 23230 (0.0007) +[2023-10-11 15:53:23,111][85176] Updated weights for policy 0, policy_version 22902 (0.0009) +[2023-10-11 15:53:23,487][85176] Updated weights for policy 0, policy_version 22912 (0.0009) +[2023-10-11 15:53:26,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 47251456. Throughput: 0: 1673.0, 1: 1696.8. Samples: 11822272. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:53:26,064][84230] Avg episode reward: [(0, '7.580'), (1, '7.390')] +[2023-10-11 15:53:26,939][85175] Updated weights for policy 1, policy_version 23240 (0.0010) +[2023-10-11 15:53:27,298][85175] Updated weights for policy 1, policy_version 23250 (0.0008) +[2023-10-11 15:53:27,497][85176] Updated weights for policy 0, policy_version 22922 (0.0009) +[2023-10-11 15:53:27,663][85175] Updated weights for policy 1, policy_version 23260 (0.0007) +[2023-10-11 15:53:27,879][85176] Updated weights for policy 0, policy_version 22932 (0.0008) +[2023-10-11 15:53:28,254][85176] Updated weights for policy 0, policy_version 22942 (0.0009) +[2023-10-11 15:53:31,063][84230] Fps is (10 sec: 13106.8, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 47316992. Throughput: 0: 1670.8, 1: 1699.0. Samples: 11843100. Policy #0 lag: (min: 13.0, avg: 15.8, max: 45.0) +[2023-10-11 15:53:31,064][84230] Avg episode reward: [(0, '7.120'), (1, '7.450')] +[2023-10-11 15:53:31,565][85175] Updated weights for policy 1, policy_version 23270 (0.0008) +[2023-10-11 15:53:31,945][85175] Updated weights for policy 1, policy_version 23280 (0.0007) +[2023-10-11 15:53:32,192][85176] Updated weights for policy 0, policy_version 22952 (0.0008) +[2023-10-11 15:53:32,323][85175] Updated weights for policy 1, policy_version 23290 (0.0008) +[2023-10-11 15:53:32,560][85176] Updated weights for policy 0, policy_version 22962 (0.0010) +[2023-10-11 15:53:32,934][85176] Updated weights for policy 0, policy_version 22972 (0.0008) +[2023-10-11 15:53:36,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 47382528. Throughput: 0: 1660.3, 1: 1680.9. Samples: 11852044. Policy #0 lag: (min: 13.0, avg: 15.8, max: 45.0) +[2023-10-11 15:53:36,063][84230] Avg episode reward: [(0, '7.020'), (1, '7.630')] +[2023-10-11 15:53:36,497][85175] Updated weights for policy 1, policy_version 23300 (0.0008) +[2023-10-11 15:53:36,864][85175] Updated weights for policy 1, policy_version 23310 (0.0010) +[2023-10-11 15:53:37,149][85176] Updated weights for policy 0, policy_version 22982 (0.0008) +[2023-10-11 15:53:37,233][85175] Updated weights for policy 1, policy_version 23320 (0.0008) +[2023-10-11 15:53:37,519][85176] Updated weights for policy 0, policy_version 22992 (0.0009) +[2023-10-11 15:53:37,886][85176] Updated weights for policy 0, policy_version 23002 (0.0010) +[2023-10-11 15:53:41,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 47448064. Throughput: 0: 1668.9, 1: 1692.5. Samples: 11872492. Policy #0 lag: (min: 13.0, avg: 15.8, max: 45.0) +[2023-10-11 15:53:41,064][84230] Avg episode reward: [(0, '6.940'), (1, '7.890')] +[2023-10-11 15:53:41,362][85175] Updated weights for policy 1, policy_version 23330 (0.0008) +[2023-10-11 15:53:41,738][85175] Updated weights for policy 1, policy_version 23340 (0.0008) +[2023-10-11 15:53:41,989][85176] Updated weights for policy 0, policy_version 23012 (0.0007) +[2023-10-11 15:53:42,108][85175] Updated weights for policy 1, policy_version 23350 (0.0007) +[2023-10-11 15:53:42,361][85176] Updated weights for policy 0, policy_version 23022 (0.0009) +[2023-10-11 15:53:42,472][85175] Updated weights for policy 1, policy_version 23360 (0.0010) +[2023-10-11 15:53:42,729][85176] Updated weights for policy 0, policy_version 23032 (0.0008) +[2023-10-11 15:53:46,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 47513600. Throughput: 0: 1669.1, 1: 1687.3. Samples: 11893114. Policy #0 lag: (min: 13.0, avg: 15.8, max: 45.0) +[2023-10-11 15:53:46,064][84230] Avg episode reward: [(0, '7.310'), (1, '7.620')] +[2023-10-11 15:53:46,710][85175] Updated weights for policy 1, policy_version 23370 (0.0009) +[2023-10-11 15:53:46,790][85176] Updated weights for policy 0, policy_version 23042 (0.0007) +[2023-10-11 15:53:47,082][85175] Updated weights for policy 1, policy_version 23380 (0.0007) +[2023-10-11 15:53:47,186][85176] Updated weights for policy 0, policy_version 23052 (0.0007) +[2023-10-11 15:53:47,445][85175] Updated weights for policy 1, policy_version 23390 (0.0008) +[2023-10-11 15:53:47,563][85176] Updated weights for policy 0, policy_version 23062 (0.0009) +[2023-10-11 15:53:47,932][85176] Updated weights for policy 0, policy_version 23072 (0.0008) +[2023-10-11 15:53:51,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.1, 300 sec: 13329.3). Total num frames: 47579136. Throughput: 0: 1659.6, 1: 1686.7. Samples: 11901994. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:53:51,064][84230] Avg episode reward: [(0, '7.140'), (1, '7.300')] +[2023-10-11 15:53:51,579][85175] Updated weights for policy 1, policy_version 23400 (0.0009) +[2023-10-11 15:53:51,954][85175] Updated weights for policy 1, policy_version 23410 (0.0008) +[2023-10-11 15:53:52,065][85176] Updated weights for policy 0, policy_version 23082 (0.0009) +[2023-10-11 15:53:52,311][85175] Updated weights for policy 1, policy_version 23420 (0.0008) +[2023-10-11 15:53:52,433][85176] Updated weights for policy 0, policy_version 23092 (0.0010) +[2023-10-11 15:53:52,809][85176] Updated weights for policy 0, policy_version 23102 (0.0007) +[2023-10-11 15:53:56,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 47644672. Throughput: 0: 1663.8, 1: 1684.5. Samples: 11922524. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:53:56,064][84230] Avg episode reward: [(0, '7.460'), (1, '7.350')] +[2023-10-11 15:53:56,225][85175] Updated weights for policy 1, policy_version 23430 (0.0007) +[2023-10-11 15:53:56,588][85175] Updated weights for policy 1, policy_version 23440 (0.0007) +[2023-10-11 15:53:56,940][85176] Updated weights for policy 0, policy_version 23112 (0.0007) +[2023-10-11 15:53:56,957][85175] Updated weights for policy 1, policy_version 23450 (0.0007) +[2023-10-11 15:53:57,320][85176] Updated weights for policy 0, policy_version 23122 (0.0008) +[2023-10-11 15:53:57,695][85176] Updated weights for policy 0, policy_version 23132 (0.0008) +[2023-10-11 15:54:01,049][85175] Updated weights for policy 1, policy_version 23460 (0.0007) +[2023-10-11 15:54:01,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 47710208. Throughput: 0: 1662.4, 1: 1681.8. Samples: 11943132. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:54:01,064][84230] Avg episode reward: [(0, '7.340'), (1, '7.830')] +[2023-10-11 15:54:01,417][85175] Updated weights for policy 1, policy_version 23470 (0.0009) +[2023-10-11 15:54:01,784][85175] Updated weights for policy 1, policy_version 23480 (0.0007) +[2023-10-11 15:54:01,824][85176] Updated weights for policy 0, policy_version 23142 (0.0008) +[2023-10-11 15:54:02,200][85176] Updated weights for policy 0, policy_version 23152 (0.0007) +[2023-10-11 15:54:02,571][85176] Updated weights for policy 0, policy_version 23162 (0.0008) +[2023-10-11 15:54:05,852][85175] Updated weights for policy 1, policy_version 23490 (0.0009) +[2023-10-11 15:54:06,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 47775744. Throughput: 0: 1663.9, 1: 1681.2. Samples: 11952238. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:54:06,063][84230] Avg episode reward: [(0, '7.290'), (1, '8.310')] +[2023-10-11 15:54:06,223][85175] Updated weights for policy 1, policy_version 23500 (0.0008) +[2023-10-11 15:54:06,589][85175] Updated weights for policy 1, policy_version 23510 (0.0008) +[2023-10-11 15:54:06,636][85176] Updated weights for policy 0, policy_version 23172 (0.0009) +[2023-10-11 15:54:06,954][85175] Updated weights for policy 1, policy_version 23520 (0.0007) +[2023-10-11 15:54:07,005][85176] Updated weights for policy 0, policy_version 23182 (0.0010) +[2023-10-11 15:54:07,373][85176] Updated weights for policy 0, policy_version 23192 (0.0010) +[2023-10-11 15:54:10,979][85175] Updated weights for policy 1, policy_version 23530 (0.0010) +[2023-10-11 15:54:11,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.1, 300 sec: 13329.3). Total num frames: 47841280. Throughput: 0: 1663.5, 1: 1683.1. Samples: 11972868. Policy #0 lag: (min: 27.0, avg: 27.0, max: 27.0) +[2023-10-11 15:54:11,064][84230] Avg episode reward: [(0, '7.180'), (1, '7.800')] +[2023-10-11 15:54:11,347][85175] Updated weights for policy 1, policy_version 23540 (0.0009) +[2023-10-11 15:54:11,720][85175] Updated weights for policy 1, policy_version 23550 (0.0009) +[2023-10-11 15:54:11,777][85176] Updated weights for policy 0, policy_version 23202 (0.0010) +[2023-10-11 15:54:12,148][85176] Updated weights for policy 0, policy_version 23212 (0.0009) +[2023-10-11 15:54:12,518][85176] Updated weights for policy 0, policy_version 23222 (0.0010) +[2023-10-11 15:54:12,896][85176] Updated weights for policy 0, policy_version 23232 (0.0011) +[2023-10-11 15:54:15,860][85175] Updated weights for policy 1, policy_version 23560 (0.0009) +[2023-10-11 15:54:16,063][84230] Fps is (10 sec: 13106.7, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 47906816. Throughput: 0: 1657.2, 1: 1682.9. Samples: 11993404. Policy #0 lag: (min: 27.0, avg: 27.0, max: 27.0) +[2023-10-11 15:54:16,064][84230] Avg episode reward: [(0, '7.540'), (1, '7.220')] +[2023-10-11 15:54:16,075][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000023232_23789568.pth... +[2023-10-11 15:54:16,115][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000021696_22216704.pth +[2023-10-11 15:54:16,121][84801] Saving a milestone ./train_atari/atari_frostbite_APPO/checkpoint_p0/milestones/checkpoint_000023232_23789568.pth +[2023-10-11 15:54:16,222][85175] Updated weights for policy 1, policy_version 23570 (0.0007) +[2023-10-11 15:54:16,592][85175] Updated weights for policy 1, policy_version 23580 (0.0008) +[2023-10-11 15:54:16,734][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000023584_24150016.pth... +[2023-10-11 15:54:16,765][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000021984_22511616.pth +[2023-10-11 15:54:16,769][85000] Saving a milestone ./train_atari/atari_frostbite_APPO/checkpoint_p1/milestones/checkpoint_000023584_24150016.pth +[2023-10-11 15:54:16,954][85176] Updated weights for policy 0, policy_version 23242 (0.0008) +[2023-10-11 15:54:17,324][85176] Updated weights for policy 0, policy_version 23252 (0.0009) +[2023-10-11 15:54:17,691][85176] Updated weights for policy 0, policy_version 23262 (0.0012) +[2023-10-11 15:54:20,546][85175] Updated weights for policy 1, policy_version 23590 (0.0009) +[2023-10-11 15:54:20,916][85175] Updated weights for policy 1, policy_version 23600 (0.0008) +[2023-10-11 15:54:21,063][84230] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 47972352. Throughput: 0: 1656.0, 1: 1687.6. Samples: 12002506. Policy #0 lag: (min: 27.0, avg: 27.0, max: 27.0) +[2023-10-11 15:54:21,064][84230] Avg episode reward: [(0, '7.620'), (1, '7.150')] +[2023-10-11 15:54:21,292][85175] Updated weights for policy 1, policy_version 23610 (0.0009) +[2023-10-11 15:54:21,738][85176] Updated weights for policy 0, policy_version 23272 (0.0010) +[2023-10-11 15:54:22,120][85176] Updated weights for policy 0, policy_version 23282 (0.0008) +[2023-10-11 15:54:22,493][85176] Updated weights for policy 0, policy_version 23292 (0.0009) +[2023-10-11 15:54:25,259][85175] Updated weights for policy 1, policy_version 23620 (0.0008) +[2023-10-11 15:54:25,633][85175] Updated weights for policy 1, policy_version 23630 (0.0008) +[2023-10-11 15:54:25,992][85175] Updated weights for policy 1, policy_version 23640 (0.0007) +[2023-10-11 15:54:26,063][84230] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 48037888. Throughput: 0: 1659.1, 1: 1690.2. Samples: 12023208. Policy #0 lag: (min: 27.0, avg: 27.0, max: 27.0) +[2023-10-11 15:54:26,063][84230] Avg episode reward: [(0, '7.570'), (1, '7.890')] +[2023-10-11 15:54:26,474][85176] Updated weights for policy 0, policy_version 23302 (0.0009) +[2023-10-11 15:54:26,841][85176] Updated weights for policy 0, policy_version 23312 (0.0007) +[2023-10-11 15:54:27,204][85176] Updated weights for policy 0, policy_version 23322 (0.0008) +[2023-10-11 15:54:30,157][85175] Updated weights for policy 1, policy_version 23650 (0.0009) +[2023-10-11 15:54:30,524][85175] Updated weights for policy 1, policy_version 23660 (0.0008) +[2023-10-11 15:54:30,891][85175] Updated weights for policy 1, policy_version 23670 (0.0007) +[2023-10-11 15:54:31,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 48103424. Throughput: 0: 1662.1, 1: 1679.6. Samples: 12043492. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2023-10-11 15:54:31,063][84230] Avg episode reward: [(0, '7.300'), (1, '8.250')] +[2023-10-11 15:54:31,249][85175] Updated weights for policy 1, policy_version 23680 (0.0007) +[2023-10-11 15:54:31,319][85176] Updated weights for policy 0, policy_version 23332 (0.0010) +[2023-10-11 15:54:31,698][85176] Updated weights for policy 0, policy_version 23342 (0.0009) +[2023-10-11 15:54:32,074][85176] Updated weights for policy 0, policy_version 23352 (0.0010) +[2023-10-11 15:54:35,207][85175] Updated weights for policy 1, policy_version 23690 (0.0008) +[2023-10-11 15:54:35,578][85175] Updated weights for policy 1, policy_version 23700 (0.0007) +[2023-10-11 15:54:35,952][85175] Updated weights for policy 1, policy_version 23710 (0.0010) +[2023-10-11 15:54:36,062][84230] Fps is (10 sec: 16384.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 48201728. Throughput: 0: 1667.8, 1: 1691.7. Samples: 12053172. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2023-10-11 15:54:36,063][84230] Avg episode reward: [(0, '7.050'), (1, '7.450')] +[2023-10-11 15:54:36,088][85176] Updated weights for policy 0, policy_version 23362 (0.0009) +[2023-10-11 15:54:36,477][85176] Updated weights for policy 0, policy_version 23372 (0.0007) +[2023-10-11 15:54:36,846][85176] Updated weights for policy 0, policy_version 23382 (0.0010) +[2023-10-11 15:54:37,216][85176] Updated weights for policy 0, policy_version 23392 (0.0010) +[2023-10-11 15:54:40,025][85175] Updated weights for policy 1, policy_version 23720 (0.0009) +[2023-10-11 15:54:40,393][85175] Updated weights for policy 1, policy_version 23730 (0.0007) +[2023-10-11 15:54:40,757][85175] Updated weights for policy 1, policy_version 23740 (0.0008) +[2023-10-11 15:54:41,062][84230] Fps is (10 sec: 16384.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 48267264. Throughput: 0: 1665.3, 1: 1688.6. Samples: 12073448. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2023-10-11 15:54:41,063][84230] Avg episode reward: [(0, '7.110'), (1, '7.320')] +[2023-10-11 15:54:41,368][85176] Updated weights for policy 0, policy_version 23402 (0.0007) +[2023-10-11 15:54:41,748][85176] Updated weights for policy 0, policy_version 23412 (0.0007) +[2023-10-11 15:54:42,128][85176] Updated weights for policy 0, policy_version 23422 (0.0009) +[2023-10-11 15:54:44,886][85175] Updated weights for policy 1, policy_version 23750 (0.0007) +[2023-10-11 15:54:45,246][85175] Updated weights for policy 1, policy_version 23760 (0.0008) +[2023-10-11 15:54:45,613][85175] Updated weights for policy 1, policy_version 23770 (0.0010) +[2023-10-11 15:54:46,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 48332800. Throughput: 0: 1667.1, 1: 1667.0. Samples: 12093166. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2023-10-11 15:54:46,063][84230] Avg episode reward: [(0, '7.470'), (1, '7.510')] +[2023-10-11 15:54:46,195][85176] Updated weights for policy 0, policy_version 23432 (0.0009) +[2023-10-11 15:54:46,578][85176] Updated weights for policy 0, policy_version 23442 (0.0009) +[2023-10-11 15:54:46,967][85176] Updated weights for policy 0, policy_version 23452 (0.0010) +[2023-10-11 15:54:49,717][85175] Updated weights for policy 1, policy_version 23780 (0.0009) +[2023-10-11 15:54:50,073][85175] Updated weights for policy 1, policy_version 23790 (0.0011) +[2023-10-11 15:54:50,448][85175] Updated weights for policy 1, policy_version 23800 (0.0011) +[2023-10-11 15:54:51,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 48398336. Throughput: 0: 1661.6, 1: 1685.7. Samples: 12102870. Policy #0 lag: (min: 20.0, avg: 20.0, max: 20.0) +[2023-10-11 15:54:51,063][84230] Avg episode reward: [(0, '7.660'), (1, '7.670')] +[2023-10-11 15:54:51,246][85176] Updated weights for policy 0, policy_version 23462 (0.0009) +[2023-10-11 15:54:51,616][85176] Updated weights for policy 0, policy_version 23472 (0.0008) +[2023-10-11 15:54:51,994][85176] Updated weights for policy 0, policy_version 23482 (0.0008) +[2023-10-11 15:54:54,624][85175] Updated weights for policy 1, policy_version 23810 (0.0007) +[2023-10-11 15:54:55,002][85175] Updated weights for policy 1, policy_version 23820 (0.0010) +[2023-10-11 15:54:55,362][85175] Updated weights for policy 1, policy_version 23830 (0.0007) +[2023-10-11 15:54:55,735][85175] Updated weights for policy 1, policy_version 23840 (0.0008) +[2023-10-11 15:54:56,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 48463872. Throughput: 0: 1659.7, 1: 1684.4. Samples: 12123350. Policy #0 lag: (min: 20.0, avg: 20.0, max: 20.0) +[2023-10-11 15:54:56,063][84230] Avg episode reward: [(0, '7.750'), (1, '7.640')] +[2023-10-11 15:54:56,204][85176] Updated weights for policy 0, policy_version 23492 (0.0009) +[2023-10-11 15:54:56,582][85176] Updated weights for policy 0, policy_version 23502 (0.0007) +[2023-10-11 15:54:56,961][85176] Updated weights for policy 0, policy_version 23512 (0.0008) +[2023-10-11 15:54:59,544][85175] Updated weights for policy 1, policy_version 23850 (0.0008) +[2023-10-11 15:54:59,911][85175] Updated weights for policy 1, policy_version 23860 (0.0008) +[2023-10-11 15:55:00,279][85175] Updated weights for policy 1, policy_version 23870 (0.0009) +[2023-10-11 15:55:01,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 48529408. Throughput: 0: 1659.6, 1: 1662.9. Samples: 12142920. Policy #0 lag: (min: 20.0, avg: 20.0, max: 20.0) +[2023-10-11 15:55:01,064][84230] Avg episode reward: [(0, '7.440'), (1, '7.540')] +[2023-10-11 15:55:01,188][85176] Updated weights for policy 0, policy_version 23522 (0.0009) +[2023-10-11 15:55:01,561][85176] Updated weights for policy 0, policy_version 23532 (0.0009) +[2023-10-11 15:55:01,940][85176] Updated weights for policy 0, policy_version 23542 (0.0008) +[2023-10-11 15:55:02,313][85176] Updated weights for policy 0, policy_version 23552 (0.0009) +[2023-10-11 15:55:04,361][85175] Updated weights for policy 1, policy_version 23880 (0.0009) +[2023-10-11 15:55:04,729][85175] Updated weights for policy 1, policy_version 23890 (0.0010) +[2023-10-11 15:55:05,099][85175] Updated weights for policy 1, policy_version 23900 (0.0009) +[2023-10-11 15:55:06,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 48594944. Throughput: 0: 1661.0, 1: 1690.4. Samples: 12153316. Policy #0 lag: (min: 20.0, avg: 20.0, max: 20.0) +[2023-10-11 15:55:06,063][84230] Avg episode reward: [(0, '7.040'), (1, '8.150')] +[2023-10-11 15:55:06,209][85176] Updated weights for policy 0, policy_version 23562 (0.0009) +[2023-10-11 15:55:06,586][85176] Updated weights for policy 0, policy_version 23572 (0.0008) +[2023-10-11 15:55:06,967][85176] Updated weights for policy 0, policy_version 23582 (0.0008) +[2023-10-11 15:55:09,260][85175] Updated weights for policy 1, policy_version 23910 (0.0008) +[2023-10-11 15:55:09,624][85175] Updated weights for policy 1, policy_version 23920 (0.0008) +[2023-10-11 15:55:09,982][85175] Updated weights for policy 1, policy_version 23930 (0.0009) +[2023-10-11 15:55:11,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 48660480. Throughput: 0: 1664.7, 1: 1678.4. Samples: 12173644. Policy #0 lag: (min: 26.0, avg: 29.1, max: 51.0) +[2023-10-11 15:55:11,064][84230] Avg episode reward: [(0, '6.970'), (1, '7.440')] +[2023-10-11 15:55:11,177][85176] Updated weights for policy 0, policy_version 23592 (0.0007) +[2023-10-11 15:55:11,558][85176] Updated weights for policy 0, policy_version 23602 (0.0008) +[2023-10-11 15:55:11,940][85176] Updated weights for policy 0, policy_version 23612 (0.0007) +[2023-10-11 15:55:14,039][85175] Updated weights for policy 1, policy_version 23940 (0.0008) +[2023-10-11 15:55:14,398][85175] Updated weights for policy 1, policy_version 23950 (0.0008) +[2023-10-11 15:55:14,772][85175] Updated weights for policy 1, policy_version 23960 (0.0010) +[2023-10-11 15:55:15,970][85176] Updated weights for policy 0, policy_version 23622 (0.0007) +[2023-10-11 15:55:16,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13329.3). Total num frames: 48726016. Throughput: 0: 1665.2, 1: 1670.0. Samples: 12193580. Policy #0 lag: (min: 26.0, avg: 29.1, max: 51.0) +[2023-10-11 15:55:16,064][84230] Avg episode reward: [(0, '7.190'), (1, '7.280')] +[2023-10-11 15:55:16,330][85176] Updated weights for policy 0, policy_version 23632 (0.0008) +[2023-10-11 15:55:16,704][85176] Updated weights for policy 0, policy_version 23642 (0.0010) +[2023-10-11 15:55:18,911][85175] Updated weights for policy 1, policy_version 23970 (0.0010) +[2023-10-11 15:55:19,275][85175] Updated weights for policy 1, policy_version 23980 (0.0009) +[2023-10-11 15:55:19,660][85175] Updated weights for policy 1, policy_version 23990 (0.0009) +[2023-10-11 15:55:20,021][85175] Updated weights for policy 1, policy_version 24000 (0.0008) +[2023-10-11 15:55:20,770][85176] Updated weights for policy 0, policy_version 23652 (0.0007) +[2023-10-11 15:55:21,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 48791552. Throughput: 0: 1663.6, 1: 1687.4. Samples: 12203968. Policy #0 lag: (min: 26.0, avg: 29.1, max: 51.0) +[2023-10-11 15:55:21,063][84230] Avg episode reward: [(0, '7.270'), (1, '7.890')] +[2023-10-11 15:55:21,166][85176] Updated weights for policy 0, policy_version 23662 (0.0008) +[2023-10-11 15:55:21,525][85176] Updated weights for policy 0, policy_version 23672 (0.0007) +[2023-10-11 15:55:24,089][85175] Updated weights for policy 1, policy_version 24010 (0.0009) +[2023-10-11 15:55:24,463][85175] Updated weights for policy 1, policy_version 24020 (0.0007) +[2023-10-11 15:55:24,841][85175] Updated weights for policy 1, policy_version 24030 (0.0007) +[2023-10-11 15:55:25,577][85176] Updated weights for policy 0, policy_version 23682 (0.0007) +[2023-10-11 15:55:25,947][85176] Updated weights for policy 0, policy_version 23692 (0.0009) +[2023-10-11 15:55:26,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13329.3). Total num frames: 48857088. Throughput: 0: 1665.9, 1: 1668.0. Samples: 12223472. Policy #0 lag: (min: 26.0, avg: 29.1, max: 51.0) +[2023-10-11 15:55:26,064][84230] Avg episode reward: [(0, '7.630'), (1, '7.990')] +[2023-10-11 15:55:26,320][85176] Updated weights for policy 0, policy_version 23702 (0.0007) +[2023-10-11 15:55:26,694][85176] Updated weights for policy 0, policy_version 23712 (0.0008) +[2023-10-11 15:55:28,761][85175] Updated weights for policy 1, policy_version 24040 (0.0007) +[2023-10-11 15:55:29,128][85175] Updated weights for policy 1, policy_version 24050 (0.0007) +[2023-10-11 15:55:29,507][85175] Updated weights for policy 1, policy_version 24060 (0.0009) +[2023-10-11 15:55:30,848][85176] Updated weights for policy 0, policy_version 23722 (0.0011) +[2023-10-11 15:55:31,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 48922624. Throughput: 0: 1663.0, 1: 1684.5. Samples: 12243806. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-11 15:55:31,063][84230] Avg episode reward: [(0, '7.440'), (1, '7.830')] +[2023-10-11 15:55:31,211][85176] Updated weights for policy 0, policy_version 23732 (0.0011) +[2023-10-11 15:55:31,587][85176] Updated weights for policy 0, policy_version 23742 (0.0009) +[2023-10-11 15:55:33,445][85175] Updated weights for policy 1, policy_version 24070 (0.0009) +[2023-10-11 15:55:33,814][85175] Updated weights for policy 1, policy_version 24080 (0.0009) +[2023-10-11 15:55:34,185][85175] Updated weights for policy 1, policy_version 24090 (0.0009) +[2023-10-11 15:55:35,823][85176] Updated weights for policy 0, policy_version 23752 (0.0010) +[2023-10-11 15:55:36,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 48988160. Throughput: 0: 1668.0, 1: 1692.6. Samples: 12254098. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-11 15:55:36,063][84230] Avg episode reward: [(0, '7.170'), (1, '7.800')] +[2023-10-11 15:55:36,187][85176] Updated weights for policy 0, policy_version 23762 (0.0010) +[2023-10-11 15:55:36,571][85176] Updated weights for policy 0, policy_version 23772 (0.0008) +[2023-10-11 15:55:38,188][85175] Updated weights for policy 1, policy_version 24100 (0.0009) +[2023-10-11 15:55:38,551][85175] Updated weights for policy 1, policy_version 24110 (0.0007) +[2023-10-11 15:55:38,928][85175] Updated weights for policy 1, policy_version 24120 (0.0008) +[2023-10-11 15:55:40,582][85176] Updated weights for policy 0, policy_version 23782 (0.0008) +[2023-10-11 15:55:40,958][85176] Updated weights for policy 0, policy_version 23792 (0.0008) +[2023-10-11 15:55:41,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 49053696. Throughput: 0: 1668.0, 1: 1673.2. Samples: 12273706. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-11 15:55:41,063][84230] Avg episode reward: [(0, '6.680'), (1, '7.640')] +[2023-10-11 15:55:41,333][85176] Updated weights for policy 0, policy_version 23802 (0.0009) +[2023-10-11 15:55:43,034][85175] Updated weights for policy 1, policy_version 24130 (0.0008) +[2023-10-11 15:55:43,408][85175] Updated weights for policy 1, policy_version 24140 (0.0008) +[2023-10-11 15:55:43,777][85175] Updated weights for policy 1, policy_version 24150 (0.0008) +[2023-10-11 15:55:44,140][85175] Updated weights for policy 1, policy_version 24160 (0.0008) +[2023-10-11 15:55:45,424][85176] Updated weights for policy 0, policy_version 23812 (0.0008) +[2023-10-11 15:55:45,794][85176] Updated weights for policy 0, policy_version 23822 (0.0008) +[2023-10-11 15:55:46,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 49119232. Throughput: 0: 1660.8, 1: 1702.9. Samples: 12294284. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-11 15:55:46,064][84230] Avg episode reward: [(0, '7.080'), (1, '7.630')] +[2023-10-11 15:55:46,177][85176] Updated weights for policy 0, policy_version 23832 (0.0010) +[2023-10-11 15:55:48,073][85175] Updated weights for policy 1, policy_version 24170 (0.0010) +[2023-10-11 15:55:48,431][85175] Updated weights for policy 1, policy_version 24180 (0.0008) +[2023-10-11 15:55:48,803][85175] Updated weights for policy 1, policy_version 24190 (0.0009) +[2023-10-11 15:55:50,169][85176] Updated weights for policy 0, policy_version 23842 (0.0010) +[2023-10-11 15:55:50,539][85176] Updated weights for policy 0, policy_version 23852 (0.0008) +[2023-10-11 15:55:50,900][85176] Updated weights for policy 0, policy_version 23862 (0.0007) +[2023-10-11 15:55:51,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 49184768. Throughput: 0: 1667.9, 1: 1685.1. Samples: 12304200. Policy #0 lag: (min: 31.0, avg: 31.2, max: 41.0) +[2023-10-11 15:55:51,063][84230] Avg episode reward: [(0, '7.150'), (1, '7.520')] +[2023-10-11 15:55:51,277][85176] Updated weights for policy 0, policy_version 23872 (0.0008) +[2023-10-11 15:55:52,742][85175] Updated weights for policy 1, policy_version 24200 (0.0008) +[2023-10-11 15:55:53,108][85175] Updated weights for policy 1, policy_version 24210 (0.0010) +[2023-10-11 15:55:53,481][85175] Updated weights for policy 1, policy_version 24220 (0.0008) +[2023-10-11 15:55:55,406][85176] Updated weights for policy 0, policy_version 23882 (0.0009) +[2023-10-11 15:55:55,785][85176] Updated weights for policy 0, policy_version 23892 (0.0008) +[2023-10-11 15:55:56,062][84230] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 49250304. Throughput: 0: 1663.0, 1: 1691.7. Samples: 12324604. Policy #0 lag: (min: 31.0, avg: 31.2, max: 41.0) +[2023-10-11 15:55:56,063][84230] Avg episode reward: [(0, '7.190'), (1, '7.560')] +[2023-10-11 15:55:56,154][85176] Updated weights for policy 0, policy_version 23902 (0.0009) +[2023-10-11 15:55:57,415][85175] Updated weights for policy 1, policy_version 24230 (0.0007) +[2023-10-11 15:55:57,789][85175] Updated weights for policy 1, policy_version 24240 (0.0010) +[2023-10-11 15:55:58,148][85175] Updated weights for policy 1, policy_version 24250 (0.0007) +[2023-10-11 15:56:00,183][85176] Updated weights for policy 0, policy_version 23912 (0.0007) +[2023-10-11 15:56:00,546][85176] Updated weights for policy 0, policy_version 23922 (0.0007) +[2023-10-11 15:56:00,923][85176] Updated weights for policy 0, policy_version 23932 (0.0007) +[2023-10-11 15:56:01,063][84230] Fps is (10 sec: 16383.8, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 49348608. Throughput: 0: 1649.8, 1: 1710.4. Samples: 12344788. Policy #0 lag: (min: 31.0, avg: 31.2, max: 41.0) +[2023-10-11 15:56:01,064][84230] Avg episode reward: [(0, '7.890'), (1, '7.700')] +[2023-10-11 15:56:02,149][85175] Updated weights for policy 1, policy_version 24260 (0.0008) +[2023-10-11 15:56:02,525][85175] Updated weights for policy 1, policy_version 24270 (0.0007) +[2023-10-11 15:56:02,886][85175] Updated weights for policy 1, policy_version 24280 (0.0008) +[2023-10-11 15:56:05,152][85176] Updated weights for policy 0, policy_version 23942 (0.0007) +[2023-10-11 15:56:05,531][85176] Updated weights for policy 0, policy_version 23952 (0.0007) +[2023-10-11 15:56:05,902][85176] Updated weights for policy 0, policy_version 23962 (0.0007) +[2023-10-11 15:56:06,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 49381376. Throughput: 0: 1666.2, 1: 1682.7. Samples: 12354668. Policy #0 lag: (min: 31.0, avg: 31.2, max: 41.0) +[2023-10-11 15:56:06,063][84230] Avg episode reward: [(0, '7.520'), (1, '7.460')] +[2023-10-11 15:56:06,831][85175] Updated weights for policy 1, policy_version 24290 (0.0008) +[2023-10-11 15:56:07,201][85175] Updated weights for policy 1, policy_version 24300 (0.0007) +[2023-10-11 15:56:07,565][85175] Updated weights for policy 1, policy_version 24310 (0.0007) +[2023-10-11 15:56:07,934][85175] Updated weights for policy 1, policy_version 24320 (0.0007) +[2023-10-11 15:56:10,084][85176] Updated weights for policy 0, policy_version 23972 (0.0009) +[2023-10-11 15:56:10,467][85176] Updated weights for policy 0, policy_version 23982 (0.0010) +[2023-10-11 15:56:10,842][85176] Updated weights for policy 0, policy_version 23992 (0.0010) +[2023-10-11 15:56:11,062][84230] Fps is (10 sec: 9830.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 49446912. Throughput: 0: 1664.6, 1: 1708.9. Samples: 12375282. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) +[2023-10-11 15:56:11,063][84230] Avg episode reward: [(0, '7.290'), (1, '7.670')] +[2023-10-11 15:56:12,045][85175] Updated weights for policy 1, policy_version 24330 (0.0010) +[2023-10-11 15:56:12,405][85175] Updated weights for policy 1, policy_version 24340 (0.0008) +[2023-10-11 15:56:12,781][85175] Updated weights for policy 1, policy_version 24350 (0.0010) +[2023-10-11 15:56:14,901][85176] Updated weights for policy 0, policy_version 24002 (0.0009) +[2023-10-11 15:56:15,268][85176] Updated weights for policy 0, policy_version 24012 (0.0010) +[2023-10-11 15:56:15,638][85176] Updated weights for policy 0, policy_version 24022 (0.0009) +[2023-10-11 15:56:16,009][85176] Updated weights for policy 0, policy_version 24032 (0.0009) +[2023-10-11 15:56:16,063][84230] Fps is (10 sec: 16383.6, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 49545216. Throughput: 0: 1647.5, 1: 1713.2. Samples: 12395040. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) +[2023-10-11 15:56:16,064][84230] Avg episode reward: [(0, '7.310'), (1, '7.420')] +[2023-10-11 15:56:16,073][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000024032_24608768.pth... +[2023-10-11 15:56:16,074][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000024352_24936448.pth... +[2023-10-11 15:56:16,103][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000022464_23003136.pth +[2023-10-11 15:56:16,109][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000022784_23330816.pth +[2023-10-11 15:56:16,924][85175] Updated weights for policy 1, policy_version 24360 (0.0008) +[2023-10-11 15:56:17,293][85175] Updated weights for policy 1, policy_version 24370 (0.0008) +[2023-10-11 15:56:17,669][85175] Updated weights for policy 1, policy_version 24380 (0.0009) +[2023-10-11 15:56:20,298][85176] Updated weights for policy 0, policy_version 24042 (0.0007) +[2023-10-11 15:56:20,675][85176] Updated weights for policy 0, policy_version 24052 (0.0009) +[2023-10-11 15:56:21,049][85176] Updated weights for policy 0, policy_version 24062 (0.0009) +[2023-10-11 15:56:21,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 49577984. Throughput: 0: 1660.4, 1: 1688.4. Samples: 12404796. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) +[2023-10-11 15:56:21,063][84230] Avg episode reward: [(0, '6.960'), (1, '7.550')] +[2023-10-11 15:56:21,718][85175] Updated weights for policy 1, policy_version 24390 (0.0008) +[2023-10-11 15:56:22,085][85175] Updated weights for policy 1, policy_version 24400 (0.0009) +[2023-10-11 15:56:22,453][85175] Updated weights for policy 1, policy_version 24410 (0.0009) +[2023-10-11 15:56:25,294][85176] Updated weights for policy 0, policy_version 24072 (0.0007) +[2023-10-11 15:56:25,666][85176] Updated weights for policy 0, policy_version 24082 (0.0008) +[2023-10-11 15:56:26,053][85176] Updated weights for policy 0, policy_version 24092 (0.0008) +[2023-10-11 15:56:26,063][84230] Fps is (10 sec: 9830.5, 60 sec: 13107.2, 300 sec: 13218.3). Total num frames: 49643520. Throughput: 0: 1660.1, 1: 1711.8. Samples: 12425440. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) +[2023-10-11 15:56:26,064][84230] Avg episode reward: [(0, '7.170'), (1, '7.280')] +[2023-10-11 15:56:26,435][85175] Updated weights for policy 1, policy_version 24420 (0.0008) +[2023-10-11 15:56:26,799][85175] Updated weights for policy 1, policy_version 24430 (0.0008) +[2023-10-11 15:56:27,166][85175] Updated weights for policy 1, policy_version 24440 (0.0008) +[2023-10-11 15:56:30,010][85176] Updated weights for policy 0, policy_version 24102 (0.0009) +[2023-10-11 15:56:30,382][85176] Updated weights for policy 0, policy_version 24112 (0.0010) +[2023-10-11 15:56:30,769][85176] Updated weights for policy 0, policy_version 24122 (0.0009) +[2023-10-11 15:56:31,063][84230] Fps is (10 sec: 16383.6, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 49741824. Throughput: 0: 1654.8, 1: 1708.7. Samples: 12445640. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:56:31,064][84230] Avg episode reward: [(0, '7.660'), (1, '7.550')] +[2023-10-11 15:56:31,207][85175] Updated weights for policy 1, policy_version 24450 (0.0010) +[2023-10-11 15:56:31,568][85175] Updated weights for policy 1, policy_version 24460 (0.0010) +[2023-10-11 15:56:31,941][85175] Updated weights for policy 1, policy_version 24470 (0.0007) +[2023-10-11 15:56:32,307][85175] Updated weights for policy 1, policy_version 24480 (0.0008) +[2023-10-11 15:56:34,900][85176] Updated weights for policy 0, policy_version 24132 (0.0007) +[2023-10-11 15:56:35,273][85176] Updated weights for policy 0, policy_version 24142 (0.0007) +[2023-10-11 15:56:35,648][85176] Updated weights for policy 0, policy_version 24152 (0.0007) +[2023-10-11 15:56:36,062][84230] Fps is (10 sec: 16384.4, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 49807360. Throughput: 0: 1661.9, 1: 1696.1. Samples: 12455310. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:56:36,063][84230] Avg episode reward: [(0, '7.340'), (1, '7.460')] +[2023-10-11 15:56:36,336][85175] Updated weights for policy 1, policy_version 24490 (0.0007) +[2023-10-11 15:56:36,706][85175] Updated weights for policy 1, policy_version 24500 (0.0008) +[2023-10-11 15:56:37,072][85175] Updated weights for policy 1, policy_version 24510 (0.0007) +[2023-10-11 15:56:39,656][85176] Updated weights for policy 0, policy_version 24162 (0.0007) +[2023-10-11 15:56:40,034][85176] Updated weights for policy 0, policy_version 24172 (0.0008) +[2023-10-11 15:56:40,407][85176] Updated weights for policy 0, policy_version 24182 (0.0009) +[2023-10-11 15:56:40,778][85176] Updated weights for policy 0, policy_version 24192 (0.0008) +[2023-10-11 15:56:41,023][85175] Updated weights for policy 1, policy_version 24520 (0.0008) +[2023-10-11 15:56:41,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 49872896. Throughput: 0: 1660.4, 1: 1698.0. Samples: 12475732. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:56:41,063][84230] Avg episode reward: [(0, '7.350'), (1, '7.830')] +[2023-10-11 15:56:41,393][85175] Updated weights for policy 1, policy_version 24530 (0.0007) +[2023-10-11 15:56:41,759][85175] Updated weights for policy 1, policy_version 24540 (0.0007) +[2023-10-11 15:56:44,729][85176] Updated weights for policy 0, policy_version 24202 (0.0007) +[2023-10-11 15:56:45,098][85176] Updated weights for policy 0, policy_version 24212 (0.0007) +[2023-10-11 15:56:45,473][85176] Updated weights for policy 0, policy_version 24222 (0.0008) +[2023-10-11 15:56:45,771][85175] Updated weights for policy 1, policy_version 24550 (0.0008) +[2023-10-11 15:56:46,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.4, 300 sec: 13329.3). Total num frames: 49938432. Throughput: 0: 1644.4, 1: 1698.3. Samples: 12495212. Policy #0 lag: (min: 6.0, avg: 28.1, max: 32.0) +[2023-10-11 15:56:46,063][84230] Avg episode reward: [(0, '7.570'), (1, '7.220')] +[2023-10-11 15:56:46,134][85175] Updated weights for policy 1, policy_version 24560 (0.0009) +[2023-10-11 15:56:46,496][85175] Updated weights for policy 1, policy_version 24570 (0.0007) +[2023-10-11 15:56:49,682][85176] Updated weights for policy 0, policy_version 24232 (0.0008) +[2023-10-11 15:56:50,054][85176] Updated weights for policy 0, policy_version 24242 (0.0007) +[2023-10-11 15:56:50,427][85176] Updated weights for policy 0, policy_version 24252 (0.0008) +[2023-10-11 15:56:50,668][85175] Updated weights for policy 1, policy_version 24580 (0.0009) +[2023-10-11 15:56:51,047][85175] Updated weights for policy 1, policy_version 24590 (0.0010) +[2023-10-11 15:56:51,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 50003968. Throughput: 0: 1656.5, 1: 1691.3. Samples: 12505320. Policy #0 lag: (min: 6.0, avg: 28.1, max: 32.0) +[2023-10-11 15:56:51,063][84230] Avg episode reward: [(0, '7.500'), (1, '7.390')] +[2023-10-11 15:56:51,417][85175] Updated weights for policy 1, policy_version 24600 (0.0008) +[2023-10-11 15:56:54,532][85176] Updated weights for policy 0, policy_version 24262 (0.0008) +[2023-10-11 15:56:54,909][85176] Updated weights for policy 0, policy_version 24272 (0.0009) +[2023-10-11 15:56:55,294][85176] Updated weights for policy 0, policy_version 24282 (0.0008) +[2023-10-11 15:56:55,532][85175] Updated weights for policy 1, policy_version 24610 (0.0008) +[2023-10-11 15:56:55,892][85175] Updated weights for policy 1, policy_version 24620 (0.0008) +[2023-10-11 15:56:56,062][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 50069504. Throughput: 0: 1656.0, 1: 1686.1. Samples: 12525678. Policy #0 lag: (min: 6.0, avg: 28.1, max: 32.0) +[2023-10-11 15:56:56,063][84230] Avg episode reward: [(0, '8.150'), (1, '8.210')] +[2023-10-11 15:56:56,262][85175] Updated weights for policy 1, policy_version 24630 (0.0007) +[2023-10-11 15:56:56,628][85175] Updated weights for policy 1, policy_version 24640 (0.0009) +[2023-10-11 15:56:59,516][85176] Updated weights for policy 0, policy_version 24292 (0.0009) +[2023-10-11 15:56:59,888][85176] Updated weights for policy 0, policy_version 24302 (0.0010) +[2023-10-11 15:57:00,260][85176] Updated weights for policy 0, policy_version 24312 (0.0010) +[2023-10-11 15:57:00,748][85175] Updated weights for policy 1, policy_version 24650 (0.0008) +[2023-10-11 15:57:01,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 50135040. Throughput: 0: 1650.8, 1: 1683.1. Samples: 12545064. Policy #0 lag: (min: 6.0, avg: 28.1, max: 32.0) +[2023-10-11 15:57:01,063][84230] Avg episode reward: [(0, '7.760'), (1, '7.950')] +[2023-10-11 15:57:01,114][85175] Updated weights for policy 1, policy_version 24660 (0.0010) +[2023-10-11 15:57:01,488][85175] Updated weights for policy 1, policy_version 24670 (0.0007) +[2023-10-11 15:57:04,488][85176] Updated weights for policy 0, policy_version 24322 (0.0009) +[2023-10-11 15:57:04,864][85176] Updated weights for policy 0, policy_version 24332 (0.0009) +[2023-10-11 15:57:05,236][85176] Updated weights for policy 0, policy_version 24342 (0.0007) +[2023-10-11 15:57:05,514][85175] Updated weights for policy 1, policy_version 24680 (0.0008) +[2023-10-11 15:57:05,612][85176] Updated weights for policy 0, policy_version 24352 (0.0007) +[2023-10-11 15:57:05,898][85175] Updated weights for policy 1, policy_version 24690 (0.0009) +[2023-10-11 15:57:06,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13329.4). Total num frames: 50200576. Throughput: 0: 1662.7, 1: 1685.6. Samples: 12555468. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:57:06,063][84230] Avg episode reward: [(0, '7.440'), (1, '7.360')] +[2023-10-11 15:57:06,257][85175] Updated weights for policy 1, policy_version 24700 (0.0008) +[2023-10-11 15:57:09,783][85176] Updated weights for policy 0, policy_version 24362 (0.0008) +[2023-10-11 15:57:10,164][85176] Updated weights for policy 0, policy_version 24372 (0.0009) +[2023-10-11 15:57:10,241][85175] Updated weights for policy 1, policy_version 24710 (0.0007) +[2023-10-11 15:57:10,528][85176] Updated weights for policy 0, policy_version 24382 (0.0008) +[2023-10-11 15:57:10,612][85175] Updated weights for policy 1, policy_version 24720 (0.0008) +[2023-10-11 15:57:10,985][85175] Updated weights for policy 1, policy_version 24730 (0.0009) +[2023-10-11 15:57:11,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 50266112. Throughput: 0: 1658.2, 1: 1685.2. Samples: 12575892. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:57:11,064][84230] Avg episode reward: [(0, '6.910'), (1, '6.910')] +[2023-10-11 15:57:14,620][85176] Updated weights for policy 0, policy_version 24392 (0.0008) +[2023-10-11 15:57:14,935][85175] Updated weights for policy 1, policy_version 24740 (0.0008) +[2023-10-11 15:57:14,991][85176] Updated weights for policy 0, policy_version 24402 (0.0007) +[2023-10-11 15:57:15,300][85175] Updated weights for policy 1, policy_version 24750 (0.0008) +[2023-10-11 15:57:15,363][85176] Updated weights for policy 0, policy_version 24412 (0.0008) +[2023-10-11 15:57:15,681][85175] Updated weights for policy 1, policy_version 24760 (0.0009) +[2023-10-11 15:57:16,063][84230] Fps is (10 sec: 16383.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 50364416. Throughput: 0: 1648.2, 1: 1669.4. Samples: 12594930. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:57:16,064][84230] Avg episode reward: [(0, '7.080'), (1, '7.250')] +[2023-10-11 15:57:19,427][85176] Updated weights for policy 0, policy_version 24422 (0.0009) +[2023-10-11 15:57:19,650][85175] Updated weights for policy 1, policy_version 24770 (0.0008) +[2023-10-11 15:57:19,790][85176] Updated weights for policy 0, policy_version 24432 (0.0008) +[2023-10-11 15:57:20,009][85175] Updated weights for policy 1, policy_version 24780 (0.0007) +[2023-10-11 15:57:20,169][85176] Updated weights for policy 0, policy_version 24442 (0.0008) +[2023-10-11 15:57:20,379][85175] Updated weights for policy 1, policy_version 24790 (0.0007) +[2023-10-11 15:57:20,747][85175] Updated weights for policy 1, policy_version 24800 (0.0007) +[2023-10-11 15:57:21,063][84230] Fps is (10 sec: 16383.7, 60 sec: 14199.4, 300 sec: 13440.4). Total num frames: 50429952. Throughput: 0: 1662.3, 1: 1691.5. Samples: 12606230. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:57:21,064][84230] Avg episode reward: [(0, '7.880'), (1, '7.860')] +[2023-10-11 15:57:24,498][85176] Updated weights for policy 0, policy_version 24452 (0.0007) +[2023-10-11 15:57:24,807][85175] Updated weights for policy 1, policy_version 24810 (0.0008) +[2023-10-11 15:57:24,867][85176] Updated weights for policy 0, policy_version 24462 (0.0007) +[2023-10-11 15:57:25,177][85175] Updated weights for policy 1, policy_version 24820 (0.0008) +[2023-10-11 15:57:25,232][85176] Updated weights for policy 0, policy_version 24472 (0.0008) +[2023-10-11 15:57:25,542][85175] Updated weights for policy 1, policy_version 24830 (0.0007) +[2023-10-11 15:57:26,062][84230] Fps is (10 sec: 13107.6, 60 sec: 14199.5, 300 sec: 13440.4). Total num frames: 50495488. Throughput: 0: 1653.9, 1: 1694.7. Samples: 12626420. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:57:26,063][84230] Avg episode reward: [(0, '8.030'), (1, '8.100')] +[2023-10-11 15:57:29,353][85176] Updated weights for policy 0, policy_version 24482 (0.0007) +[2023-10-11 15:57:29,652][85175] Updated weights for policy 1, policy_version 24840 (0.0008) +[2023-10-11 15:57:29,727][85176] Updated weights for policy 0, policy_version 24492 (0.0007) +[2023-10-11 15:57:30,023][85175] Updated weights for policy 1, policy_version 24850 (0.0008) +[2023-10-11 15:57:30,108][85176] Updated weights for policy 0, policy_version 24502 (0.0007) +[2023-10-11 15:57:30,400][85175] Updated weights for policy 1, policy_version 24860 (0.0008) +[2023-10-11 15:57:30,471][85176] Updated weights for policy 0, policy_version 24512 (0.0009) +[2023-10-11 15:57:31,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 50561024. Throughput: 0: 1657.3, 1: 1670.9. Samples: 12644984. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:57:31,064][84230] Avg episode reward: [(0, '7.810'), (1, '7.750')] +[2023-10-11 15:57:34,338][85175] Updated weights for policy 1, policy_version 24870 (0.0009) +[2023-10-11 15:57:34,587][85176] Updated weights for policy 0, policy_version 24522 (0.0007) +[2023-10-11 15:57:34,713][85175] Updated weights for policy 1, policy_version 24880 (0.0008) +[2023-10-11 15:57:34,964][85176] Updated weights for policy 0, policy_version 24532 (0.0010) +[2023-10-11 15:57:35,074][85175] Updated weights for policy 1, policy_version 24890 (0.0008) +[2023-10-11 15:57:35,329][85176] Updated weights for policy 0, policy_version 24542 (0.0008) +[2023-10-11 15:57:36,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 50626560. Throughput: 0: 1656.0, 1: 1705.0. Samples: 12656566. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:57:36,064][84230] Avg episode reward: [(0, '7.420'), (1, '7.520')] +[2023-10-11 15:57:39,188][85175] Updated weights for policy 1, policy_version 24900 (0.0007) +[2023-10-11 15:57:39,483][85176] Updated weights for policy 0, policy_version 24552 (0.0009) +[2023-10-11 15:57:39,555][85175] Updated weights for policy 1, policy_version 24910 (0.0007) +[2023-10-11 15:57:39,849][85176] Updated weights for policy 0, policy_version 24562 (0.0009) +[2023-10-11 15:57:39,920][85175] Updated weights for policy 1, policy_version 24920 (0.0008) +[2023-10-11 15:57:40,219][85176] Updated weights for policy 0, policy_version 24572 (0.0008) +[2023-10-11 15:57:41,062][84230] Fps is (10 sec: 13107.6, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 50692096. Throughput: 0: 1648.1, 1: 1693.3. Samples: 12676042. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 15:57:41,063][84230] Avg episode reward: [(0, '7.120'), (1, '7.220')] +[2023-10-11 15:57:43,939][85175] Updated weights for policy 1, policy_version 24930 (0.0008) +[2023-10-11 15:57:44,298][85175] Updated weights for policy 1, policy_version 24940 (0.0008) +[2023-10-11 15:57:44,419][85176] Updated weights for policy 0, policy_version 24582 (0.0009) +[2023-10-11 15:57:44,666][85175] Updated weights for policy 1, policy_version 24950 (0.0007) +[2023-10-11 15:57:44,795][85176] Updated weights for policy 0, policy_version 24592 (0.0008) +[2023-10-11 15:57:45,030][85175] Updated weights for policy 1, policy_version 24960 (0.0007) +[2023-10-11 15:57:45,165][85176] Updated weights for policy 0, policy_version 24602 (0.0008) +[2023-10-11 15:57:46,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 50757632. Throughput: 0: 1650.0, 1: 1678.6. Samples: 12694852. Policy #0 lag: (min: 19.0, avg: 25.8, max: 51.0) +[2023-10-11 15:57:46,064][84230] Avg episode reward: [(0, '6.960'), (1, '7.210')] +[2023-10-11 15:57:49,267][85175] Updated weights for policy 1, policy_version 24970 (0.0008) +[2023-10-11 15:57:49,281][85176] Updated weights for policy 0, policy_version 24612 (0.0009) +[2023-10-11 15:57:49,635][85175] Updated weights for policy 1, policy_version 24980 (0.0010) +[2023-10-11 15:57:49,657][85176] Updated weights for policy 0, policy_version 24622 (0.0007) +[2023-10-11 15:57:50,005][85175] Updated weights for policy 1, policy_version 24990 (0.0008) +[2023-10-11 15:57:50,018][85176] Updated weights for policy 0, policy_version 24632 (0.0008) +[2023-10-11 15:57:51,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 50823168. Throughput: 0: 1651.7, 1: 1704.3. Samples: 12706490. Policy #0 lag: (min: 19.0, avg: 25.8, max: 51.0) +[2023-10-11 15:57:51,063][84230] Avg episode reward: [(0, '7.330'), (1, '7.750')] +[2023-10-11 15:57:53,927][85176] Updated weights for policy 0, policy_version 24642 (0.0008) +[2023-10-11 15:57:54,063][85175] Updated weights for policy 1, policy_version 25000 (0.0008) +[2023-10-11 15:57:54,290][85176] Updated weights for policy 0, policy_version 24652 (0.0007) +[2023-10-11 15:57:54,435][85175] Updated weights for policy 1, policy_version 25010 (0.0008) +[2023-10-11 15:57:54,667][85176] Updated weights for policy 0, policy_version 24662 (0.0007) +[2023-10-11 15:57:54,797][85175] Updated weights for policy 1, policy_version 25020 (0.0007) +[2023-10-11 15:57:55,029][85176] Updated weights for policy 0, policy_version 24672 (0.0009) +[2023-10-11 15:57:56,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 50888704. Throughput: 0: 1646.1, 1: 1678.5. Samples: 12725498. Policy #0 lag: (min: 19.0, avg: 25.8, max: 51.0) +[2023-10-11 15:57:56,063][84230] Avg episode reward: [(0, '7.400'), (1, '7.560')] +[2023-10-11 15:57:58,714][85175] Updated weights for policy 1, policy_version 25030 (0.0008) +[2023-10-11 15:57:59,091][85175] Updated weights for policy 1, policy_version 25040 (0.0008) +[2023-10-11 15:57:59,103][85176] Updated weights for policy 0, policy_version 24682 (0.0009) +[2023-10-11 15:57:59,454][85175] Updated weights for policy 1, policy_version 25050 (0.0008) +[2023-10-11 15:57:59,468][85176] Updated weights for policy 0, policy_version 24692 (0.0008) +[2023-10-11 15:57:59,847][85176] Updated weights for policy 0, policy_version 24702 (0.0007) +[2023-10-11 15:58:01,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 50954240. Throughput: 0: 1656.9, 1: 1686.9. Samples: 12745402. Policy #0 lag: (min: 19.0, avg: 25.8, max: 51.0) +[2023-10-11 15:58:01,064][84230] Avg episode reward: [(0, '7.980'), (1, '7.530')] +[2023-10-11 15:58:03,570][85175] Updated weights for policy 1, policy_version 25060 (0.0008) +[2023-10-11 15:58:03,941][85175] Updated weights for policy 1, policy_version 25070 (0.0008) +[2023-10-11 15:58:03,960][85176] Updated weights for policy 0, policy_version 24712 (0.0007) +[2023-10-11 15:58:04,303][85175] Updated weights for policy 1, policy_version 25080 (0.0009) +[2023-10-11 15:58:04,325][85176] Updated weights for policy 0, policy_version 24722 (0.0009) +[2023-10-11 15:58:04,696][85176] Updated weights for policy 0, policy_version 24732 (0.0008) +[2023-10-11 15:58:06,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 51019776. Throughput: 0: 1656.1, 1: 1692.2. Samples: 12756902. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-10-11 15:58:06,063][84230] Avg episode reward: [(0, '7.480'), (1, '7.760')] +[2023-10-11 15:58:08,451][85175] Updated weights for policy 1, policy_version 25090 (0.0008) +[2023-10-11 15:58:08,737][85176] Updated weights for policy 0, policy_version 24742 (0.0009) +[2023-10-11 15:58:08,824][85175] Updated weights for policy 1, policy_version 25100 (0.0007) +[2023-10-11 15:58:09,103][85176] Updated weights for policy 0, policy_version 24752 (0.0008) +[2023-10-11 15:58:09,193][85175] Updated weights for policy 1, policy_version 25110 (0.0009) +[2023-10-11 15:58:09,475][85176] Updated weights for policy 0, policy_version 24762 (0.0008) +[2023-10-11 15:58:09,560][85175] Updated weights for policy 1, policy_version 25120 (0.0007) +[2023-10-11 15:58:11,063][84230] Fps is (10 sec: 13107.6, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 51085312. Throughput: 0: 1642.0, 1: 1666.0. Samples: 12775280. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-10-11 15:58:11,063][84230] Avg episode reward: [(0, '8.110'), (1, '7.520')] +[2023-10-11 15:58:13,394][85175] Updated weights for policy 1, policy_version 25130 (0.0008) +[2023-10-11 15:58:13,630][85176] Updated weights for policy 0, policy_version 24772 (0.0008) +[2023-10-11 15:58:13,773][85175] Updated weights for policy 1, policy_version 25140 (0.0008) +[2023-10-11 15:58:14,007][85176] Updated weights for policy 0, policy_version 24782 (0.0010) +[2023-10-11 15:58:14,135][85175] Updated weights for policy 1, policy_version 25150 (0.0008) +[2023-10-11 15:58:14,377][85176] Updated weights for policy 0, policy_version 24792 (0.0008) +[2023-10-11 15:58:16,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 51150848. Throughput: 0: 1661.8, 1: 1688.9. Samples: 12795766. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-10-11 15:58:16,063][84230] Avg episode reward: [(0, '8.430'), (1, '7.350')] +[2023-10-11 15:58:16,070][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000024800_25395200.pth... +[2023-10-11 15:58:16,070][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000025152_25755648.pth... +[2023-10-11 15:58:16,100][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000023584_24150016.pth +[2023-10-11 15:58:16,106][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000023232_23789568.pth +[2023-10-11 15:58:18,178][85175] Updated weights for policy 1, policy_version 25160 (0.0007) +[2023-10-11 15:58:18,390][85176] Updated weights for policy 0, policy_version 24802 (0.0008) +[2023-10-11 15:58:18,539][85175] Updated weights for policy 1, policy_version 25170 (0.0009) +[2023-10-11 15:58:18,764][85176] Updated weights for policy 0, policy_version 24812 (0.0008) +[2023-10-11 15:58:18,903][85175] Updated weights for policy 1, policy_version 25180 (0.0008) +[2023-10-11 15:58:19,137][85176] Updated weights for policy 0, policy_version 24822 (0.0011) +[2023-10-11 15:58:19,503][85176] Updated weights for policy 0, policy_version 24832 (0.0010) +[2023-10-11 15:58:21,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 51216384. Throughput: 0: 1656.4, 1: 1675.8. Samples: 12806512. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-10-11 15:58:21,064][84230] Avg episode reward: [(0, '7.580'), (1, '7.200')] +[2023-10-11 15:58:22,913][85175] Updated weights for policy 1, policy_version 25190 (0.0010) +[2023-10-11 15:58:23,289][85175] Updated weights for policy 1, policy_version 25200 (0.0010) +[2023-10-11 15:58:23,657][85175] Updated weights for policy 1, policy_version 25210 (0.0008) +[2023-10-11 15:58:23,811][85176] Updated weights for policy 0, policy_version 24842 (0.0009) +[2023-10-11 15:58:24,180][85176] Updated weights for policy 0, policy_version 24852 (0.0008) +[2023-10-11 15:58:24,560][85176] Updated weights for policy 0, policy_version 24862 (0.0007) +[2023-10-11 15:58:26,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 51281920. Throughput: 0: 1646.2, 1: 1680.3. Samples: 12825732. Policy #0 lag: (min: 31.0, avg: 37.5, max: 63.0) +[2023-10-11 15:58:26,063][84230] Avg episode reward: [(0, '7.670'), (1, '7.510')] +[2023-10-11 15:58:27,592][85175] Updated weights for policy 1, policy_version 25220 (0.0009) +[2023-10-11 15:58:27,956][85175] Updated weights for policy 1, policy_version 25230 (0.0010) +[2023-10-11 15:58:28,320][85175] Updated weights for policy 1, policy_version 25240 (0.0007) +[2023-10-11 15:58:28,412][85176] Updated weights for policy 0, policy_version 24872 (0.0009) +[2023-10-11 15:58:28,784][85176] Updated weights for policy 0, policy_version 24882 (0.0007) +[2023-10-11 15:58:29,157][85176] Updated weights for policy 0, policy_version 24892 (0.0010) +[2023-10-11 15:58:31,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 51347456. Throughput: 0: 1667.9, 1: 1701.2. Samples: 12846464. Policy #0 lag: (min: 31.0, avg: 37.5, max: 63.0) +[2023-10-11 15:58:31,064][84230] Avg episode reward: [(0, '7.880'), (1, '7.800')] +[2023-10-11 15:58:32,267][85175] Updated weights for policy 1, policy_version 25250 (0.0008) +[2023-10-11 15:58:32,638][85175] Updated weights for policy 1, policy_version 25260 (0.0010) +[2023-10-11 15:58:33,010][85175] Updated weights for policy 1, policy_version 25270 (0.0007) +[2023-10-11 15:58:33,373][85175] Updated weights for policy 1, policy_version 25280 (0.0007) +[2023-10-11 15:58:33,608][85176] Updated weights for policy 0, policy_version 24902 (0.0010) +[2023-10-11 15:58:33,997][85176] Updated weights for policy 0, policy_version 24912 (0.0010) +[2023-10-11 15:58:34,363][85176] Updated weights for policy 0, policy_version 24922 (0.0008) +[2023-10-11 15:58:36,062][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 51412992. Throughput: 0: 1661.9, 1: 1672.2. Samples: 12856524. Policy #0 lag: (min: 31.0, avg: 37.5, max: 63.0) +[2023-10-11 15:58:36,063][84230] Avg episode reward: [(0, '7.990'), (1, '8.200')] +[2023-10-11 15:58:37,622][85175] Updated weights for policy 1, policy_version 25290 (0.0010) +[2023-10-11 15:58:37,991][85175] Updated weights for policy 1, policy_version 25300 (0.0009) +[2023-10-11 15:58:38,368][85175] Updated weights for policy 1, policy_version 25310 (0.0009) +[2023-10-11 15:58:38,506][85176] Updated weights for policy 0, policy_version 24932 (0.0007) +[2023-10-11 15:58:38,888][85176] Updated weights for policy 0, policy_version 24942 (0.0009) +[2023-10-11 15:58:39,257][85176] Updated weights for policy 0, policy_version 24952 (0.0011) +[2023-10-11 15:58:41,062][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 51478528. Throughput: 0: 1653.8, 1: 1691.3. Samples: 12876026. Policy #0 lag: (min: 31.0, avg: 37.5, max: 63.0) +[2023-10-11 15:58:41,063][84230] Avg episode reward: [(0, '7.830'), (1, '7.670')] +[2023-10-11 15:58:42,416][85175] Updated weights for policy 1, policy_version 25320 (0.0009) +[2023-10-11 15:58:42,786][85175] Updated weights for policy 1, policy_version 25330 (0.0007) +[2023-10-11 15:58:43,110][85176] Updated weights for policy 0, policy_version 24962 (0.0010) +[2023-10-11 15:58:43,154][85175] Updated weights for policy 1, policy_version 25340 (0.0008) +[2023-10-11 15:58:43,482][85176] Updated weights for policy 0, policy_version 24972 (0.0009) +[2023-10-11 15:58:43,855][85176] Updated weights for policy 0, policy_version 24982 (0.0008) +[2023-10-11 15:58:44,229][85176] Updated weights for policy 0, policy_version 24992 (0.0008) +[2023-10-11 15:58:46,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 51544064. Throughput: 0: 1665.8, 1: 1694.1. Samples: 12896596. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) +[2023-10-11 15:58:46,064][84230] Avg episode reward: [(0, '7.600'), (1, '7.330')] +[2023-10-11 15:58:47,136][85175] Updated weights for policy 1, policy_version 25350 (0.0008) +[2023-10-11 15:58:47,506][85175] Updated weights for policy 1, policy_version 25360 (0.0008) +[2023-10-11 15:58:47,872][85175] Updated weights for policy 1, policy_version 25370 (0.0010) +[2023-10-11 15:58:48,440][85176] Updated weights for policy 0, policy_version 25002 (0.0009) +[2023-10-11 15:58:48,814][85176] Updated weights for policy 0, policy_version 25012 (0.0009) +[2023-10-11 15:58:49,190][85176] Updated weights for policy 0, policy_version 25022 (0.0010) +[2023-10-11 15:58:51,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 51609600. Throughput: 0: 1653.3, 1: 1670.1. Samples: 12906458. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) +[2023-10-11 15:58:51,063][84230] Avg episode reward: [(0, '7.020'), (1, '7.540')] +[2023-10-11 15:58:51,951][85175] Updated weights for policy 1, policy_version 25380 (0.0008) +[2023-10-11 15:58:52,319][85175] Updated weights for policy 1, policy_version 25390 (0.0010) +[2023-10-11 15:58:52,682][85175] Updated weights for policy 1, policy_version 25400 (0.0009) +[2023-10-11 15:58:53,432][85176] Updated weights for policy 0, policy_version 25032 (0.0008) +[2023-10-11 15:58:53,810][85176] Updated weights for policy 0, policy_version 25042 (0.0008) +[2023-10-11 15:58:54,177][85176] Updated weights for policy 0, policy_version 25052 (0.0008) +[2023-10-11 15:58:56,062][84230] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 51675136. Throughput: 0: 1660.3, 1: 1695.3. Samples: 12926282. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) +[2023-10-11 15:58:56,063][84230] Avg episode reward: [(0, '7.350'), (1, '7.280')] +[2023-10-11 15:58:56,850][85175] Updated weights for policy 1, policy_version 25410 (0.0008) +[2023-10-11 15:58:57,214][85175] Updated weights for policy 1, policy_version 25420 (0.0008) +[2023-10-11 15:58:57,578][85175] Updated weights for policy 1, policy_version 25430 (0.0007) +[2023-10-11 15:58:57,943][85175] Updated weights for policy 1, policy_version 25440 (0.0007) +[2023-10-11 15:58:58,370][85176] Updated weights for policy 0, policy_version 25062 (0.0008) +[2023-10-11 15:58:58,742][85176] Updated weights for policy 0, policy_version 25072 (0.0009) +[2023-10-11 15:58:59,114][85176] Updated weights for policy 0, policy_version 25082 (0.0009) +[2023-10-11 15:59:01,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 51740672. Throughput: 0: 1662.1, 1: 1696.3. Samples: 12946894. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) +[2023-10-11 15:59:01,063][84230] Avg episode reward: [(0, '7.180'), (1, '7.450')] +[2023-10-11 15:59:01,957][85175] Updated weights for policy 1, policy_version 25450 (0.0007) +[2023-10-11 15:59:02,316][85175] Updated weights for policy 1, policy_version 25460 (0.0007) +[2023-10-11 15:59:02,698][85175] Updated weights for policy 1, policy_version 25470 (0.0008) +[2023-10-11 15:59:03,280][85176] Updated weights for policy 0, policy_version 25092 (0.0009) +[2023-10-11 15:59:03,662][85176] Updated weights for policy 0, policy_version 25102 (0.0009) +[2023-10-11 15:59:04,018][85176] Updated weights for policy 0, policy_version 25112 (0.0010) +[2023-10-11 15:59:06,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 51806208. Throughput: 0: 1657.6, 1: 1679.3. Samples: 12956672. Policy #0 lag: (min: 10.0, avg: 24.8, max: 42.0) +[2023-10-11 15:59:06,064][84230] Avg episode reward: [(0, '7.280'), (1, '7.930')] +[2023-10-11 15:59:06,702][85175] Updated weights for policy 1, policy_version 25480 (0.0007) +[2023-10-11 15:59:07,076][85175] Updated weights for policy 1, policy_version 25490 (0.0007) +[2023-10-11 15:59:07,447][85175] Updated weights for policy 1, policy_version 25500 (0.0009) +[2023-10-11 15:59:08,022][85176] Updated weights for policy 0, policy_version 25122 (0.0009) +[2023-10-11 15:59:08,381][85176] Updated weights for policy 0, policy_version 25132 (0.0009) +[2023-10-11 15:59:08,757][85176] Updated weights for policy 0, policy_version 25142 (0.0009) +[2023-10-11 15:59:09,128][85176] Updated weights for policy 0, policy_version 25152 (0.0010) +[2023-10-11 15:59:11,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.5). Total num frames: 51871744. Throughput: 0: 1662.4, 1: 1692.7. Samples: 12976714. Policy #0 lag: (min: 10.0, avg: 24.8, max: 42.0) +[2023-10-11 15:59:11,063][84230] Avg episode reward: [(0, '7.310'), (1, '7.630')] +[2023-10-11 15:59:11,412][85175] Updated weights for policy 1, policy_version 25510 (0.0008) +[2023-10-11 15:59:11,769][85175] Updated weights for policy 1, policy_version 25520 (0.0009) +[2023-10-11 15:59:12,142][85175] Updated weights for policy 1, policy_version 25530 (0.0007) +[2023-10-11 15:59:12,985][85176] Updated weights for policy 0, policy_version 25162 (0.0007) +[2023-10-11 15:59:13,364][85176] Updated weights for policy 0, policy_version 25172 (0.0007) +[2023-10-11 15:59:13,744][85176] Updated weights for policy 0, policy_version 25182 (0.0008) +[2023-10-11 15:59:16,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 51937280. Throughput: 0: 1670.1, 1: 1692.3. Samples: 12997772. Policy #0 lag: (min: 10.0, avg: 24.8, max: 42.0) +[2023-10-11 15:59:16,064][84230] Avg episode reward: [(0, '7.730'), (1, '7.220')] +[2023-10-11 15:59:16,223][85175] Updated weights for policy 1, policy_version 25540 (0.0007) +[2023-10-11 15:59:16,581][85175] Updated weights for policy 1, policy_version 25550 (0.0009) +[2023-10-11 15:59:16,956][85175] Updated weights for policy 1, policy_version 25560 (0.0007) +[2023-10-11 15:59:17,912][85176] Updated weights for policy 0, policy_version 25192 (0.0011) +[2023-10-11 15:59:18,281][85176] Updated weights for policy 0, policy_version 25202 (0.0010) +[2023-10-11 15:59:18,654][85176] Updated weights for policy 0, policy_version 25212 (0.0011) +[2023-10-11 15:59:20,786][85175] Updated weights for policy 1, policy_version 25570 (0.0008) +[2023-10-11 15:59:21,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 52002816. Throughput: 0: 1653.4, 1: 1696.4. Samples: 13007262. Policy #0 lag: (min: 10.0, avg: 24.8, max: 42.0) +[2023-10-11 15:59:21,063][84230] Avg episode reward: [(0, '8.100'), (1, '7.540')] +[2023-10-11 15:59:21,157][85175] Updated weights for policy 1, policy_version 25580 (0.0008) +[2023-10-11 15:59:21,510][85175] Updated weights for policy 1, policy_version 25590 (0.0009) +[2023-10-11 15:59:21,885][85175] Updated weights for policy 1, policy_version 25600 (0.0010) +[2023-10-11 15:59:22,833][85176] Updated weights for policy 0, policy_version 25222 (0.0008) +[2023-10-11 15:59:23,198][85176] Updated weights for policy 0, policy_version 25232 (0.0007) +[2023-10-11 15:59:23,578][85176] Updated weights for policy 0, policy_version 25242 (0.0009) +[2023-10-11 15:59:26,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 52068352. Throughput: 0: 1665.7, 1: 1701.4. Samples: 13027544. Policy #0 lag: (min: 25.0, avg: 33.7, max: 57.0) +[2023-10-11 15:59:26,064][84230] Avg episode reward: [(0, '7.940'), (1, '7.940')] +[2023-10-11 15:59:26,099][85175] Updated weights for policy 1, policy_version 25610 (0.0009) +[2023-10-11 15:59:26,456][85175] Updated weights for policy 1, policy_version 25620 (0.0007) +[2023-10-11 15:59:26,827][85175] Updated weights for policy 1, policy_version 25630 (0.0007) +[2023-10-11 15:59:27,528][85176] Updated weights for policy 0, policy_version 25252 (0.0007) +[2023-10-11 15:59:27,902][85176] Updated weights for policy 0, policy_version 25262 (0.0007) +[2023-10-11 15:59:28,280][85176] Updated weights for policy 0, policy_version 25272 (0.0007) +[2023-10-11 15:59:30,847][85175] Updated weights for policy 1, policy_version 25640 (0.0007) +[2023-10-11 15:59:31,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 52133888. Throughput: 0: 1672.9, 1: 1700.3. Samples: 13048388. Policy #0 lag: (min: 25.0, avg: 33.7, max: 57.0) +[2023-10-11 15:59:31,063][84230] Avg episode reward: [(0, '7.640'), (1, '7.850')] +[2023-10-11 15:59:31,218][85175] Updated weights for policy 1, policy_version 25650 (0.0007) +[2023-10-11 15:59:31,579][85175] Updated weights for policy 1, policy_version 25660 (0.0007) +[2023-10-11 15:59:32,223][85176] Updated weights for policy 0, policy_version 25282 (0.0008) +[2023-10-11 15:59:32,600][85176] Updated weights for policy 0, policy_version 25292 (0.0007) +[2023-10-11 15:59:32,967][85176] Updated weights for policy 0, policy_version 25302 (0.0010) +[2023-10-11 15:59:33,351][85176] Updated weights for policy 0, policy_version 25312 (0.0010) +[2023-10-11 15:59:35,825][85175] Updated weights for policy 1, policy_version 25670 (0.0009) +[2023-10-11 15:59:36,063][84230] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 52199424. Throughput: 0: 1656.5, 1: 1698.8. Samples: 13057444. Policy #0 lag: (min: 25.0, avg: 33.7, max: 57.0) +[2023-10-11 15:59:36,063][84230] Avg episode reward: [(0, '7.520'), (1, '7.730')] +[2023-10-11 15:59:36,184][85175] Updated weights for policy 1, policy_version 25680 (0.0009) +[2023-10-11 15:59:36,556][85175] Updated weights for policy 1, policy_version 25690 (0.0007) +[2023-10-11 15:59:37,544][85176] Updated weights for policy 0, policy_version 25322 (0.0009) +[2023-10-11 15:59:37,927][85176] Updated weights for policy 0, policy_version 25332 (0.0010) +[2023-10-11 15:59:38,303][85176] Updated weights for policy 0, policy_version 25342 (0.0009) +[2023-10-11 15:59:40,731][85175] Updated weights for policy 1, policy_version 25700 (0.0008) +[2023-10-11 15:59:41,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 52264960. Throughput: 0: 1673.1, 1: 1697.9. Samples: 13077980. Policy #0 lag: (min: 25.0, avg: 33.7, max: 57.0) +[2023-10-11 15:59:41,064][84230] Avg episode reward: [(0, '7.360'), (1, '7.280')] +[2023-10-11 15:59:41,096][85175] Updated weights for policy 1, policy_version 25710 (0.0011) +[2023-10-11 15:59:41,468][85175] Updated weights for policy 1, policy_version 25720 (0.0010) +[2023-10-11 15:59:42,465][85176] Updated weights for policy 0, policy_version 25352 (0.0007) +[2023-10-11 15:59:42,835][85176] Updated weights for policy 0, policy_version 25362 (0.0007) +[2023-10-11 15:59:43,214][85176] Updated weights for policy 0, policy_version 25372 (0.0009) +[2023-10-11 15:59:45,467][85175] Updated weights for policy 1, policy_version 25730 (0.0010) +[2023-10-11 15:59:45,834][85175] Updated weights for policy 1, policy_version 25740 (0.0007) +[2023-10-11 15:59:46,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 52330496. Throughput: 0: 1675.0, 1: 1697.7. Samples: 13098664. Policy #0 lag: (min: 18.0, avg: 23.2, max: 50.0) +[2023-10-11 15:59:46,063][84230] Avg episode reward: [(0, '7.640'), (1, '7.290')] +[2023-10-11 15:59:46,206][85175] Updated weights for policy 1, policy_version 25750 (0.0007) +[2023-10-11 15:59:46,560][85175] Updated weights for policy 1, policy_version 25760 (0.0007) +[2023-10-11 15:59:47,295][85176] Updated weights for policy 0, policy_version 25382 (0.0009) +[2023-10-11 15:59:47,666][85176] Updated weights for policy 0, policy_version 25392 (0.0009) +[2023-10-11 15:59:48,039][85176] Updated weights for policy 0, policy_version 25402 (0.0008) +[2023-10-11 15:59:50,448][85175] Updated weights for policy 1, policy_version 25770 (0.0009) +[2023-10-11 15:59:50,812][85175] Updated weights for policy 1, policy_version 25780 (0.0008) +[2023-10-11 15:59:51,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 52396032. Throughput: 0: 1662.4, 1: 1702.9. Samples: 13108106. Policy #0 lag: (min: 18.0, avg: 23.2, max: 50.0) +[2023-10-11 15:59:51,063][84230] Avg episode reward: [(0, '7.220'), (1, '7.610')] +[2023-10-11 15:59:51,179][85175] Updated weights for policy 1, policy_version 25790 (0.0010) +[2023-10-11 15:59:51,921][85176] Updated weights for policy 0, policy_version 25412 (0.0009) +[2023-10-11 15:59:52,297][85176] Updated weights for policy 0, policy_version 25422 (0.0009) +[2023-10-11 15:59:52,667][85176] Updated weights for policy 0, policy_version 25432 (0.0009) +[2023-10-11 15:59:55,335][85175] Updated weights for policy 1, policy_version 25800 (0.0009) +[2023-10-11 15:59:55,706][85175] Updated weights for policy 1, policy_version 25810 (0.0008) +[2023-10-11 15:59:56,062][85175] Updated weights for policy 1, policy_version 25820 (0.0008) +[2023-10-11 15:59:56,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 52461568. Throughput: 0: 1687.2, 1: 1697.3. Samples: 13129018. Policy #0 lag: (min: 18.0, avg: 23.2, max: 50.0) +[2023-10-11 15:59:56,063][84230] Avg episode reward: [(0, '7.500'), (1, '7.800')] +[2023-10-11 15:59:56,831][85176] Updated weights for policy 0, policy_version 25442 (0.0011) +[2023-10-11 15:59:57,210][85176] Updated weights for policy 0, policy_version 25452 (0.0011) +[2023-10-11 15:59:57,580][85176] Updated weights for policy 0, policy_version 25462 (0.0008) +[2023-10-11 15:59:57,945][85176] Updated weights for policy 0, policy_version 25472 (0.0009) +[2023-10-11 16:00:00,049][85175] Updated weights for policy 1, policy_version 25830 (0.0010) +[2023-10-11 16:00:00,425][85175] Updated weights for policy 1, policy_version 25840 (0.0009) +[2023-10-11 16:00:00,794][85175] Updated weights for policy 1, policy_version 25850 (0.0008) +[2023-10-11 16:00:01,063][84230] Fps is (10 sec: 16383.6, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 52559872. Throughput: 0: 1681.8, 1: 1681.6. Samples: 13149126. Policy #0 lag: (min: 18.0, avg: 23.2, max: 50.0) +[2023-10-11 16:00:01,064][84230] Avg episode reward: [(0, '7.940'), (1, '7.640')] +[2023-10-11 16:00:02,043][85176] Updated weights for policy 0, policy_version 25482 (0.0008) +[2023-10-11 16:00:02,425][85176] Updated weights for policy 0, policy_version 25492 (0.0009) +[2023-10-11 16:00:02,788][85176] Updated weights for policy 0, policy_version 25502 (0.0008) +[2023-10-11 16:00:04,680][85175] Updated weights for policy 1, policy_version 25860 (0.0009) +[2023-10-11 16:00:05,043][85175] Updated weights for policy 1, policy_version 25870 (0.0009) +[2023-10-11 16:00:05,420][85175] Updated weights for policy 1, policy_version 25880 (0.0009) +[2023-10-11 16:00:06,063][84230] Fps is (10 sec: 16383.6, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 52625408. Throughput: 0: 1676.7, 1: 1696.7. Samples: 13159068. Policy #0 lag: (min: 29.0, avg: 38.2, max: 61.0) +[2023-10-11 16:00:06,064][84230] Avg episode reward: [(0, '8.450'), (1, '7.640')] +[2023-10-11 16:00:06,816][85176] Updated weights for policy 0, policy_version 25512 (0.0008) +[2023-10-11 16:00:07,188][85176] Updated weights for policy 0, policy_version 25522 (0.0007) +[2023-10-11 16:00:07,562][85176] Updated weights for policy 0, policy_version 25532 (0.0008) +[2023-10-11 16:00:09,324][85175] Updated weights for policy 1, policy_version 25890 (0.0008) +[2023-10-11 16:00:09,689][85175] Updated weights for policy 1, policy_version 25900 (0.0008) +[2023-10-11 16:00:10,052][85175] Updated weights for policy 1, policy_version 25910 (0.0008) +[2023-10-11 16:00:10,419][85175] Updated weights for policy 1, policy_version 25920 (0.0009) +[2023-10-11 16:00:11,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 52690944. Throughput: 0: 1689.4, 1: 1693.4. Samples: 13179770. Policy #0 lag: (min: 29.0, avg: 38.2, max: 61.0) +[2023-10-11 16:00:11,064][84230] Avg episode reward: [(0, '8.150'), (1, '7.540')] +[2023-10-11 16:00:11,555][85176] Updated weights for policy 0, policy_version 25542 (0.0010) +[2023-10-11 16:00:11,938][85176] Updated weights for policy 0, policy_version 25552 (0.0009) +[2023-10-11 16:00:12,306][85176] Updated weights for policy 0, policy_version 25562 (0.0008) +[2023-10-11 16:00:14,603][85175] Updated weights for policy 1, policy_version 25930 (0.0009) +[2023-10-11 16:00:14,969][85175] Updated weights for policy 1, policy_version 25940 (0.0009) +[2023-10-11 16:00:15,341][85175] Updated weights for policy 1, policy_version 25950 (0.0009) +[2023-10-11 16:00:16,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 52756480. Throughput: 0: 1683.5, 1: 1671.2. Samples: 13199352. Policy #0 lag: (min: 29.0, avg: 38.2, max: 61.0) +[2023-10-11 16:00:16,064][84230] Avg episode reward: [(0, '7.180'), (1, '7.480')] +[2023-10-11 16:00:16,076][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000025952_26574848.pth... +[2023-10-11 16:00:16,116][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000024352_24936448.pth +[2023-10-11 16:00:16,236][85176] Updated weights for policy 0, policy_version 25572 (0.0010) +[2023-10-11 16:00:16,609][85176] Updated weights for policy 0, policy_version 25582 (0.0009) +[2023-10-11 16:00:16,979][85176] Updated weights for policy 0, policy_version 25592 (0.0010) +[2023-10-11 16:00:17,281][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000025600_26214400.pth... +[2023-10-11 16:00:17,323][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000024032_24608768.pth +[2023-10-11 16:00:19,557][85175] Updated weights for policy 1, policy_version 25960 (0.0009) +[2023-10-11 16:00:19,919][85175] Updated weights for policy 1, policy_version 25970 (0.0008) +[2023-10-11 16:00:20,284][85175] Updated weights for policy 1, policy_version 25980 (0.0008) +[2023-10-11 16:00:21,034][85176] Updated weights for policy 0, policy_version 25602 (0.0010) +[2023-10-11 16:00:21,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 52822016. Throughput: 0: 1687.9, 1: 1700.4. Samples: 13209922. Policy #0 lag: (min: 29.0, avg: 38.2, max: 61.0) +[2023-10-11 16:00:21,064][84230] Avg episode reward: [(0, '7.100'), (1, '7.600')] +[2023-10-11 16:00:21,398][85176] Updated weights for policy 0, policy_version 25612 (0.0009) +[2023-10-11 16:00:21,769][85176] Updated weights for policy 0, policy_version 25622 (0.0007) +[2023-10-11 16:00:22,145][85176] Updated weights for policy 0, policy_version 25632 (0.0007) +[2023-10-11 16:00:24,331][85175] Updated weights for policy 1, policy_version 25990 (0.0009) +[2023-10-11 16:00:24,700][85175] Updated weights for policy 1, policy_version 26000 (0.0008) +[2023-10-11 16:00:25,066][85175] Updated weights for policy 1, policy_version 26010 (0.0007) +[2023-10-11 16:00:26,054][85176] Updated weights for policy 0, policy_version 25642 (0.0008) +[2023-10-11 16:00:26,063][84230] Fps is (10 sec: 13107.6, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 52887552. Throughput: 0: 1692.0, 1: 1687.4. Samples: 13230054. Policy #0 lag: (min: 17.0, avg: 27.1, max: 49.0) +[2023-10-11 16:00:26,063][84230] Avg episode reward: [(0, '7.050'), (1, '7.980')] +[2023-10-11 16:00:26,431][85176] Updated weights for policy 0, policy_version 25652 (0.0008) +[2023-10-11 16:00:26,808][85176] Updated weights for policy 0, policy_version 25662 (0.0009) +[2023-10-11 16:00:29,014][85175] Updated weights for policy 1, policy_version 26020 (0.0008) +[2023-10-11 16:00:29,385][85175] Updated weights for policy 1, policy_version 26030 (0.0008) +[2023-10-11 16:00:29,760][85175] Updated weights for policy 1, policy_version 26040 (0.0008) +[2023-10-11 16:00:31,007][85176] Updated weights for policy 0, policy_version 25672 (0.0008) +[2023-10-11 16:00:31,062][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 52953088. Throughput: 0: 1693.7, 1: 1669.9. Samples: 13250026. Policy #0 lag: (min: 17.0, avg: 27.1, max: 49.0) +[2023-10-11 16:00:31,063][84230] Avg episode reward: [(0, '7.870'), (1, '7.900')] +[2023-10-11 16:00:31,379][85176] Updated weights for policy 0, policy_version 25682 (0.0008) +[2023-10-11 16:00:31,753][85176] Updated weights for policy 0, policy_version 25692 (0.0009) +[2023-10-11 16:00:33,753][85175] Updated weights for policy 1, policy_version 26050 (0.0008) +[2023-10-11 16:00:34,122][85175] Updated weights for policy 1, policy_version 26060 (0.0011) +[2023-10-11 16:00:34,496][85175] Updated weights for policy 1, policy_version 26070 (0.0007) +[2023-10-11 16:00:34,869][85175] Updated weights for policy 1, policy_version 26080 (0.0008) +[2023-10-11 16:00:35,749][85176] Updated weights for policy 0, policy_version 25702 (0.0008) +[2023-10-11 16:00:36,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 53018624. Throughput: 0: 1690.8, 1: 1697.3. Samples: 13260570. Policy #0 lag: (min: 17.0, avg: 27.1, max: 49.0) +[2023-10-11 16:00:36,063][84230] Avg episode reward: [(0, '7.840'), (1, '7.680')] +[2023-10-11 16:00:36,119][85176] Updated weights for policy 0, policy_version 25712 (0.0009) +[2023-10-11 16:00:36,495][85176] Updated weights for policy 0, policy_version 25722 (0.0009) +[2023-10-11 16:00:38,798][85175] Updated weights for policy 1, policy_version 26090 (0.0008) +[2023-10-11 16:00:39,153][85175] Updated weights for policy 1, policy_version 26100 (0.0007) +[2023-10-11 16:00:39,528][85175] Updated weights for policy 1, policy_version 26110 (0.0007) +[2023-10-11 16:00:40,636][85176] Updated weights for policy 0, policy_version 25732 (0.0008) +[2023-10-11 16:00:41,010][85176] Updated weights for policy 0, policy_version 25742 (0.0007) +[2023-10-11 16:00:41,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 53084160. Throughput: 0: 1686.7, 1: 1676.4. Samples: 13280358. Policy #0 lag: (min: 17.0, avg: 27.1, max: 49.0) +[2023-10-11 16:00:41,063][84230] Avg episode reward: [(0, '7.710'), (1, '6.970')] +[2023-10-11 16:00:41,386][85176] Updated weights for policy 0, policy_version 25752 (0.0008) +[2023-10-11 16:00:43,489][85175] Updated weights for policy 1, policy_version 26120 (0.0009) +[2023-10-11 16:00:43,853][85175] Updated weights for policy 1, policy_version 26130 (0.0009) +[2023-10-11 16:00:44,234][85175] Updated weights for policy 1, policy_version 26140 (0.0009) +[2023-10-11 16:00:45,423][85176] Updated weights for policy 0, policy_version 25762 (0.0010) +[2023-10-11 16:00:45,799][85176] Updated weights for policy 0, policy_version 25772 (0.0008) +[2023-10-11 16:00:46,063][84230] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 53149696. Throughput: 0: 1682.8, 1: 1685.4. Samples: 13300692. Policy #0 lag: (min: 17.0, avg: 26.0, max: 49.0) +[2023-10-11 16:00:46,064][84230] Avg episode reward: [(0, '7.410'), (1, '7.100')] +[2023-10-11 16:00:46,178][85176] Updated weights for policy 0, policy_version 25782 (0.0008) +[2023-10-11 16:00:46,557][85176] Updated weights for policy 0, policy_version 25792 (0.0009) +[2023-10-11 16:00:48,418][85175] Updated weights for policy 1, policy_version 26150 (0.0007) +[2023-10-11 16:00:48,782][85175] Updated weights for policy 1, policy_version 26160 (0.0007) +[2023-10-11 16:00:49,143][85175] Updated weights for policy 1, policy_version 26170 (0.0008) +[2023-10-11 16:00:50,683][85176] Updated weights for policy 0, policy_version 25802 (0.0007) +[2023-10-11 16:00:51,051][85176] Updated weights for policy 0, policy_version 25812 (0.0008) +[2023-10-11 16:00:51,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 53215232. Throughput: 0: 1686.6, 1: 1687.8. Samples: 13310916. Policy #0 lag: (min: 17.0, avg: 26.0, max: 49.0) +[2023-10-11 16:00:51,063][84230] Avg episode reward: [(0, '7.370'), (1, '8.090')] +[2023-10-11 16:00:51,437][85176] Updated weights for policy 0, policy_version 25822 (0.0009) +[2023-10-11 16:00:53,278][85175] Updated weights for policy 1, policy_version 26180 (0.0010) +[2023-10-11 16:00:53,640][85175] Updated weights for policy 1, policy_version 26190 (0.0011) +[2023-10-11 16:00:54,009][85175] Updated weights for policy 1, policy_version 26200 (0.0010) +[2023-10-11 16:00:55,593][85176] Updated weights for policy 0, policy_version 25832 (0.0007) +[2023-10-11 16:00:55,972][85176] Updated weights for policy 0, policy_version 25842 (0.0010) +[2023-10-11 16:00:56,062][84230] Fps is (10 sec: 13107.6, 60 sec: 13653.4, 300 sec: 13329.4). Total num frames: 53280768. Throughput: 0: 1687.3, 1: 1664.8. Samples: 13330610. Policy #0 lag: (min: 17.0, avg: 26.0, max: 49.0) +[2023-10-11 16:00:56,063][84230] Avg episode reward: [(0, '7.650'), (1, '8.150')] +[2023-10-11 16:00:56,339][85176] Updated weights for policy 0, policy_version 25852 (0.0010) +[2023-10-11 16:00:57,988][85175] Updated weights for policy 1, policy_version 26210 (0.0009) +[2023-10-11 16:00:58,358][85175] Updated weights for policy 1, policy_version 26220 (0.0008) +[2023-10-11 16:00:58,720][85175] Updated weights for policy 1, policy_version 26230 (0.0007) +[2023-10-11 16:00:59,085][85175] Updated weights for policy 1, policy_version 26240 (0.0007) +[2023-10-11 16:01:00,394][85176] Updated weights for policy 0, policy_version 25862 (0.0010) +[2023-10-11 16:01:00,766][85176] Updated weights for policy 0, policy_version 25872 (0.0010) +[2023-10-11 16:01:01,062][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 53346304. Throughput: 0: 1675.7, 1: 1690.5. Samples: 13350832. Policy #0 lag: (min: 17.0, avg: 26.0, max: 49.0) +[2023-10-11 16:01:01,063][84230] Avg episode reward: [(0, '8.090'), (1, '7.810')] +[2023-10-11 16:01:01,136][85176] Updated weights for policy 0, policy_version 25882 (0.0010) +[2023-10-11 16:01:03,282][85175] Updated weights for policy 1, policy_version 26250 (0.0008) +[2023-10-11 16:01:03,658][85175] Updated weights for policy 1, policy_version 26260 (0.0008) +[2023-10-11 16:01:04,018][85175] Updated weights for policy 1, policy_version 26270 (0.0008) +[2023-10-11 16:01:05,252][85176] Updated weights for policy 0, policy_version 25892 (0.0009) +[2023-10-11 16:01:05,617][85176] Updated weights for policy 0, policy_version 25902 (0.0008) +[2023-10-11 16:01:05,994][85176] Updated weights for policy 0, policy_version 25912 (0.0009) +[2023-10-11 16:01:06,062][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 53411840. Throughput: 0: 1681.5, 1: 1671.7. Samples: 13360818. Policy #0 lag: (min: 20.0, avg: 25.3, max: 52.0) +[2023-10-11 16:01:06,063][84230] Avg episode reward: [(0, '8.200'), (1, '7.410')] +[2023-10-11 16:01:07,894][85175] Updated weights for policy 1, policy_version 26280 (0.0007) +[2023-10-11 16:01:08,254][85175] Updated weights for policy 1, policy_version 26290 (0.0007) +[2023-10-11 16:01:08,625][85175] Updated weights for policy 1, policy_version 26300 (0.0011) +[2023-10-11 16:01:10,099][85176] Updated weights for policy 0, policy_version 25922 (0.0008) +[2023-10-11 16:01:10,479][85176] Updated weights for policy 0, policy_version 25932 (0.0007) +[2023-10-11 16:01:10,849][85176] Updated weights for policy 0, policy_version 25942 (0.0008) +[2023-10-11 16:01:11,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 53477376. Throughput: 0: 1679.9, 1: 1675.7. Samples: 13381056. Policy #0 lag: (min: 20.0, avg: 25.3, max: 52.0) +[2023-10-11 16:01:11,064][84230] Avg episode reward: [(0, '7.650'), (1, '6.890')] +[2023-10-11 16:01:11,216][85176] Updated weights for policy 0, policy_version 25952 (0.0009) +[2023-10-11 16:01:12,681][85175] Updated weights for policy 1, policy_version 26310 (0.0011) +[2023-10-11 16:01:13,043][85175] Updated weights for policy 1, policy_version 26320 (0.0010) +[2023-10-11 16:01:13,411][85175] Updated weights for policy 1, policy_version 26330 (0.0011) +[2023-10-11 16:01:15,338][85176] Updated weights for policy 0, policy_version 25962 (0.0008) +[2023-10-11 16:01:15,713][85176] Updated weights for policy 0, policy_version 25972 (0.0009) +[2023-10-11 16:01:16,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 53542912. Throughput: 0: 1663.1, 1: 1696.4. Samples: 13401202. Policy #0 lag: (min: 20.0, avg: 25.3, max: 52.0) +[2023-10-11 16:01:16,064][84230] Avg episode reward: [(0, '7.050'), (1, '6.810')] +[2023-10-11 16:01:16,089][85176] Updated weights for policy 0, policy_version 25982 (0.0009) +[2023-10-11 16:01:17,430][85175] Updated weights for policy 1, policy_version 26340 (0.0010) +[2023-10-11 16:01:17,812][85175] Updated weights for policy 1, policy_version 26350 (0.0008) +[2023-10-11 16:01:18,179][85175] Updated weights for policy 1, policy_version 26360 (0.0008) +[2023-10-11 16:01:20,325][85176] Updated weights for policy 0, policy_version 25992 (0.0009) +[2023-10-11 16:01:20,690][85176] Updated weights for policy 0, policy_version 26002 (0.0010) +[2023-10-11 16:01:21,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 53608448. Throughput: 0: 1677.1, 1: 1665.9. Samples: 13411006. Policy #0 lag: (min: 20.0, avg: 25.3, max: 52.0) +[2023-10-11 16:01:21,064][85176] Updated weights for policy 0, policy_version 26012 (0.0008) +[2023-10-11 16:01:21,064][84230] Avg episode reward: [(0, '7.390'), (1, '7.100')] +[2023-10-11 16:01:22,201][85175] Updated weights for policy 1, policy_version 26370 (0.0007) +[2023-10-11 16:01:22,567][85175] Updated weights for policy 1, policy_version 26380 (0.0008) +[2023-10-11 16:01:22,940][85175] Updated weights for policy 1, policy_version 26390 (0.0011) +[2023-10-11 16:01:23,298][85175] Updated weights for policy 1, policy_version 26400 (0.0011) +[2023-10-11 16:01:25,133][85176] Updated weights for policy 0, policy_version 26022 (0.0009) +[2023-10-11 16:01:25,504][85176] Updated weights for policy 0, policy_version 26032 (0.0007) +[2023-10-11 16:01:25,872][85176] Updated weights for policy 0, policy_version 26042 (0.0008) +[2023-10-11 16:01:26,063][84230] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 53673984. Throughput: 0: 1670.3, 1: 1689.6. Samples: 13431550. Policy #0 lag: (min: 1.0, avg: 14.8, max: 33.0) +[2023-10-11 16:01:26,063][84230] Avg episode reward: [(0, '7.860'), (1, '7.630')] +[2023-10-11 16:01:27,327][85175] Updated weights for policy 1, policy_version 26410 (0.0010) +[2023-10-11 16:01:27,691][85175] Updated weights for policy 1, policy_version 26420 (0.0010) +[2023-10-11 16:01:28,060][85175] Updated weights for policy 1, policy_version 26430 (0.0009) +[2023-10-11 16:01:29,800][85176] Updated weights for policy 0, policy_version 26052 (0.0009) +[2023-10-11 16:01:30,170][85176] Updated weights for policy 0, policy_version 26062 (0.0007) +[2023-10-11 16:01:30,541][85176] Updated weights for policy 0, policy_version 26072 (0.0007) +[2023-10-11 16:01:31,062][84230] Fps is (10 sec: 16384.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 53772288. Throughput: 0: 1656.2, 1: 1698.1. Samples: 13451632. Policy #0 lag: (min: 1.0, avg: 14.8, max: 33.0) +[2023-10-11 16:01:31,063][84230] Avg episode reward: [(0, '8.080'), (1, '7.980')] +[2023-10-11 16:01:32,053][85175] Updated weights for policy 1, policy_version 26440 (0.0009) +[2023-10-11 16:01:32,432][85175] Updated weights for policy 1, policy_version 26450 (0.0009) +[2023-10-11 16:01:32,800][85175] Updated weights for policy 1, policy_version 26460 (0.0008) +[2023-10-11 16:01:34,595][85176] Updated weights for policy 0, policy_version 26082 (0.0007) +[2023-10-11 16:01:34,969][85176] Updated weights for policy 0, policy_version 26092 (0.0007) +[2023-10-11 16:01:35,348][85176] Updated weights for policy 0, policy_version 26102 (0.0008) +[2023-10-11 16:01:35,706][85176] Updated weights for policy 0, policy_version 26112 (0.0010) +[2023-10-11 16:01:36,063][84230] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 53837824. Throughput: 0: 1674.9, 1: 1676.2. Samples: 13461716. Policy #0 lag: (min: 1.0, avg: 14.8, max: 33.0) +[2023-10-11 16:01:36,063][84230] Avg episode reward: [(0, '7.700'), (1, '7.650')] +[2023-10-11 16:01:36,735][85175] Updated weights for policy 1, policy_version 26470 (0.0009) +[2023-10-11 16:01:37,101][85175] Updated weights for policy 1, policy_version 26480 (0.0007) +[2023-10-11 16:01:37,473][85175] Updated weights for policy 1, policy_version 26490 (0.0008) +[2023-10-11 16:01:39,766][85176] Updated weights for policy 0, policy_version 26122 (0.0010) +[2023-10-11 16:01:40,146][85176] Updated weights for policy 0, policy_version 26132 (0.0011) +[2023-10-11 16:01:40,527][85176] Updated weights for policy 0, policy_version 26142 (0.0008) +[2023-10-11 16:01:41,062][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 53903360. Throughput: 0: 1670.0, 1: 1703.2. Samples: 13482400. Policy #0 lag: (min: 28.0, avg: 28.1, max: 34.0) +[2023-10-11 16:01:41,063][84230] Avg episode reward: [(0, '7.350'), (1, '7.360')] +[2023-10-11 16:01:41,683][85175] Updated weights for policy 1, policy_version 26500 (0.0009) +[2023-10-11 16:01:42,051][85175] Updated weights for policy 1, policy_version 26510 (0.0008) +[2023-10-11 16:01:42,421][85175] Updated weights for policy 1, policy_version 26520 (0.0009) +[2023-10-11 16:01:44,536][85176] Updated weights for policy 0, policy_version 26152 (0.0008) +[2023-10-11 16:01:44,909][85176] Updated weights for policy 0, policy_version 26162 (0.0007) +[2023-10-11 16:01:45,281][85176] Updated weights for policy 0, policy_version 26172 (0.0007) +[2023-10-11 16:01:46,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 53968896. Throughput: 0: 1659.9, 1: 1699.1. Samples: 13501988. Policy #0 lag: (min: 28.0, avg: 28.1, max: 34.0) +[2023-10-11 16:01:46,063][84230] Avg episode reward: [(0, '7.380'), (1, '7.390')] +[2023-10-11 16:01:46,484][85175] Updated weights for policy 1, policy_version 26530 (0.0009) +[2023-10-11 16:01:46,851][85175] Updated weights for policy 1, policy_version 26540 (0.0007) +[2023-10-11 16:01:47,214][85175] Updated weights for policy 1, policy_version 26550 (0.0007) +[2023-10-11 16:01:47,585][85175] Updated weights for policy 1, policy_version 26560 (0.0010) +[2023-10-11 16:01:49,462][85176] Updated weights for policy 0, policy_version 26182 (0.0007) +[2023-10-11 16:01:49,829][85176] Updated weights for policy 0, policy_version 26192 (0.0009) +[2023-10-11 16:01:50,211][85176] Updated weights for policy 0, policy_version 26202 (0.0007) +[2023-10-11 16:01:51,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 54034432. Throughput: 0: 1680.4, 1: 1686.3. Samples: 13512318. Policy #0 lag: (min: 28.0, avg: 28.1, max: 34.0) +[2023-10-11 16:01:51,064][84230] Avg episode reward: [(0, '8.300'), (1, '7.650')] +[2023-10-11 16:01:51,778][85175] Updated weights for policy 1, policy_version 26570 (0.0008) +[2023-10-11 16:01:52,152][85175] Updated weights for policy 1, policy_version 26580 (0.0008) +[2023-10-11 16:01:52,515][85175] Updated weights for policy 1, policy_version 26590 (0.0009) +[2023-10-11 16:01:54,303][85176] Updated weights for policy 0, policy_version 26212 (0.0008) +[2023-10-11 16:01:54,682][85176] Updated weights for policy 0, policy_version 26222 (0.0007) +[2023-10-11 16:01:55,049][85176] Updated weights for policy 0, policy_version 26232 (0.0008) +[2023-10-11 16:01:56,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 54099968. Throughput: 0: 1668.2, 1: 1696.0. Samples: 13532444. Policy #0 lag: (min: 28.0, avg: 28.1, max: 34.0) +[2023-10-11 16:01:56,063][84230] Avg episode reward: [(0, '7.680'), (1, '7.900')] +[2023-10-11 16:01:56,570][85175] Updated weights for policy 1, policy_version 26600 (0.0008) +[2023-10-11 16:01:56,935][85175] Updated weights for policy 1, policy_version 26610 (0.0008) +[2023-10-11 16:01:57,301][85175] Updated weights for policy 1, policy_version 26620 (0.0008) +[2023-10-11 16:01:59,011][85176] Updated weights for policy 0, policy_version 26242 (0.0007) +[2023-10-11 16:01:59,388][85176] Updated weights for policy 0, policy_version 26252 (0.0008) +[2023-10-11 16:01:59,755][85176] Updated weights for policy 0, policy_version 26262 (0.0009) +[2023-10-11 16:02:00,144][85176] Updated weights for policy 0, policy_version 26272 (0.0010) +[2023-10-11 16:02:01,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 54165504. Throughput: 0: 1662.7, 1: 1695.8. Samples: 13552334. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:02:01,064][84230] Avg episode reward: [(0, '6.880'), (1, '7.450')] +[2023-10-11 16:02:01,436][85175] Updated weights for policy 1, policy_version 26630 (0.0008) +[2023-10-11 16:02:01,800][85175] Updated weights for policy 1, policy_version 26640 (0.0007) +[2023-10-11 16:02:02,175][85175] Updated weights for policy 1, policy_version 26650 (0.0009) +[2023-10-11 16:02:04,169][85176] Updated weights for policy 0, policy_version 26282 (0.0008) +[2023-10-11 16:02:04,548][85176] Updated weights for policy 0, policy_version 26292 (0.0010) +[2023-10-11 16:02:04,920][85176] Updated weights for policy 0, policy_version 26302 (0.0011) +[2023-10-11 16:02:06,050][85175] Updated weights for policy 1, policy_version 26660 (0.0008) +[2023-10-11 16:02:06,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 54231040. Throughput: 0: 1678.6, 1: 1693.0. Samples: 13562730. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:02:06,064][84230] Avg episode reward: [(0, '7.240'), (1, '7.010')] +[2023-10-11 16:02:06,415][85175] Updated weights for policy 1, policy_version 26670 (0.0009) +[2023-10-11 16:02:06,786][85175] Updated weights for policy 1, policy_version 26680 (0.0007) +[2023-10-11 16:02:09,093][85176] Updated weights for policy 0, policy_version 26312 (0.0008) +[2023-10-11 16:02:09,467][85176] Updated weights for policy 0, policy_version 26322 (0.0009) +[2023-10-11 16:02:09,834][85176] Updated weights for policy 0, policy_version 26332 (0.0009) +[2023-10-11 16:02:10,891][85175] Updated weights for policy 1, policy_version 26690 (0.0009) +[2023-10-11 16:02:11,062][84230] Fps is (10 sec: 13107.7, 60 sec: 13653.4, 300 sec: 13329.4). Total num frames: 54296576. Throughput: 0: 1667.4, 1: 1694.1. Samples: 13582820. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:02:11,063][84230] Avg episode reward: [(0, '7.290'), (1, '7.490')] +[2023-10-11 16:02:11,255][85175] Updated weights for policy 1, policy_version 26700 (0.0011) +[2023-10-11 16:02:11,622][85175] Updated weights for policy 1, policy_version 26710 (0.0010) +[2023-10-11 16:02:11,988][85175] Updated weights for policy 1, policy_version 26720 (0.0011) +[2023-10-11 16:02:13,913][85176] Updated weights for policy 0, policy_version 26342 (0.0008) +[2023-10-11 16:02:14,298][85176] Updated weights for policy 0, policy_version 26352 (0.0009) +[2023-10-11 16:02:14,675][85176] Updated weights for policy 0, policy_version 26362 (0.0007) +[2023-10-11 16:02:16,006][85175] Updated weights for policy 1, policy_version 26730 (0.0008) +[2023-10-11 16:02:16,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13329.4). Total num frames: 54362112. Throughput: 0: 1673.5, 1: 1691.9. Samples: 13603076. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:02:16,063][84230] Avg episode reward: [(0, '7.740'), (1, '7.930')] +[2023-10-11 16:02:16,074][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000026368_27000832.pth... +[2023-10-11 16:02:16,108][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000024800_25395200.pth +[2023-10-11 16:02:16,375][85175] Updated weights for policy 1, policy_version 26740 (0.0010) +[2023-10-11 16:02:16,747][85175] Updated weights for policy 1, policy_version 26750 (0.0010) +[2023-10-11 16:02:16,821][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000026752_27394048.pth... +[2023-10-11 16:02:16,856][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000025152_25755648.pth +[2023-10-11 16:02:18,829][85176] Updated weights for policy 0, policy_version 26372 (0.0007) +[2023-10-11 16:02:19,209][85176] Updated weights for policy 0, policy_version 26382 (0.0009) +[2023-10-11 16:02:19,572][85176] Updated weights for policy 0, policy_version 26392 (0.0009) +[2023-10-11 16:02:20,770][85175] Updated weights for policy 1, policy_version 26760 (0.0009) +[2023-10-11 16:02:21,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13329.3). Total num frames: 54427648. Throughput: 0: 1675.7, 1: 1694.6. Samples: 13613380. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:02:21,064][84230] Avg episode reward: [(0, '7.680'), (1, '7.800')] +[2023-10-11 16:02:21,146][85175] Updated weights for policy 1, policy_version 26770 (0.0009) +[2023-10-11 16:02:21,517][85175] Updated weights for policy 1, policy_version 26780 (0.0008) +[2023-10-11 16:02:23,626][85176] Updated weights for policy 0, policy_version 26402 (0.0007) +[2023-10-11 16:02:23,996][85176] Updated weights for policy 0, policy_version 26412 (0.0009) +[2023-10-11 16:02:24,364][85176] Updated weights for policy 0, policy_version 26422 (0.0010) +[2023-10-11 16:02:24,747][85176] Updated weights for policy 0, policy_version 26432 (0.0007) +[2023-10-11 16:02:25,396][85175] Updated weights for policy 1, policy_version 26790 (0.0009) +[2023-10-11 16:02:25,756][85175] Updated weights for policy 1, policy_version 26800 (0.0011) +[2023-10-11 16:02:26,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 54493184. Throughput: 0: 1654.6, 1: 1693.3. Samples: 13633058. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:02:26,063][84230] Avg episode reward: [(0, '7.550'), (1, '7.220')] +[2023-10-11 16:02:26,121][85175] Updated weights for policy 1, policy_version 26810 (0.0010) +[2023-10-11 16:02:28,783][85176] Updated weights for policy 0, policy_version 26442 (0.0007) +[2023-10-11 16:02:29,158][85176] Updated weights for policy 0, policy_version 26452 (0.0009) +[2023-10-11 16:02:29,536][85176] Updated weights for policy 0, policy_version 26462 (0.0009) +[2023-10-11 16:02:30,112][85175] Updated weights for policy 1, policy_version 26820 (0.0010) +[2023-10-11 16:02:30,474][85175] Updated weights for policy 1, policy_version 26830 (0.0010) +[2023-10-11 16:02:30,851][85175] Updated weights for policy 1, policy_version 26840 (0.0008) +[2023-10-11 16:02:31,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 54558720. Throughput: 0: 1672.0, 1: 1687.2. Samples: 13653152. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:02:31,064][84230] Avg episode reward: [(0, '7.570'), (1, '7.410')] +[2023-10-11 16:02:33,636][85176] Updated weights for policy 0, policy_version 26472 (0.0011) +[2023-10-11 16:02:33,999][85176] Updated weights for policy 0, policy_version 26482 (0.0011) +[2023-10-11 16:02:34,367][85176] Updated weights for policy 0, policy_version 26492 (0.0007) +[2023-10-11 16:02:34,984][85175] Updated weights for policy 1, policy_version 26850 (0.0008) +[2023-10-11 16:02:35,357][85175] Updated weights for policy 1, policy_version 26860 (0.0007) +[2023-10-11 16:02:35,718][85175] Updated weights for policy 1, policy_version 26870 (0.0008) +[2023-10-11 16:02:36,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 54624256. Throughput: 0: 1667.7, 1: 1703.5. Samples: 13664022. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:02:36,063][84230] Avg episode reward: [(0, '7.480'), (1, '7.250')] +[2023-10-11 16:02:36,080][85175] Updated weights for policy 1, policy_version 26880 (0.0007) +[2023-10-11 16:02:38,507][85176] Updated weights for policy 0, policy_version 26502 (0.0009) +[2023-10-11 16:02:38,883][85176] Updated weights for policy 0, policy_version 26512 (0.0011) +[2023-10-11 16:02:39,269][85176] Updated weights for policy 0, policy_version 26522 (0.0010) +[2023-10-11 16:02:40,041][85175] Updated weights for policy 1, policy_version 26890 (0.0011) +[2023-10-11 16:02:40,411][85175] Updated weights for policy 1, policy_version 26900 (0.0007) +[2023-10-11 16:02:40,776][85175] Updated weights for policy 1, policy_version 26910 (0.0008) +[2023-10-11 16:02:41,063][84230] Fps is (10 sec: 16384.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 54722560. Throughput: 0: 1653.1, 1: 1709.9. Samples: 13683780. Policy #0 lag: (min: 31.0, avg: 33.7, max: 63.0) +[2023-10-11 16:02:41,063][84230] Avg episode reward: [(0, '7.410'), (1, '8.150')] +[2023-10-11 16:02:43,305][85176] Updated weights for policy 0, policy_version 26532 (0.0008) +[2023-10-11 16:02:43,672][85176] Updated weights for policy 0, policy_version 26542 (0.0007) +[2023-10-11 16:02:44,047][85176] Updated weights for policy 0, policy_version 26552 (0.0009) +[2023-10-11 16:02:44,733][85175] Updated weights for policy 1, policy_version 26920 (0.0009) +[2023-10-11 16:02:45,096][85175] Updated weights for policy 1, policy_version 26930 (0.0009) +[2023-10-11 16:02:45,470][85175] Updated weights for policy 1, policy_version 26940 (0.0009) +[2023-10-11 16:02:46,062][84230] Fps is (10 sec: 16384.0, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 54788096. Throughput: 0: 1674.9, 1: 1686.5. Samples: 13703596. Policy #0 lag: (min: 31.0, avg: 33.7, max: 63.0) +[2023-10-11 16:02:46,063][84230] Avg episode reward: [(0, '7.470'), (1, '8.150')] +[2023-10-11 16:02:48,148][85176] Updated weights for policy 0, policy_version 26562 (0.0008) +[2023-10-11 16:02:48,516][85176] Updated weights for policy 0, policy_version 26572 (0.0009) +[2023-10-11 16:02:48,887][85176] Updated weights for policy 0, policy_version 26582 (0.0010) +[2023-10-11 16:02:49,261][85176] Updated weights for policy 0, policy_version 26592 (0.0009) +[2023-10-11 16:02:49,642][85175] Updated weights for policy 1, policy_version 26950 (0.0009) +[2023-10-11 16:02:50,006][85175] Updated weights for policy 1, policy_version 26960 (0.0010) +[2023-10-11 16:02:50,374][85175] Updated weights for policy 1, policy_version 26970 (0.0007) +[2023-10-11 16:02:51,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 54853632. Throughput: 0: 1657.2, 1: 1711.4. Samples: 13714314. Policy #0 lag: (min: 31.0, avg: 33.7, max: 63.0) +[2023-10-11 16:02:51,063][84230] Avg episode reward: [(0, '7.370'), (1, '7.540')] +[2023-10-11 16:02:53,362][85176] Updated weights for policy 0, policy_version 26602 (0.0008) +[2023-10-11 16:02:53,734][85176] Updated weights for policy 0, policy_version 26612 (0.0007) +[2023-10-11 16:02:54,107][85176] Updated weights for policy 0, policy_version 26622 (0.0009) +[2023-10-11 16:02:54,482][85175] Updated weights for policy 1, policy_version 26980 (0.0009) +[2023-10-11 16:02:54,854][85175] Updated weights for policy 1, policy_version 26990 (0.0009) +[2023-10-11 16:02:55,224][85175] Updated weights for policy 1, policy_version 27000 (0.0009) +[2023-10-11 16:02:56,062][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 54919168. Throughput: 0: 1655.5, 1: 1703.9. Samples: 13733992. Policy #0 lag: (min: 31.0, avg: 33.7, max: 63.0) +[2023-10-11 16:02:56,063][84230] Avg episode reward: [(0, '7.800'), (1, '7.350')] +[2023-10-11 16:02:58,226][85176] Updated weights for policy 0, policy_version 26632 (0.0007) +[2023-10-11 16:02:58,603][85176] Updated weights for policy 0, policy_version 26642 (0.0007) +[2023-10-11 16:02:58,969][85176] Updated weights for policy 0, policy_version 26652 (0.0007) +[2023-10-11 16:02:59,198][85175] Updated weights for policy 1, policy_version 27010 (0.0008) +[2023-10-11 16:02:59,567][85175] Updated weights for policy 1, policy_version 27020 (0.0009) +[2023-10-11 16:02:59,934][85175] Updated weights for policy 1, policy_version 27030 (0.0007) +[2023-10-11 16:03:00,299][85175] Updated weights for policy 1, policy_version 27040 (0.0010) +[2023-10-11 16:03:01,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 54984704. Throughput: 0: 1667.9, 1: 1677.6. Samples: 13753626. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:03:01,064][84230] Avg episode reward: [(0, '7.890'), (1, '7.320')] +[2023-10-11 16:03:02,981][85176] Updated weights for policy 0, policy_version 26662 (0.0007) +[2023-10-11 16:03:03,365][85176] Updated weights for policy 0, policy_version 26672 (0.0009) +[2023-10-11 16:03:03,734][85176] Updated weights for policy 0, policy_version 26682 (0.0008) +[2023-10-11 16:03:04,297][85175] Updated weights for policy 1, policy_version 27050 (0.0010) +[2023-10-11 16:03:04,660][85175] Updated weights for policy 1, policy_version 27060 (0.0009) +[2023-10-11 16:03:05,033][85175] Updated weights for policy 1, policy_version 27070 (0.0009) +[2023-10-11 16:03:06,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 55050240. Throughput: 0: 1651.4, 1: 1704.5. Samples: 13764394. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:03:06,063][84230] Avg episode reward: [(0, '7.790'), (1, '7.170')] +[2023-10-11 16:03:07,898][85176] Updated weights for policy 0, policy_version 26692 (0.0008) +[2023-10-11 16:03:08,263][85176] Updated weights for policy 0, policy_version 26702 (0.0009) +[2023-10-11 16:03:08,634][85176] Updated weights for policy 0, policy_version 26712 (0.0008) +[2023-10-11 16:03:09,023][85175] Updated weights for policy 1, policy_version 27080 (0.0008) +[2023-10-11 16:03:09,392][85175] Updated weights for policy 1, policy_version 27090 (0.0010) +[2023-10-11 16:03:09,773][85175] Updated weights for policy 1, policy_version 27100 (0.0009) +[2023-10-11 16:03:11,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 55115776. Throughput: 0: 1660.3, 1: 1688.1. Samples: 13783736. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:03:11,064][84230] Avg episode reward: [(0, '7.690'), (1, '7.400')] +[2023-10-11 16:03:12,949][85176] Updated weights for policy 0, policy_version 26722 (0.0009) +[2023-10-11 16:03:13,349][85176] Updated weights for policy 0, policy_version 26732 (0.0008) +[2023-10-11 16:03:13,724][85176] Updated weights for policy 0, policy_version 26742 (0.0008) +[2023-10-11 16:03:13,828][85175] Updated weights for policy 1, policy_version 27110 (0.0010) +[2023-10-11 16:03:14,091][85176] Updated weights for policy 0, policy_version 26752 (0.0008) +[2023-10-11 16:03:14,195][85175] Updated weights for policy 1, policy_version 27120 (0.0010) +[2023-10-11 16:03:14,560][85175] Updated weights for policy 1, policy_version 27130 (0.0010) +[2023-10-11 16:03:16,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 55181312. Throughput: 0: 1665.6, 1: 1685.4. Samples: 13803946. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:03:16,064][84230] Avg episode reward: [(0, '7.380'), (1, '7.750')] +[2023-10-11 16:03:17,958][85176] Updated weights for policy 0, policy_version 26762 (0.0007) +[2023-10-11 16:03:18,328][85176] Updated weights for policy 0, policy_version 26772 (0.0008) +[2023-10-11 16:03:18,678][85175] Updated weights for policy 1, policy_version 27140 (0.0009) +[2023-10-11 16:03:18,703][85176] Updated weights for policy 0, policy_version 26782 (0.0007) +[2023-10-11 16:03:19,048][85175] Updated weights for policy 1, policy_version 27150 (0.0009) +[2023-10-11 16:03:19,424][85175] Updated weights for policy 1, policy_version 27160 (0.0009) +[2023-10-11 16:03:21,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 55246848. Throughput: 0: 1644.1, 1: 1700.9. Samples: 13814548. Policy #0 lag: (min: 28.0, avg: 33.7, max: 60.0) +[2023-10-11 16:03:21,064][84230] Avg episode reward: [(0, '7.710'), (1, '7.680')] +[2023-10-11 16:03:22,793][85176] Updated weights for policy 0, policy_version 26792 (0.0008) +[2023-10-11 16:03:23,160][85176] Updated weights for policy 0, policy_version 26802 (0.0008) +[2023-10-11 16:03:23,348][85175] Updated weights for policy 1, policy_version 27170 (0.0008) +[2023-10-11 16:03:23,534][85176] Updated weights for policy 0, policy_version 26812 (0.0009) +[2023-10-11 16:03:23,723][85175] Updated weights for policy 1, policy_version 27180 (0.0008) +[2023-10-11 16:03:24,098][85175] Updated weights for policy 1, policy_version 27190 (0.0009) +[2023-10-11 16:03:24,463][85175] Updated weights for policy 1, policy_version 27200 (0.0008) +[2023-10-11 16:03:26,063][84230] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 55312384. Throughput: 0: 1662.8, 1: 1664.5. Samples: 13833512. Policy #0 lag: (min: 28.0, avg: 33.7, max: 60.0) +[2023-10-11 16:03:26,063][84230] Avg episode reward: [(0, '8.050'), (1, '7.000')] +[2023-10-11 16:03:27,724][85176] Updated weights for policy 0, policy_version 26822 (0.0007) +[2023-10-11 16:03:28,091][85176] Updated weights for policy 0, policy_version 26832 (0.0009) +[2023-10-11 16:03:28,466][85176] Updated weights for policy 0, policy_version 26842 (0.0009) +[2023-10-11 16:03:28,821][85175] Updated weights for policy 1, policy_version 27210 (0.0009) +[2023-10-11 16:03:29,193][85175] Updated weights for policy 1, policy_version 27220 (0.0009) +[2023-10-11 16:03:29,561][85175] Updated weights for policy 1, policy_version 27230 (0.0008) +[2023-10-11 16:03:31,063][84230] Fps is (10 sec: 13106.8, 60 sec: 13653.2, 300 sec: 13440.4). Total num frames: 55377920. Throughput: 0: 1660.8, 1: 1681.8. Samples: 13854014. Policy #0 lag: (min: 28.0, avg: 33.7, max: 60.0) +[2023-10-11 16:03:31,064][84230] Avg episode reward: [(0, '8.200'), (1, '7.310')] +[2023-10-11 16:03:32,437][85176] Updated weights for policy 0, policy_version 26852 (0.0010) +[2023-10-11 16:03:32,811][85176] Updated weights for policy 0, policy_version 26862 (0.0010) +[2023-10-11 16:03:33,172][85176] Updated weights for policy 0, policy_version 26872 (0.0009) +[2023-10-11 16:03:33,317][85175] Updated weights for policy 1, policy_version 27240 (0.0009) +[2023-10-11 16:03:33,688][85175] Updated weights for policy 1, policy_version 27250 (0.0008) +[2023-10-11 16:03:34,053][85175] Updated weights for policy 1, policy_version 27260 (0.0008) +[2023-10-11 16:03:36,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 55443456. Throughput: 0: 1649.5, 1: 1679.1. Samples: 13864100. Policy #0 lag: (min: 28.0, avg: 33.7, max: 60.0) +[2023-10-11 16:03:36,063][84230] Avg episode reward: [(0, '7.440'), (1, '7.590')] +[2023-10-11 16:03:37,226][85176] Updated weights for policy 0, policy_version 26882 (0.0009) +[2023-10-11 16:03:37,592][85176] Updated weights for policy 0, policy_version 26892 (0.0008) +[2023-10-11 16:03:37,974][85176] Updated weights for policy 0, policy_version 26902 (0.0009) +[2023-10-11 16:03:38,165][85175] Updated weights for policy 1, policy_version 27270 (0.0008) +[2023-10-11 16:03:38,345][85176] Updated weights for policy 0, policy_version 26912 (0.0007) +[2023-10-11 16:03:38,531][85175] Updated weights for policy 1, policy_version 27280 (0.0009) +[2023-10-11 16:03:38,899][85175] Updated weights for policy 1, policy_version 27290 (0.0008) +[2023-10-11 16:03:41,062][84230] Fps is (10 sec: 13107.8, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 55508992. Throughput: 0: 1668.8, 1: 1662.8. Samples: 13883916. Policy #0 lag: (min: 24.0, avg: 45.5, max: 56.0) +[2023-10-11 16:03:41,063][84230] Avg episode reward: [(0, '7.400'), (1, '8.100')] +[2023-10-11 16:03:42,673][85176] Updated weights for policy 0, policy_version 26922 (0.0008) +[2023-10-11 16:03:43,034][85175] Updated weights for policy 1, policy_version 27300 (0.0007) +[2023-10-11 16:03:43,043][85176] Updated weights for policy 0, policy_version 26932 (0.0007) +[2023-10-11 16:03:43,403][85175] Updated weights for policy 1, policy_version 27310 (0.0008) +[2023-10-11 16:03:43,408][85176] Updated weights for policy 0, policy_version 26942 (0.0008) +[2023-10-11 16:03:43,778][85175] Updated weights for policy 1, policy_version 27320 (0.0008) +[2023-10-11 16:03:46,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 55574528. Throughput: 0: 1665.6, 1: 1690.3. Samples: 13904642. Policy #0 lag: (min: 24.0, avg: 45.5, max: 56.0) +[2023-10-11 16:03:46,064][84230] Avg episode reward: [(0, '6.710'), (1, '7.650')] +[2023-10-11 16:03:47,518][85176] Updated weights for policy 0, policy_version 26952 (0.0008) +[2023-10-11 16:03:47,817][85175] Updated weights for policy 1, policy_version 27330 (0.0008) +[2023-10-11 16:03:47,893][85176] Updated weights for policy 0, policy_version 26962 (0.0008) +[2023-10-11 16:03:48,187][85175] Updated weights for policy 1, policy_version 27340 (0.0009) +[2023-10-11 16:03:48,267][85176] Updated weights for policy 0, policy_version 26972 (0.0009) +[2023-10-11 16:03:48,553][85175] Updated weights for policy 1, policy_version 27350 (0.0008) +[2023-10-11 16:03:48,916][85175] Updated weights for policy 1, policy_version 27360 (0.0007) +[2023-10-11 16:03:51,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 55640064. Throughput: 0: 1655.5, 1: 1672.7. Samples: 13914160. Policy #0 lag: (min: 24.0, avg: 45.5, max: 56.0) +[2023-10-11 16:03:51,063][84230] Avg episode reward: [(0, '7.610'), (1, '7.240')] +[2023-10-11 16:03:52,451][85176] Updated weights for policy 0, policy_version 26982 (0.0008) +[2023-10-11 16:03:52,815][85175] Updated weights for policy 1, policy_version 27370 (0.0010) +[2023-10-11 16:03:52,823][85176] Updated weights for policy 0, policy_version 26992 (0.0007) +[2023-10-11 16:03:53,180][85175] Updated weights for policy 1, policy_version 27380 (0.0008) +[2023-10-11 16:03:53,202][85176] Updated weights for policy 0, policy_version 27002 (0.0007) +[2023-10-11 16:03:53,556][85175] Updated weights for policy 1, policy_version 27390 (0.0008) +[2023-10-11 16:03:56,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 55705600. Throughput: 0: 1671.9, 1: 1677.6. Samples: 13934460. Policy #0 lag: (min: 24.0, avg: 45.5, max: 56.0) +[2023-10-11 16:03:56,063][84230] Avg episode reward: [(0, '7.860'), (1, '7.150')] +[2023-10-11 16:03:57,191][85176] Updated weights for policy 0, policy_version 27012 (0.0008) +[2023-10-11 16:03:57,567][85176] Updated weights for policy 0, policy_version 27022 (0.0008) +[2023-10-11 16:03:57,610][85175] Updated weights for policy 1, policy_version 27400 (0.0008) +[2023-10-11 16:03:57,936][85176] Updated weights for policy 0, policy_version 27032 (0.0008) +[2023-10-11 16:03:57,978][85175] Updated weights for policy 1, policy_version 27410 (0.0009) +[2023-10-11 16:03:58,344][85175] Updated weights for policy 1, policy_version 27420 (0.0008) +[2023-10-11 16:04:01,063][84230] Fps is (10 sec: 13106.8, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 55771136. Throughput: 0: 1674.1, 1: 1689.6. Samples: 13955314. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:04:01,064][84230] Avg episode reward: [(0, '8.080'), (1, '7.960')] +[2023-10-11 16:04:02,036][85176] Updated weights for policy 0, policy_version 27042 (0.0009) +[2023-10-11 16:04:02,358][85175] Updated weights for policy 1, policy_version 27430 (0.0009) +[2023-10-11 16:04:02,449][85176] Updated weights for policy 0, policy_version 27052 (0.0008) +[2023-10-11 16:04:02,726][85175] Updated weights for policy 1, policy_version 27440 (0.0009) +[2023-10-11 16:04:02,824][85176] Updated weights for policy 0, policy_version 27062 (0.0007) +[2023-10-11 16:04:03,086][85175] Updated weights for policy 1, policy_version 27450 (0.0010) +[2023-10-11 16:04:03,191][85176] Updated weights for policy 0, policy_version 27072 (0.0008) +[2023-10-11 16:04:06,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 55836672. Throughput: 0: 1667.8, 1: 1662.8. Samples: 13964424. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:04:06,064][84230] Avg episode reward: [(0, '7.800'), (1, '8.090')] +[2023-10-11 16:04:07,128][85175] Updated weights for policy 1, policy_version 27460 (0.0007) +[2023-10-11 16:04:07,209][85176] Updated weights for policy 0, policy_version 27082 (0.0008) +[2023-10-11 16:04:07,503][85175] Updated weights for policy 1, policy_version 27470 (0.0007) +[2023-10-11 16:04:07,585][85176] Updated weights for policy 0, policy_version 27092 (0.0008) +[2023-10-11 16:04:07,872][85175] Updated weights for policy 1, policy_version 27480 (0.0009) +[2023-10-11 16:04:07,961][85176] Updated weights for policy 0, policy_version 27102 (0.0009) +[2023-10-11 16:04:11,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 55902208. Throughput: 0: 1673.9, 1: 1692.8. Samples: 13985014. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:04:11,063][84230] Avg episode reward: [(0, '7.320'), (1, '7.420')] +[2023-10-11 16:04:11,916][85175] Updated weights for policy 1, policy_version 27490 (0.0009) +[2023-10-11 16:04:12,155][85176] Updated weights for policy 0, policy_version 27112 (0.0007) +[2023-10-11 16:04:12,286][85175] Updated weights for policy 1, policy_version 27500 (0.0010) +[2023-10-11 16:04:12,525][85176] Updated weights for policy 0, policy_version 27122 (0.0009) +[2023-10-11 16:04:12,654][85175] Updated weights for policy 1, policy_version 27510 (0.0008) +[2023-10-11 16:04:12,894][85176] Updated weights for policy 0, policy_version 27132 (0.0007) +[2023-10-11 16:04:13,014][85175] Updated weights for policy 1, policy_version 27520 (0.0008) +[2023-10-11 16:04:16,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 55967744. Throughput: 0: 1672.8, 1: 1699.8. Samples: 14005778. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:04:16,064][84230] Avg episode reward: [(0, '7.020'), (1, '7.520')] +[2023-10-11 16:04:16,076][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000027136_27787264.pth... +[2023-10-11 16:04:16,076][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000027520_28180480.pth... +[2023-10-11 16:04:16,111][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000025952_26574848.pth +[2023-10-11 16:04:16,118][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000025600_26214400.pth +[2023-10-11 16:04:16,897][85176] Updated weights for policy 0, policy_version 27142 (0.0009) +[2023-10-11 16:04:17,265][85176] Updated weights for policy 0, policy_version 27152 (0.0007) +[2023-10-11 16:04:17,293][85175] Updated weights for policy 1, policy_version 27530 (0.0009) +[2023-10-11 16:04:17,633][85176] Updated weights for policy 0, policy_version 27162 (0.0007) +[2023-10-11 16:04:17,667][85175] Updated weights for policy 1, policy_version 27540 (0.0008) +[2023-10-11 16:04:18,036][85175] Updated weights for policy 1, policy_version 27550 (0.0010) +[2023-10-11 16:04:21,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 56033280. Throughput: 0: 1669.5, 1: 1676.5. Samples: 14014670. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:04:21,063][84230] Avg episode reward: [(0, '7.460'), (1, '7.680')] +[2023-10-11 16:04:21,709][85176] Updated weights for policy 0, policy_version 27172 (0.0008) +[2023-10-11 16:04:21,976][85175] Updated weights for policy 1, policy_version 27560 (0.0007) +[2023-10-11 16:04:22,074][85176] Updated weights for policy 0, policy_version 27182 (0.0009) +[2023-10-11 16:04:22,341][85175] Updated weights for policy 1, policy_version 27570 (0.0009) +[2023-10-11 16:04:22,454][85176] Updated weights for policy 0, policy_version 27192 (0.0009) +[2023-10-11 16:04:22,714][85175] Updated weights for policy 1, policy_version 27580 (0.0009) +[2023-10-11 16:04:26,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 56098816. Throughput: 0: 1666.4, 1: 1694.8. Samples: 14035174. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:04:26,064][84230] Avg episode reward: [(0, '7.370'), (1, '7.490')] +[2023-10-11 16:04:26,454][85176] Updated weights for policy 0, policy_version 27202 (0.0009) +[2023-10-11 16:04:26,742][85175] Updated weights for policy 1, policy_version 27590 (0.0009) +[2023-10-11 16:04:26,832][85176] Updated weights for policy 0, policy_version 27212 (0.0008) +[2023-10-11 16:04:27,121][85175] Updated weights for policy 1, policy_version 27600 (0.0007) +[2023-10-11 16:04:27,211][85176] Updated weights for policy 0, policy_version 27222 (0.0008) +[2023-10-11 16:04:27,489][85175] Updated weights for policy 1, policy_version 27610 (0.0008) +[2023-10-11 16:04:27,572][85176] Updated weights for policy 0, policy_version 27232 (0.0008) +[2023-10-11 16:04:31,063][84230] Fps is (10 sec: 13106.8, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 56164352. Throughput: 0: 1667.7, 1: 1694.7. Samples: 14055950. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:04:31,064][84230] Avg episode reward: [(0, '7.820'), (1, '7.420')] +[2023-10-11 16:04:31,541][85175] Updated weights for policy 1, policy_version 27620 (0.0008) +[2023-10-11 16:04:31,880][85176] Updated weights for policy 0, policy_version 27242 (0.0007) +[2023-10-11 16:04:31,910][85175] Updated weights for policy 1, policy_version 27630 (0.0008) +[2023-10-11 16:04:32,256][85176] Updated weights for policy 0, policy_version 27252 (0.0010) +[2023-10-11 16:04:32,279][85175] Updated weights for policy 1, policy_version 27640 (0.0008) +[2023-10-11 16:04:32,618][85176] Updated weights for policy 0, policy_version 27262 (0.0008) +[2023-10-11 16:04:36,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 56229888. Throughput: 0: 1668.4, 1: 1682.9. Samples: 14064966. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:04:36,063][84230] Avg episode reward: [(0, '8.050'), (1, '7.570')] +[2023-10-11 16:04:36,131][85175] Updated weights for policy 1, policy_version 27650 (0.0007) +[2023-10-11 16:04:36,499][85175] Updated weights for policy 1, policy_version 27660 (0.0007) +[2023-10-11 16:04:36,754][85176] Updated weights for policy 0, policy_version 27272 (0.0009) +[2023-10-11 16:04:36,861][85175] Updated weights for policy 1, policy_version 27670 (0.0007) +[2023-10-11 16:04:37,130][85176] Updated weights for policy 0, policy_version 27282 (0.0008) +[2023-10-11 16:04:37,230][85175] Updated weights for policy 1, policy_version 27680 (0.0009) +[2023-10-11 16:04:37,497][85176] Updated weights for policy 0, policy_version 27292 (0.0010) +[2023-10-11 16:04:41,062][84230] Fps is (10 sec: 13107.7, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 56295424. Throughput: 0: 1662.2, 1: 1695.6. Samples: 14085560. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-10-11 16:04:41,063][84230] Avg episode reward: [(0, '7.790'), (1, '7.890')] +[2023-10-11 16:04:41,337][85175] Updated weights for policy 1, policy_version 27690 (0.0007) +[2023-10-11 16:04:41,499][85176] Updated weights for policy 0, policy_version 27302 (0.0008) +[2023-10-11 16:04:41,708][85175] Updated weights for policy 1, policy_version 27700 (0.0007) +[2023-10-11 16:04:41,864][85176] Updated weights for policy 0, policy_version 27312 (0.0008) +[2023-10-11 16:04:42,085][85175] Updated weights for policy 1, policy_version 27710 (0.0009) +[2023-10-11 16:04:42,237][85176] Updated weights for policy 0, policy_version 27322 (0.0008) +[2023-10-11 16:04:46,062][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 56360960. Throughput: 0: 1661.4, 1: 1695.1. Samples: 14106358. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-10-11 16:04:46,063][84230] Avg episode reward: [(0, '7.680'), (1, '7.800')] +[2023-10-11 16:04:46,152][85175] Updated weights for policy 1, policy_version 27720 (0.0010) +[2023-10-11 16:04:46,429][85176] Updated weights for policy 0, policy_version 27332 (0.0009) +[2023-10-11 16:04:46,523][85175] Updated weights for policy 1, policy_version 27730 (0.0008) +[2023-10-11 16:04:46,794][85176] Updated weights for policy 0, policy_version 27342 (0.0010) +[2023-10-11 16:04:46,881][85175] Updated weights for policy 1, policy_version 27740 (0.0009) +[2023-10-11 16:04:47,166][85176] Updated weights for policy 0, policy_version 27352 (0.0008) +[2023-10-11 16:04:51,030][85175] Updated weights for policy 1, policy_version 27750 (0.0007) +[2023-10-11 16:04:51,063][84230] Fps is (10 sec: 13106.8, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 56426496. Throughput: 0: 1663.1, 1: 1693.1. Samples: 14115450. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-10-11 16:04:51,064][84230] Avg episode reward: [(0, '6.850'), (1, '7.520')] +[2023-10-11 16:04:51,383][85176] Updated weights for policy 0, policy_version 27362 (0.0009) +[2023-10-11 16:04:51,390][85175] Updated weights for policy 1, policy_version 27760 (0.0009) +[2023-10-11 16:04:51,764][85175] Updated weights for policy 1, policy_version 27770 (0.0007) +[2023-10-11 16:04:51,780][85176] Updated weights for policy 0, policy_version 27372 (0.0008) +[2023-10-11 16:04:52,160][85176] Updated weights for policy 0, policy_version 27382 (0.0009) +[2023-10-11 16:04:52,526][85176] Updated weights for policy 0, policy_version 27392 (0.0008) +[2023-10-11 16:04:55,917][85175] Updated weights for policy 1, policy_version 27780 (0.0008) +[2023-10-11 16:04:56,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 56492032. Throughput: 0: 1660.9, 1: 1692.4. Samples: 14135916. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-10-11 16:04:56,063][84230] Avg episode reward: [(0, '7.200'), (1, '7.550')] +[2023-10-11 16:04:56,294][85175] Updated weights for policy 1, policy_version 27790 (0.0009) +[2023-10-11 16:04:56,592][85176] Updated weights for policy 0, policy_version 27402 (0.0007) +[2023-10-11 16:04:56,655][85175] Updated weights for policy 1, policy_version 27800 (0.0008) +[2023-10-11 16:04:56,966][85176] Updated weights for policy 0, policy_version 27412 (0.0007) +[2023-10-11 16:04:57,349][85176] Updated weights for policy 0, policy_version 27422 (0.0009) +[2023-10-11 16:05:00,841][85175] Updated weights for policy 1, policy_version 27810 (0.0008) +[2023-10-11 16:05:01,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 56557568. Throughput: 0: 1663.2, 1: 1687.0. Samples: 14156538. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:05:01,064][84230] Avg episode reward: [(0, '7.580'), (1, '7.420')] +[2023-10-11 16:05:01,207][85175] Updated weights for policy 1, policy_version 27820 (0.0009) +[2023-10-11 16:05:01,468][85176] Updated weights for policy 0, policy_version 27432 (0.0007) +[2023-10-11 16:05:01,580][85175] Updated weights for policy 1, policy_version 27830 (0.0008) +[2023-10-11 16:05:01,831][85176] Updated weights for policy 0, policy_version 27442 (0.0007) +[2023-10-11 16:05:01,949][85175] Updated weights for policy 1, policy_version 27840 (0.0007) +[2023-10-11 16:05:02,207][85176] Updated weights for policy 0, policy_version 27452 (0.0007) +[2023-10-11 16:05:06,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 56623104. Throughput: 0: 1666.5, 1: 1688.1. Samples: 14165626. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:05:06,063][84230] Avg episode reward: [(0, '8.070'), (1, '7.470')] +[2023-10-11 16:05:06,112][85175] Updated weights for policy 1, policy_version 27850 (0.0010) +[2023-10-11 16:05:06,266][85176] Updated weights for policy 0, policy_version 27462 (0.0010) +[2023-10-11 16:05:06,486][85175] Updated weights for policy 1, policy_version 27860 (0.0009) +[2023-10-11 16:05:06,643][85176] Updated weights for policy 0, policy_version 27472 (0.0009) +[2023-10-11 16:05:06,851][85175] Updated weights for policy 1, policy_version 27870 (0.0007) +[2023-10-11 16:05:07,019][85176] Updated weights for policy 0, policy_version 27482 (0.0007) +[2023-10-11 16:05:10,763][85175] Updated weights for policy 1, policy_version 27880 (0.0007) +[2023-10-11 16:05:11,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 56688640. Throughput: 0: 1661.0, 1: 1690.8. Samples: 14186006. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:05:11,064][84230] Avg episode reward: [(0, '7.750'), (1, '7.820')] +[2023-10-11 16:05:11,131][85175] Updated weights for policy 1, policy_version 27890 (0.0009) +[2023-10-11 16:05:11,229][85176] Updated weights for policy 0, policy_version 27492 (0.0009) +[2023-10-11 16:05:11,502][85175] Updated weights for policy 1, policy_version 27900 (0.0009) +[2023-10-11 16:05:11,598][85176] Updated weights for policy 0, policy_version 27502 (0.0008) +[2023-10-11 16:05:11,983][85176] Updated weights for policy 0, policy_version 27512 (0.0008) +[2023-10-11 16:05:15,449][85175] Updated weights for policy 1, policy_version 27910 (0.0009) +[2023-10-11 16:05:15,821][85175] Updated weights for policy 1, policy_version 27920 (0.0010) +[2023-10-11 16:05:16,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 56754176. Throughput: 0: 1658.3, 1: 1684.1. Samples: 14206356. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:05:16,063][84230] Avg episode reward: [(0, '7.560'), (1, '8.090')] +[2023-10-11 16:05:16,083][85176] Updated weights for policy 0, policy_version 27522 (0.0008) +[2023-10-11 16:05:16,191][85175] Updated weights for policy 1, policy_version 27930 (0.0008) +[2023-10-11 16:05:16,456][85176] Updated weights for policy 0, policy_version 27532 (0.0009) +[2023-10-11 16:05:16,824][85176] Updated weights for policy 0, policy_version 27542 (0.0010) +[2023-10-11 16:05:17,196][85176] Updated weights for policy 0, policy_version 27552 (0.0007) +[2023-10-11 16:05:20,189][85175] Updated weights for policy 1, policy_version 27940 (0.0008) +[2023-10-11 16:05:20,565][85175] Updated weights for policy 1, policy_version 27950 (0.0009) +[2023-10-11 16:05:20,942][85175] Updated weights for policy 1, policy_version 27960 (0.0008) +[2023-10-11 16:05:21,062][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 56819712. Throughput: 0: 1658.6, 1: 1693.0. Samples: 14215790. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:05:21,063][84230] Avg episode reward: [(0, '7.300'), (1, '7.710')] +[2023-10-11 16:05:21,349][85176] Updated weights for policy 0, policy_version 27562 (0.0009) +[2023-10-11 16:05:21,722][85176] Updated weights for policy 0, policy_version 27572 (0.0010) +[2023-10-11 16:05:22,094][85176] Updated weights for policy 0, policy_version 27582 (0.0009) +[2023-10-11 16:05:24,956][85175] Updated weights for policy 1, policy_version 27970 (0.0008) +[2023-10-11 16:05:25,318][85175] Updated weights for policy 1, policy_version 27980 (0.0008) +[2023-10-11 16:05:25,691][85175] Updated weights for policy 1, policy_version 27990 (0.0009) +[2023-10-11 16:05:26,056][85175] Updated weights for policy 1, policy_version 28000 (0.0008) +[2023-10-11 16:05:26,062][84230] Fps is (10 sec: 16384.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 56918016. Throughput: 0: 1655.9, 1: 1688.1. Samples: 14236038. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:05:26,063][84230] Avg episode reward: [(0, '7.440'), (1, '7.190')] +[2023-10-11 16:05:26,310][85176] Updated weights for policy 0, policy_version 27592 (0.0010) +[2023-10-11 16:05:26,689][85176] Updated weights for policy 0, policy_version 27602 (0.0011) +[2023-10-11 16:05:27,059][85176] Updated weights for policy 0, policy_version 27612 (0.0011) +[2023-10-11 16:05:30,161][85175] Updated weights for policy 1, policy_version 28010 (0.0007) +[2023-10-11 16:05:30,522][85175] Updated weights for policy 1, policy_version 28020 (0.0010) +[2023-10-11 16:05:30,894][85175] Updated weights for policy 1, policy_version 28030 (0.0010) +[2023-10-11 16:05:31,063][84230] Fps is (10 sec: 16383.9, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 56983552. Throughput: 0: 1650.5, 1: 1669.5. Samples: 14255760. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:05:31,063][84230] Avg episode reward: [(0, '7.690'), (1, '7.670')] +[2023-10-11 16:05:31,302][85176] Updated weights for policy 0, policy_version 27622 (0.0008) +[2023-10-11 16:05:31,676][85176] Updated weights for policy 0, policy_version 27632 (0.0007) +[2023-10-11 16:05:32,042][85176] Updated weights for policy 0, policy_version 27642 (0.0010) +[2023-10-11 16:05:35,028][85175] Updated weights for policy 1, policy_version 28040 (0.0010) +[2023-10-11 16:05:35,399][85175] Updated weights for policy 1, policy_version 28050 (0.0011) +[2023-10-11 16:05:35,774][85175] Updated weights for policy 1, policy_version 28060 (0.0009) +[2023-10-11 16:05:36,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 57049088. Throughput: 0: 1650.9, 1: 1686.7. Samples: 14265644. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:05:36,063][84230] Avg episode reward: [(0, '7.730'), (1, '7.710')] +[2023-10-11 16:05:36,233][85176] Updated weights for policy 0, policy_version 27652 (0.0010) +[2023-10-11 16:05:36,611][85176] Updated weights for policy 0, policy_version 27662 (0.0009) +[2023-10-11 16:05:36,987][85176] Updated weights for policy 0, policy_version 27672 (0.0011) +[2023-10-11 16:05:39,860][85175] Updated weights for policy 1, policy_version 28070 (0.0009) +[2023-10-11 16:05:40,229][85175] Updated weights for policy 1, policy_version 28080 (0.0011) +[2023-10-11 16:05:40,609][85175] Updated weights for policy 1, policy_version 28090 (0.0010) +[2023-10-11 16:05:41,053][85176] Updated weights for policy 0, policy_version 27682 (0.0009) +[2023-10-11 16:05:41,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 57114624. Throughput: 0: 1651.0, 1: 1685.5. Samples: 14286056. Policy #0 lag: (min: 20.0, avg: 28.0, max: 52.0) +[2023-10-11 16:05:41,064][84230] Avg episode reward: [(0, '7.730'), (1, '7.720')] +[2023-10-11 16:05:41,431][85176] Updated weights for policy 0, policy_version 27692 (0.0008) +[2023-10-11 16:05:41,803][85176] Updated weights for policy 0, policy_version 27702 (0.0007) +[2023-10-11 16:05:42,175][85176] Updated weights for policy 0, policy_version 27712 (0.0010) +[2023-10-11 16:05:44,810][85175] Updated weights for policy 1, policy_version 28100 (0.0008) +[2023-10-11 16:05:45,184][85175] Updated weights for policy 1, policy_version 28110 (0.0008) +[2023-10-11 16:05:45,554][85175] Updated weights for policy 1, policy_version 28120 (0.0008) +[2023-10-11 16:05:46,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 57180160. Throughput: 0: 1650.2, 1: 1663.4. Samples: 14305650. Policy #0 lag: (min: 20.0, avg: 28.0, max: 52.0) +[2023-10-11 16:05:46,064][84230] Avg episode reward: [(0, '7.670'), (1, '7.630')] +[2023-10-11 16:05:46,318][85176] Updated weights for policy 0, policy_version 27722 (0.0009) +[2023-10-11 16:05:46,695][85176] Updated weights for policy 0, policy_version 27732 (0.0008) +[2023-10-11 16:05:47,073][85176] Updated weights for policy 0, policy_version 27742 (0.0007) +[2023-10-11 16:05:49,696][85175] Updated weights for policy 1, policy_version 28130 (0.0009) +[2023-10-11 16:05:50,056][85175] Updated weights for policy 1, policy_version 28140 (0.0009) +[2023-10-11 16:05:50,421][85175] Updated weights for policy 1, policy_version 28150 (0.0009) +[2023-10-11 16:05:50,787][85175] Updated weights for policy 1, policy_version 28160 (0.0007) +[2023-10-11 16:05:51,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 57245696. Throughput: 0: 1649.2, 1: 1684.7. Samples: 14315654. Policy #0 lag: (min: 20.0, avg: 28.0, max: 52.0) +[2023-10-11 16:05:51,063][84230] Avg episode reward: [(0, '7.870'), (1, '7.170')] +[2023-10-11 16:05:51,117][85176] Updated weights for policy 0, policy_version 27752 (0.0008) +[2023-10-11 16:05:51,482][85176] Updated weights for policy 0, policy_version 27762 (0.0007) +[2023-10-11 16:05:51,852][85176] Updated weights for policy 0, policy_version 27772 (0.0008) +[2023-10-11 16:05:54,898][85175] Updated weights for policy 1, policy_version 28170 (0.0009) +[2023-10-11 16:05:55,269][85175] Updated weights for policy 1, policy_version 28180 (0.0011) +[2023-10-11 16:05:55,637][85175] Updated weights for policy 1, policy_version 28190 (0.0008) +[2023-10-11 16:05:55,879][85176] Updated weights for policy 0, policy_version 27782 (0.0008) +[2023-10-11 16:05:56,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 57311232. Throughput: 0: 1653.8, 1: 1681.9. Samples: 14336114. Policy #0 lag: (min: 20.0, avg: 28.0, max: 52.0) +[2023-10-11 16:05:56,063][84230] Avg episode reward: [(0, '7.860'), (1, '7.130')] +[2023-10-11 16:05:56,248][85176] Updated weights for policy 0, policy_version 27792 (0.0007) +[2023-10-11 16:05:56,618][85176] Updated weights for policy 0, policy_version 27802 (0.0007) +[2023-10-11 16:05:59,592][85175] Updated weights for policy 1, policy_version 28200 (0.0009) +[2023-10-11 16:05:59,962][85175] Updated weights for policy 1, policy_version 28210 (0.0008) +[2023-10-11 16:06:00,325][85175] Updated weights for policy 1, policy_version 28220 (0.0010) +[2023-10-11 16:06:00,814][85176] Updated weights for policy 0, policy_version 27812 (0.0009) +[2023-10-11 16:06:01,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 57376768. Throughput: 0: 1652.0, 1: 1658.8. Samples: 14355342. Policy #0 lag: (min: 9.0, avg: 25.0, max: 41.0) +[2023-10-11 16:06:01,063][84230] Avg episode reward: [(0, '7.620'), (1, '7.430')] +[2023-10-11 16:06:01,190][85176] Updated weights for policy 0, policy_version 27822 (0.0008) +[2023-10-11 16:06:01,574][85176] Updated weights for policy 0, policy_version 27832 (0.0010) +[2023-10-11 16:06:04,429][85175] Updated weights for policy 1, policy_version 28230 (0.0009) +[2023-10-11 16:06:04,794][85175] Updated weights for policy 1, policy_version 28240 (0.0008) +[2023-10-11 16:06:05,160][85175] Updated weights for policy 1, policy_version 28250 (0.0010) +[2023-10-11 16:06:05,521][85176] Updated weights for policy 0, policy_version 27842 (0.0011) +[2023-10-11 16:06:05,890][85176] Updated weights for policy 0, policy_version 27852 (0.0010) +[2023-10-11 16:06:06,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 57442304. Throughput: 0: 1652.8, 1: 1682.0. Samples: 14365858. Policy #0 lag: (min: 9.0, avg: 25.0, max: 41.0) +[2023-10-11 16:06:06,064][84230] Avg episode reward: [(0, '7.020'), (1, '7.560')] +[2023-10-11 16:06:06,274][85176] Updated weights for policy 0, policy_version 27862 (0.0010) +[2023-10-11 16:06:06,647][85176] Updated weights for policy 0, policy_version 27872 (0.0009) +[2023-10-11 16:06:09,195][85175] Updated weights for policy 1, policy_version 28260 (0.0009) +[2023-10-11 16:06:09,574][85175] Updated weights for policy 1, policy_version 28270 (0.0008) +[2023-10-11 16:06:09,947][85175] Updated weights for policy 1, policy_version 28280 (0.0008) +[2023-10-11 16:06:10,755][85176] Updated weights for policy 0, policy_version 27882 (0.0010) +[2023-10-11 16:06:11,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 57507840. Throughput: 0: 1661.6, 1: 1673.0. Samples: 14386092. Policy #0 lag: (min: 9.0, avg: 25.0, max: 41.0) +[2023-10-11 16:06:11,063][84230] Avg episode reward: [(0, '7.260'), (1, '7.640')] +[2023-10-11 16:06:11,120][85176] Updated weights for policy 0, policy_version 27892 (0.0010) +[2023-10-11 16:06:11,486][85176] Updated weights for policy 0, policy_version 27902 (0.0010) +[2023-10-11 16:06:14,059][85175] Updated weights for policy 1, policy_version 28290 (0.0009) +[2023-10-11 16:06:14,432][85175] Updated weights for policy 1, policy_version 28300 (0.0008) +[2023-10-11 16:06:14,794][85175] Updated weights for policy 1, policy_version 28310 (0.0007) +[2023-10-11 16:06:15,160][85175] Updated weights for policy 1, policy_version 28320 (0.0007) +[2023-10-11 16:06:15,669][85176] Updated weights for policy 0, policy_version 27912 (0.0009) +[2023-10-11 16:06:16,038][85176] Updated weights for policy 0, policy_version 27922 (0.0010) +[2023-10-11 16:06:16,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 57573376. Throughput: 0: 1655.7, 1: 1674.0. Samples: 14405598. Policy #0 lag: (min: 9.0, avg: 25.0, max: 41.0) +[2023-10-11 16:06:16,064][84230] Avg episode reward: [(0, '7.850'), (1, '7.870')] +[2023-10-11 16:06:16,074][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000028320_28999680.pth... +[2023-10-11 16:06:16,115][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000026752_27394048.pth +[2023-10-11 16:06:16,414][85176] Updated weights for policy 0, policy_version 27932 (0.0007) +[2023-10-11 16:06:16,557][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000027936_28606464.pth... +[2023-10-11 16:06:16,592][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000026368_27000832.pth +[2023-10-11 16:06:19,076][85175] Updated weights for policy 1, policy_version 28330 (0.0009) +[2023-10-11 16:06:19,452][85175] Updated weights for policy 1, policy_version 28340 (0.0009) +[2023-10-11 16:06:19,808][85175] Updated weights for policy 1, policy_version 28350 (0.0009) +[2023-10-11 16:06:20,574][85176] Updated weights for policy 0, policy_version 27942 (0.0010) +[2023-10-11 16:06:20,949][85176] Updated weights for policy 0, policy_version 27952 (0.0009) +[2023-10-11 16:06:21,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 57638912. Throughput: 0: 1660.1, 1: 1690.1. Samples: 14416404. Policy #0 lag: (min: 31.0, avg: 31.0, max: 33.0) +[2023-10-11 16:06:21,063][84230] Avg episode reward: [(0, '8.450'), (1, '7.650')] +[2023-10-11 16:06:21,328][85176] Updated weights for policy 0, policy_version 27962 (0.0007) +[2023-10-11 16:06:23,853][85175] Updated weights for policy 1, policy_version 28360 (0.0009) +[2023-10-11 16:06:24,221][85175] Updated weights for policy 1, policy_version 28370 (0.0008) +[2023-10-11 16:06:24,600][85175] Updated weights for policy 1, policy_version 28380 (0.0007) +[2023-10-11 16:06:25,562][85176] Updated weights for policy 0, policy_version 27972 (0.0007) +[2023-10-11 16:06:25,926][85176] Updated weights for policy 0, policy_version 27982 (0.0007) +[2023-10-11 16:06:26,062][84230] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 57704448. Throughput: 0: 1665.2, 1: 1667.5. Samples: 14436026. Policy #0 lag: (min: 31.0, avg: 31.0, max: 33.0) +[2023-10-11 16:06:26,063][84230] Avg episode reward: [(0, '8.200'), (1, '7.160')] +[2023-10-11 16:06:26,308][85176] Updated weights for policy 0, policy_version 27992 (0.0008) +[2023-10-11 16:06:28,400][85175] Updated weights for policy 1, policy_version 28390 (0.0009) +[2023-10-11 16:06:28,760][85175] Updated weights for policy 1, policy_version 28400 (0.0008) +[2023-10-11 16:06:29,128][85175] Updated weights for policy 1, policy_version 28410 (0.0008) +[2023-10-11 16:06:30,293][85176] Updated weights for policy 0, policy_version 28002 (0.0009) +[2023-10-11 16:06:30,657][85176] Updated weights for policy 0, policy_version 28012 (0.0010) +[2023-10-11 16:06:31,028][85176] Updated weights for policy 0, policy_version 28022 (0.0007) +[2023-10-11 16:06:31,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 57769984. Throughput: 0: 1662.2, 1: 1684.4. Samples: 14456246. Policy #0 lag: (min: 31.0, avg: 31.0, max: 33.0) +[2023-10-11 16:06:31,063][84230] Avg episode reward: [(0, '7.690'), (1, '7.480')] +[2023-10-11 16:06:31,398][85176] Updated weights for policy 0, policy_version 28032 (0.0008) +[2023-10-11 16:06:33,019][85175] Updated weights for policy 1, policy_version 28420 (0.0008) +[2023-10-11 16:06:33,388][85175] Updated weights for policy 1, policy_version 28430 (0.0007) +[2023-10-11 16:06:33,765][85175] Updated weights for policy 1, policy_version 28440 (0.0010) +[2023-10-11 16:06:35,362][85176] Updated weights for policy 0, policy_version 28042 (0.0007) +[2023-10-11 16:06:35,750][85176] Updated weights for policy 0, policy_version 28052 (0.0008) +[2023-10-11 16:06:36,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 57835520. Throughput: 0: 1673.0, 1: 1684.2. Samples: 14466726. Policy #0 lag: (min: 31.0, avg: 31.0, max: 33.0) +[2023-10-11 16:06:36,063][84230] Avg episode reward: [(0, '7.170'), (1, '7.680')] +[2023-10-11 16:06:36,129][85176] Updated weights for policy 0, policy_version 28062 (0.0007) +[2023-10-11 16:06:37,878][85175] Updated weights for policy 1, policy_version 28450 (0.0010) +[2023-10-11 16:06:38,252][85175] Updated weights for policy 1, policy_version 28460 (0.0009) +[2023-10-11 16:06:38,612][85175] Updated weights for policy 1, policy_version 28470 (0.0011) +[2023-10-11 16:06:38,981][85175] Updated weights for policy 1, policy_version 28480 (0.0009) +[2023-10-11 16:06:40,239][85176] Updated weights for policy 0, policy_version 28072 (0.0007) +[2023-10-11 16:06:40,622][85176] Updated weights for policy 0, policy_version 28082 (0.0008) +[2023-10-11 16:06:41,001][85176] Updated weights for policy 0, policy_version 28092 (0.0007) +[2023-10-11 16:06:41,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 57901056. Throughput: 0: 1670.7, 1: 1670.1. Samples: 14486448. Policy #0 lag: (min: 14.0, avg: 14.0, max: 16.0) +[2023-10-11 16:06:41,064][84230] Avg episode reward: [(0, '7.330'), (1, '7.700')] +[2023-10-11 16:06:43,229][85175] Updated weights for policy 1, policy_version 28490 (0.0007) +[2023-10-11 16:06:43,610][85175] Updated weights for policy 1, policy_version 28500 (0.0008) +[2023-10-11 16:06:43,978][85175] Updated weights for policy 1, policy_version 28510 (0.0009) +[2023-10-11 16:06:45,137][85176] Updated weights for policy 0, policy_version 28102 (0.0007) +[2023-10-11 16:06:45,504][85176] Updated weights for policy 0, policy_version 28112 (0.0008) +[2023-10-11 16:06:45,876][85176] Updated weights for policy 0, policy_version 28122 (0.0009) +[2023-10-11 16:06:46,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 57966592. Throughput: 0: 1657.9, 1: 1694.4. Samples: 14506198. Policy #0 lag: (min: 14.0, avg: 14.0, max: 16.0) +[2023-10-11 16:06:46,064][84230] Avg episode reward: [(0, '7.600'), (1, '7.700')] +[2023-10-11 16:06:47,940][85175] Updated weights for policy 1, policy_version 28520 (0.0008) +[2023-10-11 16:06:48,307][85175] Updated weights for policy 1, policy_version 28530 (0.0007) +[2023-10-11 16:06:48,679][85175] Updated weights for policy 1, policy_version 28540 (0.0007) +[2023-10-11 16:06:50,097][85176] Updated weights for policy 0, policy_version 28132 (0.0008) +[2023-10-11 16:06:50,469][85176] Updated weights for policy 0, policy_version 28142 (0.0007) +[2023-10-11 16:06:50,835][85176] Updated weights for policy 0, policy_version 28152 (0.0009) +[2023-10-11 16:06:51,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 58032128. Throughput: 0: 1670.2, 1: 1673.1. Samples: 14516304. Policy #0 lag: (min: 14.0, avg: 14.0, max: 16.0) +[2023-10-11 16:06:51,063][84230] Avg episode reward: [(0, '8.300'), (1, '7.330')] +[2023-10-11 16:06:52,788][85175] Updated weights for policy 1, policy_version 28550 (0.0010) +[2023-10-11 16:06:53,164][85175] Updated weights for policy 1, policy_version 28560 (0.0008) +[2023-10-11 16:06:53,534][85175] Updated weights for policy 1, policy_version 28570 (0.0009) +[2023-10-11 16:06:55,029][85176] Updated weights for policy 0, policy_version 28162 (0.0009) +[2023-10-11 16:06:55,394][85176] Updated weights for policy 0, policy_version 28172 (0.0009) +[2023-10-11 16:06:55,764][85176] Updated weights for policy 0, policy_version 28182 (0.0008) +[2023-10-11 16:06:56,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 58097664. Throughput: 0: 1668.2, 1: 1673.5. Samples: 14536470. Policy #0 lag: (min: 14.0, avg: 14.0, max: 16.0) +[2023-10-11 16:06:56,064][84230] Avg episode reward: [(0, '7.990'), (1, '7.830')] +[2023-10-11 16:06:56,145][85176] Updated weights for policy 0, policy_version 28192 (0.0008) +[2023-10-11 16:06:57,670][85175] Updated weights for policy 1, policy_version 28580 (0.0008) +[2023-10-11 16:06:58,046][85175] Updated weights for policy 1, policy_version 28590 (0.0007) +[2023-10-11 16:06:58,419][85175] Updated weights for policy 1, policy_version 28600 (0.0008) +[2023-10-11 16:07:00,273][85176] Updated weights for policy 0, policy_version 28202 (0.0011) +[2023-10-11 16:07:00,644][85176] Updated weights for policy 0, policy_version 28212 (0.0009) +[2023-10-11 16:07:01,021][85176] Updated weights for policy 0, policy_version 28222 (0.0007) +[2023-10-11 16:07:01,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 58163200. Throughput: 0: 1661.0, 1: 1692.5. Samples: 14556504. Policy #0 lag: (min: 2.0, avg: 4.5, max: 30.0) +[2023-10-11 16:07:01,063][84230] Avg episode reward: [(0, '7.410'), (1, '7.920')] +[2023-10-11 16:07:02,556][85175] Updated weights for policy 1, policy_version 28610 (0.0009) +[2023-10-11 16:07:02,919][85175] Updated weights for policy 1, policy_version 28620 (0.0010) +[2023-10-11 16:07:03,284][85175] Updated weights for policy 1, policy_version 28630 (0.0010) +[2023-10-11 16:07:03,651][85175] Updated weights for policy 1, policy_version 28640 (0.0011) +[2023-10-11 16:07:04,910][85176] Updated weights for policy 0, policy_version 28232 (0.0007) +[2023-10-11 16:07:05,283][85176] Updated weights for policy 0, policy_version 28242 (0.0008) +[2023-10-11 16:07:05,655][85176] Updated weights for policy 0, policy_version 28252 (0.0007) +[2023-10-11 16:07:06,062][84230] Fps is (10 sec: 16384.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 58261504. Throughput: 0: 1674.7, 1: 1660.0. Samples: 14566462. Policy #0 lag: (min: 2.0, avg: 4.5, max: 30.0) +[2023-10-11 16:07:06,063][84230] Avg episode reward: [(0, '6.780'), (1, '7.530')] +[2023-10-11 16:07:07,857][85175] Updated weights for policy 1, policy_version 28650 (0.0007) +[2023-10-11 16:07:08,235][85175] Updated weights for policy 1, policy_version 28660 (0.0008) +[2023-10-11 16:07:08,601][85175] Updated weights for policy 1, policy_version 28670 (0.0009) +[2023-10-11 16:07:09,767][85176] Updated weights for policy 0, policy_version 28262 (0.0009) +[2023-10-11 16:07:10,148][85176] Updated weights for policy 0, policy_version 28272 (0.0008) +[2023-10-11 16:07:10,522][85176] Updated weights for policy 0, policy_version 28282 (0.0007) +[2023-10-11 16:07:11,063][84230] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 58327040. Throughput: 0: 1669.5, 1: 1676.3. Samples: 14586588. Policy #0 lag: (min: 2.0, avg: 4.5, max: 30.0) +[2023-10-11 16:07:11,063][84230] Avg episode reward: [(0, '7.000'), (1, '7.420')] +[2023-10-11 16:07:12,516][85175] Updated weights for policy 1, policy_version 28680 (0.0007) +[2023-10-11 16:07:12,875][85175] Updated weights for policy 1, policy_version 28690 (0.0009) +[2023-10-11 16:07:13,242][85175] Updated weights for policy 1, policy_version 28700 (0.0010) +[2023-10-11 16:07:14,643][85176] Updated weights for policy 0, policy_version 28292 (0.0008) +[2023-10-11 16:07:15,015][85176] Updated weights for policy 0, policy_version 28302 (0.0010) +[2023-10-11 16:07:15,392][85176] Updated weights for policy 0, policy_version 28312 (0.0008) +[2023-10-11 16:07:16,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 58392576. Throughput: 0: 1644.0, 1: 1690.0. Samples: 14606278. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-10-11 16:07:16,064][84230] Avg episode reward: [(0, '7.910'), (1, '7.720')] +[2023-10-11 16:07:17,170][85175] Updated weights for policy 1, policy_version 28710 (0.0007) +[2023-10-11 16:07:17,531][85175] Updated weights for policy 1, policy_version 28720 (0.0009) +[2023-10-11 16:07:17,907][85175] Updated weights for policy 1, policy_version 28730 (0.0010) +[2023-10-11 16:07:19,522][85176] Updated weights for policy 0, policy_version 28322 (0.0008) +[2023-10-11 16:07:19,892][85176] Updated weights for policy 0, policy_version 28332 (0.0009) +[2023-10-11 16:07:20,260][85176] Updated weights for policy 0, policy_version 28342 (0.0009) +[2023-10-11 16:07:20,637][85176] Updated weights for policy 0, policy_version 28352 (0.0007) +[2023-10-11 16:07:21,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 58458112. Throughput: 0: 1656.0, 1: 1670.3. Samples: 14616408. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-10-11 16:07:21,063][84230] Avg episode reward: [(0, '8.240'), (1, '7.860')] +[2023-10-11 16:07:22,000][85175] Updated weights for policy 1, policy_version 28740 (0.0008) +[2023-10-11 16:07:22,365][85175] Updated weights for policy 1, policy_version 28750 (0.0009) +[2023-10-11 16:07:22,732][85175] Updated weights for policy 1, policy_version 28760 (0.0010) +[2023-10-11 16:07:24,716][85176] Updated weights for policy 0, policy_version 28362 (0.0007) +[2023-10-11 16:07:25,091][85176] Updated weights for policy 0, policy_version 28372 (0.0008) +[2023-10-11 16:07:25,459][85176] Updated weights for policy 0, policy_version 28382 (0.0009) +[2023-10-11 16:07:26,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 58523648. Throughput: 0: 1654.6, 1: 1687.5. Samples: 14636840. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-10-11 16:07:26,063][84230] Avg episode reward: [(0, '7.710'), (1, '7.690')] +[2023-10-11 16:07:26,822][85175] Updated weights for policy 1, policy_version 28770 (0.0010) +[2023-10-11 16:07:27,194][85175] Updated weights for policy 1, policy_version 28780 (0.0009) +[2023-10-11 16:07:27,562][85175] Updated weights for policy 1, policy_version 28790 (0.0008) +[2023-10-11 16:07:27,926][85175] Updated weights for policy 1, policy_version 28800 (0.0008) +[2023-10-11 16:07:29,759][85176] Updated weights for policy 0, policy_version 28392 (0.0010) +[2023-10-11 16:07:30,133][85176] Updated weights for policy 0, policy_version 28402 (0.0011) +[2023-10-11 16:07:30,497][85176] Updated weights for policy 0, policy_version 28412 (0.0011) +[2023-10-11 16:07:31,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 58589184. Throughput: 0: 1650.5, 1: 1690.1. Samples: 14656524. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-10-11 16:07:31,063][84230] Avg episode reward: [(0, '7.120'), (1, '7.320')] +[2023-10-11 16:07:32,290][85175] Updated weights for policy 1, policy_version 28810 (0.0009) +[2023-10-11 16:07:32,662][85175] Updated weights for policy 1, policy_version 28820 (0.0008) +[2023-10-11 16:07:33,032][85175] Updated weights for policy 1, policy_version 28830 (0.0009) +[2023-10-11 16:07:34,510][85176] Updated weights for policy 0, policy_version 28422 (0.0008) +[2023-10-11 16:07:34,884][85176] Updated weights for policy 0, policy_version 28432 (0.0009) +[2023-10-11 16:07:35,265][85176] Updated weights for policy 0, policy_version 28442 (0.0011) +[2023-10-11 16:07:36,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 58654720. Throughput: 0: 1662.6, 1: 1675.0. Samples: 14666496. Policy #0 lag: (min: 31.0, avg: 31.1, max: 37.0) +[2023-10-11 16:07:36,063][84230] Avg episode reward: [(0, '7.080'), (1, '7.120')] +[2023-10-11 16:07:36,987][85175] Updated weights for policy 1, policy_version 28840 (0.0007) +[2023-10-11 16:07:37,351][85175] Updated weights for policy 1, policy_version 28850 (0.0008) +[2023-10-11 16:07:37,734][85175] Updated weights for policy 1, policy_version 28860 (0.0008) +[2023-10-11 16:07:39,333][85176] Updated weights for policy 0, policy_version 28452 (0.0011) +[2023-10-11 16:07:39,705][85176] Updated weights for policy 0, policy_version 28462 (0.0009) +[2023-10-11 16:07:40,078][85176] Updated weights for policy 0, policy_version 28472 (0.0007) +[2023-10-11 16:07:41,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13329.4). Total num frames: 58720256. Throughput: 0: 1653.2, 1: 1689.1. Samples: 14686870. Policy #0 lag: (min: 31.0, avg: 31.1, max: 37.0) +[2023-10-11 16:07:41,063][84230] Avg episode reward: [(0, '7.630'), (1, '7.380')] +[2023-10-11 16:07:41,758][85175] Updated weights for policy 1, policy_version 28870 (0.0009) +[2023-10-11 16:07:42,126][85175] Updated weights for policy 1, policy_version 28880 (0.0009) +[2023-10-11 16:07:42,496][85175] Updated weights for policy 1, policy_version 28890 (0.0009) +[2023-10-11 16:07:44,434][85176] Updated weights for policy 0, policy_version 28482 (0.0009) +[2023-10-11 16:07:44,815][85176] Updated weights for policy 0, policy_version 28492 (0.0007) +[2023-10-11 16:07:45,195][85176] Updated weights for policy 0, policy_version 28502 (0.0009) +[2023-10-11 16:07:45,574][85176] Updated weights for policy 0, policy_version 28512 (0.0009) +[2023-10-11 16:07:46,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13329.3). Total num frames: 58785792. Throughput: 0: 1649.8, 1: 1686.6. Samples: 14706642. Policy #0 lag: (min: 31.0, avg: 31.1, max: 37.0) +[2023-10-11 16:07:46,064][84230] Avg episode reward: [(0, '7.770'), (1, '7.870')] +[2023-10-11 16:07:46,490][85175] Updated weights for policy 1, policy_version 28900 (0.0008) +[2023-10-11 16:07:46,854][85175] Updated weights for policy 1, policy_version 28910 (0.0007) +[2023-10-11 16:07:47,214][85175] Updated weights for policy 1, policy_version 28920 (0.0008) +[2023-10-11 16:07:49,627][85176] Updated weights for policy 0, policy_version 28522 (0.0009) +[2023-10-11 16:07:50,008][85176] Updated weights for policy 0, policy_version 28532 (0.0009) +[2023-10-11 16:07:50,395][85176] Updated weights for policy 0, policy_version 28542 (0.0007) +[2023-10-11 16:07:51,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 58851328. Throughput: 0: 1657.9, 1: 1683.0. Samples: 14716804. Policy #0 lag: (min: 31.0, avg: 31.1, max: 37.0) +[2023-10-11 16:07:51,063][84230] Avg episode reward: [(0, '7.660'), (1, '7.930')] +[2023-10-11 16:07:51,303][85175] Updated weights for policy 1, policy_version 28930 (0.0008) +[2023-10-11 16:07:51,675][85175] Updated weights for policy 1, policy_version 28940 (0.0007) +[2023-10-11 16:07:52,036][85175] Updated weights for policy 1, policy_version 28950 (0.0007) +[2023-10-11 16:07:52,400][85175] Updated weights for policy 1, policy_version 28960 (0.0010) +[2023-10-11 16:07:54,708][85176] Updated weights for policy 0, policy_version 28552 (0.0009) +[2023-10-11 16:07:55,078][85176] Updated weights for policy 0, policy_version 28562 (0.0008) +[2023-10-11 16:07:55,457][85176] Updated weights for policy 0, policy_version 28572 (0.0007) +[2023-10-11 16:07:56,063][84230] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13329.4). Total num frames: 58916864. Throughput: 0: 1651.7, 1: 1693.8. Samples: 14737136. Policy #0 lag: (min: 1.0, avg: 10.2, max: 33.0) +[2023-10-11 16:07:56,063][84230] Avg episode reward: [(0, '8.160'), (1, '7.970')] +[2023-10-11 16:07:56,292][85175] Updated weights for policy 1, policy_version 28970 (0.0009) +[2023-10-11 16:07:56,654][85175] Updated weights for policy 1, policy_version 28980 (0.0009) +[2023-10-11 16:07:57,018][85175] Updated weights for policy 1, policy_version 28990 (0.0007) +[2023-10-11 16:07:59,337][85176] Updated weights for policy 0, policy_version 28582 (0.0007) +[2023-10-11 16:07:59,713][85176] Updated weights for policy 0, policy_version 28592 (0.0008) +[2023-10-11 16:08:00,087][85176] Updated weights for policy 0, policy_version 28602 (0.0008) +[2023-10-11 16:08:01,014][85175] Updated weights for policy 1, policy_version 29000 (0.0010) +[2023-10-11 16:08:01,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13329.3). Total num frames: 58982400. Throughput: 0: 1658.0, 1: 1694.4. Samples: 14757136. Policy #0 lag: (min: 1.0, avg: 10.2, max: 33.0) +[2023-10-11 16:08:01,064][84230] Avg episode reward: [(0, '7.620'), (1, '8.030')] +[2023-10-11 16:08:01,394][85175] Updated weights for policy 1, policy_version 29010 (0.0010) +[2023-10-11 16:08:01,748][85175] Updated weights for policy 1, policy_version 29020 (0.0010) +[2023-10-11 16:08:04,295][85176] Updated weights for policy 0, policy_version 28612 (0.0008) +[2023-10-11 16:08:04,671][85176] Updated weights for policy 0, policy_version 28622 (0.0008) +[2023-10-11 16:08:05,042][85176] Updated weights for policy 0, policy_version 28632 (0.0008) +[2023-10-11 16:08:05,738][85175] Updated weights for policy 1, policy_version 29030 (0.0008) +[2023-10-11 16:08:06,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 59047936. Throughput: 0: 1665.0, 1: 1693.6. Samples: 14767548. Policy #0 lag: (min: 1.0, avg: 10.2, max: 33.0) +[2023-10-11 16:08:06,063][84230] Avg episode reward: [(0, '7.320'), (1, '7.130')] +[2023-10-11 16:08:06,103][85175] Updated weights for policy 1, policy_version 29040 (0.0007) +[2023-10-11 16:08:06,473][85175] Updated weights for policy 1, policy_version 29050 (0.0009) +[2023-10-11 16:08:08,771][85176] Updated weights for policy 0, policy_version 28642 (0.0010) +[2023-10-11 16:08:09,144][85176] Updated weights for policy 0, policy_version 28652 (0.0007) +[2023-10-11 16:08:09,529][85176] Updated weights for policy 0, policy_version 28662 (0.0008) +[2023-10-11 16:08:09,893][85176] Updated weights for policy 0, policy_version 28672 (0.0010) +[2023-10-11 16:08:10,587][85175] Updated weights for policy 1, policy_version 29060 (0.0009) +[2023-10-11 16:08:10,954][85175] Updated weights for policy 1, policy_version 29070 (0.0008) +[2023-10-11 16:08:11,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 59113472. Throughput: 0: 1650.8, 1: 1693.6. Samples: 14787336. Policy #0 lag: (min: 1.0, avg: 10.2, max: 33.0) +[2023-10-11 16:08:11,063][84230] Avg episode reward: [(0, '7.040'), (1, '6.940')] +[2023-10-11 16:08:11,329][85175] Updated weights for policy 1, policy_version 29080 (0.0009) +[2023-10-11 16:08:13,805][85176] Updated weights for policy 0, policy_version 28682 (0.0007) +[2023-10-11 16:08:14,179][85176] Updated weights for policy 0, policy_version 28692 (0.0010) +[2023-10-11 16:08:14,539][85176] Updated weights for policy 0, policy_version 28702 (0.0010) +[2023-10-11 16:08:15,343][85175] Updated weights for policy 1, policy_version 29090 (0.0011) +[2023-10-11 16:08:15,718][85175] Updated weights for policy 1, policy_version 29100 (0.0009) +[2023-10-11 16:08:16,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 59179008. Throughput: 0: 1667.5, 1: 1690.3. Samples: 14807626. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-11 16:08:16,063][84230] Avg episode reward: [(0, '7.660'), (1, '7.560')] +[2023-10-11 16:08:16,072][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000028704_29392896.pth... +[2023-10-11 16:08:16,078][85175] Updated weights for policy 1, policy_version 29110 (0.0008) +[2023-10-11 16:08:16,111][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000027136_27787264.pth +[2023-10-11 16:08:16,438][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000029120_29818880.pth... +[2023-10-11 16:08:16,441][85175] Updated weights for policy 1, policy_version 29120 (0.0007) +[2023-10-11 16:08:16,479][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000027520_28180480.pth +[2023-10-11 16:08:18,869][85176] Updated weights for policy 0, policy_version 28712 (0.0010) +[2023-10-11 16:08:19,237][85176] Updated weights for policy 0, policy_version 28722 (0.0008) +[2023-10-11 16:08:19,608][85176] Updated weights for policy 0, policy_version 28732 (0.0007) +[2023-10-11 16:08:20,525][85175] Updated weights for policy 1, policy_version 29130 (0.0008) +[2023-10-11 16:08:20,895][85175] Updated weights for policy 1, policy_version 29140 (0.0008) +[2023-10-11 16:08:21,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 59244544. Throughput: 0: 1670.6, 1: 1699.7. Samples: 14818160. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-11 16:08:21,063][84230] Avg episode reward: [(0, '7.880'), (1, '8.310')] +[2023-10-11 16:08:21,268][85175] Updated weights for policy 1, policy_version 29150 (0.0009) +[2023-10-11 16:08:23,672][85176] Updated weights for policy 0, policy_version 28742 (0.0007) +[2023-10-11 16:08:24,047][85176] Updated weights for policy 0, policy_version 28752 (0.0009) +[2023-10-11 16:08:24,427][85176] Updated weights for policy 0, policy_version 28762 (0.0009) +[2023-10-11 16:08:25,374][85175] Updated weights for policy 1, policy_version 29160 (0.0007) +[2023-10-11 16:08:25,747][85175] Updated weights for policy 1, policy_version 29170 (0.0007) +[2023-10-11 16:08:26,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 59310080. Throughput: 0: 1655.2, 1: 1696.3. Samples: 14837686. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-11 16:08:26,064][84230] Avg episode reward: [(0, '8.020'), (1, '7.870')] +[2023-10-11 16:08:26,115][85175] Updated weights for policy 1, policy_version 29180 (0.0008) +[2023-10-11 16:08:28,515][85176] Updated weights for policy 0, policy_version 28772 (0.0008) +[2023-10-11 16:08:28,895][85176] Updated weights for policy 0, policy_version 28782 (0.0009) +[2023-10-11 16:08:29,271][85176] Updated weights for policy 0, policy_version 28792 (0.0009) +[2023-10-11 16:08:30,266][85175] Updated weights for policy 1, policy_version 29190 (0.0009) +[2023-10-11 16:08:30,626][85175] Updated weights for policy 1, policy_version 29200 (0.0009) +[2023-10-11 16:08:31,005][85175] Updated weights for policy 1, policy_version 29210 (0.0007) +[2023-10-11 16:08:31,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 59375616. Throughput: 0: 1671.0, 1: 1689.3. Samples: 14857854. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-11 16:08:31,063][84230] Avg episode reward: [(0, '7.570'), (1, '7.420')] +[2023-10-11 16:08:33,331][85176] Updated weights for policy 0, policy_version 28802 (0.0010) +[2023-10-11 16:08:33,722][85176] Updated weights for policy 0, policy_version 28812 (0.0010) +[2023-10-11 16:08:34,091][85176] Updated weights for policy 0, policy_version 28822 (0.0011) +[2023-10-11 16:08:34,461][85176] Updated weights for policy 0, policy_version 28832 (0.0011) +[2023-10-11 16:08:34,886][85175] Updated weights for policy 1, policy_version 29220 (0.0007) +[2023-10-11 16:08:35,255][85175] Updated weights for policy 1, policy_version 29230 (0.0007) +[2023-10-11 16:08:35,630][85175] Updated weights for policy 1, policy_version 29240 (0.0007) +[2023-10-11 16:08:36,062][84230] Fps is (10 sec: 16384.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 59473920. Throughput: 0: 1665.9, 1: 1706.9. Samples: 14868580. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-11 16:08:36,063][84230] Avg episode reward: [(0, '7.560'), (1, '6.740')] +[2023-10-11 16:08:38,565][85176] Updated weights for policy 0, policy_version 28842 (0.0009) +[2023-10-11 16:08:38,934][85176] Updated weights for policy 0, policy_version 28852 (0.0008) +[2023-10-11 16:08:39,303][85176] Updated weights for policy 0, policy_version 28862 (0.0010) +[2023-10-11 16:08:39,731][85175] Updated weights for policy 1, policy_version 29250 (0.0009) +[2023-10-11 16:08:40,109][85175] Updated weights for policy 1, policy_version 29260 (0.0007) +[2023-10-11 16:08:40,476][85175] Updated weights for policy 1, policy_version 29270 (0.0008) +[2023-10-11 16:08:40,845][85175] Updated weights for policy 1, policy_version 29280 (0.0010) +[2023-10-11 16:08:41,063][84230] Fps is (10 sec: 16383.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 59539456. Throughput: 0: 1653.7, 1: 1704.9. Samples: 14888274. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-10-11 16:08:41,063][84230] Avg episode reward: [(0, '7.540'), (1, '7.380')] +[2023-10-11 16:08:43,591][85176] Updated weights for policy 0, policy_version 28872 (0.0008) +[2023-10-11 16:08:43,969][85176] Updated weights for policy 0, policy_version 28882 (0.0009) +[2023-10-11 16:08:44,335][85176] Updated weights for policy 0, policy_version 28892 (0.0008) +[2023-10-11 16:08:44,837][85175] Updated weights for policy 1, policy_version 29290 (0.0007) +[2023-10-11 16:08:45,211][85175] Updated weights for policy 1, policy_version 29300 (0.0008) +[2023-10-11 16:08:45,581][85175] Updated weights for policy 1, policy_version 29310 (0.0008) +[2023-10-11 16:08:46,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 59604992. Throughput: 0: 1676.2, 1: 1678.0. Samples: 14908074. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-10-11 16:08:46,063][84230] Avg episode reward: [(0, '7.430'), (1, '8.120')] +[2023-10-11 16:08:48,400][85176] Updated weights for policy 0, policy_version 28902 (0.0009) +[2023-10-11 16:08:48,772][85176] Updated weights for policy 0, policy_version 28912 (0.0010) +[2023-10-11 16:08:49,142][85176] Updated weights for policy 0, policy_version 28922 (0.0007) +[2023-10-11 16:08:49,650][85175] Updated weights for policy 1, policy_version 29320 (0.0009) +[2023-10-11 16:08:50,017][85175] Updated weights for policy 1, policy_version 29330 (0.0009) +[2023-10-11 16:08:50,385][85175] Updated weights for policy 1, policy_version 29340 (0.0009) +[2023-10-11 16:08:51,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 59670528. Throughput: 0: 1664.0, 1: 1700.6. Samples: 14918954. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-10-11 16:08:51,063][84230] Avg episode reward: [(0, '7.660'), (1, '8.000')] +[2023-10-11 16:08:53,060][85176] Updated weights for policy 0, policy_version 28932 (0.0009) +[2023-10-11 16:08:53,425][85176] Updated weights for policy 0, policy_version 28942 (0.0009) +[2023-10-11 16:08:53,798][85176] Updated weights for policy 0, policy_version 28952 (0.0008) +[2023-10-11 16:08:54,369][85175] Updated weights for policy 1, policy_version 29350 (0.0008) +[2023-10-11 16:08:54,737][85175] Updated weights for policy 1, policy_version 29360 (0.0007) +[2023-10-11 16:08:55,101][85175] Updated weights for policy 1, policy_version 29370 (0.0007) +[2023-10-11 16:08:56,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 59736064. Throughput: 0: 1667.4, 1: 1696.0. Samples: 14938688. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-10-11 16:08:56,064][84230] Avg episode reward: [(0, '7.690'), (1, '7.840')] +[2023-10-11 16:08:57,694][85176] Updated weights for policy 0, policy_version 28962 (0.0009) +[2023-10-11 16:08:58,062][85176] Updated weights for policy 0, policy_version 28972 (0.0008) +[2023-10-11 16:08:58,440][85176] Updated weights for policy 0, policy_version 28982 (0.0008) +[2023-10-11 16:08:58,813][85176] Updated weights for policy 0, policy_version 28992 (0.0009) +[2023-10-11 16:08:59,143][85175] Updated weights for policy 1, policy_version 29380 (0.0008) +[2023-10-11 16:08:59,513][85175] Updated weights for policy 1, policy_version 29390 (0.0007) +[2023-10-11 16:08:59,886][85175] Updated weights for policy 1, policy_version 29400 (0.0007) +[2023-10-11 16:09:01,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 59801600. Throughput: 0: 1675.9, 1: 1683.9. Samples: 14958814. Policy #0 lag: (min: 20.0, avg: 21.5, max: 47.0) +[2023-10-11 16:09:01,063][84230] Avg episode reward: [(0, '7.810'), (1, '7.390')] +[2023-10-11 16:09:03,134][85176] Updated weights for policy 0, policy_version 29002 (0.0010) +[2023-10-11 16:09:03,507][85176] Updated weights for policy 0, policy_version 29012 (0.0011) +[2023-10-11 16:09:03,744][85175] Updated weights for policy 1, policy_version 29410 (0.0008) +[2023-10-11 16:09:03,874][85176] Updated weights for policy 0, policy_version 29022 (0.0007) +[2023-10-11 16:09:04,103][85175] Updated weights for policy 1, policy_version 29420 (0.0010) +[2023-10-11 16:09:04,469][85175] Updated weights for policy 1, policy_version 29430 (0.0011) +[2023-10-11 16:09:04,834][85175] Updated weights for policy 1, policy_version 29440 (0.0011) +[2023-10-11 16:09:06,062][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 59867136. Throughput: 0: 1655.0, 1: 1712.6. Samples: 14969700. Policy #0 lag: (min: 20.0, avg: 21.5, max: 47.0) +[2023-10-11 16:09:06,063][84230] Avg episode reward: [(0, '7.620'), (1, '7.230')] +[2023-10-11 16:09:07,577][85176] Updated weights for policy 0, policy_version 29032 (0.0010) +[2023-10-11 16:09:07,957][85176] Updated weights for policy 0, policy_version 29042 (0.0011) +[2023-10-11 16:09:08,326][85176] Updated weights for policy 0, policy_version 29052 (0.0008) +[2023-10-11 16:09:08,877][85175] Updated weights for policy 1, policy_version 29450 (0.0007) +[2023-10-11 16:09:09,254][85175] Updated weights for policy 1, policy_version 29460 (0.0008) +[2023-10-11 16:09:09,621][85175] Updated weights for policy 1, policy_version 29470 (0.0009) +[2023-10-11 16:09:11,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 59932672. Throughput: 0: 1680.1, 1: 1687.4. Samples: 14989224. Policy #0 lag: (min: 20.0, avg: 21.5, max: 47.0) +[2023-10-11 16:09:11,064][84230] Avg episode reward: [(0, '7.550'), (1, '7.230')] +[2023-10-11 16:09:12,512][85176] Updated weights for policy 0, policy_version 29062 (0.0008) +[2023-10-11 16:09:12,889][85176] Updated weights for policy 0, policy_version 29072 (0.0008) +[2023-10-11 16:09:13,261][85176] Updated weights for policy 0, policy_version 29082 (0.0009) +[2023-10-11 16:09:13,678][85175] Updated weights for policy 1, policy_version 29480 (0.0009) +[2023-10-11 16:09:14,042][85175] Updated weights for policy 1, policy_version 29490 (0.0010) +[2023-10-11 16:09:14,412][85175] Updated weights for policy 1, policy_version 29500 (0.0008) +[2023-10-11 16:09:16,063][84230] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 59998208. Throughput: 0: 1684.3, 1: 1692.5. Samples: 15009812. Policy #0 lag: (min: 20.0, avg: 21.5, max: 47.0) +[2023-10-11 16:09:16,064][84230] Avg episode reward: [(0, '7.920'), (1, '7.640')] +[2023-10-11 16:09:17,372][85176] Updated weights for policy 0, policy_version 29092 (0.0009) +[2023-10-11 16:09:17,748][85176] Updated weights for policy 0, policy_version 29102 (0.0009) +[2023-10-11 16:09:18,125][85176] Updated weights for policy 0, policy_version 29112 (0.0008) +[2023-10-11 16:09:18,518][85175] Updated weights for policy 1, policy_version 29510 (0.0009) +[2023-10-11 16:09:18,894][85175] Updated weights for policy 1, policy_version 29520 (0.0010) +[2023-10-11 16:09:19,259][85175] Updated weights for policy 1, policy_version 29530 (0.0009) +[2023-10-11 16:09:21,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 60063744. Throughput: 0: 1662.3, 1: 1697.5. Samples: 15019774. Policy #0 lag: (min: 20.0, avg: 21.5, max: 47.0) +[2023-10-11 16:09:21,064][84230] Avg episode reward: [(0, '7.740'), (1, '7.850')] +[2023-10-11 16:09:22,173][85176] Updated weights for policy 0, policy_version 29122 (0.0008) +[2023-10-11 16:09:22,550][85176] Updated weights for policy 0, policy_version 29132 (0.0008) +[2023-10-11 16:09:22,909][85176] Updated weights for policy 0, policy_version 29142 (0.0007) +[2023-10-11 16:09:23,224][85175] Updated weights for policy 1, policy_version 29540 (0.0009) +[2023-10-11 16:09:23,287][85176] Updated weights for policy 0, policy_version 29152 (0.0007) +[2023-10-11 16:09:23,591][85175] Updated weights for policy 1, policy_version 29550 (0.0009) +[2023-10-11 16:09:23,959][85175] Updated weights for policy 1, policy_version 29560 (0.0009) +[2023-10-11 16:09:26,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 60129280. Throughput: 0: 1689.7, 1: 1679.4. Samples: 15039886. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:09:26,064][84230] Avg episode reward: [(0, '7.810'), (1, '7.730')] +[2023-10-11 16:09:27,434][85176] Updated weights for policy 0, policy_version 29162 (0.0010) +[2023-10-11 16:09:27,801][85176] Updated weights for policy 0, policy_version 29172 (0.0007) +[2023-10-11 16:09:27,931][85175] Updated weights for policy 1, policy_version 29570 (0.0011) +[2023-10-11 16:09:28,170][85176] Updated weights for policy 0, policy_version 29182 (0.0007) +[2023-10-11 16:09:28,302][85175] Updated weights for policy 1, policy_version 29580 (0.0007) +[2023-10-11 16:09:28,680][85175] Updated weights for policy 1, policy_version 29590 (0.0008) +[2023-10-11 16:09:29,046][85175] Updated weights for policy 1, policy_version 29600 (0.0009) +[2023-10-11 16:09:31,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 60194816. Throughput: 0: 1690.2, 1: 1697.9. Samples: 15060542. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:09:31,064][84230] Avg episode reward: [(0, '6.940'), (1, '7.840')] +[2023-10-11 16:09:32,156][85176] Updated weights for policy 0, policy_version 29192 (0.0007) +[2023-10-11 16:09:32,537][85176] Updated weights for policy 0, policy_version 29202 (0.0007) +[2023-10-11 16:09:32,909][85176] Updated weights for policy 0, policy_version 29212 (0.0007) +[2023-10-11 16:09:33,133][85175] Updated weights for policy 1, policy_version 29610 (0.0008) +[2023-10-11 16:09:33,500][85175] Updated weights for policy 1, policy_version 29620 (0.0008) +[2023-10-11 16:09:33,871][85175] Updated weights for policy 1, policy_version 29630 (0.0010) +[2023-10-11 16:09:36,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 60260352. Throughput: 0: 1673.2, 1: 1686.8. Samples: 15070152. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:09:36,064][84230] Avg episode reward: [(0, '7.330'), (1, '7.210')] +[2023-10-11 16:09:36,959][85176] Updated weights for policy 0, policy_version 29222 (0.0008) +[2023-10-11 16:09:37,322][85176] Updated weights for policy 0, policy_version 29232 (0.0010) +[2023-10-11 16:09:37,699][85176] Updated weights for policy 0, policy_version 29242 (0.0007) +[2023-10-11 16:09:37,902][85175] Updated weights for policy 1, policy_version 29640 (0.0008) +[2023-10-11 16:09:38,275][85175] Updated weights for policy 1, policy_version 29650 (0.0008) +[2023-10-11 16:09:38,644][85175] Updated weights for policy 1, policy_version 29660 (0.0007) +[2023-10-11 16:09:41,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 60325888. Throughput: 0: 1689.2, 1: 1681.5. Samples: 15090370. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:09:41,063][84230] Avg episode reward: [(0, '7.750'), (1, '7.270')] +[2023-10-11 16:09:41,884][85176] Updated weights for policy 0, policy_version 29252 (0.0008) +[2023-10-11 16:09:42,262][85176] Updated weights for policy 0, policy_version 29262 (0.0009) +[2023-10-11 16:09:42,525][85175] Updated weights for policy 1, policy_version 29670 (0.0008) +[2023-10-11 16:09:42,625][85176] Updated weights for policy 0, policy_version 29272 (0.0008) +[2023-10-11 16:09:42,890][85175] Updated weights for policy 1, policy_version 29680 (0.0007) +[2023-10-11 16:09:43,261][85175] Updated weights for policy 1, policy_version 29690 (0.0008) +[2023-10-11 16:09:46,063][84230] Fps is (10 sec: 13106.8, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 60391424. Throughput: 0: 1681.8, 1: 1704.3. Samples: 15111192. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:09:46,064][84230] Avg episode reward: [(0, '8.200'), (1, '7.600')] +[2023-10-11 16:09:46,736][85176] Updated weights for policy 0, policy_version 29282 (0.0010) +[2023-10-11 16:09:47,123][85176] Updated weights for policy 0, policy_version 29292 (0.0009) +[2023-10-11 16:09:47,239][85175] Updated weights for policy 1, policy_version 29700 (0.0007) +[2023-10-11 16:09:47,497][85176] Updated weights for policy 0, policy_version 29302 (0.0008) +[2023-10-11 16:09:47,609][85175] Updated weights for policy 1, policy_version 29710 (0.0008) +[2023-10-11 16:09:47,868][85176] Updated weights for policy 0, policy_version 29312 (0.0008) +[2023-10-11 16:09:47,968][85175] Updated weights for policy 1, policy_version 29720 (0.0009) +[2023-10-11 16:09:51,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 60456960. Throughput: 0: 1676.8, 1: 1668.6. Samples: 15120246. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:09:51,063][84230] Avg episode reward: [(0, '7.750'), (1, '7.930')] +[2023-10-11 16:09:51,968][85176] Updated weights for policy 0, policy_version 29322 (0.0008) +[2023-10-11 16:09:52,151][85175] Updated weights for policy 1, policy_version 29730 (0.0007) +[2023-10-11 16:09:52,337][85176] Updated weights for policy 0, policy_version 29332 (0.0008) +[2023-10-11 16:09:52,513][85175] Updated weights for policy 1, policy_version 29740 (0.0009) +[2023-10-11 16:09:52,709][85176] Updated weights for policy 0, policy_version 29342 (0.0009) +[2023-10-11 16:09:52,876][85175] Updated weights for policy 1, policy_version 29750 (0.0007) +[2023-10-11 16:09:53,244][85175] Updated weights for policy 1, policy_version 29760 (0.0007) +[2023-10-11 16:09:56,063][84230] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 60522496. Throughput: 0: 1676.6, 1: 1694.8. Samples: 15140936. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:09:56,064][84230] Avg episode reward: [(0, '7.390'), (1, '8.120')] +[2023-10-11 16:09:56,721][85176] Updated weights for policy 0, policy_version 29352 (0.0008) +[2023-10-11 16:09:57,096][85176] Updated weights for policy 0, policy_version 29362 (0.0007) +[2023-10-11 16:09:57,268][85175] Updated weights for policy 1, policy_version 29770 (0.0007) +[2023-10-11 16:09:57,463][85176] Updated weights for policy 0, policy_version 29372 (0.0008) +[2023-10-11 16:09:57,638][85175] Updated weights for policy 1, policy_version 29780 (0.0011) +[2023-10-11 16:09:58,009][85175] Updated weights for policy 1, policy_version 29790 (0.0009) +[2023-10-11 16:10:01,063][84230] Fps is (10 sec: 13106.7, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 60588032. Throughput: 0: 1680.8, 1: 1696.1. Samples: 15161776. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:10:01,064][84230] Avg episode reward: [(0, '7.470'), (1, '7.550')] +[2023-10-11 16:10:01,539][85176] Updated weights for policy 0, policy_version 29382 (0.0009) +[2023-10-11 16:10:01,911][85176] Updated weights for policy 0, policy_version 29392 (0.0008) +[2023-10-11 16:10:02,028][85175] Updated weights for policy 1, policy_version 29800 (0.0008) +[2023-10-11 16:10:02,294][85176] Updated weights for policy 0, policy_version 29402 (0.0007) +[2023-10-11 16:10:02,388][85175] Updated weights for policy 1, policy_version 29810 (0.0007) +[2023-10-11 16:10:02,756][85175] Updated weights for policy 1, policy_version 29820 (0.0010) +[2023-10-11 16:10:06,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 60653568. Throughput: 0: 1680.3, 1: 1672.3. Samples: 15170640. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:10:06,063][84230] Avg episode reward: [(0, '7.600'), (1, '7.140')] +[2023-10-11 16:10:06,295][85176] Updated weights for policy 0, policy_version 29412 (0.0009) +[2023-10-11 16:10:06,673][85176] Updated weights for policy 0, policy_version 29422 (0.0008) +[2023-10-11 16:10:06,791][85175] Updated weights for policy 1, policy_version 29830 (0.0009) +[2023-10-11 16:10:07,046][85176] Updated weights for policy 0, policy_version 29432 (0.0007) +[2023-10-11 16:10:07,147][85175] Updated weights for policy 1, policy_version 29840 (0.0009) +[2023-10-11 16:10:07,523][85175] Updated weights for policy 1, policy_version 29850 (0.0009) +[2023-10-11 16:10:11,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 60719104. Throughput: 0: 1680.5, 1: 1689.8. Samples: 15191548. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:10:11,064][84230] Avg episode reward: [(0, '7.720'), (1, '7.430')] +[2023-10-11 16:10:11,183][85176] Updated weights for policy 0, policy_version 29442 (0.0009) +[2023-10-11 16:10:11,551][85176] Updated weights for policy 0, policy_version 29452 (0.0009) +[2023-10-11 16:10:11,567][85175] Updated weights for policy 1, policy_version 29860 (0.0009) +[2023-10-11 16:10:11,927][85175] Updated weights for policy 1, policy_version 29870 (0.0007) +[2023-10-11 16:10:11,930][85176] Updated weights for policy 0, policy_version 29462 (0.0008) +[2023-10-11 16:10:12,296][85175] Updated weights for policy 1, policy_version 29880 (0.0008) +[2023-10-11 16:10:12,297][85176] Updated weights for policy 0, policy_version 29472 (0.0009) +[2023-10-11 16:10:16,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 60784640. Throughput: 0: 1676.5, 1: 1697.7. Samples: 15212382. Policy #0 lag: (min: 14.0, avg: 21.7, max: 46.0) +[2023-10-11 16:10:16,064][84230] Avg episode reward: [(0, '7.820'), (1, '7.970')] +[2023-10-11 16:10:16,076][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000029888_30605312.pth... +[2023-10-11 16:10:16,115][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000028320_28999680.pth +[2023-10-11 16:10:16,312][85175] Updated weights for policy 1, policy_version 29890 (0.0007) +[2023-10-11 16:10:16,404][85176] Updated weights for policy 0, policy_version 29482 (0.0008) +[2023-10-11 16:10:16,688][85175] Updated weights for policy 1, policy_version 29900 (0.0008) +[2023-10-11 16:10:16,772][85176] Updated weights for policy 0, policy_version 29492 (0.0008) +[2023-10-11 16:10:17,047][85175] Updated weights for policy 1, policy_version 29910 (0.0009) +[2023-10-11 16:10:17,143][85176] Updated weights for policy 0, policy_version 29502 (0.0008) +[2023-10-11 16:10:17,218][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000029504_30212096.pth... +[2023-10-11 16:10:17,247][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000027936_28606464.pth +[2023-10-11 16:10:17,410][85175] Updated weights for policy 1, policy_version 29920 (0.0007) +[2023-10-11 16:10:21,063][84230] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 60850176. Throughput: 0: 1674.6, 1: 1687.5. Samples: 15221444. Policy #0 lag: (min: 14.0, avg: 21.7, max: 46.0) +[2023-10-11 16:10:21,063][84230] Avg episode reward: [(0, '7.550'), (1, '8.020')] +[2023-10-11 16:10:21,256][85176] Updated weights for policy 0, policy_version 29512 (0.0007) +[2023-10-11 16:10:21,536][85175] Updated weights for policy 1, policy_version 29930 (0.0008) +[2023-10-11 16:10:21,623][85176] Updated weights for policy 0, policy_version 29522 (0.0009) +[2023-10-11 16:10:21,900][85175] Updated weights for policy 1, policy_version 29940 (0.0007) +[2023-10-11 16:10:22,006][85176] Updated weights for policy 0, policy_version 29532 (0.0007) +[2023-10-11 16:10:22,267][85175] Updated weights for policy 1, policy_version 29950 (0.0007) +[2023-10-11 16:10:26,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 60915712. Throughput: 0: 1673.5, 1: 1696.6. Samples: 15242024. Policy #0 lag: (min: 14.0, avg: 21.7, max: 46.0) +[2023-10-11 16:10:26,063][84230] Avg episode reward: [(0, '7.580'), (1, '7.820')] +[2023-10-11 16:10:26,063][85176] Updated weights for policy 0, policy_version 29542 (0.0007) +[2023-10-11 16:10:26,352][85175] Updated weights for policy 1, policy_version 29960 (0.0007) +[2023-10-11 16:10:26,437][85176] Updated weights for policy 0, policy_version 29552 (0.0009) +[2023-10-11 16:10:26,719][85175] Updated weights for policy 1, policy_version 29970 (0.0008) +[2023-10-11 16:10:26,807][85176] Updated weights for policy 0, policy_version 29562 (0.0008) +[2023-10-11 16:10:27,090][85175] Updated weights for policy 1, policy_version 29980 (0.0009) +[2023-10-11 16:10:30,913][85176] Updated weights for policy 0, policy_version 29572 (0.0009) +[2023-10-11 16:10:31,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 60981248. Throughput: 0: 1680.5, 1: 1691.8. Samples: 15262944. Policy #0 lag: (min: 14.0, avg: 21.7, max: 46.0) +[2023-10-11 16:10:31,063][84230] Avg episode reward: [(0, '7.360'), (1, '7.020')] +[2023-10-11 16:10:31,148][85175] Updated weights for policy 1, policy_version 29990 (0.0008) +[2023-10-11 16:10:31,285][85176] Updated weights for policy 0, policy_version 29582 (0.0007) +[2023-10-11 16:10:31,520][85175] Updated weights for policy 1, policy_version 30000 (0.0008) +[2023-10-11 16:10:31,660][85176] Updated weights for policy 0, policy_version 29592 (0.0007) +[2023-10-11 16:10:31,880][85175] Updated weights for policy 1, policy_version 30010 (0.0009) +[2023-10-11 16:10:35,833][85175] Updated weights for policy 1, policy_version 30020 (0.0008) +[2023-10-11 16:10:35,885][85176] Updated weights for policy 0, policy_version 29602 (0.0007) +[2023-10-11 16:10:36,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 61046784. Throughput: 0: 1675.7, 1: 1693.4. Samples: 15271856. Policy #0 lag: (min: 14.0, avg: 21.7, max: 46.0) +[2023-10-11 16:10:36,064][84230] Avg episode reward: [(0, '7.810'), (1, '7.210')] +[2023-10-11 16:10:36,206][85175] Updated weights for policy 1, policy_version 30030 (0.0008) +[2023-10-11 16:10:36,259][85176] Updated weights for policy 0, policy_version 29612 (0.0009) +[2023-10-11 16:10:36,573][85175] Updated weights for policy 1, policy_version 30040 (0.0008) +[2023-10-11 16:10:36,623][85176] Updated weights for policy 0, policy_version 29622 (0.0008) +[2023-10-11 16:10:37,001][85176] Updated weights for policy 0, policy_version 29632 (0.0009) +[2023-10-11 16:10:40,617][85175] Updated weights for policy 1, policy_version 30050 (0.0009) +[2023-10-11 16:10:40,993][85175] Updated weights for policy 1, policy_version 30060 (0.0007) +[2023-10-11 16:10:41,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 61112320. Throughput: 0: 1673.6, 1: 1692.2. Samples: 15292396. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:10:41,063][84230] Avg episode reward: [(0, '7.850'), (1, '7.920')] +[2023-10-11 16:10:41,222][85176] Updated weights for policy 0, policy_version 29642 (0.0009) +[2023-10-11 16:10:41,353][85175] Updated weights for policy 1, policy_version 30070 (0.0007) +[2023-10-11 16:10:41,588][85176] Updated weights for policy 0, policy_version 29652 (0.0009) +[2023-10-11 16:10:41,718][85175] Updated weights for policy 1, policy_version 30080 (0.0008) +[2023-10-11 16:10:41,961][85176] Updated weights for policy 0, policy_version 29662 (0.0008) +[2023-10-11 16:10:45,734][85175] Updated weights for policy 1, policy_version 30090 (0.0008) +[2023-10-11 16:10:46,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 61177856. Throughput: 0: 1664.5, 1: 1694.4. Samples: 15312922. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:10:46,063][84230] Avg episode reward: [(0, '8.000'), (1, '8.070')] +[2023-10-11 16:10:46,096][85176] Updated weights for policy 0, policy_version 29672 (0.0009) +[2023-10-11 16:10:46,099][85175] Updated weights for policy 1, policy_version 30100 (0.0009) +[2023-10-11 16:10:46,458][85176] Updated weights for policy 0, policy_version 29682 (0.0010) +[2023-10-11 16:10:46,473][85175] Updated weights for policy 1, policy_version 30110 (0.0007) +[2023-10-11 16:10:46,833][85176] Updated weights for policy 0, policy_version 29692 (0.0009) +[2023-10-11 16:10:50,540][85175] Updated weights for policy 1, policy_version 30120 (0.0007) +[2023-10-11 16:10:50,885][85176] Updated weights for policy 0, policy_version 29702 (0.0009) +[2023-10-11 16:10:50,915][85175] Updated weights for policy 1, policy_version 30130 (0.0007) +[2023-10-11 16:10:51,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 61243392. Throughput: 0: 1666.5, 1: 1697.6. Samples: 15322026. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:10:51,063][84230] Avg episode reward: [(0, '7.480'), (1, '8.000')] +[2023-10-11 16:10:51,252][85176] Updated weights for policy 0, policy_version 29712 (0.0008) +[2023-10-11 16:10:51,281][85175] Updated weights for policy 1, policy_version 30140 (0.0007) +[2023-10-11 16:10:51,624][85176] Updated weights for policy 0, policy_version 29722 (0.0008) +[2023-10-11 16:10:55,287][85175] Updated weights for policy 1, policy_version 30150 (0.0011) +[2023-10-11 16:10:55,654][85175] Updated weights for policy 1, policy_version 30160 (0.0009) +[2023-10-11 16:10:55,657][85176] Updated weights for policy 0, policy_version 29732 (0.0007) +[2023-10-11 16:10:56,020][85176] Updated weights for policy 0, policy_version 29742 (0.0008) +[2023-10-11 16:10:56,021][85175] Updated weights for policy 1, policy_version 30170 (0.0008) +[2023-10-11 16:10:56,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 61308928. Throughput: 0: 1661.4, 1: 1699.7. Samples: 15342796. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:10:56,063][84230] Avg episode reward: [(0, '7.060'), (1, '7.540')] +[2023-10-11 16:10:56,400][85176] Updated weights for policy 0, policy_version 29752 (0.0009) +[2023-10-11 16:11:00,154][85175] Updated weights for policy 1, policy_version 30180 (0.0008) +[2023-10-11 16:11:00,515][85175] Updated weights for policy 1, policy_version 30190 (0.0008) +[2023-10-11 16:11:00,525][85176] Updated weights for policy 0, policy_version 29762 (0.0008) +[2023-10-11 16:11:00,881][85175] Updated weights for policy 1, policy_version 30200 (0.0007) +[2023-10-11 16:11:00,895][85176] Updated weights for policy 0, policy_version 29772 (0.0010) +[2023-10-11 16:11:01,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 61374464. Throughput: 0: 1662.9, 1: 1677.5. Samples: 15362700. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:11:01,063][84230] Avg episode reward: [(0, '7.500'), (1, '7.450')] +[2023-10-11 16:11:01,261][85176] Updated weights for policy 0, policy_version 29782 (0.0008) +[2023-10-11 16:11:01,640][85176] Updated weights for policy 0, policy_version 29792 (0.0009) +[2023-10-11 16:11:04,814][85175] Updated weights for policy 1, policy_version 30210 (0.0008) +[2023-10-11 16:11:05,184][85175] Updated weights for policy 1, policy_version 30220 (0.0009) +[2023-10-11 16:11:05,555][85175] Updated weights for policy 1, policy_version 30230 (0.0009) +[2023-10-11 16:11:05,730][85176] Updated weights for policy 0, policy_version 29802 (0.0007) +[2023-10-11 16:11:05,928][85175] Updated weights for policy 1, policy_version 30240 (0.0010) +[2023-10-11 16:11:06,063][84230] Fps is (10 sec: 16383.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 61472768. Throughput: 0: 1664.3, 1: 1691.5. Samples: 15372456. Policy #0 lag: (min: 3.0, avg: 8.3, max: 35.0) +[2023-10-11 16:11:06,064][84230] Avg episode reward: [(0, '7.960'), (1, '7.330')] +[2023-10-11 16:11:06,104][85176] Updated weights for policy 0, policy_version 29812 (0.0007) +[2023-10-11 16:11:06,476][85176] Updated weights for policy 0, policy_version 29822 (0.0009) +[2023-10-11 16:11:09,898][85175] Updated weights for policy 1, policy_version 30250 (0.0009) +[2023-10-11 16:11:10,261][85175] Updated weights for policy 1, policy_version 30260 (0.0008) +[2023-10-11 16:11:10,607][85176] Updated weights for policy 0, policy_version 29832 (0.0007) +[2023-10-11 16:11:10,640][85175] Updated weights for policy 1, policy_version 30270 (0.0007) +[2023-10-11 16:11:10,985][85176] Updated weights for policy 0, policy_version 29842 (0.0007) +[2023-10-11 16:11:11,063][84230] Fps is (10 sec: 16383.7, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 61538304. Throughput: 0: 1666.9, 1: 1699.8. Samples: 15393526. Policy #0 lag: (min: 3.0, avg: 8.3, max: 35.0) +[2023-10-11 16:11:11,064][84230] Avg episode reward: [(0, '7.850'), (1, '7.180')] +[2023-10-11 16:11:11,355][85176] Updated weights for policy 0, policy_version 29852 (0.0009) +[2023-10-11 16:11:14,585][85175] Updated weights for policy 1, policy_version 30280 (0.0009) +[2023-10-11 16:11:14,954][85175] Updated weights for policy 1, policy_version 30290 (0.0009) +[2023-10-11 16:11:15,320][85175] Updated weights for policy 1, policy_version 30300 (0.0007) +[2023-10-11 16:11:15,334][85176] Updated weights for policy 0, policy_version 29862 (0.0008) +[2023-10-11 16:11:15,704][85176] Updated weights for policy 0, policy_version 29872 (0.0009) +[2023-10-11 16:11:16,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 61603840. Throughput: 0: 1652.0, 1: 1669.5. Samples: 15412414. Policy #0 lag: (min: 3.0, avg: 8.3, max: 35.0) +[2023-10-11 16:11:16,064][84230] Avg episode reward: [(0, '7.830'), (1, '7.590')] +[2023-10-11 16:11:16,083][85176] Updated weights for policy 0, policy_version 29882 (0.0007) +[2023-10-11 16:11:19,458][85175] Updated weights for policy 1, policy_version 30310 (0.0010) +[2023-10-11 16:11:19,821][85175] Updated weights for policy 1, policy_version 30320 (0.0010) +[2023-10-11 16:11:20,193][85175] Updated weights for policy 1, policy_version 30330 (0.0007) +[2023-10-11 16:11:20,293][85176] Updated weights for policy 0, policy_version 29892 (0.0010) +[2023-10-11 16:11:20,678][85176] Updated weights for policy 0, policy_version 29902 (0.0009) +[2023-10-11 16:11:21,045][85176] Updated weights for policy 0, policy_version 29912 (0.0009) +[2023-10-11 16:11:21,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 61669376. Throughput: 0: 1663.8, 1: 1701.8. Samples: 15423306. Policy #0 lag: (min: 3.0, avg: 8.3, max: 35.0) +[2023-10-11 16:11:21,063][84230] Avg episode reward: [(0, '7.600'), (1, '8.160')] +[2023-10-11 16:11:24,330][85175] Updated weights for policy 1, policy_version 30340 (0.0009) +[2023-10-11 16:11:24,693][85175] Updated weights for policy 1, policy_version 30350 (0.0010) +[2023-10-11 16:11:25,057][85175] Updated weights for policy 1, policy_version 30360 (0.0009) +[2023-10-11 16:11:25,177][85176] Updated weights for policy 0, policy_version 29922 (0.0008) +[2023-10-11 16:11:25,536][85176] Updated weights for policy 0, policy_version 29932 (0.0010) +[2023-10-11 16:11:25,913][85176] Updated weights for policy 0, policy_version 29942 (0.0008) +[2023-10-11 16:11:26,063][84230] Fps is (10 sec: 13107.6, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 61734912. Throughput: 0: 1661.7, 1: 1688.9. Samples: 15443176. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) +[2023-10-11 16:11:26,063][84230] Avg episode reward: [(0, '7.670'), (1, '8.250')] +[2023-10-11 16:11:26,285][85176] Updated weights for policy 0, policy_version 29952 (0.0010) +[2023-10-11 16:11:29,183][85175] Updated weights for policy 1, policy_version 30370 (0.0008) +[2023-10-11 16:11:29,549][85175] Updated weights for policy 1, policy_version 30380 (0.0009) +[2023-10-11 16:11:29,923][85175] Updated weights for policy 1, policy_version 30390 (0.0009) +[2023-10-11 16:11:30,283][85175] Updated weights for policy 1, policy_version 30400 (0.0008) +[2023-10-11 16:11:30,313][85176] Updated weights for policy 0, policy_version 29962 (0.0007) +[2023-10-11 16:11:30,702][85176] Updated weights for policy 0, policy_version 29972 (0.0010) +[2023-10-11 16:11:31,062][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 61800448. Throughput: 0: 1649.5, 1: 1665.9. Samples: 15462114. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) +[2023-10-11 16:11:31,063][84230] Avg episode reward: [(0, '8.070'), (1, '7.550')] +[2023-10-11 16:11:31,077][85176] Updated weights for policy 0, policy_version 29982 (0.0010) +[2023-10-11 16:11:34,212][85175] Updated weights for policy 1, policy_version 30410 (0.0008) +[2023-10-11 16:11:34,580][85175] Updated weights for policy 1, policy_version 30420 (0.0007) +[2023-10-11 16:11:34,956][85175] Updated weights for policy 1, policy_version 30430 (0.0007) +[2023-10-11 16:11:35,085][85176] Updated weights for policy 0, policy_version 29992 (0.0011) +[2023-10-11 16:11:35,460][85176] Updated weights for policy 0, policy_version 30002 (0.0010) +[2023-10-11 16:11:35,830][85176] Updated weights for policy 0, policy_version 30012 (0.0009) +[2023-10-11 16:11:36,063][84230] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 61898752. Throughput: 0: 1663.6, 1: 1700.9. Samples: 15473430. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) +[2023-10-11 16:11:36,063][84230] Avg episode reward: [(0, '7.740'), (1, '6.900')] +[2023-10-11 16:11:39,155][85175] Updated weights for policy 1, policy_version 30440 (0.0010) +[2023-10-11 16:11:39,526][85175] Updated weights for policy 1, policy_version 30450 (0.0009) +[2023-10-11 16:11:39,873][85176] Updated weights for policy 0, policy_version 30022 (0.0009) +[2023-10-11 16:11:39,897][85175] Updated weights for policy 1, policy_version 30460 (0.0007) +[2023-10-11 16:11:40,259][85176] Updated weights for policy 0, policy_version 30032 (0.0009) +[2023-10-11 16:11:40,642][85176] Updated weights for policy 0, policy_version 30042 (0.0010) +[2023-10-11 16:11:41,062][84230] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 61964288. Throughput: 0: 1666.6, 1: 1683.9. Samples: 15493570. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) +[2023-10-11 16:11:41,063][84230] Avg episode reward: [(0, '7.710'), (1, '6.920')] +[2023-10-11 16:11:43,814][85175] Updated weights for policy 1, policy_version 30470 (0.0009) +[2023-10-11 16:11:44,178][85175] Updated weights for policy 1, policy_version 30480 (0.0009) +[2023-10-11 16:11:44,550][85175] Updated weights for policy 1, policy_version 30490 (0.0009) +[2023-10-11 16:11:44,803][85176] Updated weights for policy 0, policy_version 30052 (0.0008) +[2023-10-11 16:11:45,181][85176] Updated weights for policy 0, policy_version 30062 (0.0009) +[2023-10-11 16:11:45,546][85176] Updated weights for policy 0, policy_version 30072 (0.0008) +[2023-10-11 16:11:46,063][84230] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 62029824. Throughput: 0: 1645.4, 1: 1691.6. Samples: 15512864. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) +[2023-10-11 16:11:46,063][84230] Avg episode reward: [(0, '7.290'), (1, '7.660')] +[2023-10-11 16:11:48,497][85175] Updated weights for policy 1, policy_version 30500 (0.0008) +[2023-10-11 16:11:48,862][85175] Updated weights for policy 1, policy_version 30510 (0.0007) +[2023-10-11 16:11:49,232][85175] Updated weights for policy 1, policy_version 30520 (0.0008) +[2023-10-11 16:11:49,772][85176] Updated weights for policy 0, policy_version 30082 (0.0007) +[2023-10-11 16:11:50,155][85176] Updated weights for policy 0, policy_version 30092 (0.0007) +[2023-10-11 16:11:50,523][85176] Updated weights for policy 0, policy_version 30102 (0.0008) +[2023-10-11 16:11:50,899][85176] Updated weights for policy 0, policy_version 30112 (0.0009) +[2023-10-11 16:11:51,063][84230] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 62095360. Throughput: 0: 1661.4, 1: 1704.9. Samples: 15523942. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) +[2023-10-11 16:11:51,063][84230] Avg episode reward: [(0, '7.450'), (1, '8.390')] +[2023-10-11 16:11:51,064][85000] Saving new best policy, reward=8.390! +[2023-10-11 16:11:53,258][85175] Updated weights for policy 1, policy_version 30530 (0.0008) +[2023-10-11 16:11:53,629][85175] Updated weights for policy 1, policy_version 30540 (0.0007) +[2023-10-11 16:11:54,000][85175] Updated weights for policy 1, policy_version 30550 (0.0009) +[2023-10-11 16:11:54,366][85175] Updated weights for policy 1, policy_version 30560 (0.0010) +[2023-10-11 16:11:55,121][85176] Updated weights for policy 0, policy_version 30122 (0.0008) +[2023-10-11 16:11:55,499][85176] Updated weights for policy 0, policy_version 30132 (0.0008) +[2023-10-11 16:11:55,885][85176] Updated weights for policy 0, policy_version 30142 (0.0009) +[2023-10-11 16:11:56,063][84230] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 62160896. Throughput: 0: 1660.0, 1: 1670.2. Samples: 15543384. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) +[2023-10-11 16:11:56,064][84230] Avg episode reward: [(0, '8.200'), (1, '7.830')] +[2023-10-11 16:11:58,436][85175] Updated weights for policy 1, policy_version 30570 (0.0008) +[2023-10-11 16:11:58,808][85175] Updated weights for policy 1, policy_version 30580 (0.0008) +[2023-10-11 16:11:59,178][85175] Updated weights for policy 1, policy_version 30590 (0.0009) +[2023-10-11 16:11:59,950][85176] Updated weights for policy 0, policy_version 30152 (0.0008) +[2023-10-11 16:12:00,330][85176] Updated weights for policy 0, policy_version 30162 (0.0008) +[2023-10-11 16:12:00,707][85176] Updated weights for policy 0, policy_version 30172 (0.0010) +[2023-10-11 16:12:01,062][84230] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 13440.4). Total num frames: 62226432. Throughput: 0: 1644.0, 1: 1701.5. Samples: 15562960. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) +[2023-10-11 16:12:01,063][84230] Avg episode reward: [(0, '8.420'), (1, '7.270')] +[2023-10-11 16:12:03,064][85175] Updated weights for policy 1, policy_version 30600 (0.0009) +[2023-10-11 16:12:03,429][85175] Updated weights for policy 1, policy_version 30610 (0.0007) +[2023-10-11 16:12:03,799][85175] Updated weights for policy 1, policy_version 30620 (0.0008) +[2023-10-11 16:12:04,877][85176] Updated weights for policy 0, policy_version 30182 (0.0008) +[2023-10-11 16:12:05,251][85176] Updated weights for policy 0, policy_version 30192 (0.0007) +[2023-10-11 16:12:05,627][85176] Updated weights for policy 0, policy_version 30202 (0.0007) +[2023-10-11 16:12:06,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 62291968. Throughput: 0: 1654.4, 1: 1685.3. Samples: 15573594. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) +[2023-10-11 16:12:06,063][84230] Avg episode reward: [(0, '7.470'), (1, '6.640')] +[2023-10-11 16:12:07,838][85175] Updated weights for policy 1, policy_version 30630 (0.0009) +[2023-10-11 16:12:08,195][85175] Updated weights for policy 1, policy_version 30640 (0.0008) +[2023-10-11 16:12:08,567][85175] Updated weights for policy 1, policy_version 30650 (0.0009) +[2023-10-11 16:12:09,786][85176] Updated weights for policy 0, policy_version 30212 (0.0008) +[2023-10-11 16:12:10,162][85176] Updated weights for policy 0, policy_version 30222 (0.0010) +[2023-10-11 16:12:10,535][85176] Updated weights for policy 0, policy_version 30232 (0.0007) +[2023-10-11 16:12:11,063][84230] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 62357504. Throughput: 0: 1653.2, 1: 1690.3. Samples: 15593634. Policy #0 lag: (min: 10.0, avg: 11.6, max: 38.0) +[2023-10-11 16:12:11,064][84230] Avg episode reward: [(0, '7.240'), (1, '6.730')] +[2023-10-11 16:12:12,551][85175] Updated weights for policy 1, policy_version 30660 (0.0008) +[2023-10-11 16:12:12,918][85175] Updated weights for policy 1, policy_version 30670 (0.0007) +[2023-10-11 16:12:13,289][85175] Updated weights for policy 1, policy_version 30680 (0.0008) +[2023-10-11 16:12:14,757][85176] Updated weights for policy 0, policy_version 30242 (0.0007) +[2023-10-11 16:12:15,140][85176] Updated weights for policy 0, policy_version 30252 (0.0010) +[2023-10-11 16:12:15,509][85176] Updated weights for policy 0, policy_version 30262 (0.0009) +[2023-10-11 16:12:15,885][85176] Updated weights for policy 0, policy_version 30272 (0.0009) +[2023-10-11 16:12:16,063][84230] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 62423040. Throughput: 0: 1643.9, 1: 1715.2. Samples: 15613274. Policy #0 lag: (min: 10.0, avg: 11.6, max: 38.0) +[2023-10-11 16:12:16,064][84230] Avg episode reward: [(0, '7.140'), (1, '8.430')] +[2023-10-11 16:12:16,076][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000030272_30998528.pth... +[2023-10-11 16:12:16,076][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000030688_31424512.pth... +[2023-10-11 16:12:16,113][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000029120_29818880.pth +[2023-10-11 16:12:16,114][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000028704_29392896.pth +[2023-10-11 16:12:16,116][85000] Saving new best policy, reward=8.430! +[2023-10-11 16:12:17,435][85175] Updated weights for policy 1, policy_version 30690 (0.0008) +[2023-10-11 16:12:17,815][85175] Updated weights for policy 1, policy_version 30700 (0.0011) +[2023-10-11 16:12:18,187][85175] Updated weights for policy 1, policy_version 30710 (0.0009) +[2023-10-11 16:12:18,549][85175] Updated weights for policy 1, policy_version 30720 (0.0010) +[2023-10-11 16:12:19,958][85176] Updated weights for policy 0, policy_version 30282 (0.0008) +[2023-10-11 16:12:20,321][85176] Updated weights for policy 0, policy_version 30292 (0.0009) +[2023-10-11 16:12:20,695][85176] Updated weights for policy 0, policy_version 30302 (0.0010) +[2023-10-11 16:12:21,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 62488576. Throughput: 0: 1654.2, 1: 1681.6. Samples: 15623542. Policy #0 lag: (min: 10.0, avg: 11.6, max: 38.0) +[2023-10-11 16:12:21,064][84230] Avg episode reward: [(0, '7.560'), (1, '7.900')] +[2023-10-11 16:12:22,468][85175] Updated weights for policy 1, policy_version 30730 (0.0009) +[2023-10-11 16:12:22,838][85175] Updated weights for policy 1, policy_version 30740 (0.0007) +[2023-10-11 16:12:23,208][85175] Updated weights for policy 1, policy_version 30750 (0.0007) +[2023-10-11 16:12:24,774][85176] Updated weights for policy 0, policy_version 30312 (0.0008) +[2023-10-11 16:12:25,145][85176] Updated weights for policy 0, policy_version 30322 (0.0007) +[2023-10-11 16:12:25,524][85176] Updated weights for policy 0, policy_version 30332 (0.0011) +[2023-10-11 16:12:26,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 62554112. Throughput: 0: 1647.2, 1: 1698.7. Samples: 15644138. Policy #0 lag: (min: 10.0, avg: 11.6, max: 38.0) +[2023-10-11 16:12:26,064][84230] Avg episode reward: [(0, '7.680'), (1, '7.370')] +[2023-10-11 16:12:27,158][85175] Updated weights for policy 1, policy_version 30760 (0.0008) +[2023-10-11 16:12:27,539][85175] Updated weights for policy 1, policy_version 30770 (0.0007) +[2023-10-11 16:12:27,918][85175] Updated weights for policy 1, policy_version 30780 (0.0009) +[2023-10-11 16:12:29,587][85176] Updated weights for policy 0, policy_version 30342 (0.0007) +[2023-10-11 16:12:29,957][85176] Updated weights for policy 0, policy_version 30352 (0.0007) +[2023-10-11 16:12:30,335][85176] Updated weights for policy 0, policy_version 30362 (0.0009) +[2023-10-11 16:12:31,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 62619648. Throughput: 0: 1645.3, 1: 1707.7. Samples: 15663750. Policy #0 lag: (min: 10.0, avg: 11.6, max: 38.0) +[2023-10-11 16:12:31,064][84230] Avg episode reward: [(0, '8.120'), (1, '6.900')] +[2023-10-11 16:12:31,907][85175] Updated weights for policy 1, policy_version 30790 (0.0008) +[2023-10-11 16:12:32,262][85175] Updated weights for policy 1, policy_version 30800 (0.0009) +[2023-10-11 16:12:32,639][85175] Updated weights for policy 1, policy_version 30810 (0.0008) +[2023-10-11 16:12:34,512][85176] Updated weights for policy 0, policy_version 30372 (0.0009) +[2023-10-11 16:12:34,888][85176] Updated weights for policy 0, policy_version 30382 (0.0007) +[2023-10-11 16:12:35,259][85176] Updated weights for policy 0, policy_version 30392 (0.0011) +[2023-10-11 16:12:36,062][84230] Fps is (10 sec: 13107.7, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 62685184. Throughput: 0: 1656.0, 1: 1681.7. Samples: 15674140. Policy #0 lag: (min: 13.0, avg: 18.0, max: 45.0) +[2023-10-11 16:12:36,063][84230] Avg episode reward: [(0, '7.720'), (1, '7.120')] +[2023-10-11 16:12:36,757][85175] Updated weights for policy 1, policy_version 30820 (0.0011) +[2023-10-11 16:12:37,121][85175] Updated weights for policy 1, policy_version 30830 (0.0009) +[2023-10-11 16:12:37,497][85175] Updated weights for policy 1, policy_version 30840 (0.0007) +[2023-10-11 16:12:39,476][85176] Updated weights for policy 0, policy_version 30402 (0.0007) +[2023-10-11 16:12:39,845][85176] Updated weights for policy 0, policy_version 30412 (0.0009) +[2023-10-11 16:12:40,222][85176] Updated weights for policy 0, policy_version 30422 (0.0009) +[2023-10-11 16:12:40,600][85176] Updated weights for policy 0, policy_version 30432 (0.0008) +[2023-10-11 16:12:41,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 62750720. Throughput: 0: 1651.1, 1: 1709.4. Samples: 15694606. Policy #0 lag: (min: 13.0, avg: 18.0, max: 45.0) +[2023-10-11 16:12:41,063][84230] Avg episode reward: [(0, '7.110'), (1, '7.790')] +[2023-10-11 16:12:41,534][85175] Updated weights for policy 1, policy_version 30850 (0.0007) +[2023-10-11 16:12:41,896][85175] Updated weights for policy 1, policy_version 30860 (0.0007) +[2023-10-11 16:12:42,258][85175] Updated weights for policy 1, policy_version 30870 (0.0007) +[2023-10-11 16:12:42,624][85175] Updated weights for policy 1, policy_version 30880 (0.0011) +[2023-10-11 16:12:44,808][85176] Updated weights for policy 0, policy_version 30442 (0.0009) +[2023-10-11 16:12:45,189][85176] Updated weights for policy 0, policy_version 30452 (0.0010) +[2023-10-11 16:12:45,549][85176] Updated weights for policy 0, policy_version 30462 (0.0009) +[2023-10-11 16:12:46,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 62816256. Throughput: 0: 1652.3, 1: 1706.0. Samples: 15714086. Policy #0 lag: (min: 13.0, avg: 18.0, max: 45.0) +[2023-10-11 16:12:46,063][84230] Avg episode reward: [(0, '7.240'), (1, '8.060')] +[2023-10-11 16:12:46,751][85175] Updated weights for policy 1, policy_version 30890 (0.0008) +[2023-10-11 16:12:47,112][85175] Updated weights for policy 1, policy_version 30900 (0.0008) +[2023-10-11 16:12:47,486][85175] Updated weights for policy 1, policy_version 30910 (0.0008) +[2023-10-11 16:12:49,723][85176] Updated weights for policy 0, policy_version 30472 (0.0010) +[2023-10-11 16:12:50,088][85176] Updated weights for policy 0, policy_version 30482 (0.0011) +[2023-10-11 16:12:50,475][85176] Updated weights for policy 0, policy_version 30492 (0.0009) +[2023-10-11 16:12:51,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 62881792. Throughput: 0: 1657.1, 1: 1689.7. Samples: 15724200. Policy #0 lag: (min: 13.0, avg: 18.0, max: 45.0) +[2023-10-11 16:12:51,063][84230] Avg episode reward: [(0, '8.100'), (1, '7.890')] +[2023-10-11 16:12:51,525][85175] Updated weights for policy 1, policy_version 30920 (0.0008) +[2023-10-11 16:12:51,887][85175] Updated weights for policy 1, policy_version 30930 (0.0007) +[2023-10-11 16:12:52,264][85175] Updated weights for policy 1, policy_version 30940 (0.0007) +[2023-10-11 16:12:54,456][85176] Updated weights for policy 0, policy_version 30502 (0.0007) +[2023-10-11 16:12:54,817][85176] Updated weights for policy 0, policy_version 30512 (0.0009) +[2023-10-11 16:12:55,204][85176] Updated weights for policy 0, policy_version 30522 (0.0007) +[2023-10-11 16:12:56,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 62947328. Throughput: 0: 1648.3, 1: 1698.0. Samples: 15744214. Policy #0 lag: (min: 13.0, avg: 18.0, max: 45.0) +[2023-10-11 16:12:56,063][84230] Avg episode reward: [(0, '8.120'), (1, '7.600')] +[2023-10-11 16:12:56,326][85175] Updated weights for policy 1, policy_version 30950 (0.0009) +[2023-10-11 16:12:56,695][85175] Updated weights for policy 1, policy_version 30960 (0.0009) +[2023-10-11 16:12:57,074][85175] Updated weights for policy 1, policy_version 30970 (0.0008) +[2023-10-11 16:12:59,422][85176] Updated weights for policy 0, policy_version 30532 (0.0008) +[2023-10-11 16:12:59,791][85176] Updated weights for policy 0, policy_version 30542 (0.0011) +[2023-10-11 16:13:00,156][85176] Updated weights for policy 0, policy_version 30552 (0.0009) +[2023-10-11 16:13:01,047][85175] Updated weights for policy 1, policy_version 30980 (0.0008) +[2023-10-11 16:13:01,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 63012864. Throughput: 0: 1655.1, 1: 1699.7. Samples: 15764242. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) +[2023-10-11 16:13:01,063][84230] Avg episode reward: [(0, '7.640'), (1, '7.790')] +[2023-10-11 16:13:01,413][85175] Updated weights for policy 1, policy_version 30990 (0.0009) +[2023-10-11 16:13:01,780][85175] Updated weights for policy 1, policy_version 31000 (0.0007) +[2023-10-11 16:13:04,321][85176] Updated weights for policy 0, policy_version 30562 (0.0010) +[2023-10-11 16:13:04,685][85176] Updated weights for policy 0, policy_version 30572 (0.0009) +[2023-10-11 16:13:05,053][85176] Updated weights for policy 0, policy_version 30582 (0.0008) +[2023-10-11 16:13:05,422][85176] Updated weights for policy 0, policy_version 30592 (0.0007) +[2023-10-11 16:13:05,723][85175] Updated weights for policy 1, policy_version 31010 (0.0008) +[2023-10-11 16:13:06,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 63078400. Throughput: 0: 1657.7, 1: 1699.7. Samples: 15774622. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) +[2023-10-11 16:13:06,063][84230] Avg episode reward: [(0, '7.510'), (1, '7.370')] +[2023-10-11 16:13:06,086][85175] Updated weights for policy 1, policy_version 31020 (0.0009) +[2023-10-11 16:13:06,462][85175] Updated weights for policy 1, policy_version 31030 (0.0009) +[2023-10-11 16:13:06,830][85175] Updated weights for policy 1, policy_version 31040 (0.0008) +[2023-10-11 16:13:09,408][85176] Updated weights for policy 0, policy_version 30602 (0.0007) +[2023-10-11 16:13:09,788][85176] Updated weights for policy 0, policy_version 30612 (0.0007) +[2023-10-11 16:13:10,158][85176] Updated weights for policy 0, policy_version 30622 (0.0007) +[2023-10-11 16:13:10,867][85175] Updated weights for policy 1, policy_version 31050 (0.0009) +[2023-10-11 16:13:11,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 63143936. Throughput: 0: 1651.6, 1: 1700.7. Samples: 15794988. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) +[2023-10-11 16:13:11,063][84230] Avg episode reward: [(0, '7.750'), (1, '7.750')] +[2023-10-11 16:13:11,242][85175] Updated weights for policy 1, policy_version 31060 (0.0009) +[2023-10-11 16:13:11,604][85175] Updated weights for policy 1, policy_version 31070 (0.0009) +[2023-10-11 16:13:14,388][85176] Updated weights for policy 0, policy_version 30632 (0.0008) +[2023-10-11 16:13:14,772][85176] Updated weights for policy 0, policy_version 30642 (0.0007) +[2023-10-11 16:13:15,135][85176] Updated weights for policy 0, policy_version 30652 (0.0008) +[2023-10-11 16:13:15,813][85175] Updated weights for policy 1, policy_version 31080 (0.0009) +[2023-10-11 16:13:16,063][84230] Fps is (10 sec: 13106.8, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 63209472. Throughput: 0: 1658.0, 1: 1696.8. Samples: 15814716. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) +[2023-10-11 16:13:16,064][84230] Avg episode reward: [(0, '7.710'), (1, '8.120')] +[2023-10-11 16:13:16,188][85175] Updated weights for policy 1, policy_version 31090 (0.0007) +[2023-10-11 16:13:16,547][85175] Updated weights for policy 1, policy_version 31100 (0.0008) +[2023-10-11 16:13:19,215][85176] Updated weights for policy 0, policy_version 30662 (0.0009) +[2023-10-11 16:13:19,584][85176] Updated weights for policy 0, policy_version 30672 (0.0009) +[2023-10-11 16:13:19,963][85176] Updated weights for policy 0, policy_version 30682 (0.0009) +[2023-10-11 16:13:20,532][85175] Updated weights for policy 1, policy_version 31110 (0.0008) +[2023-10-11 16:13:20,900][85175] Updated weights for policy 1, policy_version 31120 (0.0007) +[2023-10-11 16:13:21,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 63275008. Throughput: 0: 1660.0, 1: 1694.9. Samples: 15825114. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) +[2023-10-11 16:13:21,064][84230] Avg episode reward: [(0, '7.860'), (1, '8.060')] +[2023-10-11 16:13:21,273][85175] Updated weights for policy 1, policy_version 31130 (0.0007) +[2023-10-11 16:13:24,048][85176] Updated weights for policy 0, policy_version 30692 (0.0011) +[2023-10-11 16:13:24,427][85176] Updated weights for policy 0, policy_version 30702 (0.0008) +[2023-10-11 16:13:24,799][85176] Updated weights for policy 0, policy_version 30712 (0.0008) +[2023-10-11 16:13:25,303][85175] Updated weights for policy 1, policy_version 31140 (0.0008) +[2023-10-11 16:13:25,673][85175] Updated weights for policy 1, policy_version 31150 (0.0009) +[2023-10-11 16:13:26,044][85175] Updated weights for policy 1, policy_version 31160 (0.0008) +[2023-10-11 16:13:26,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 63340544. Throughput: 0: 1650.6, 1: 1694.6. Samples: 15845138. Policy #0 lag: (min: 31.0, avg: 32.9, max: 61.0) +[2023-10-11 16:13:26,064][84230] Avg episode reward: [(0, '7.690'), (1, '7.890')] +[2023-10-11 16:13:28,959][85176] Updated weights for policy 0, policy_version 30722 (0.0008) +[2023-10-11 16:13:29,373][85176] Updated weights for policy 0, policy_version 30732 (0.0009) +[2023-10-11 16:13:29,740][85176] Updated weights for policy 0, policy_version 30742 (0.0011) +[2023-10-11 16:13:30,000][85175] Updated weights for policy 1, policy_version 31170 (0.0007) +[2023-10-11 16:13:30,115][85176] Updated weights for policy 0, policy_version 30752 (0.0008) +[2023-10-11 16:13:30,373][85175] Updated weights for policy 1, policy_version 31180 (0.0008) +[2023-10-11 16:13:30,741][85175] Updated weights for policy 1, policy_version 31190 (0.0011) +[2023-10-11 16:13:31,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 63406080. Throughput: 0: 1660.2, 1: 1686.1. Samples: 15864668. Policy #0 lag: (min: 31.0, avg: 32.9, max: 61.0) +[2023-10-11 16:13:31,063][84230] Avg episode reward: [(0, '8.070'), (1, '7.440')] +[2023-10-11 16:13:31,116][85175] Updated weights for policy 1, policy_version 31200 (0.0007) +[2023-10-11 16:13:34,071][85176] Updated weights for policy 0, policy_version 30762 (0.0008) +[2023-10-11 16:13:34,452][85176] Updated weights for policy 0, policy_version 30772 (0.0007) +[2023-10-11 16:13:34,825][85176] Updated weights for policy 0, policy_version 30782 (0.0008) +[2023-10-11 16:13:34,994][85175] Updated weights for policy 1, policy_version 31210 (0.0007) +[2023-10-11 16:13:35,358][85175] Updated weights for policy 1, policy_version 31220 (0.0007) +[2023-10-11 16:13:35,719][85175] Updated weights for policy 1, policy_version 31230 (0.0007) +[2023-10-11 16:13:36,063][84230] Fps is (10 sec: 16384.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 63504384. Throughput: 0: 1662.8, 1: 1706.4. Samples: 15875814. Policy #0 lag: (min: 31.0, avg: 32.9, max: 61.0) +[2023-10-11 16:13:36,063][84230] Avg episode reward: [(0, '7.320'), (1, '7.650')] +[2023-10-11 16:13:38,530][85176] Updated weights for policy 0, policy_version 30792 (0.0010) +[2023-10-11 16:13:38,906][85176] Updated weights for policy 0, policy_version 30802 (0.0011) +[2023-10-11 16:13:39,290][85176] Updated weights for policy 0, policy_version 30812 (0.0009) +[2023-10-11 16:13:39,671][85175] Updated weights for policy 1, policy_version 31240 (0.0009) +[2023-10-11 16:13:40,042][85175] Updated weights for policy 1, policy_version 31250 (0.0011) +[2023-10-11 16:13:40,416][85175] Updated weights for policy 1, policy_version 31260 (0.0009) +[2023-10-11 16:13:41,063][84230] Fps is (10 sec: 16384.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 63569920. Throughput: 0: 1658.1, 1: 1702.7. Samples: 15895452. Policy #0 lag: (min: 31.0, avg: 32.9, max: 61.0) +[2023-10-11 16:13:41,063][84230] Avg episode reward: [(0, '7.240'), (1, '8.130')] +[2023-10-11 16:13:43,527][85176] Updated weights for policy 0, policy_version 30822 (0.0008) +[2023-10-11 16:13:43,904][85176] Updated weights for policy 0, policy_version 30832 (0.0008) +[2023-10-11 16:13:44,277][85176] Updated weights for policy 0, policy_version 30842 (0.0008) +[2023-10-11 16:13:44,474][85175] Updated weights for policy 1, policy_version 31270 (0.0009) +[2023-10-11 16:13:44,840][85175] Updated weights for policy 1, policy_version 31280 (0.0007) +[2023-10-11 16:13:45,200][85175] Updated weights for policy 1, policy_version 31290 (0.0010) +[2023-10-11 16:13:46,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 63635456. Throughput: 0: 1671.2, 1: 1678.1. Samples: 15914960. Policy #0 lag: (min: 31.0, avg: 32.9, max: 61.0) +[2023-10-11 16:13:46,063][84230] Avg episode reward: [(0, '7.270'), (1, '7.930')] +[2023-10-11 16:13:48,419][85176] Updated weights for policy 0, policy_version 30852 (0.0007) +[2023-10-11 16:13:48,794][85176] Updated weights for policy 0, policy_version 30862 (0.0008) +[2023-10-11 16:13:49,169][85176] Updated weights for policy 0, policy_version 30872 (0.0007) +[2023-10-11 16:13:49,181][85175] Updated weights for policy 1, policy_version 31300 (0.0011) +[2023-10-11 16:13:49,542][85175] Updated weights for policy 1, policy_version 31310 (0.0009) +[2023-10-11 16:13:49,920][85175] Updated weights for policy 1, policy_version 31320 (0.0008) +[2023-10-11 16:13:51,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 63700992. Throughput: 0: 1663.5, 1: 1709.9. Samples: 15926424. Policy #0 lag: (min: 29.0, avg: 29.0, max: 29.0) +[2023-10-11 16:13:51,064][84230] Avg episode reward: [(0, '7.830'), (1, '7.780')] +[2023-10-11 16:13:53,319][85176] Updated weights for policy 0, policy_version 30882 (0.0008) +[2023-10-11 16:13:53,695][85176] Updated weights for policy 0, policy_version 30892 (0.0010) +[2023-10-11 16:13:53,832][85175] Updated weights for policy 1, policy_version 31330 (0.0009) +[2023-10-11 16:13:54,062][85176] Updated weights for policy 0, policy_version 30902 (0.0008) +[2023-10-11 16:13:54,198][85175] Updated weights for policy 1, policy_version 31340 (0.0008) +[2023-10-11 16:13:54,438][85176] Updated weights for policy 0, policy_version 30912 (0.0008) +[2023-10-11 16:13:54,562][85175] Updated weights for policy 1, policy_version 31350 (0.0009) +[2023-10-11 16:13:54,934][85175] Updated weights for policy 1, policy_version 31360 (0.0011) +[2023-10-11 16:13:56,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 63766528. Throughput: 0: 1653.2, 1: 1696.0. Samples: 15945700. Policy #0 lag: (min: 29.0, avg: 29.0, max: 29.0) +[2023-10-11 16:13:56,063][84230] Avg episode reward: [(0, '7.990'), (1, '7.890')] +[2023-10-11 16:13:58,473][85176] Updated weights for policy 0, policy_version 30922 (0.0008) +[2023-10-11 16:13:58,841][85175] Updated weights for policy 1, policy_version 31370 (0.0007) +[2023-10-11 16:13:58,842][85176] Updated weights for policy 0, policy_version 30932 (0.0007) +[2023-10-11 16:13:59,201][85175] Updated weights for policy 1, policy_version 31380 (0.0008) +[2023-10-11 16:13:59,205][85176] Updated weights for policy 0, policy_version 30942 (0.0008) +[2023-10-11 16:13:59,562][85175] Updated weights for policy 1, policy_version 31390 (0.0011) +[2023-10-11 16:14:01,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 63832064. Throughput: 0: 1673.4, 1: 1687.1. Samples: 15965940. Policy #0 lag: (min: 29.0, avg: 29.0, max: 29.0) +[2023-10-11 16:14:01,064][84230] Avg episode reward: [(0, '8.350'), (1, '7.930')] +[2023-10-11 16:14:03,468][85176] Updated weights for policy 0, policy_version 30952 (0.0009) +[2023-10-11 16:14:03,820][85175] Updated weights for policy 1, policy_version 31400 (0.0008) +[2023-10-11 16:14:03,853][85176] Updated weights for policy 0, policy_version 30962 (0.0009) +[2023-10-11 16:14:04,186][85175] Updated weights for policy 1, policy_version 31410 (0.0009) +[2023-10-11 16:14:04,220][85176] Updated weights for policy 0, policy_version 30972 (0.0010) +[2023-10-11 16:14:04,558][85175] Updated weights for policy 1, policy_version 31420 (0.0009) +[2023-10-11 16:14:06,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 63897600. Throughput: 0: 1661.4, 1: 1715.8. Samples: 15977088. Policy #0 lag: (min: 29.0, avg: 29.0, max: 29.0) +[2023-10-11 16:14:06,063][84230] Avg episode reward: [(0, '7.570'), (1, '7.740')] +[2023-10-11 16:14:08,242][85176] Updated weights for policy 0, policy_version 30982 (0.0007) +[2023-10-11 16:14:08,613][85176] Updated weights for policy 0, policy_version 30992 (0.0008) +[2023-10-11 16:14:08,634][85175] Updated weights for policy 1, policy_version 31430 (0.0008) +[2023-10-11 16:14:08,983][85176] Updated weights for policy 0, policy_version 31002 (0.0008) +[2023-10-11 16:14:08,999][85175] Updated weights for policy 1, policy_version 31440 (0.0008) +[2023-10-11 16:14:09,365][85175] Updated weights for policy 1, policy_version 31450 (0.0009) +[2023-10-11 16:14:11,062][84230] Fps is (10 sec: 13107.7, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 63963136. Throughput: 0: 1662.0, 1: 1693.3. Samples: 15996128. Policy #0 lag: (min: 29.0, avg: 29.0, max: 29.0) +[2023-10-11 16:14:11,063][84230] Avg episode reward: [(0, '7.570'), (1, '7.740')] +[2023-10-11 16:14:12,992][85176] Updated weights for policy 0, policy_version 31012 (0.0008) +[2023-10-11 16:14:13,366][85176] Updated weights for policy 0, policy_version 31022 (0.0007) +[2023-10-11 16:14:13,386][85175] Updated weights for policy 1, policy_version 31460 (0.0009) +[2023-10-11 16:14:13,744][85176] Updated weights for policy 0, policy_version 31032 (0.0007) +[2023-10-11 16:14:13,750][85175] Updated weights for policy 1, policy_version 31470 (0.0007) +[2023-10-11 16:14:14,116][85175] Updated weights for policy 1, policy_version 31480 (0.0009) +[2023-10-11 16:14:16,063][84230] Fps is (10 sec: 13106.6, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 64028672. Throughput: 0: 1682.1, 1: 1694.7. Samples: 16016622. Policy #0 lag: (min: 30.0, avg: 37.9, max: 62.0) +[2023-10-11 16:14:16,064][84230] Avg episode reward: [(0, '7.090'), (1, '7.430')] +[2023-10-11 16:14:16,075][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000031040_31784960.pth... +[2023-10-11 16:14:16,075][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000031488_32243712.pth... +[2023-10-11 16:14:16,113][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000029888_30605312.pth +[2023-10-11 16:14:16,115][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000029504_30212096.pth +[2023-10-11 16:14:16,118][85000] Saving a milestone ./train_atari/atari_frostbite_APPO/checkpoint_p1/milestones/checkpoint_000031488_32243712.pth +[2023-10-11 16:14:16,121][84801] Saving a milestone ./train_atari/atari_frostbite_APPO/checkpoint_p0/milestones/checkpoint_000031040_31784960.pth +[2023-10-11 16:14:17,939][85176] Updated weights for policy 0, policy_version 31042 (0.0008) +[2023-10-11 16:14:18,306][85175] Updated weights for policy 1, policy_version 31490 (0.0007) +[2023-10-11 16:14:18,320][85176] Updated weights for policy 0, policy_version 31052 (0.0009) +[2023-10-11 16:14:18,682][85175] Updated weights for policy 1, policy_version 31500 (0.0008) +[2023-10-11 16:14:18,687][85176] Updated weights for policy 0, policy_version 31062 (0.0008) +[2023-10-11 16:14:19,047][85175] Updated weights for policy 1, policy_version 31510 (0.0009) +[2023-10-11 16:14:19,057][85176] Updated weights for policy 0, policy_version 31072 (0.0010) +[2023-10-11 16:14:19,409][85175] Updated weights for policy 1, policy_version 31520 (0.0008) +[2023-10-11 16:14:21,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 64094208. Throughput: 0: 1667.9, 1: 1693.9. Samples: 16027096. Policy #0 lag: (min: 30.0, avg: 37.9, max: 62.0) +[2023-10-11 16:14:21,063][84230] Avg episode reward: [(0, '7.500'), (1, '7.990')] +[2023-10-11 16:14:23,128][85176] Updated weights for policy 0, policy_version 31082 (0.0009) +[2023-10-11 16:14:23,400][85175] Updated weights for policy 1, policy_version 31530 (0.0007) +[2023-10-11 16:14:23,495][85176] Updated weights for policy 0, policy_version 31092 (0.0008) +[2023-10-11 16:14:23,771][85175] Updated weights for policy 1, policy_version 31540 (0.0007) +[2023-10-11 16:14:23,873][85176] Updated weights for policy 0, policy_version 31102 (0.0007) +[2023-10-11 16:14:24,137][85175] Updated weights for policy 1, policy_version 31550 (0.0009) +[2023-10-11 16:14:26,062][84230] Fps is (10 sec: 13107.8, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 64159744. Throughput: 0: 1675.3, 1: 1678.1. Samples: 16046356. Policy #0 lag: (min: 30.0, avg: 37.9, max: 62.0) +[2023-10-11 16:14:26,063][84230] Avg episode reward: [(0, '7.850'), (1, '7.870')] +[2023-10-11 16:14:27,814][85176] Updated weights for policy 0, policy_version 31112 (0.0008) +[2023-10-11 16:14:28,182][85176] Updated weights for policy 0, policy_version 31122 (0.0009) +[2023-10-11 16:14:28,184][85175] Updated weights for policy 1, policy_version 31560 (0.0008) +[2023-10-11 16:14:28,550][85175] Updated weights for policy 1, policy_version 31570 (0.0009) +[2023-10-11 16:14:28,551][85176] Updated weights for policy 0, policy_version 31132 (0.0008) +[2023-10-11 16:14:28,917][85175] Updated weights for policy 1, policy_version 31580 (0.0012) +[2023-10-11 16:14:31,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 64225280. Throughput: 0: 1686.4, 1: 1698.0. Samples: 16067258. Policy #0 lag: (min: 30.0, avg: 37.9, max: 62.0) +[2023-10-11 16:14:31,064][84230] Avg episode reward: [(0, '8.180'), (1, '7.340')] +[2023-10-11 16:14:32,330][85176] Updated weights for policy 0, policy_version 31142 (0.0008) +[2023-10-11 16:14:32,705][85176] Updated weights for policy 0, policy_version 31152 (0.0009) +[2023-10-11 16:14:32,965][85175] Updated weights for policy 1, policy_version 31590 (0.0009) +[2023-10-11 16:14:33,065][85176] Updated weights for policy 0, policy_version 31162 (0.0008) +[2023-10-11 16:14:33,348][85175] Updated weights for policy 1, policy_version 31600 (0.0008) +[2023-10-11 16:14:33,716][85175] Updated weights for policy 1, policy_version 31610 (0.0008) +[2023-10-11 16:14:36,063][84230] Fps is (10 sec: 13106.8, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 64290816. Throughput: 0: 1665.1, 1: 1677.0. Samples: 16076818. Policy #0 lag: (min: 30.0, avg: 37.9, max: 62.0) +[2023-10-11 16:14:36,064][84230] Avg episode reward: [(0, '7.860'), (1, '7.640')] +[2023-10-11 16:14:37,217][85176] Updated weights for policy 0, policy_version 31172 (0.0009) +[2023-10-11 16:14:37,589][85176] Updated weights for policy 0, policy_version 31182 (0.0007) +[2023-10-11 16:14:37,744][85175] Updated weights for policy 1, policy_version 31620 (0.0009) +[2023-10-11 16:14:37,963][85176] Updated weights for policy 0, policy_version 31192 (0.0008) +[2023-10-11 16:14:38,104][85175] Updated weights for policy 1, policy_version 31630 (0.0007) +[2023-10-11 16:14:38,468][85175] Updated weights for policy 1, policy_version 31640 (0.0010) +[2023-10-11 16:14:41,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.5). Total num frames: 64356352. Throughput: 0: 1686.1, 1: 1676.9. Samples: 16097034. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:14:41,063][84230] Avg episode reward: [(0, '7.570'), (1, '8.390')] +[2023-10-11 16:14:41,979][85176] Updated weights for policy 0, policy_version 31202 (0.0008) +[2023-10-11 16:14:42,344][85176] Updated weights for policy 0, policy_version 31212 (0.0008) +[2023-10-11 16:14:42,442][85175] Updated weights for policy 1, policy_version 31650 (0.0010) +[2023-10-11 16:14:42,723][85176] Updated weights for policy 0, policy_version 31222 (0.0007) +[2023-10-11 16:14:42,814][85175] Updated weights for policy 1, policy_version 31660 (0.0008) +[2023-10-11 16:14:43,088][85176] Updated weights for policy 0, policy_version 31232 (0.0008) +[2023-10-11 16:14:43,180][85175] Updated weights for policy 1, policy_version 31670 (0.0009) +[2023-10-11 16:14:43,540][85175] Updated weights for policy 1, policy_version 31680 (0.0007) +[2023-10-11 16:14:46,063][84230] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 64421888. Throughput: 0: 1690.9, 1: 1687.6. Samples: 16117972. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:14:46,063][84230] Avg episode reward: [(0, '7.390'), (1, '7.940')] +[2023-10-11 16:14:46,987][85176] Updated weights for policy 0, policy_version 31242 (0.0008) +[2023-10-11 16:14:47,361][85176] Updated weights for policy 0, policy_version 31252 (0.0009) +[2023-10-11 16:14:47,671][85175] Updated weights for policy 1, policy_version 31690 (0.0011) +[2023-10-11 16:14:47,722][85176] Updated weights for policy 0, policy_version 31262 (0.0010) +[2023-10-11 16:14:48,038][85175] Updated weights for policy 1, policy_version 31700 (0.0010) +[2023-10-11 16:14:48,410][85175] Updated weights for policy 1, policy_version 31710 (0.0007) +[2023-10-11 16:14:51,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 64487424. Throughput: 0: 1679.3, 1: 1659.2. Samples: 16127322. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:14:51,064][84230] Avg episode reward: [(0, '7.590'), (1, '7.620')] +[2023-10-11 16:14:51,837][85176] Updated weights for policy 0, policy_version 31272 (0.0009) +[2023-10-11 16:14:52,216][85176] Updated weights for policy 0, policy_version 31282 (0.0009) +[2023-10-11 16:14:52,581][85176] Updated weights for policy 0, policy_version 31292 (0.0009) +[2023-10-11 16:14:52,601][85175] Updated weights for policy 1, policy_version 31720 (0.0008) +[2023-10-11 16:14:52,960][85175] Updated weights for policy 1, policy_version 31730 (0.0007) +[2023-10-11 16:14:53,327][85175] Updated weights for policy 1, policy_version 31740 (0.0009) +[2023-10-11 16:14:56,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 64552960. Throughput: 0: 1690.8, 1: 1673.8. Samples: 16147536. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:14:56,063][84230] Avg episode reward: [(0, '7.840'), (1, '7.300')] +[2023-10-11 16:14:56,723][85176] Updated weights for policy 0, policy_version 31302 (0.0009) +[2023-10-11 16:14:57,104][85176] Updated weights for policy 0, policy_version 31312 (0.0009) +[2023-10-11 16:14:57,469][85176] Updated weights for policy 0, policy_version 31322 (0.0008) +[2023-10-11 16:14:57,574][85175] Updated weights for policy 1, policy_version 31750 (0.0008) +[2023-10-11 16:14:57,971][85175] Updated weights for policy 1, policy_version 31760 (0.0008) +[2023-10-11 16:14:58,333][85175] Updated weights for policy 1, policy_version 31770 (0.0007) +[2023-10-11 16:15:01,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 64618496. Throughput: 0: 1689.3, 1: 1676.4. Samples: 16168076. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:15:01,063][84230] Avg episode reward: [(0, '7.700'), (1, '7.600')] +[2023-10-11 16:15:01,444][85176] Updated weights for policy 0, policy_version 31332 (0.0008) +[2023-10-11 16:15:01,819][85176] Updated weights for policy 0, policy_version 31342 (0.0007) +[2023-10-11 16:15:02,186][85176] Updated weights for policy 0, policy_version 31352 (0.0009) +[2023-10-11 16:15:02,305][85175] Updated weights for policy 1, policy_version 31780 (0.0007) +[2023-10-11 16:15:02,671][85175] Updated weights for policy 1, policy_version 31790 (0.0007) +[2023-10-11 16:15:03,039][85175] Updated weights for policy 1, policy_version 31800 (0.0010) +[2023-10-11 16:15:06,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 64684032. Throughput: 0: 1677.4, 1: 1657.1. Samples: 16177146. Policy #0 lag: (min: 5.0, avg: 10.2, max: 37.0) +[2023-10-11 16:15:06,063][84230] Avg episode reward: [(0, '7.710'), (1, '7.900')] +[2023-10-11 16:15:06,399][85176] Updated weights for policy 0, policy_version 31362 (0.0009) +[2023-10-11 16:15:06,813][85176] Updated weights for policy 0, policy_version 31372 (0.0009) +[2023-10-11 16:15:07,178][85175] Updated weights for policy 1, policy_version 31810 (0.0011) +[2023-10-11 16:15:07,180][85176] Updated weights for policy 0, policy_version 31382 (0.0010) +[2023-10-11 16:15:07,540][85175] Updated weights for policy 1, policy_version 31820 (0.0008) +[2023-10-11 16:15:07,549][85176] Updated weights for policy 0, policy_version 31392 (0.0008) +[2023-10-11 16:15:07,912][85175] Updated weights for policy 1, policy_version 31830 (0.0008) +[2023-10-11 16:15:08,282][85175] Updated weights for policy 1, policy_version 31840 (0.0007) +[2023-10-11 16:15:11,062][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.5). Total num frames: 64749568. Throughput: 0: 1686.9, 1: 1674.1. Samples: 16197604. Policy #0 lag: (min: 5.0, avg: 10.2, max: 37.0) +[2023-10-11 16:15:11,063][84230] Avg episode reward: [(0, '7.450'), (1, '8.200')] +[2023-10-11 16:15:11,712][85176] Updated weights for policy 0, policy_version 31402 (0.0008) +[2023-10-11 16:15:12,088][85176] Updated weights for policy 0, policy_version 31412 (0.0009) +[2023-10-11 16:15:12,412][85175] Updated weights for policy 1, policy_version 31850 (0.0010) +[2023-10-11 16:15:12,467][85176] Updated weights for policy 0, policy_version 31422 (0.0009) +[2023-10-11 16:15:12,777][85175] Updated weights for policy 1, policy_version 31860 (0.0010) +[2023-10-11 16:15:13,142][85175] Updated weights for policy 1, policy_version 31870 (0.0011) +[2023-10-11 16:15:16,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 64815104. Throughput: 0: 1679.6, 1: 1667.7. Samples: 16217888. Policy #0 lag: (min: 5.0, avg: 10.2, max: 37.0) +[2023-10-11 16:15:16,064][84230] Avg episode reward: [(0, '7.970'), (1, '7.620')] +[2023-10-11 16:15:16,719][85176] Updated weights for policy 0, policy_version 31432 (0.0007) +[2023-10-11 16:15:17,094][85176] Updated weights for policy 0, policy_version 31442 (0.0011) +[2023-10-11 16:15:17,239][85175] Updated weights for policy 1, policy_version 31880 (0.0009) +[2023-10-11 16:15:17,453][85176] Updated weights for policy 0, policy_version 31452 (0.0009) +[2023-10-11 16:15:17,610][85175] Updated weights for policy 1, policy_version 31890 (0.0008) +[2023-10-11 16:15:17,977][85175] Updated weights for policy 1, policy_version 31900 (0.0008) +[2023-10-11 16:15:21,062][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 64880640. Throughput: 0: 1682.1, 1: 1658.9. Samples: 16227164. Policy #0 lag: (min: 5.0, avg: 10.2, max: 37.0) +[2023-10-11 16:15:21,063][84230] Avg episode reward: [(0, '7.810'), (1, '7.590')] +[2023-10-11 16:15:21,626][85176] Updated weights for policy 0, policy_version 31462 (0.0009) +[2023-10-11 16:15:21,937][85175] Updated weights for policy 1, policy_version 31910 (0.0008) +[2023-10-11 16:15:21,990][85176] Updated weights for policy 0, policy_version 31472 (0.0007) +[2023-10-11 16:15:22,303][85175] Updated weights for policy 1, policy_version 31920 (0.0009) +[2023-10-11 16:15:22,366][85176] Updated weights for policy 0, policy_version 31482 (0.0007) +[2023-10-11 16:15:22,666][85175] Updated weights for policy 1, policy_version 31930 (0.0011) +[2023-10-11 16:15:26,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 64946176. Throughput: 0: 1676.2, 1: 1669.8. Samples: 16247604. Policy #0 lag: (min: 5.0, avg: 10.2, max: 37.0) +[2023-10-11 16:15:26,064][84230] Avg episode reward: [(0, '7.510'), (1, '7.630')] +[2023-10-11 16:15:26,492][85176] Updated weights for policy 0, policy_version 31492 (0.0008) +[2023-10-11 16:15:26,874][85176] Updated weights for policy 0, policy_version 31502 (0.0008) +[2023-10-11 16:15:26,886][85175] Updated weights for policy 1, policy_version 31940 (0.0009) +[2023-10-11 16:15:27,238][85176] Updated weights for policy 0, policy_version 31512 (0.0007) +[2023-10-11 16:15:27,248][85175] Updated weights for policy 1, policy_version 31950 (0.0007) +[2023-10-11 16:15:27,622][85175] Updated weights for policy 1, policy_version 31960 (0.0009) +[2023-10-11 16:15:31,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 65011712. Throughput: 0: 1665.2, 1: 1669.2. Samples: 16268020. Policy #0 lag: (min: 31.0, avg: 31.1, max: 39.0) +[2023-10-11 16:15:31,063][84230] Avg episode reward: [(0, '7.440'), (1, '7.580')] +[2023-10-11 16:15:31,357][85176] Updated weights for policy 0, policy_version 31522 (0.0008) +[2023-10-11 16:15:31,696][85175] Updated weights for policy 1, policy_version 31970 (0.0008) +[2023-10-11 16:15:31,729][85176] Updated weights for policy 0, policy_version 31532 (0.0008) +[2023-10-11 16:15:32,069][85175] Updated weights for policy 1, policy_version 31980 (0.0008) +[2023-10-11 16:15:32,101][85176] Updated weights for policy 0, policy_version 31542 (0.0008) +[2023-10-11 16:15:32,440][85175] Updated weights for policy 1, policy_version 31990 (0.0008) +[2023-10-11 16:15:32,460][85176] Updated weights for policy 0, policy_version 31552 (0.0007) +[2023-10-11 16:15:32,807][85175] Updated weights for policy 1, policy_version 32000 (0.0010) +[2023-10-11 16:15:36,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 65077248. Throughput: 0: 1656.1, 1: 1672.9. Samples: 16277124. Policy #0 lag: (min: 31.0, avg: 31.1, max: 39.0) +[2023-10-11 16:15:36,063][84230] Avg episode reward: [(0, '7.680'), (1, '7.640')] +[2023-10-11 16:15:36,618][85176] Updated weights for policy 0, policy_version 31562 (0.0010) +[2023-10-11 16:15:36,696][85175] Updated weights for policy 1, policy_version 32010 (0.0008) +[2023-10-11 16:15:36,989][85176] Updated weights for policy 0, policy_version 31572 (0.0008) +[2023-10-11 16:15:37,061][85175] Updated weights for policy 1, policy_version 32020 (0.0010) +[2023-10-11 16:15:37,358][85176] Updated weights for policy 0, policy_version 31582 (0.0008) +[2023-10-11 16:15:37,426][85175] Updated weights for policy 1, policy_version 32030 (0.0008) +[2023-10-11 16:15:41,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 65142784. Throughput: 0: 1662.1, 1: 1681.4. Samples: 16297994. Policy #0 lag: (min: 31.0, avg: 31.1, max: 39.0) +[2023-10-11 16:15:41,063][84230] Avg episode reward: [(0, '7.720'), (1, '7.450')] +[2023-10-11 16:15:41,501][85176] Updated weights for policy 0, policy_version 31592 (0.0007) +[2023-10-11 16:15:41,670][85175] Updated weights for policy 1, policy_version 32040 (0.0008) +[2023-10-11 16:15:41,864][85176] Updated weights for policy 0, policy_version 31602 (0.0009) +[2023-10-11 16:15:42,035][85175] Updated weights for policy 1, policy_version 32050 (0.0007) +[2023-10-11 16:15:42,235][85176] Updated weights for policy 0, policy_version 31612 (0.0008) +[2023-10-11 16:15:42,404][85175] Updated weights for policy 1, policy_version 32060 (0.0007) +[2023-10-11 16:15:46,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 65208320. Throughput: 0: 1661.1, 1: 1681.2. Samples: 16318482. Policy #0 lag: (min: 31.0, avg: 31.1, max: 39.0) +[2023-10-11 16:15:46,063][84230] Avg episode reward: [(0, '7.690'), (1, '7.370')] +[2023-10-11 16:15:46,101][85176] Updated weights for policy 0, policy_version 31622 (0.0011) +[2023-10-11 16:15:46,467][85176] Updated weights for policy 0, policy_version 31632 (0.0009) +[2023-10-11 16:15:46,542][85175] Updated weights for policy 1, policy_version 32070 (0.0009) +[2023-10-11 16:15:46,845][85176] Updated weights for policy 0, policy_version 31642 (0.0008) +[2023-10-11 16:15:46,925][85175] Updated weights for policy 1, policy_version 32080 (0.0007) +[2023-10-11 16:15:47,288][85175] Updated weights for policy 1, policy_version 32090 (0.0009) +[2023-10-11 16:15:51,049][85176] Updated weights for policy 0, policy_version 31652 (0.0008) +[2023-10-11 16:15:51,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 65273856. Throughput: 0: 1660.4, 1: 1679.2. Samples: 16327426. Policy #0 lag: (min: 31.0, avg: 31.1, max: 39.0) +[2023-10-11 16:15:51,063][84230] Avg episode reward: [(0, '7.650'), (1, '8.000')] +[2023-10-11 16:15:51,371][85175] Updated weights for policy 1, policy_version 32100 (0.0008) +[2023-10-11 16:15:51,424][85176] Updated weights for policy 0, policy_version 31662 (0.0008) +[2023-10-11 16:15:51,744][85175] Updated weights for policy 1, policy_version 32110 (0.0007) +[2023-10-11 16:15:51,787][85176] Updated weights for policy 0, policy_version 31672 (0.0007) +[2023-10-11 16:15:52,110][85175] Updated weights for policy 1, policy_version 32120 (0.0007) +[2023-10-11 16:15:56,054][85176] Updated weights for policy 0, policy_version 31682 (0.0008) +[2023-10-11 16:15:56,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 65339392. Throughput: 0: 1657.4, 1: 1681.6. Samples: 16347860. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:15:56,063][84230] Avg episode reward: [(0, '7.570'), (1, '8.190')] +[2023-10-11 16:15:56,141][85175] Updated weights for policy 1, policy_version 32130 (0.0008) +[2023-10-11 16:15:56,427][85176] Updated weights for policy 0, policy_version 31692 (0.0008) +[2023-10-11 16:15:56,505][85175] Updated weights for policy 1, policy_version 32140 (0.0008) +[2023-10-11 16:15:56,790][85176] Updated weights for policy 0, policy_version 31702 (0.0008) +[2023-10-11 16:15:56,874][85175] Updated weights for policy 1, policy_version 32150 (0.0009) +[2023-10-11 16:15:57,160][85176] Updated weights for policy 0, policy_version 31712 (0.0008) +[2023-10-11 16:15:57,246][85175] Updated weights for policy 1, policy_version 32160 (0.0009) +[2023-10-11 16:16:01,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 65404928. Throughput: 0: 1651.6, 1: 1690.5. Samples: 16368282. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:16:01,063][84230] Avg episode reward: [(0, '7.710'), (1, '7.650')] +[2023-10-11 16:16:01,227][85175] Updated weights for policy 1, policy_version 32170 (0.0010) +[2023-10-11 16:16:01,259][85176] Updated weights for policy 0, policy_version 31722 (0.0010) +[2023-10-11 16:16:01,595][85175] Updated weights for policy 1, policy_version 32180 (0.0010) +[2023-10-11 16:16:01,637][85176] Updated weights for policy 0, policy_version 31732 (0.0009) +[2023-10-11 16:16:01,971][85175] Updated weights for policy 1, policy_version 32190 (0.0010) +[2023-10-11 16:16:02,001][85176] Updated weights for policy 0, policy_version 31742 (0.0009) +[2023-10-11 16:16:05,896][85175] Updated weights for policy 1, policy_version 32200 (0.0009) +[2023-10-11 16:16:06,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 65470464. Throughput: 0: 1647.3, 1: 1688.1. Samples: 16377258. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:16:06,064][84230] Avg episode reward: [(0, '7.660'), (1, '7.120')] +[2023-10-11 16:16:06,127][85176] Updated weights for policy 0, policy_version 31752 (0.0008) +[2023-10-11 16:16:06,270][85175] Updated weights for policy 1, policy_version 32210 (0.0009) +[2023-10-11 16:16:06,506][85176] Updated weights for policy 0, policy_version 31762 (0.0008) +[2023-10-11 16:16:06,636][85175] Updated weights for policy 1, policy_version 32220 (0.0007) +[2023-10-11 16:16:06,870][85176] Updated weights for policy 0, policy_version 31772 (0.0009) +[2023-10-11 16:16:10,578][85175] Updated weights for policy 1, policy_version 32230 (0.0007) +[2023-10-11 16:16:10,951][85175] Updated weights for policy 1, policy_version 32240 (0.0010) +[2023-10-11 16:16:11,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 65536000. Throughput: 0: 1651.3, 1: 1692.4. Samples: 16398068. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:16:11,063][84230] Avg episode reward: [(0, '7.490'), (1, '7.660')] +[2023-10-11 16:16:11,173][85176] Updated weights for policy 0, policy_version 31782 (0.0008) +[2023-10-11 16:16:11,324][85175] Updated weights for policy 1, policy_version 32250 (0.0008) +[2023-10-11 16:16:11,542][85176] Updated weights for policy 0, policy_version 31792 (0.0007) +[2023-10-11 16:16:11,920][85176] Updated weights for policy 0, policy_version 31802 (0.0007) +[2023-10-11 16:16:15,430][85175] Updated weights for policy 1, policy_version 32260 (0.0009) +[2023-10-11 16:16:15,795][85175] Updated weights for policy 1, policy_version 32270 (0.0008) +[2023-10-11 16:16:16,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 65601536. Throughput: 0: 1653.8, 1: 1689.0. Samples: 16418448. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:16:16,064][84230] Avg episode reward: [(0, '7.380'), (1, '8.060')] +[2023-10-11 16:16:16,155][85176] Updated weights for policy 0, policy_version 31812 (0.0007) +[2023-10-11 16:16:16,168][85175] Updated weights for policy 1, policy_version 32280 (0.0007) +[2023-10-11 16:16:16,454][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000032288_33062912.pth... +[2023-10-11 16:16:16,483][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000030688_31424512.pth +[2023-10-11 16:16:16,529][85176] Updated weights for policy 0, policy_version 31822 (0.0009) +[2023-10-11 16:16:16,900][85176] Updated weights for policy 0, policy_version 31832 (0.0009) +[2023-10-11 16:16:17,195][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000031840_32604160.pth... +[2023-10-11 16:16:17,225][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000030272_30998528.pth +[2023-10-11 16:16:20,323][85175] Updated weights for policy 1, policy_version 32290 (0.0008) +[2023-10-11 16:16:20,689][85175] Updated weights for policy 1, policy_version 32300 (0.0009) +[2023-10-11 16:16:21,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 65667072. Throughput: 0: 1655.5, 1: 1690.4. Samples: 16427690. Policy #0 lag: (min: 24.0, avg: 49.5, max: 56.0) +[2023-10-11 16:16:21,064][84230] Avg episode reward: [(0, '7.970'), (1, '7.790')] +[2023-10-11 16:16:21,066][85175] Updated weights for policy 1, policy_version 32310 (0.0009) +[2023-10-11 16:16:21,071][85176] Updated weights for policy 0, policy_version 31842 (0.0009) +[2023-10-11 16:16:21,426][85175] Updated weights for policy 1, policy_version 32320 (0.0009) +[2023-10-11 16:16:21,450][85176] Updated weights for policy 0, policy_version 31852 (0.0009) +[2023-10-11 16:16:21,819][85176] Updated weights for policy 0, policy_version 31862 (0.0009) +[2023-10-11 16:16:22,187][85176] Updated weights for policy 0, policy_version 31872 (0.0008) +[2023-10-11 16:16:25,443][85175] Updated weights for policy 1, policy_version 32330 (0.0009) +[2023-10-11 16:16:25,822][85175] Updated weights for policy 1, policy_version 32340 (0.0009) +[2023-10-11 16:16:26,062][84230] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 65732608. Throughput: 0: 1646.0, 1: 1688.0. Samples: 16448028. Policy #0 lag: (min: 24.0, avg: 49.5, max: 56.0) +[2023-10-11 16:16:26,063][84230] Avg episode reward: [(0, '7.680'), (1, '7.310')] +[2023-10-11 16:16:26,189][85175] Updated weights for policy 1, policy_version 32350 (0.0007) +[2023-10-11 16:16:26,285][85176] Updated weights for policy 0, policy_version 31882 (0.0009) +[2023-10-11 16:16:26,656][85176] Updated weights for policy 0, policy_version 31892 (0.0008) +[2023-10-11 16:16:27,034][85176] Updated weights for policy 0, policy_version 31902 (0.0008) +[2023-10-11 16:16:30,350][85175] Updated weights for policy 1, policy_version 32360 (0.0008) +[2023-10-11 16:16:30,720][85175] Updated weights for policy 1, policy_version 32370 (0.0009) +[2023-10-11 16:16:31,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13218.3). Total num frames: 65798144. Throughput: 0: 1645.1, 1: 1678.2. Samples: 16468030. Policy #0 lag: (min: 24.0, avg: 49.5, max: 56.0) +[2023-10-11 16:16:31,063][84230] Avg episode reward: [(0, '7.740'), (1, '7.650')] +[2023-10-11 16:16:31,087][85175] Updated weights for policy 1, policy_version 32380 (0.0007) +[2023-10-11 16:16:31,195][85176] Updated weights for policy 0, policy_version 31912 (0.0007) +[2023-10-11 16:16:31,569][85176] Updated weights for policy 0, policy_version 31922 (0.0008) +[2023-10-11 16:16:31,932][85176] Updated weights for policy 0, policy_version 31932 (0.0009) +[2023-10-11 16:16:35,241][85175] Updated weights for policy 1, policy_version 32390 (0.0009) +[2023-10-11 16:16:35,632][85175] Updated weights for policy 1, policy_version 32400 (0.0009) +[2023-10-11 16:16:35,921][85176] Updated weights for policy 0, policy_version 31942 (0.0008) +[2023-10-11 16:16:36,002][85175] Updated weights for policy 1, policy_version 32410 (0.0009) +[2023-10-11 16:16:36,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13218.3). Total num frames: 65863680. Throughput: 0: 1646.4, 1: 1691.7. Samples: 16477642. Policy #0 lag: (min: 24.0, avg: 49.5, max: 56.0) +[2023-10-11 16:16:36,063][84230] Avg episode reward: [(0, '7.500'), (1, '7.320')] +[2023-10-11 16:16:36,294][85176] Updated weights for policy 0, policy_version 31952 (0.0010) +[2023-10-11 16:16:36,669][85176] Updated weights for policy 0, policy_version 31962 (0.0011) +[2023-10-11 16:16:40,054][85175] Updated weights for policy 1, policy_version 32420 (0.0009) +[2023-10-11 16:16:40,422][85175] Updated weights for policy 1, policy_version 32430 (0.0007) +[2023-10-11 16:16:40,735][85176] Updated weights for policy 0, policy_version 31972 (0.0009) +[2023-10-11 16:16:40,793][85175] Updated weights for policy 1, policy_version 32440 (0.0008) +[2023-10-11 16:16:41,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13218.3). Total num frames: 65929216. Throughput: 0: 1652.1, 1: 1685.4. Samples: 16498050. Policy #0 lag: (min: 24.0, avg: 49.5, max: 56.0) +[2023-10-11 16:16:41,063][84230] Avg episode reward: [(0, '7.640'), (1, '7.350')] +[2023-10-11 16:16:41,106][85176] Updated weights for policy 0, policy_version 31982 (0.0008) +[2023-10-11 16:16:41,481][85176] Updated weights for policy 0, policy_version 31992 (0.0009) +[2023-10-11 16:16:44,824][85175] Updated weights for policy 1, policy_version 32450 (0.0009) +[2023-10-11 16:16:45,192][85175] Updated weights for policy 1, policy_version 32460 (0.0008) +[2023-10-11 16:16:45,553][85175] Updated weights for policy 1, policy_version 32470 (0.0009) +[2023-10-11 16:16:45,615][85176] Updated weights for policy 0, policy_version 32002 (0.0010) +[2023-10-11 16:16:45,921][85175] Updated weights for policy 1, policy_version 32480 (0.0008) +[2023-10-11 16:16:45,988][85176] Updated weights for policy 0, policy_version 32012 (0.0008) +[2023-10-11 16:16:46,063][84230] Fps is (10 sec: 16384.1, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 66027520. Throughput: 0: 1656.9, 1: 1667.5. Samples: 16517882. Policy #0 lag: (min: 16.0, avg: 30.4, max: 48.0) +[2023-10-11 16:16:46,063][84230] Avg episode reward: [(0, '7.710'), (1, '7.450')] +[2023-10-11 16:16:46,370][85176] Updated weights for policy 0, policy_version 32022 (0.0007) +[2023-10-11 16:16:46,740][85176] Updated weights for policy 0, policy_version 32032 (0.0008) +[2023-10-11 16:16:49,962][85175] Updated weights for policy 1, policy_version 32490 (0.0007) +[2023-10-11 16:16:50,326][85175] Updated weights for policy 1, policy_version 32500 (0.0009) +[2023-10-11 16:16:50,701][85175] Updated weights for policy 1, policy_version 32510 (0.0009) +[2023-10-11 16:16:50,900][85176] Updated weights for policy 0, policy_version 32042 (0.0007) +[2023-10-11 16:16:51,063][84230] Fps is (10 sec: 16383.7, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 66093056. Throughput: 0: 1659.4, 1: 1683.6. Samples: 16527694. Policy #0 lag: (min: 16.0, avg: 30.4, max: 48.0) +[2023-10-11 16:16:51,064][84230] Avg episode reward: [(0, '7.570'), (1, '7.820')] +[2023-10-11 16:16:51,276][85176] Updated weights for policy 0, policy_version 32052 (0.0008) +[2023-10-11 16:16:51,648][85176] Updated weights for policy 0, policy_version 32062 (0.0008) +[2023-10-11 16:16:54,732][85175] Updated weights for policy 1, policy_version 32520 (0.0009) +[2023-10-11 16:16:55,100][85175] Updated weights for policy 1, policy_version 32530 (0.0008) +[2023-10-11 16:16:55,458][85175] Updated weights for policy 1, policy_version 32540 (0.0010) +[2023-10-11 16:16:55,681][85176] Updated weights for policy 0, policy_version 32072 (0.0009) +[2023-10-11 16:16:56,061][85176] Updated weights for policy 0, policy_version 32082 (0.0008) +[2023-10-11 16:16:56,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13329.3). Total num frames: 66158592. Throughput: 0: 1663.5, 1: 1681.6. Samples: 16548596. Policy #0 lag: (min: 16.0, avg: 30.4, max: 48.0) +[2023-10-11 16:16:56,064][84230] Avg episode reward: [(0, '7.540'), (1, '7.960')] +[2023-10-11 16:16:56,435][85176] Updated weights for policy 0, policy_version 32092 (0.0010) +[2023-10-11 16:16:59,549][85175] Updated weights for policy 1, policy_version 32550 (0.0008) +[2023-10-11 16:16:59,907][85175] Updated weights for policy 1, policy_version 32560 (0.0011) +[2023-10-11 16:17:00,274][85175] Updated weights for policy 1, policy_version 32570 (0.0009) +[2023-10-11 16:17:00,472][85176] Updated weights for policy 0, policy_version 32102 (0.0010) +[2023-10-11 16:17:00,849][85176] Updated weights for policy 0, policy_version 32112 (0.0009) +[2023-10-11 16:17:01,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 66224128. Throughput: 0: 1660.0, 1: 1661.2. Samples: 16567902. Policy #0 lag: (min: 16.0, avg: 30.4, max: 48.0) +[2023-10-11 16:17:01,063][84230] Avg episode reward: [(0, '7.980'), (1, '7.650')] +[2023-10-11 16:17:01,226][85176] Updated weights for policy 0, policy_version 32122 (0.0009) +[2023-10-11 16:17:04,121][85175] Updated weights for policy 1, policy_version 32580 (0.0008) +[2023-10-11 16:17:04,499][85175] Updated weights for policy 1, policy_version 32590 (0.0007) +[2023-10-11 16:17:04,874][85175] Updated weights for policy 1, policy_version 32600 (0.0007) +[2023-10-11 16:17:05,384][85176] Updated weights for policy 0, policy_version 32132 (0.0008) +[2023-10-11 16:17:05,768][85176] Updated weights for policy 0, policy_version 32142 (0.0009) +[2023-10-11 16:17:06,062][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13329.4). Total num frames: 66289664. Throughput: 0: 1664.9, 1: 1690.9. Samples: 16578702. Policy #0 lag: (min: 16.0, avg: 30.4, max: 48.0) +[2023-10-11 16:17:06,063][84230] Avg episode reward: [(0, '8.030'), (1, '7.590')] +[2023-10-11 16:17:06,143][85176] Updated weights for policy 0, policy_version 32152 (0.0008) +[2023-10-11 16:17:09,026][85175] Updated weights for policy 1, policy_version 32610 (0.0008) +[2023-10-11 16:17:09,399][85175] Updated weights for policy 1, policy_version 32620 (0.0009) +[2023-10-11 16:17:09,760][85175] Updated weights for policy 1, policy_version 32630 (0.0007) +[2023-10-11 16:17:10,127][85175] Updated weights for policy 1, policy_version 32640 (0.0007) +[2023-10-11 16:17:10,229][85176] Updated weights for policy 0, policy_version 32162 (0.0007) +[2023-10-11 16:17:10,597][85176] Updated weights for policy 0, policy_version 32172 (0.0008) +[2023-10-11 16:17:10,975][85176] Updated weights for policy 0, policy_version 32182 (0.0009) +[2023-10-11 16:17:11,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 66355200. Throughput: 0: 1668.8, 1: 1678.7. Samples: 16598670. Policy #0 lag: (min: 30.0, avg: 38.0, max: 62.0) +[2023-10-11 16:17:11,064][84230] Avg episode reward: [(0, '7.900'), (1, '7.360')] +[2023-10-11 16:17:11,333][85176] Updated weights for policy 0, policy_version 32192 (0.0009) +[2023-10-11 16:17:14,087][85175] Updated weights for policy 1, policy_version 32650 (0.0008) +[2023-10-11 16:17:14,456][85175] Updated weights for policy 1, policy_version 32660 (0.0008) +[2023-10-11 16:17:14,815][85175] Updated weights for policy 1, policy_version 32670 (0.0009) +[2023-10-11 16:17:15,482][85176] Updated weights for policy 0, policy_version 32202 (0.0008) +[2023-10-11 16:17:15,858][85176] Updated weights for policy 0, policy_version 32212 (0.0009) +[2023-10-11 16:17:16,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13329.4). Total num frames: 66420736. Throughput: 0: 1658.1, 1: 1683.9. Samples: 16618418. Policy #0 lag: (min: 30.0, avg: 38.0, max: 62.0) +[2023-10-11 16:17:16,063][84230] Avg episode reward: [(0, '7.750'), (1, '7.360')] +[2023-10-11 16:17:16,237][85176] Updated weights for policy 0, policy_version 32222 (0.0007) +[2023-10-11 16:17:19,062][85175] Updated weights for policy 1, policy_version 32680 (0.0008) +[2023-10-11 16:17:19,429][85175] Updated weights for policy 1, policy_version 32690 (0.0008) +[2023-10-11 16:17:19,806][85175] Updated weights for policy 1, policy_version 32700 (0.0011) +[2023-10-11 16:17:20,061][85176] Updated weights for policy 0, policy_version 32232 (0.0008) +[2023-10-11 16:17:20,429][85176] Updated weights for policy 0, policy_version 32242 (0.0008) +[2023-10-11 16:17:20,798][85176] Updated weights for policy 0, policy_version 32252 (0.0007) +[2023-10-11 16:17:21,062][84230] Fps is (10 sec: 16384.4, 60 sec: 14199.5, 300 sec: 13440.5). Total num frames: 66519040. Throughput: 0: 1671.9, 1: 1700.1. Samples: 16629382. Policy #0 lag: (min: 30.0, avg: 38.0, max: 62.0) +[2023-10-11 16:17:21,063][84230] Avg episode reward: [(0, '7.300'), (1, '7.430')] +[2023-10-11 16:17:23,965][85175] Updated weights for policy 1, policy_version 32710 (0.0010) +[2023-10-11 16:17:24,347][85175] Updated weights for policy 1, policy_version 32720 (0.0008) +[2023-10-11 16:17:24,721][85175] Updated weights for policy 1, policy_version 32730 (0.0009) +[2023-10-11 16:17:25,029][85176] Updated weights for policy 0, policy_version 32262 (0.0007) +[2023-10-11 16:17:25,406][85176] Updated weights for policy 0, policy_version 32272 (0.0008) +[2023-10-11 16:17:25,786][85176] Updated weights for policy 0, policy_version 32282 (0.0009) +[2023-10-11 16:17:26,063][84230] Fps is (10 sec: 16383.8, 60 sec: 14199.4, 300 sec: 13440.4). Total num frames: 66584576. Throughput: 0: 1668.8, 1: 1684.6. Samples: 16648956. Policy #0 lag: (min: 30.0, avg: 38.0, max: 62.0) +[2023-10-11 16:17:26,064][84230] Avg episode reward: [(0, '7.600'), (1, '7.800')] +[2023-10-11 16:17:28,709][85175] Updated weights for policy 1, policy_version 32740 (0.0008) +[2023-10-11 16:17:29,072][85175] Updated weights for policy 1, policy_version 32750 (0.0007) +[2023-10-11 16:17:29,441][85175] Updated weights for policy 1, policy_version 32760 (0.0007) +[2023-10-11 16:17:29,938][85176] Updated weights for policy 0, policy_version 32292 (0.0009) +[2023-10-11 16:17:30,304][85176] Updated weights for policy 0, policy_version 32302 (0.0009) +[2023-10-11 16:17:30,678][85176] Updated weights for policy 0, policy_version 32312 (0.0009) +[2023-10-11 16:17:31,062][84230] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13440.4). Total num frames: 66650112. Throughput: 0: 1648.8, 1: 1693.4. Samples: 16668284. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-11 16:17:31,063][84230] Avg episode reward: [(0, '7.750'), (1, '7.760')] +[2023-10-11 16:17:33,363][85175] Updated weights for policy 1, policy_version 32770 (0.0008) +[2023-10-11 16:17:33,729][85175] Updated weights for policy 1, policy_version 32780 (0.0008) +[2023-10-11 16:17:34,089][85175] Updated weights for policy 1, policy_version 32790 (0.0008) +[2023-10-11 16:17:34,461][85175] Updated weights for policy 1, policy_version 32800 (0.0010) +[2023-10-11 16:17:34,653][85176] Updated weights for policy 0, policy_version 32322 (0.0007) +[2023-10-11 16:17:35,013][85176] Updated weights for policy 0, policy_version 32332 (0.0007) +[2023-10-11 16:17:35,392][85176] Updated weights for policy 0, policy_version 32342 (0.0008) +[2023-10-11 16:17:35,764][85176] Updated weights for policy 0, policy_version 32352 (0.0009) +[2023-10-11 16:17:36,062][84230] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 13440.4). Total num frames: 66715648. Throughput: 0: 1668.8, 1: 1699.0. Samples: 16679244. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-11 16:17:36,063][84230] Avg episode reward: [(0, '8.350'), (1, '7.610')] +[2023-10-11 16:17:38,424][85175] Updated weights for policy 1, policy_version 32810 (0.0007) +[2023-10-11 16:17:38,783][85175] Updated weights for policy 1, policy_version 32820 (0.0008) +[2023-10-11 16:17:39,158][85175] Updated weights for policy 1, policy_version 32830 (0.0008) +[2023-10-11 16:17:39,935][85176] Updated weights for policy 0, policy_version 32362 (0.0007) +[2023-10-11 16:17:40,307][85176] Updated weights for policy 0, policy_version 32372 (0.0008) +[2023-10-11 16:17:40,693][85176] Updated weights for policy 0, policy_version 32382 (0.0008) +[2023-10-11 16:17:41,063][84230] Fps is (10 sec: 13106.8, 60 sec: 14199.4, 300 sec: 13440.4). Total num frames: 66781184. Throughput: 0: 1663.6, 1: 1673.1. Samples: 16698750. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-11 16:17:41,064][84230] Avg episode reward: [(0, '7.710'), (1, '7.620')] +[2023-10-11 16:17:43,160][85175] Updated weights for policy 1, policy_version 32840 (0.0007) +[2023-10-11 16:17:43,527][85175] Updated weights for policy 1, policy_version 32850 (0.0009) +[2023-10-11 16:17:43,899][85175] Updated weights for policy 1, policy_version 32860 (0.0011) +[2023-10-11 16:17:44,783][85176] Updated weights for policy 0, policy_version 32392 (0.0008) +[2023-10-11 16:17:45,172][85176] Updated weights for policy 0, policy_version 32402 (0.0010) +[2023-10-11 16:17:45,546][85176] Updated weights for policy 0, policy_version 32412 (0.0009) +[2023-10-11 16:17:46,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 66846720. Throughput: 0: 1641.2, 1: 1701.9. Samples: 16718340. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-11 16:17:46,063][84230] Avg episode reward: [(0, '7.230'), (1, '7.520')] +[2023-10-11 16:17:47,763][85175] Updated weights for policy 1, policy_version 32870 (0.0010) +[2023-10-11 16:17:48,127][85175] Updated weights for policy 1, policy_version 32880 (0.0008) +[2023-10-11 16:17:48,493][85175] Updated weights for policy 1, policy_version 32890 (0.0008) +[2023-10-11 16:17:49,815][85176] Updated weights for policy 0, policy_version 32422 (0.0009) +[2023-10-11 16:17:50,195][85176] Updated weights for policy 0, policy_version 32432 (0.0008) +[2023-10-11 16:17:50,569][85176] Updated weights for policy 0, policy_version 32442 (0.0008) +[2023-10-11 16:17:51,063][84230] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 66912256. Throughput: 0: 1658.2, 1: 1673.5. Samples: 16728626. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-11 16:17:51,063][84230] Avg episode reward: [(0, '7.090'), (1, '7.800')] +[2023-10-11 16:17:52,648][85175] Updated weights for policy 1, policy_version 32900 (0.0009) +[2023-10-11 16:17:53,018][85175] Updated weights for policy 1, policy_version 32910 (0.0008) +[2023-10-11 16:17:53,377][85175] Updated weights for policy 1, policy_version 32920 (0.0008) +[2023-10-11 16:17:54,794][85176] Updated weights for policy 0, policy_version 32452 (0.0008) +[2023-10-11 16:17:55,167][85176] Updated weights for policy 0, policy_version 32462 (0.0007) +[2023-10-11 16:17:55,539][85176] Updated weights for policy 0, policy_version 32472 (0.0008) +[2023-10-11 16:17:56,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 66977792. Throughput: 0: 1658.0, 1: 1677.4. Samples: 16748762. Policy #0 lag: (min: 20.0, avg: 23.2, max: 52.0) +[2023-10-11 16:17:56,063][84230] Avg episode reward: [(0, '7.140'), (1, '7.580')] +[2023-10-11 16:17:57,456][85175] Updated weights for policy 1, policy_version 32930 (0.0007) +[2023-10-11 16:17:57,827][85175] Updated weights for policy 1, policy_version 32940 (0.0010) +[2023-10-11 16:17:58,196][85175] Updated weights for policy 1, policy_version 32950 (0.0008) +[2023-10-11 16:17:58,556][85175] Updated weights for policy 1, policy_version 32960 (0.0007) +[2023-10-11 16:17:59,652][85176] Updated weights for policy 0, policy_version 32482 (0.0007) +[2023-10-11 16:18:00,020][85176] Updated weights for policy 0, policy_version 32492 (0.0007) +[2023-10-11 16:18:00,398][85176] Updated weights for policy 0, policy_version 32502 (0.0009) +[2023-10-11 16:18:00,777][85176] Updated weights for policy 0, policy_version 32512 (0.0008) +[2023-10-11 16:18:01,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 67043328. Throughput: 0: 1648.4, 1: 1692.3. Samples: 16768752. Policy #0 lag: (min: 20.0, avg: 23.2, max: 52.0) +[2023-10-11 16:18:01,064][84230] Avg episode reward: [(0, '7.580'), (1, '7.700')] +[2023-10-11 16:18:02,471][85175] Updated weights for policy 1, policy_version 32970 (0.0010) +[2023-10-11 16:18:02,838][85175] Updated weights for policy 1, policy_version 32980 (0.0009) +[2023-10-11 16:18:03,204][85175] Updated weights for policy 1, policy_version 32990 (0.0009) +[2023-10-11 16:18:04,935][85176] Updated weights for policy 0, policy_version 32522 (0.0008) +[2023-10-11 16:18:05,301][85176] Updated weights for policy 0, policy_version 32532 (0.0007) +[2023-10-11 16:18:05,681][85176] Updated weights for policy 0, policy_version 32542 (0.0010) +[2023-10-11 16:18:06,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 67108864. Throughput: 0: 1657.4, 1: 1665.5. Samples: 16778912. Policy #0 lag: (min: 20.0, avg: 23.2, max: 52.0) +[2023-10-11 16:18:06,064][84230] Avg episode reward: [(0, '8.400'), (1, '7.410')] +[2023-10-11 16:18:07,257][85175] Updated weights for policy 1, policy_version 33000 (0.0008) +[2023-10-11 16:18:07,628][85175] Updated weights for policy 1, policy_version 33010 (0.0008) +[2023-10-11 16:18:08,001][85175] Updated weights for policy 1, policy_version 33020 (0.0008) +[2023-10-11 16:18:09,855][85176] Updated weights for policy 0, policy_version 32552 (0.0007) +[2023-10-11 16:18:10,237][85176] Updated weights for policy 0, policy_version 32562 (0.0010) +[2023-10-11 16:18:10,604][85176] Updated weights for policy 0, policy_version 32572 (0.0008) +[2023-10-11 16:18:11,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 67174400. Throughput: 0: 1654.8, 1: 1688.9. Samples: 16799420. Policy #0 lag: (min: 20.0, avg: 23.2, max: 52.0) +[2023-10-11 16:18:11,064][84230] Avg episode reward: [(0, '8.110'), (1, '7.260')] +[2023-10-11 16:18:12,269][85175] Updated weights for policy 1, policy_version 33030 (0.0007) +[2023-10-11 16:18:12,663][85175] Updated weights for policy 1, policy_version 33040 (0.0007) +[2023-10-11 16:18:13,042][85175] Updated weights for policy 1, policy_version 33050 (0.0007) +[2023-10-11 16:18:14,615][85176] Updated weights for policy 0, policy_version 32582 (0.0008) +[2023-10-11 16:18:14,990][85176] Updated weights for policy 0, policy_version 32592 (0.0007) +[2023-10-11 16:18:15,366][85176] Updated weights for policy 0, policy_version 32602 (0.0009) +[2023-10-11 16:18:16,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 67239936. Throughput: 0: 1650.2, 1: 1697.0. Samples: 16818910. Policy #0 lag: (min: 20.0, avg: 23.2, max: 52.0) +[2023-10-11 16:18:16,063][84230] Avg episode reward: [(0, '7.630'), (1, '7.650')] +[2023-10-11 16:18:16,076][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000032608_33390592.pth... +[2023-10-11 16:18:16,076][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000033056_33849344.pth... +[2023-10-11 16:18:16,111][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000031040_31784960.pth +[2023-10-11 16:18:16,113][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000031488_32243712.pth +[2023-10-11 16:18:16,872][85175] Updated weights for policy 1, policy_version 33060 (0.0008) +[2023-10-11 16:18:17,230][85175] Updated weights for policy 1, policy_version 33070 (0.0010) +[2023-10-11 16:18:17,604][85175] Updated weights for policy 1, policy_version 33080 (0.0009) +[2023-10-11 16:18:19,508][85176] Updated weights for policy 0, policy_version 32612 (0.0007) +[2023-10-11 16:18:19,882][85176] Updated weights for policy 0, policy_version 32622 (0.0008) +[2023-10-11 16:18:20,259][85176] Updated weights for policy 0, policy_version 32632 (0.0009) +[2023-10-11 16:18:21,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 67305472. Throughput: 0: 1659.4, 1: 1676.5. Samples: 16829358. Policy #0 lag: (min: 31.0, avg: 38.3, max: 63.0) +[2023-10-11 16:18:21,063][84230] Avg episode reward: [(0, '7.250'), (1, '7.810')] +[2023-10-11 16:18:21,646][85175] Updated weights for policy 1, policy_version 33090 (0.0008) +[2023-10-11 16:18:22,008][85175] Updated weights for policy 1, policy_version 33100 (0.0007) +[2023-10-11 16:18:22,385][85175] Updated weights for policy 1, policy_version 33110 (0.0007) +[2023-10-11 16:18:22,751][85175] Updated weights for policy 1, policy_version 33120 (0.0008) +[2023-10-11 16:18:24,158][85176] Updated weights for policy 0, policy_version 32642 (0.0009) +[2023-10-11 16:18:24,530][85176] Updated weights for policy 0, policy_version 32652 (0.0009) +[2023-10-11 16:18:24,898][85176] Updated weights for policy 0, policy_version 32662 (0.0008) +[2023-10-11 16:18:25,271][85176] Updated weights for policy 0, policy_version 32672 (0.0007) +[2023-10-11 16:18:26,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 67371008. Throughput: 0: 1656.3, 1: 1701.5. Samples: 16849850. Policy #0 lag: (min: 31.0, avg: 38.3, max: 63.0) +[2023-10-11 16:18:26,063][84230] Avg episode reward: [(0, '7.060'), (1, '7.450')] +[2023-10-11 16:18:26,818][85175] Updated weights for policy 1, policy_version 33130 (0.0008) +[2023-10-11 16:18:27,190][85175] Updated weights for policy 1, policy_version 33140 (0.0007) +[2023-10-11 16:18:27,566][85175] Updated weights for policy 1, policy_version 33150 (0.0009) +[2023-10-11 16:18:29,157][85176] Updated weights for policy 0, policy_version 32682 (0.0008) +[2023-10-11 16:18:29,529][85176] Updated weights for policy 0, policy_version 32692 (0.0009) +[2023-10-11 16:18:29,912][85176] Updated weights for policy 0, policy_version 32702 (0.0009) +[2023-10-11 16:18:31,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 67436544. Throughput: 0: 1669.6, 1: 1701.8. Samples: 16870050. Policy #0 lag: (min: 31.0, avg: 38.3, max: 63.0) +[2023-10-11 16:18:31,063][84230] Avg episode reward: [(0, '7.450'), (1, '7.510')] +[2023-10-11 16:18:31,650][85175] Updated weights for policy 1, policy_version 33160 (0.0010) +[2023-10-11 16:18:32,025][85175] Updated weights for policy 1, policy_version 33170 (0.0011) +[2023-10-11 16:18:32,398][85175] Updated weights for policy 1, policy_version 33180 (0.0007) +[2023-10-11 16:18:33,877][85176] Updated weights for policy 0, policy_version 32712 (0.0008) +[2023-10-11 16:18:34,254][85176] Updated weights for policy 0, policy_version 32722 (0.0008) +[2023-10-11 16:18:34,621][85176] Updated weights for policy 0, policy_version 32732 (0.0007) +[2023-10-11 16:18:36,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 67502080. Throughput: 0: 1678.0, 1: 1696.7. Samples: 16880486. Policy #0 lag: (min: 31.0, avg: 38.3, max: 63.0) +[2023-10-11 16:18:36,064][84230] Avg episode reward: [(0, '8.350'), (1, '7.620')] +[2023-10-11 16:18:36,524][85175] Updated weights for policy 1, policy_version 33190 (0.0007) +[2023-10-11 16:18:36,900][85175] Updated weights for policy 1, policy_version 33200 (0.0008) +[2023-10-11 16:18:37,258][85175] Updated weights for policy 1, policy_version 33210 (0.0008) +[2023-10-11 16:18:38,760][85176] Updated weights for policy 0, policy_version 32742 (0.0010) +[2023-10-11 16:18:39,132][85176] Updated weights for policy 0, policy_version 32752 (0.0007) +[2023-10-11 16:18:39,509][85176] Updated weights for policy 0, policy_version 32762 (0.0009) +[2023-10-11 16:18:41,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 67567616. Throughput: 0: 1655.0, 1: 1705.4. Samples: 16899982. Policy #0 lag: (min: 31.0, avg: 38.3, max: 63.0) +[2023-10-11 16:18:41,063][84230] Avg episode reward: [(0, '8.190'), (1, '8.030')] +[2023-10-11 16:18:41,142][85175] Updated weights for policy 1, policy_version 33220 (0.0008) +[2023-10-11 16:18:41,512][85175] Updated weights for policy 1, policy_version 33230 (0.0007) +[2023-10-11 16:18:41,883][85175] Updated weights for policy 1, policy_version 33240 (0.0008) +[2023-10-11 16:18:43,808][85176] Updated weights for policy 0, policy_version 32772 (0.0009) +[2023-10-11 16:18:44,172][85176] Updated weights for policy 0, policy_version 32782 (0.0007) +[2023-10-11 16:18:44,550][85176] Updated weights for policy 0, policy_version 32792 (0.0007) +[2023-10-11 16:18:45,900][85175] Updated weights for policy 1, policy_version 33250 (0.0009) +[2023-10-11 16:18:46,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 67633152. Throughput: 0: 1671.7, 1: 1702.7. Samples: 16920600. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:18:46,064][84230] Avg episode reward: [(0, '8.040'), (1, '7.640')] +[2023-10-11 16:18:46,265][85175] Updated weights for policy 1, policy_version 33260 (0.0009) +[2023-10-11 16:18:46,633][85175] Updated weights for policy 1, policy_version 33270 (0.0008) +[2023-10-11 16:18:47,002][85175] Updated weights for policy 1, policy_version 33280 (0.0009) +[2023-10-11 16:18:48,549][85176] Updated weights for policy 0, policy_version 32802 (0.0008) +[2023-10-11 16:18:48,923][85176] Updated weights for policy 0, policy_version 32812 (0.0010) +[2023-10-11 16:18:49,295][85176] Updated weights for policy 0, policy_version 32822 (0.0007) +[2023-10-11 16:18:49,663][85176] Updated weights for policy 0, policy_version 32832 (0.0007) +[2023-10-11 16:18:51,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 67698688. Throughput: 0: 1674.9, 1: 1701.8. Samples: 16930864. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:18:51,064][84230] Avg episode reward: [(0, '7.850'), (1, '7.420')] +[2023-10-11 16:18:51,116][85175] Updated weights for policy 1, policy_version 33290 (0.0008) +[2023-10-11 16:18:51,482][85175] Updated weights for policy 1, policy_version 33300 (0.0007) +[2023-10-11 16:18:51,862][85175] Updated weights for policy 1, policy_version 33310 (0.0007) +[2023-10-11 16:18:53,706][85176] Updated weights for policy 0, policy_version 32842 (0.0008) +[2023-10-11 16:18:54,079][85176] Updated weights for policy 0, policy_version 32852 (0.0008) +[2023-10-11 16:18:54,455][85176] Updated weights for policy 0, policy_version 32862 (0.0008) +[2023-10-11 16:18:55,712][85175] Updated weights for policy 1, policy_version 33320 (0.0007) +[2023-10-11 16:18:56,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 67764224. Throughput: 0: 1653.8, 1: 1700.6. Samples: 16950368. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:18:56,063][84230] Avg episode reward: [(0, '7.600'), (1, '7.060')] +[2023-10-11 16:18:56,075][85175] Updated weights for policy 1, policy_version 33330 (0.0007) +[2023-10-11 16:18:56,448][85175] Updated weights for policy 1, policy_version 33340 (0.0007) +[2023-10-11 16:18:58,878][85176] Updated weights for policy 0, policy_version 32872 (0.0010) +[2023-10-11 16:18:59,244][85176] Updated weights for policy 0, policy_version 32882 (0.0011) +[2023-10-11 16:18:59,624][85176] Updated weights for policy 0, policy_version 32892 (0.0010) +[2023-10-11 16:19:00,295][85175] Updated weights for policy 1, policy_version 33350 (0.0009) +[2023-10-11 16:19:00,667][85175] Updated weights for policy 1, policy_version 33360 (0.0010) +[2023-10-11 16:19:01,029][85175] Updated weights for policy 1, policy_version 33370 (0.0009) +[2023-10-11 16:19:01,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 67829760. Throughput: 0: 1673.4, 1: 1695.6. Samples: 16970514. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:19:01,064][84230] Avg episode reward: [(0, '7.200'), (1, '7.030')] +[2023-10-11 16:19:03,822][85176] Updated weights for policy 0, policy_version 32902 (0.0008) +[2023-10-11 16:19:04,200][85176] Updated weights for policy 0, policy_version 32912 (0.0009) +[2023-10-11 16:19:04,572][85176] Updated weights for policy 0, policy_version 32922 (0.0008) +[2023-10-11 16:19:05,251][85175] Updated weights for policy 1, policy_version 33380 (0.0008) +[2023-10-11 16:19:05,624][85175] Updated weights for policy 1, policy_version 33390 (0.0008) +[2023-10-11 16:19:05,982][85175] Updated weights for policy 1, policy_version 33400 (0.0008) +[2023-10-11 16:19:06,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 67895296. Throughput: 0: 1672.2, 1: 1701.9. Samples: 16981190. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:19:06,063][84230] Avg episode reward: [(0, '7.410'), (1, '7.350')] +[2023-10-11 16:19:08,639][85176] Updated weights for policy 0, policy_version 32932 (0.0009) +[2023-10-11 16:19:09,017][85176] Updated weights for policy 0, policy_version 32942 (0.0010) +[2023-10-11 16:19:09,394][85176] Updated weights for policy 0, policy_version 32952 (0.0008) +[2023-10-11 16:19:10,036][85175] Updated weights for policy 1, policy_version 33410 (0.0008) +[2023-10-11 16:19:10,399][85175] Updated weights for policy 1, policy_version 33420 (0.0009) +[2023-10-11 16:19:10,757][85175] Updated weights for policy 1, policy_version 33430 (0.0007) +[2023-10-11 16:19:11,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 67960832. Throughput: 0: 1651.8, 1: 1700.2. Samples: 17000692. Policy #0 lag: (min: 6.0, avg: 8.1, max: 38.0) +[2023-10-11 16:19:11,064][84230] Avg episode reward: [(0, '7.710'), (1, '7.450')] +[2023-10-11 16:19:11,130][85175] Updated weights for policy 1, policy_version 33440 (0.0008) +[2023-10-11 16:19:13,565][85176] Updated weights for policy 0, policy_version 32962 (0.0007) +[2023-10-11 16:19:13,936][85176] Updated weights for policy 0, policy_version 32972 (0.0009) +[2023-10-11 16:19:14,302][85176] Updated weights for policy 0, policy_version 32982 (0.0011) +[2023-10-11 16:19:14,678][85176] Updated weights for policy 0, policy_version 32992 (0.0010) +[2023-10-11 16:19:15,047][85175] Updated weights for policy 1, policy_version 33450 (0.0008) +[2023-10-11 16:19:15,409][85175] Updated weights for policy 1, policy_version 33460 (0.0007) +[2023-10-11 16:19:15,783][85175] Updated weights for policy 1, policy_version 33470 (0.0007) +[2023-10-11 16:19:16,062][84230] Fps is (10 sec: 16384.0, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 68059136. Throughput: 0: 1662.7, 1: 1681.8. Samples: 17020554. Policy #0 lag: (min: 6.0, avg: 8.1, max: 38.0) +[2023-10-11 16:19:16,063][84230] Avg episode reward: [(0, '8.190'), (1, '7.870')] +[2023-10-11 16:19:18,621][85176] Updated weights for policy 0, policy_version 33002 (0.0009) +[2023-10-11 16:19:18,994][85176] Updated weights for policy 0, policy_version 33012 (0.0007) +[2023-10-11 16:19:19,363][85176] Updated weights for policy 0, policy_version 33022 (0.0007) +[2023-10-11 16:19:19,917][85175] Updated weights for policy 1, policy_version 33480 (0.0010) +[2023-10-11 16:19:20,290][85175] Updated weights for policy 1, policy_version 33490 (0.0010) +[2023-10-11 16:19:20,660][85175] Updated weights for policy 1, policy_version 33500 (0.0011) +[2023-10-11 16:19:21,062][84230] Fps is (10 sec: 16384.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 68124672. Throughput: 0: 1654.9, 1: 1697.3. Samples: 17031334. Policy #0 lag: (min: 6.0, avg: 8.1, max: 38.0) +[2023-10-11 16:19:21,063][84230] Avg episode reward: [(0, '7.700'), (1, '7.960')] +[2023-10-11 16:19:23,419][85176] Updated weights for policy 0, policy_version 33032 (0.0008) +[2023-10-11 16:19:23,785][85176] Updated weights for policy 0, policy_version 33042 (0.0009) +[2023-10-11 16:19:24,159][85176] Updated weights for policy 0, policy_version 33052 (0.0011) +[2023-10-11 16:19:24,705][85175] Updated weights for policy 1, policy_version 33510 (0.0009) +[2023-10-11 16:19:25,075][85175] Updated weights for policy 1, policy_version 33520 (0.0010) +[2023-10-11 16:19:25,441][85175] Updated weights for policy 1, policy_version 33530 (0.0010) +[2023-10-11 16:19:26,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 68190208. Throughput: 0: 1657.0, 1: 1700.0. Samples: 17051044. Policy #0 lag: (min: 6.0, avg: 8.1, max: 38.0) +[2023-10-11 16:19:26,063][84230] Avg episode reward: [(0, '7.590'), (1, '7.610')] +[2023-10-11 16:19:28,338][85176] Updated weights for policy 0, policy_version 33062 (0.0009) +[2023-10-11 16:19:28,703][85176] Updated weights for policy 0, policy_version 33072 (0.0010) +[2023-10-11 16:19:29,077][85176] Updated weights for policy 0, policy_version 33082 (0.0007) +[2023-10-11 16:19:29,660][85175] Updated weights for policy 1, policy_version 33540 (0.0009) +[2023-10-11 16:19:30,018][85175] Updated weights for policy 1, policy_version 33550 (0.0007) +[2023-10-11 16:19:30,392][85175] Updated weights for policy 1, policy_version 33560 (0.0008) +[2023-10-11 16:19:31,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 68255744. Throughput: 0: 1662.5, 1: 1672.4. Samples: 17070672. Policy #0 lag: (min: 6.0, avg: 8.1, max: 38.0) +[2023-10-11 16:19:31,063][84230] Avg episode reward: [(0, '7.670'), (1, '7.360')] +[2023-10-11 16:19:33,108][85176] Updated weights for policy 0, policy_version 33092 (0.0007) +[2023-10-11 16:19:33,489][85176] Updated weights for policy 0, policy_version 33102 (0.0010) +[2023-10-11 16:19:33,864][85176] Updated weights for policy 0, policy_version 33112 (0.0010) +[2023-10-11 16:19:34,397][85175] Updated weights for policy 1, policy_version 33570 (0.0008) +[2023-10-11 16:19:34,760][85175] Updated weights for policy 1, policy_version 33580 (0.0008) +[2023-10-11 16:19:35,131][85175] Updated weights for policy 1, policy_version 33590 (0.0008) +[2023-10-11 16:19:35,496][85175] Updated weights for policy 1, policy_version 33600 (0.0008) +[2023-10-11 16:19:36,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 68321280. Throughput: 0: 1655.1, 1: 1697.9. Samples: 17081748. Policy #0 lag: (min: 1.0, avg: 10.1, max: 33.0) +[2023-10-11 16:19:36,063][84230] Avg episode reward: [(0, '8.050'), (1, '7.540')] +[2023-10-11 16:19:37,837][85176] Updated weights for policy 0, policy_version 33122 (0.0010) +[2023-10-11 16:19:38,210][85176] Updated weights for policy 0, policy_version 33132 (0.0008) +[2023-10-11 16:19:38,591][85176] Updated weights for policy 0, policy_version 33142 (0.0008) +[2023-10-11 16:19:38,954][85176] Updated weights for policy 0, policy_version 33152 (0.0008) +[2023-10-11 16:19:39,481][85175] Updated weights for policy 1, policy_version 33610 (0.0010) +[2023-10-11 16:19:39,855][85175] Updated weights for policy 1, policy_version 33620 (0.0009) +[2023-10-11 16:19:40,221][85175] Updated weights for policy 1, policy_version 33630 (0.0009) +[2023-10-11 16:19:41,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 68386816. Throughput: 0: 1666.9, 1: 1692.0. Samples: 17101522. Policy #0 lag: (min: 1.0, avg: 10.1, max: 33.0) +[2023-10-11 16:19:41,063][84230] Avg episode reward: [(0, '7.450'), (1, '7.640')] +[2023-10-11 16:19:42,973][85176] Updated weights for policy 0, policy_version 33162 (0.0008) +[2023-10-11 16:19:43,347][85176] Updated weights for policy 0, policy_version 33172 (0.0011) +[2023-10-11 16:19:43,729][85176] Updated weights for policy 0, policy_version 33182 (0.0008) +[2023-10-11 16:19:44,160][85175] Updated weights for policy 1, policy_version 33640 (0.0009) +[2023-10-11 16:19:44,525][85175] Updated weights for policy 1, policy_version 33650 (0.0008) +[2023-10-11 16:19:44,895][85175] Updated weights for policy 1, policy_version 33660 (0.0008) +[2023-10-11 16:19:46,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 68452352. Throughput: 0: 1673.1, 1: 1683.0. Samples: 17121538. Policy #0 lag: (min: 1.0, avg: 10.1, max: 33.0) +[2023-10-11 16:19:46,064][84230] Avg episode reward: [(0, '7.670'), (1, '7.860')] +[2023-10-11 16:19:47,755][85176] Updated weights for policy 0, policy_version 33192 (0.0009) +[2023-10-11 16:19:48,124][85176] Updated weights for policy 0, policy_version 33202 (0.0010) +[2023-10-11 16:19:48,498][85176] Updated weights for policy 0, policy_version 33212 (0.0007) +[2023-10-11 16:19:48,789][85175] Updated weights for policy 1, policy_version 33670 (0.0009) +[2023-10-11 16:19:49,172][85175] Updated weights for policy 1, policy_version 33680 (0.0010) +[2023-10-11 16:19:49,539][85175] Updated weights for policy 1, policy_version 33690 (0.0009) +[2023-10-11 16:19:51,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 68517888. Throughput: 0: 1649.0, 1: 1705.3. Samples: 17132134. Policy #0 lag: (min: 1.0, avg: 10.1, max: 33.0) +[2023-10-11 16:19:51,063][84230] Avg episode reward: [(0, '7.560'), (1, '7.790')] +[2023-10-11 16:19:52,804][85176] Updated weights for policy 0, policy_version 33222 (0.0009) +[2023-10-11 16:19:53,177][85176] Updated weights for policy 0, policy_version 33232 (0.0007) +[2023-10-11 16:19:53,548][85176] Updated weights for policy 0, policy_version 33242 (0.0007) +[2023-10-11 16:19:53,575][85175] Updated weights for policy 1, policy_version 33700 (0.0009) +[2023-10-11 16:19:53,953][85175] Updated weights for policy 1, policy_version 33710 (0.0010) +[2023-10-11 16:19:54,316][85175] Updated weights for policy 1, policy_version 33720 (0.0007) +[2023-10-11 16:19:56,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 68583424. Throughput: 0: 1666.0, 1: 1678.5. Samples: 17151196. Policy #0 lag: (min: 1.0, avg: 10.1, max: 33.0) +[2023-10-11 16:19:56,064][84230] Avg episode reward: [(0, '7.870'), (1, '7.570')] +[2023-10-11 16:19:57,585][85176] Updated weights for policy 0, policy_version 33252 (0.0008) +[2023-10-11 16:19:57,963][85176] Updated weights for policy 0, policy_version 33262 (0.0010) +[2023-10-11 16:19:58,342][85176] Updated weights for policy 0, policy_version 33272 (0.0008) +[2023-10-11 16:19:58,413][85175] Updated weights for policy 1, policy_version 33730 (0.0008) +[2023-10-11 16:19:58,779][85175] Updated weights for policy 1, policy_version 33740 (0.0009) +[2023-10-11 16:19:59,150][85175] Updated weights for policy 1, policy_version 33750 (0.0009) +[2023-10-11 16:19:59,518][85175] Updated weights for policy 1, policy_version 33760 (0.0008) +[2023-10-11 16:20:01,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 68648960. Throughput: 0: 1671.2, 1: 1690.3. Samples: 17171824. Policy #0 lag: (min: 31.0, avg: 32.4, max: 57.0) +[2023-10-11 16:20:01,063][84230] Avg episode reward: [(0, '7.820'), (1, '7.350')] +[2023-10-11 16:20:02,486][85176] Updated weights for policy 0, policy_version 33282 (0.0008) +[2023-10-11 16:20:02,857][85176] Updated weights for policy 0, policy_version 33292 (0.0011) +[2023-10-11 16:20:03,229][85176] Updated weights for policy 0, policy_version 33302 (0.0008) +[2023-10-11 16:20:03,595][85175] Updated weights for policy 1, policy_version 33770 (0.0009) +[2023-10-11 16:20:03,603][85176] Updated weights for policy 0, policy_version 33312 (0.0007) +[2023-10-11 16:20:03,964][85175] Updated weights for policy 1, policy_version 33780 (0.0009) +[2023-10-11 16:20:04,332][85175] Updated weights for policy 1, policy_version 33790 (0.0011) +[2023-10-11 16:20:06,062][84230] Fps is (10 sec: 13107.6, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 68714496. Throughput: 0: 1654.1, 1: 1695.6. Samples: 17182070. Policy #0 lag: (min: 31.0, avg: 32.4, max: 57.0) +[2023-10-11 16:20:06,063][84230] Avg episode reward: [(0, '8.000'), (1, '7.840')] +[2023-10-11 16:20:07,733][85176] Updated weights for policy 0, policy_version 33322 (0.0009) +[2023-10-11 16:20:08,106][85176] Updated weights for policy 0, policy_version 33332 (0.0008) +[2023-10-11 16:20:08,446][85175] Updated weights for policy 1, policy_version 33800 (0.0008) +[2023-10-11 16:20:08,487][85176] Updated weights for policy 0, policy_version 33342 (0.0007) +[2023-10-11 16:20:08,821][85175] Updated weights for policy 1, policy_version 33810 (0.0007) +[2023-10-11 16:20:09,191][85175] Updated weights for policy 1, policy_version 33820 (0.0008) +[2023-10-11 16:20:11,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 68780032. Throughput: 0: 1678.3, 1: 1672.7. Samples: 17201836. Policy #0 lag: (min: 31.0, avg: 32.4, max: 57.0) +[2023-10-11 16:20:11,063][84230] Avg episode reward: [(0, '7.750'), (1, '7.930')] +[2023-10-11 16:20:12,637][85176] Updated weights for policy 0, policy_version 33352 (0.0009) +[2023-10-11 16:20:13,017][85176] Updated weights for policy 0, policy_version 33362 (0.0009) +[2023-10-11 16:20:13,166][85175] Updated weights for policy 1, policy_version 33830 (0.0008) +[2023-10-11 16:20:13,386][85176] Updated weights for policy 0, policy_version 33372 (0.0009) +[2023-10-11 16:20:13,545][85175] Updated weights for policy 1, policy_version 33840 (0.0008) +[2023-10-11 16:20:13,905][85175] Updated weights for policy 1, policy_version 33850 (0.0007) +[2023-10-11 16:20:16,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 68845568. Throughput: 0: 1676.0, 1: 1703.3. Samples: 17222740. Policy #0 lag: (min: 31.0, avg: 32.4, max: 57.0) +[2023-10-11 16:20:16,063][84230] Avg episode reward: [(0, '7.600'), (1, '7.870')] +[2023-10-11 16:20:16,072][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000033856_34668544.pth... +[2023-10-11 16:20:16,072][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000033376_34177024.pth... +[2023-10-11 16:20:16,104][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000031840_32604160.pth +[2023-10-11 16:20:16,111][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000032288_33062912.pth +[2023-10-11 16:20:17,372][85176] Updated weights for policy 0, policy_version 33382 (0.0008) +[2023-10-11 16:20:17,748][85176] Updated weights for policy 0, policy_version 33392 (0.0008) +[2023-10-11 16:20:18,044][85175] Updated weights for policy 1, policy_version 33860 (0.0007) +[2023-10-11 16:20:18,127][85176] Updated weights for policy 0, policy_version 33402 (0.0009) +[2023-10-11 16:20:18,399][85175] Updated weights for policy 1, policy_version 33870 (0.0007) +[2023-10-11 16:20:18,780][85175] Updated weights for policy 1, policy_version 33880 (0.0007) +[2023-10-11 16:20:21,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 68911104. Throughput: 0: 1657.5, 1: 1690.4. Samples: 17232404. Policy #0 lag: (min: 31.0, avg: 32.4, max: 57.0) +[2023-10-11 16:20:21,063][84230] Avg episode reward: [(0, '7.600'), (1, '7.220')] +[2023-10-11 16:20:22,160][85176] Updated weights for policy 0, policy_version 33412 (0.0008) +[2023-10-11 16:20:22,535][85176] Updated weights for policy 0, policy_version 33422 (0.0008) +[2023-10-11 16:20:22,804][85175] Updated weights for policy 1, policy_version 33890 (0.0008) +[2023-10-11 16:20:22,903][85176] Updated weights for policy 0, policy_version 33432 (0.0009) +[2023-10-11 16:20:23,176][85175] Updated weights for policy 1, policy_version 33900 (0.0009) +[2023-10-11 16:20:23,554][85175] Updated weights for policy 1, policy_version 33910 (0.0008) +[2023-10-11 16:20:23,921][85175] Updated weights for policy 1, policy_version 33920 (0.0008) +[2023-10-11 16:20:26,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 68976640. Throughput: 0: 1671.0, 1: 1681.9. Samples: 17252404. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:20:26,063][84230] Avg episode reward: [(0, '7.420'), (1, '7.000')] +[2023-10-11 16:20:26,807][85176] Updated weights for policy 0, policy_version 33442 (0.0008) +[2023-10-11 16:20:27,181][85176] Updated weights for policy 0, policy_version 33452 (0.0011) +[2023-10-11 16:20:27,550][85176] Updated weights for policy 0, policy_version 33462 (0.0009) +[2023-10-11 16:20:27,710][85175] Updated weights for policy 1, policy_version 33930 (0.0010) +[2023-10-11 16:20:27,926][85176] Updated weights for policy 0, policy_version 33472 (0.0007) +[2023-10-11 16:20:28,086][85175] Updated weights for policy 1, policy_version 33940 (0.0008) +[2023-10-11 16:20:28,457][85175] Updated weights for policy 1, policy_version 33950 (0.0010) +[2023-10-11 16:20:31,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 69042176. Throughput: 0: 1678.6, 1: 1698.8. Samples: 17273520. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:20:31,064][84230] Avg episode reward: [(0, '7.570'), (1, '7.480')] +[2023-10-11 16:20:32,200][85176] Updated weights for policy 0, policy_version 33482 (0.0010) +[2023-10-11 16:20:32,509][85175] Updated weights for policy 1, policy_version 33960 (0.0008) +[2023-10-11 16:20:32,569][85176] Updated weights for policy 0, policy_version 33492 (0.0008) +[2023-10-11 16:20:32,866][85175] Updated weights for policy 1, policy_version 33970 (0.0007) +[2023-10-11 16:20:32,954][85176] Updated weights for policy 0, policy_version 33502 (0.0009) +[2023-10-11 16:20:33,233][85175] Updated weights for policy 1, policy_version 33980 (0.0008) +[2023-10-11 16:20:36,062][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 69107712. Throughput: 0: 1670.4, 1: 1672.1. Samples: 17282548. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:20:36,063][84230] Avg episode reward: [(0, '7.600'), (1, '7.340')] +[2023-10-11 16:20:36,980][85176] Updated weights for policy 0, policy_version 33512 (0.0008) +[2023-10-11 16:20:37,273][85175] Updated weights for policy 1, policy_version 33990 (0.0008) +[2023-10-11 16:20:37,358][85176] Updated weights for policy 0, policy_version 33522 (0.0008) +[2023-10-11 16:20:37,664][85175] Updated weights for policy 1, policy_version 34000 (0.0009) +[2023-10-11 16:20:37,727][85176] Updated weights for policy 0, policy_version 33532 (0.0008) +[2023-10-11 16:20:38,029][85175] Updated weights for policy 1, policy_version 34010 (0.0007) +[2023-10-11 16:20:41,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 69173248. Throughput: 0: 1678.4, 1: 1700.7. Samples: 17303254. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:20:41,064][84230] Avg episode reward: [(0, '7.880'), (1, '7.420')] +[2023-10-11 16:20:41,819][85176] Updated weights for policy 0, policy_version 33542 (0.0010) +[2023-10-11 16:20:41,944][85175] Updated weights for policy 1, policy_version 34020 (0.0008) +[2023-10-11 16:20:42,195][85176] Updated weights for policy 0, policy_version 33552 (0.0010) +[2023-10-11 16:20:42,310][85175] Updated weights for policy 1, policy_version 34030 (0.0007) +[2023-10-11 16:20:42,569][85176] Updated weights for policy 0, policy_version 33562 (0.0010) +[2023-10-11 16:20:42,676][85175] Updated weights for policy 1, policy_version 34040 (0.0007) +[2023-10-11 16:20:46,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 69238784. Throughput: 0: 1677.2, 1: 1705.7. Samples: 17324054. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:20:46,063][84230] Avg episode reward: [(0, '8.010'), (1, '7.620')] +[2023-10-11 16:20:46,585][85176] Updated weights for policy 0, policy_version 33572 (0.0011) +[2023-10-11 16:20:46,627][85175] Updated weights for policy 1, policy_version 34050 (0.0008) +[2023-10-11 16:20:46,964][85176] Updated weights for policy 0, policy_version 33582 (0.0009) +[2023-10-11 16:20:46,993][85175] Updated weights for policy 1, policy_version 34060 (0.0007) +[2023-10-11 16:20:47,339][85176] Updated weights for policy 0, policy_version 33592 (0.0007) +[2023-10-11 16:20:47,363][85175] Updated weights for policy 1, policy_version 34070 (0.0007) +[2023-10-11 16:20:47,731][85175] Updated weights for policy 1, policy_version 34080 (0.0007) +[2023-10-11 16:20:51,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 69304320. Throughput: 0: 1670.9, 1: 1683.2. Samples: 17333008. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:20:51,063][84230] Avg episode reward: [(0, '7.750'), (1, '7.980')] +[2023-10-11 16:20:51,523][85176] Updated weights for policy 0, policy_version 33602 (0.0009) +[2023-10-11 16:20:51,885][85175] Updated weights for policy 1, policy_version 34090 (0.0007) +[2023-10-11 16:20:51,907][85176] Updated weights for policy 0, policy_version 33612 (0.0009) +[2023-10-11 16:20:52,248][85175] Updated weights for policy 1, policy_version 34100 (0.0008) +[2023-10-11 16:20:52,280][85176] Updated weights for policy 0, policy_version 33622 (0.0008) +[2023-10-11 16:20:52,613][85175] Updated weights for policy 1, policy_version 34110 (0.0008) +[2023-10-11 16:20:52,643][85176] Updated weights for policy 0, policy_version 33632 (0.0007) +[2023-10-11 16:20:56,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 69369856. Throughput: 0: 1666.0, 1: 1705.1. Samples: 17353536. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:20:56,064][84230] Avg episode reward: [(0, '7.890'), (1, '8.050')] +[2023-10-11 16:20:56,564][85175] Updated weights for policy 1, policy_version 34120 (0.0011) +[2023-10-11 16:20:56,881][85176] Updated weights for policy 0, policy_version 33642 (0.0007) +[2023-10-11 16:20:56,923][85175] Updated weights for policy 1, policy_version 34130 (0.0009) +[2023-10-11 16:20:57,252][85176] Updated weights for policy 0, policy_version 33652 (0.0009) +[2023-10-11 16:20:57,293][85175] Updated weights for policy 1, policy_version 34140 (0.0007) +[2023-10-11 16:20:57,630][85176] Updated weights for policy 0, policy_version 33662 (0.0010) +[2023-10-11 16:21:01,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 69435392. Throughput: 0: 1667.4, 1: 1704.5. Samples: 17374476. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:21:01,063][84230] Avg episode reward: [(0, '7.420'), (1, '8.090')] +[2023-10-11 16:21:01,449][85175] Updated weights for policy 1, policy_version 34150 (0.0010) +[2023-10-11 16:21:01,814][85175] Updated weights for policy 1, policy_version 34160 (0.0008) +[2023-10-11 16:21:01,878][85176] Updated weights for policy 0, policy_version 33672 (0.0007) +[2023-10-11 16:21:02,191][85175] Updated weights for policy 1, policy_version 34170 (0.0008) +[2023-10-11 16:21:02,251][85176] Updated weights for policy 0, policy_version 33682 (0.0008) +[2023-10-11 16:21:02,613][85176] Updated weights for policy 0, policy_version 33692 (0.0008) +[2023-10-11 16:21:06,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 69500928. Throughput: 0: 1667.7, 1: 1694.1. Samples: 17383686. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:21:06,064][84230] Avg episode reward: [(0, '7.520'), (1, '7.740')] +[2023-10-11 16:21:06,151][85175] Updated weights for policy 1, policy_version 34180 (0.0011) +[2023-10-11 16:21:06,529][85175] Updated weights for policy 1, policy_version 34190 (0.0008) +[2023-10-11 16:21:06,576][85176] Updated weights for policy 0, policy_version 33702 (0.0009) +[2023-10-11 16:21:06,884][85175] Updated weights for policy 1, policy_version 34200 (0.0007) +[2023-10-11 16:21:06,939][85176] Updated weights for policy 0, policy_version 33712 (0.0008) +[2023-10-11 16:21:07,316][85176] Updated weights for policy 0, policy_version 33722 (0.0007) +[2023-10-11 16:21:10,798][85175] Updated weights for policy 1, policy_version 34210 (0.0008) +[2023-10-11 16:21:11,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 69566464. Throughput: 0: 1664.6, 1: 1709.8. Samples: 17404250. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:21:11,063][84230] Avg episode reward: [(0, '7.370'), (1, '7.280')] +[2023-10-11 16:21:11,165][85175] Updated weights for policy 1, policy_version 34220 (0.0007) +[2023-10-11 16:21:11,485][85176] Updated weights for policy 0, policy_version 33732 (0.0009) +[2023-10-11 16:21:11,538][85175] Updated weights for policy 1, policy_version 34230 (0.0007) +[2023-10-11 16:21:11,859][85176] Updated weights for policy 0, policy_version 33742 (0.0009) +[2023-10-11 16:21:11,904][85175] Updated weights for policy 1, policy_version 34240 (0.0007) +[2023-10-11 16:21:12,230][85176] Updated weights for policy 0, policy_version 33752 (0.0008) +[2023-10-11 16:21:16,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 69632000. Throughput: 0: 1659.7, 1: 1705.6. Samples: 17424962. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:21:16,064][84230] Avg episode reward: [(0, '7.840'), (1, '7.250')] +[2023-10-11 16:21:16,081][85175] Updated weights for policy 1, policy_version 34250 (0.0010) +[2023-10-11 16:21:16,206][85176] Updated weights for policy 0, policy_version 33762 (0.0008) +[2023-10-11 16:21:16,445][85175] Updated weights for policy 1, policy_version 34260 (0.0007) +[2023-10-11 16:21:16,580][85176] Updated weights for policy 0, policy_version 33772 (0.0007) +[2023-10-11 16:21:16,813][85175] Updated weights for policy 1, policy_version 34270 (0.0009) +[2023-10-11 16:21:16,963][85176] Updated weights for policy 0, policy_version 33782 (0.0009) +[2023-10-11 16:21:17,333][85176] Updated weights for policy 0, policy_version 33792 (0.0008) +[2023-10-11 16:21:20,857][85175] Updated weights for policy 1, policy_version 34280 (0.0008) +[2023-10-11 16:21:21,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 69697536. Throughput: 0: 1664.3, 1: 1702.2. Samples: 17434038. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:21:21,064][84230] Avg episode reward: [(0, '8.120'), (1, '7.680')] +[2023-10-11 16:21:21,221][85175] Updated weights for policy 1, policy_version 34290 (0.0008) +[2023-10-11 16:21:21,465][85176] Updated weights for policy 0, policy_version 33802 (0.0009) +[2023-10-11 16:21:21,588][85175] Updated weights for policy 1, policy_version 34300 (0.0008) +[2023-10-11 16:21:21,835][85176] Updated weights for policy 0, policy_version 33812 (0.0008) +[2023-10-11 16:21:22,207][85176] Updated weights for policy 0, policy_version 33822 (0.0009) +[2023-10-11 16:21:25,664][85175] Updated weights for policy 1, policy_version 34310 (0.0008) +[2023-10-11 16:21:26,054][85175] Updated weights for policy 1, policy_version 34320 (0.0009) +[2023-10-11 16:21:26,063][84230] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 69763072. Throughput: 0: 1657.7, 1: 1700.7. Samples: 17454380. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:21:26,063][84230] Avg episode reward: [(0, '8.010'), (1, '7.940')] +[2023-10-11 16:21:26,239][85176] Updated weights for policy 0, policy_version 33832 (0.0008) +[2023-10-11 16:21:26,420][85175] Updated weights for policy 1, policy_version 34330 (0.0008) +[2023-10-11 16:21:26,611][85176] Updated weights for policy 0, policy_version 33842 (0.0008) +[2023-10-11 16:21:26,984][85176] Updated weights for policy 0, policy_version 33852 (0.0007) +[2023-10-11 16:21:30,440][85175] Updated weights for policy 1, policy_version 34340 (0.0009) +[2023-10-11 16:21:30,807][85175] Updated weights for policy 1, policy_version 34350 (0.0010) +[2023-10-11 16:21:31,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 69828608. Throughput: 0: 1658.8, 1: 1688.2. Samples: 17474670. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:21:31,063][84230] Avg episode reward: [(0, '7.230'), (1, '7.560')] +[2023-10-11 16:21:31,176][85175] Updated weights for policy 1, policy_version 34360 (0.0008) +[2023-10-11 16:21:31,339][85176] Updated weights for policy 0, policy_version 33862 (0.0007) +[2023-10-11 16:21:31,709][85176] Updated weights for policy 0, policy_version 33872 (0.0010) +[2023-10-11 16:21:32,074][85176] Updated weights for policy 0, policy_version 33882 (0.0010) +[2023-10-11 16:21:35,230][85175] Updated weights for policy 1, policy_version 34370 (0.0009) +[2023-10-11 16:21:35,604][85175] Updated weights for policy 1, policy_version 34380 (0.0009) +[2023-10-11 16:21:35,975][85175] Updated weights for policy 1, policy_version 34390 (0.0008) +[2023-10-11 16:21:36,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 69894144. Throughput: 0: 1662.6, 1: 1692.0. Samples: 17483966. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:21:36,063][84230] Avg episode reward: [(0, '7.420'), (1, '7.430')] +[2023-10-11 16:21:36,155][85176] Updated weights for policy 0, policy_version 33892 (0.0009) +[2023-10-11 16:21:36,337][85175] Updated weights for policy 1, policy_version 34400 (0.0008) +[2023-10-11 16:21:36,525][85176] Updated weights for policy 0, policy_version 33902 (0.0007) +[2023-10-11 16:21:36,914][85176] Updated weights for policy 0, policy_version 33912 (0.0008) +[2023-10-11 16:21:40,478][85175] Updated weights for policy 1, policy_version 34410 (0.0008) +[2023-10-11 16:21:40,783][85176] Updated weights for policy 0, policy_version 33922 (0.0008) +[2023-10-11 16:21:40,849][85175] Updated weights for policy 1, policy_version 34420 (0.0008) +[2023-10-11 16:21:41,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 69959680. Throughput: 0: 1668.0, 1: 1688.3. Samples: 17504566. Policy #0 lag: (min: 29.0, avg: 35.4, max: 61.0) +[2023-10-11 16:21:41,063][84230] Avg episode reward: [(0, '7.600'), (1, '7.550')] +[2023-10-11 16:21:41,164][85176] Updated weights for policy 0, policy_version 33932 (0.0008) +[2023-10-11 16:21:41,220][85175] Updated weights for policy 1, policy_version 34430 (0.0008) +[2023-10-11 16:21:41,536][85176] Updated weights for policy 0, policy_version 33942 (0.0009) +[2023-10-11 16:21:41,905][85176] Updated weights for policy 0, policy_version 33952 (0.0008) +[2023-10-11 16:21:45,217][85175] Updated weights for policy 1, policy_version 34440 (0.0009) +[2023-10-11 16:21:45,590][85175] Updated weights for policy 1, policy_version 34450 (0.0007) +[2023-10-11 16:21:45,960][85175] Updated weights for policy 1, policy_version 34460 (0.0007) +[2023-10-11 16:21:46,063][84230] Fps is (10 sec: 13106.6, 60 sec: 13107.1, 300 sec: 13329.3). Total num frames: 70025216. Throughput: 0: 1664.3, 1: 1668.8. Samples: 17524468. Policy #0 lag: (min: 29.0, avg: 35.4, max: 61.0) +[2023-10-11 16:21:46,064][84230] Avg episode reward: [(0, '7.750'), (1, '7.290')] +[2023-10-11 16:21:46,069][85176] Updated weights for policy 0, policy_version 33962 (0.0007) +[2023-10-11 16:21:46,445][85176] Updated weights for policy 0, policy_version 33972 (0.0007) +[2023-10-11 16:21:46,820][85176] Updated weights for policy 0, policy_version 33982 (0.0007) +[2023-10-11 16:21:49,994][85175] Updated weights for policy 1, policy_version 34470 (0.0009) +[2023-10-11 16:21:50,364][85175] Updated weights for policy 1, policy_version 34480 (0.0010) +[2023-10-11 16:21:50,743][85175] Updated weights for policy 1, policy_version 34490 (0.0008) +[2023-10-11 16:21:51,062][84230] Fps is (10 sec: 16384.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 70123520. Throughput: 0: 1661.0, 1: 1684.3. Samples: 17534222. Policy #0 lag: (min: 29.0, avg: 35.4, max: 61.0) +[2023-10-11 16:21:51,063][84230] Avg episode reward: [(0, '8.050'), (1, '7.480')] +[2023-10-11 16:21:51,150][85176] Updated weights for policy 0, policy_version 33992 (0.0010) +[2023-10-11 16:21:51,522][85176] Updated weights for policy 0, policy_version 34002 (0.0010) +[2023-10-11 16:21:51,900][85176] Updated weights for policy 0, policy_version 34012 (0.0011) +[2023-10-11 16:21:54,945][85175] Updated weights for policy 1, policy_version 34500 (0.0009) +[2023-10-11 16:21:55,309][85175] Updated weights for policy 1, policy_version 34510 (0.0011) +[2023-10-11 16:21:55,674][85175] Updated weights for policy 1, policy_version 34520 (0.0009) +[2023-10-11 16:21:55,911][85176] Updated weights for policy 0, policy_version 34022 (0.0009) +[2023-10-11 16:21:56,062][84230] Fps is (10 sec: 16384.8, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 70189056. Throughput: 0: 1665.7, 1: 1681.2. Samples: 17554860. Policy #0 lag: (min: 29.0, avg: 35.4, max: 61.0) +[2023-10-11 16:21:56,063][84230] Avg episode reward: [(0, '7.670'), (1, '7.870')] +[2023-10-11 16:21:56,288][85176] Updated weights for policy 0, policy_version 34032 (0.0010) +[2023-10-11 16:21:56,661][85176] Updated weights for policy 0, policy_version 34042 (0.0008) +[2023-10-11 16:21:59,663][85175] Updated weights for policy 1, policy_version 34530 (0.0010) +[2023-10-11 16:22:00,027][85175] Updated weights for policy 1, policy_version 34540 (0.0010) +[2023-10-11 16:22:00,397][85175] Updated weights for policy 1, policy_version 34550 (0.0008) +[2023-10-11 16:22:00,767][85175] Updated weights for policy 1, policy_version 34560 (0.0007) +[2023-10-11 16:22:00,797][85176] Updated weights for policy 0, policy_version 34052 (0.0010) +[2023-10-11 16:22:01,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 70254592. Throughput: 0: 1662.8, 1: 1657.4. Samples: 17574368. Policy #0 lag: (min: 29.0, avg: 35.4, max: 61.0) +[2023-10-11 16:22:01,064][84230] Avg episode reward: [(0, '7.960'), (1, '7.900')] +[2023-10-11 16:22:01,179][85176] Updated weights for policy 0, policy_version 34062 (0.0010) +[2023-10-11 16:22:01,562][85176] Updated weights for policy 0, policy_version 34072 (0.0007) +[2023-10-11 16:22:04,921][85175] Updated weights for policy 1, policy_version 34570 (0.0009) +[2023-10-11 16:22:05,293][85175] Updated weights for policy 1, policy_version 34580 (0.0007) +[2023-10-11 16:22:05,666][85176] Updated weights for policy 0, policy_version 34082 (0.0007) +[2023-10-11 16:22:05,667][85175] Updated weights for policy 1, policy_version 34590 (0.0009) +[2023-10-11 16:22:06,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 70320128. Throughput: 0: 1666.0, 1: 1680.3. Samples: 17584622. Policy #0 lag: (min: 4.0, avg: 12.0, max: 36.0) +[2023-10-11 16:22:06,063][84230] Avg episode reward: [(0, '7.890'), (1, '7.840')] +[2023-10-11 16:22:06,067][85176] Updated weights for policy 0, policy_version 34092 (0.0008) +[2023-10-11 16:22:06,433][85176] Updated weights for policy 0, policy_version 34102 (0.0011) +[2023-10-11 16:22:06,814][85176] Updated weights for policy 0, policy_version 34112 (0.0008) +[2023-10-11 16:22:09,754][85175] Updated weights for policy 1, policy_version 34600 (0.0007) +[2023-10-11 16:22:10,116][85175] Updated weights for policy 1, policy_version 34610 (0.0007) +[2023-10-11 16:22:10,487][85175] Updated weights for policy 1, policy_version 34620 (0.0007) +[2023-10-11 16:22:10,878][85176] Updated weights for policy 0, policy_version 34122 (0.0008) +[2023-10-11 16:22:11,062][84230] Fps is (10 sec: 13107.6, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 70385664. Throughput: 0: 1672.2, 1: 1680.1. Samples: 17605234. Policy #0 lag: (min: 4.0, avg: 12.0, max: 36.0) +[2023-10-11 16:22:11,063][84230] Avg episode reward: [(0, '7.750'), (1, '7.680')] +[2023-10-11 16:22:11,256][85176] Updated weights for policy 0, policy_version 34132 (0.0007) +[2023-10-11 16:22:11,623][85176] Updated weights for policy 0, policy_version 34142 (0.0007) +[2023-10-11 16:22:14,609][85175] Updated weights for policy 1, policy_version 34630 (0.0007) +[2023-10-11 16:22:14,996][85175] Updated weights for policy 1, policy_version 34640 (0.0009) +[2023-10-11 16:22:15,350][85175] Updated weights for policy 1, policy_version 34650 (0.0009) +[2023-10-11 16:22:15,588][85176] Updated weights for policy 0, policy_version 34152 (0.0007) +[2023-10-11 16:22:15,962][85176] Updated weights for policy 0, policy_version 34162 (0.0007) +[2023-10-11 16:22:16,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13329.4). Total num frames: 70451200. Throughput: 0: 1664.3, 1: 1662.2. Samples: 17624362. Policy #0 lag: (min: 4.0, avg: 12.0, max: 36.0) +[2023-10-11 16:22:16,063][84230] Avg episode reward: [(0, '7.600'), (1, '7.540')] +[2023-10-11 16:22:16,075][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000034656_35487744.pth... +[2023-10-11 16:22:16,111][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000033056_33849344.pth +[2023-10-11 16:22:16,339][85176] Updated weights for policy 0, policy_version 34172 (0.0007) +[2023-10-11 16:22:16,480][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000034176_34996224.pth... +[2023-10-11 16:22:16,519][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000032608_33390592.pth +[2023-10-11 16:22:19,354][85175] Updated weights for policy 1, policy_version 34660 (0.0008) +[2023-10-11 16:22:19,726][85175] Updated weights for policy 1, policy_version 34670 (0.0008) +[2023-10-11 16:22:20,085][85175] Updated weights for policy 1, policy_version 34680 (0.0007) +[2023-10-11 16:22:20,191][85176] Updated weights for policy 0, policy_version 34182 (0.0008) +[2023-10-11 16:22:20,564][85176] Updated weights for policy 0, policy_version 34192 (0.0008) +[2023-10-11 16:22:20,936][85176] Updated weights for policy 0, policy_version 34202 (0.0009) +[2023-10-11 16:22:21,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 70516736. Throughput: 0: 1672.7, 1: 1689.1. Samples: 17635244. Policy #0 lag: (min: 4.0, avg: 12.0, max: 36.0) +[2023-10-11 16:22:21,064][84230] Avg episode reward: [(0, '7.560'), (1, '7.220')] +[2023-10-11 16:22:24,108][85175] Updated weights for policy 1, policy_version 34690 (0.0008) +[2023-10-11 16:22:24,482][85175] Updated weights for policy 1, policy_version 34700 (0.0007) +[2023-10-11 16:22:24,856][85175] Updated weights for policy 1, policy_version 34710 (0.0008) +[2023-10-11 16:22:25,153][85176] Updated weights for policy 0, policy_version 34212 (0.0009) +[2023-10-11 16:22:25,222][85175] Updated weights for policy 1, policy_version 34720 (0.0008) +[2023-10-11 16:22:25,519][85176] Updated weights for policy 0, policy_version 34222 (0.0009) +[2023-10-11 16:22:25,908][85176] Updated weights for policy 0, policy_version 34232 (0.0008) +[2023-10-11 16:22:26,062][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 70582272. Throughput: 0: 1669.0, 1: 1681.0. Samples: 17655318. Policy #0 lag: (min: 4.0, avg: 12.0, max: 36.0) +[2023-10-11 16:22:26,063][84230] Avg episode reward: [(0, '7.560'), (1, '6.870')] +[2023-10-11 16:22:29,191][85175] Updated weights for policy 1, policy_version 34730 (0.0007) +[2023-10-11 16:22:29,555][85175] Updated weights for policy 1, policy_version 34740 (0.0010) +[2023-10-11 16:22:29,925][85175] Updated weights for policy 1, policy_version 34750 (0.0010) +[2023-10-11 16:22:29,978][85176] Updated weights for policy 0, policy_version 34242 (0.0008) +[2023-10-11 16:22:30,357][85176] Updated weights for policy 0, policy_version 34252 (0.0008) +[2023-10-11 16:22:30,724][85176] Updated weights for policy 0, policy_version 34262 (0.0007) +[2023-10-11 16:22:31,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 70647808. Throughput: 0: 1653.9, 1: 1679.1. Samples: 17674450. Policy #0 lag: (min: 31.0, avg: 31.0, max: 34.0) +[2023-10-11 16:22:31,063][84230] Avg episode reward: [(0, '8.200'), (1, '7.270')] +[2023-10-11 16:22:31,099][85176] Updated weights for policy 0, policy_version 34272 (0.0008) +[2023-10-11 16:22:33,995][85175] Updated weights for policy 1, policy_version 34760 (0.0008) +[2023-10-11 16:22:34,354][85175] Updated weights for policy 1, policy_version 34770 (0.0010) +[2023-10-11 16:22:34,723][85175] Updated weights for policy 1, policy_version 34780 (0.0011) +[2023-10-11 16:22:35,361][85176] Updated weights for policy 0, policy_version 34282 (0.0008) +[2023-10-11 16:22:35,743][85176] Updated weights for policy 0, policy_version 34292 (0.0009) +[2023-10-11 16:22:36,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 70713344. Throughput: 0: 1671.3, 1: 1693.2. Samples: 17685626. Policy #0 lag: (min: 31.0, avg: 31.0, max: 34.0) +[2023-10-11 16:22:36,063][84230] Avg episode reward: [(0, '7.890'), (1, '7.510')] +[2023-10-11 16:22:36,115][85176] Updated weights for policy 0, policy_version 34302 (0.0009) +[2023-10-11 16:22:38,754][85175] Updated weights for policy 1, policy_version 34790 (0.0010) +[2023-10-11 16:22:39,121][85175] Updated weights for policy 1, policy_version 34800 (0.0007) +[2023-10-11 16:22:39,478][85175] Updated weights for policy 1, policy_version 34810 (0.0008) +[2023-10-11 16:22:40,308][85176] Updated weights for policy 0, policy_version 34312 (0.0008) +[2023-10-11 16:22:40,688][85176] Updated weights for policy 0, policy_version 34322 (0.0007) +[2023-10-11 16:22:41,054][85176] Updated weights for policy 0, policy_version 34332 (0.0007) +[2023-10-11 16:22:41,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13329.3). Total num frames: 70778880. Throughput: 0: 1669.3, 1: 1671.6. Samples: 17705202. Policy #0 lag: (min: 31.0, avg: 31.0, max: 34.0) +[2023-10-11 16:22:41,064][84230] Avg episode reward: [(0, '7.740'), (1, '7.900')] +[2023-10-11 16:22:43,593][85175] Updated weights for policy 1, policy_version 34820 (0.0008) +[2023-10-11 16:22:43,957][85175] Updated weights for policy 1, policy_version 34830 (0.0009) +[2023-10-11 16:22:44,323][85175] Updated weights for policy 1, policy_version 34840 (0.0009) +[2023-10-11 16:22:45,241][85176] Updated weights for policy 0, policy_version 34342 (0.0008) +[2023-10-11 16:22:45,617][85176] Updated weights for policy 0, policy_version 34352 (0.0009) +[2023-10-11 16:22:45,998][85176] Updated weights for policy 0, policy_version 34362 (0.0009) +[2023-10-11 16:22:46,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13329.4). Total num frames: 70844416. Throughput: 0: 1660.0, 1: 1687.3. Samples: 17724994. Policy #0 lag: (min: 31.0, avg: 31.0, max: 34.0) +[2023-10-11 16:22:46,063][84230] Avg episode reward: [(0, '7.260'), (1, '7.670')] +[2023-10-11 16:22:48,439][85175] Updated weights for policy 1, policy_version 34850 (0.0008) +[2023-10-11 16:22:48,807][85175] Updated weights for policy 1, policy_version 34860 (0.0009) +[2023-10-11 16:22:49,173][85175] Updated weights for policy 1, policy_version 34870 (0.0009) +[2023-10-11 16:22:49,538][85175] Updated weights for policy 1, policy_version 34880 (0.0009) +[2023-10-11 16:22:50,037][85176] Updated weights for policy 0, policy_version 34372 (0.0008) +[2023-10-11 16:22:50,411][85176] Updated weights for policy 0, policy_version 34382 (0.0011) +[2023-10-11 16:22:50,786][85176] Updated weights for policy 0, policy_version 34392 (0.0008) +[2023-10-11 16:22:51,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 70909952. Throughput: 0: 1668.1, 1: 1689.6. Samples: 17735720. Policy #0 lag: (min: 31.0, avg: 31.0, max: 34.0) +[2023-10-11 16:22:51,064][84230] Avg episode reward: [(0, '7.600'), (1, '7.740')] +[2023-10-11 16:22:53,515][85175] Updated weights for policy 1, policy_version 34890 (0.0009) +[2023-10-11 16:22:53,886][85175] Updated weights for policy 1, policy_version 34900 (0.0008) +[2023-10-11 16:22:54,264][85175] Updated weights for policy 1, policy_version 34910 (0.0009) +[2023-10-11 16:22:54,862][85176] Updated weights for policy 0, policy_version 34402 (0.0007) +[2023-10-11 16:22:55,257][85176] Updated weights for policy 0, policy_version 34412 (0.0010) +[2023-10-11 16:22:55,642][85176] Updated weights for policy 0, policy_version 34422 (0.0008) +[2023-10-11 16:22:56,008][85176] Updated weights for policy 0, policy_version 34432 (0.0008) +[2023-10-11 16:22:56,063][84230] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 71008256. Throughput: 0: 1669.3, 1: 1668.3. Samples: 17755430. Policy #0 lag: (min: 1.0, avg: 12.3, max: 33.0) +[2023-10-11 16:22:56,064][84230] Avg episode reward: [(0, '7.150'), (1, '7.720')] +[2023-10-11 16:22:58,264][85175] Updated weights for policy 1, policy_version 34920 (0.0007) +[2023-10-11 16:22:58,632][85175] Updated weights for policy 1, policy_version 34930 (0.0007) +[2023-10-11 16:22:59,005][85175] Updated weights for policy 1, policy_version 34940 (0.0009) +[2023-10-11 16:23:00,123][85176] Updated weights for policy 0, policy_version 34442 (0.0009) +[2023-10-11 16:23:00,492][85176] Updated weights for policy 0, policy_version 34452 (0.0009) +[2023-10-11 16:23:00,861][85176] Updated weights for policy 0, policy_version 34462 (0.0007) +[2023-10-11 16:23:01,063][84230] Fps is (10 sec: 16384.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 71073792. Throughput: 0: 1651.9, 1: 1694.7. Samples: 17774956. Policy #0 lag: (min: 1.0, avg: 12.3, max: 33.0) +[2023-10-11 16:23:01,063][84230] Avg episode reward: [(0, '8.200'), (1, '7.640')] +[2023-10-11 16:23:03,126][85175] Updated weights for policy 1, policy_version 34950 (0.0010) +[2023-10-11 16:23:03,512][85175] Updated weights for policy 1, policy_version 34960 (0.0009) +[2023-10-11 16:23:03,888][85175] Updated weights for policy 1, policy_version 34970 (0.0012) +[2023-10-11 16:23:04,932][85176] Updated weights for policy 0, policy_version 34472 (0.0008) +[2023-10-11 16:23:05,314][85176] Updated weights for policy 0, policy_version 34482 (0.0007) +[2023-10-11 16:23:05,694][85176] Updated weights for policy 0, policy_version 34492 (0.0009) +[2023-10-11 16:23:06,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 71139328. Throughput: 0: 1663.3, 1: 1676.4. Samples: 17785532. Policy #0 lag: (min: 1.0, avg: 12.3, max: 33.0) +[2023-10-11 16:23:06,064][84230] Avg episode reward: [(0, '8.450'), (1, '7.210')] +[2023-10-11 16:23:07,882][85175] Updated weights for policy 1, policy_version 34980 (0.0009) +[2023-10-11 16:23:08,244][85175] Updated weights for policy 1, policy_version 34990 (0.0007) +[2023-10-11 16:23:08,612][85175] Updated weights for policy 1, policy_version 35000 (0.0007) +[2023-10-11 16:23:09,786][85176] Updated weights for policy 0, policy_version 34502 (0.0008) +[2023-10-11 16:23:10,152][85176] Updated weights for policy 0, policy_version 34512 (0.0007) +[2023-10-11 16:23:10,536][85176] Updated weights for policy 0, policy_version 34522 (0.0007) +[2023-10-11 16:23:11,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 71204864. Throughput: 0: 1660.0, 1: 1674.9. Samples: 17805390. Policy #0 lag: (min: 1.0, avg: 12.3, max: 33.0) +[2023-10-11 16:23:11,063][84230] Avg episode reward: [(0, '8.150'), (1, '7.680')] +[2023-10-11 16:23:12,725][85175] Updated weights for policy 1, policy_version 35010 (0.0008) +[2023-10-11 16:23:13,096][85175] Updated weights for policy 1, policy_version 35020 (0.0010) +[2023-10-11 16:23:13,465][85175] Updated weights for policy 1, policy_version 35030 (0.0011) +[2023-10-11 16:23:13,841][85175] Updated weights for policy 1, policy_version 35040 (0.0011) +[2023-10-11 16:23:14,500][85176] Updated weights for policy 0, policy_version 34532 (0.0007) +[2023-10-11 16:23:14,869][85176] Updated weights for policy 0, policy_version 34542 (0.0007) +[2023-10-11 16:23:15,252][85176] Updated weights for policy 0, policy_version 34552 (0.0009) +[2023-10-11 16:23:16,062][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 71270400. Throughput: 0: 1652.2, 1: 1696.3. Samples: 17825130. Policy #0 lag: (min: 17.0, avg: 29.1, max: 49.0) +[2023-10-11 16:23:16,063][84230] Avg episode reward: [(0, '7.600'), (1, '7.790')] +[2023-10-11 16:23:17,798][85175] Updated weights for policy 1, policy_version 35050 (0.0009) +[2023-10-11 16:23:18,159][85175] Updated weights for policy 1, policy_version 35060 (0.0010) +[2023-10-11 16:23:18,537][85175] Updated weights for policy 1, policy_version 35070 (0.0010) +[2023-10-11 16:23:19,429][85176] Updated weights for policy 0, policy_version 34562 (0.0010) +[2023-10-11 16:23:19,814][85176] Updated weights for policy 0, policy_version 34572 (0.0011) +[2023-10-11 16:23:20,201][85176] Updated weights for policy 0, policy_version 34582 (0.0010) +[2023-10-11 16:23:20,572][85176] Updated weights for policy 0, policy_version 34592 (0.0008) +[2023-10-11 16:23:21,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 71335936. Throughput: 0: 1662.7, 1: 1671.6. Samples: 17835672. Policy #0 lag: (min: 17.0, avg: 29.1, max: 49.0) +[2023-10-11 16:23:21,063][84230] Avg episode reward: [(0, '7.150'), (1, '7.700')] +[2023-10-11 16:23:22,632][85175] Updated weights for policy 1, policy_version 35080 (0.0009) +[2023-10-11 16:23:23,004][85175] Updated weights for policy 1, policy_version 35090 (0.0008) +[2023-10-11 16:23:23,376][85175] Updated weights for policy 1, policy_version 35100 (0.0009) +[2023-10-11 16:23:24,643][85176] Updated weights for policy 0, policy_version 34602 (0.0008) +[2023-10-11 16:23:25,026][85176] Updated weights for policy 0, policy_version 34612 (0.0007) +[2023-10-11 16:23:25,408][85176] Updated weights for policy 0, policy_version 34622 (0.0010) +[2023-10-11 16:23:26,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 71401472. Throughput: 0: 1656.5, 1: 1687.8. Samples: 17855696. Policy #0 lag: (min: 17.0, avg: 29.1, max: 49.0) +[2023-10-11 16:23:26,063][84230] Avg episode reward: [(0, '7.110'), (1, '7.610')] +[2023-10-11 16:23:27,430][85175] Updated weights for policy 1, policy_version 35110 (0.0009) +[2023-10-11 16:23:27,798][85175] Updated weights for policy 1, policy_version 35120 (0.0010) +[2023-10-11 16:23:28,169][85175] Updated weights for policy 1, policy_version 35130 (0.0008) +[2023-10-11 16:23:29,514][85176] Updated weights for policy 0, policy_version 34632 (0.0009) +[2023-10-11 16:23:29,877][85176] Updated weights for policy 0, policy_version 34642 (0.0009) +[2023-10-11 16:23:30,257][85176] Updated weights for policy 0, policy_version 34652 (0.0009) +[2023-10-11 16:23:31,062][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 71467008. Throughput: 0: 1648.0, 1: 1696.0. Samples: 17875474. Policy #0 lag: (min: 17.0, avg: 29.1, max: 49.0) +[2023-10-11 16:23:31,063][84230] Avg episode reward: [(0, '7.520'), (1, '7.280')] +[2023-10-11 16:23:32,067][85175] Updated weights for policy 1, policy_version 35140 (0.0009) +[2023-10-11 16:23:32,435][85175] Updated weights for policy 1, policy_version 35150 (0.0009) +[2023-10-11 16:23:32,798][85175] Updated weights for policy 1, policy_version 35160 (0.0008) +[2023-10-11 16:23:34,340][85176] Updated weights for policy 0, policy_version 34662 (0.0009) +[2023-10-11 16:23:34,715][85176] Updated weights for policy 0, policy_version 34672 (0.0010) +[2023-10-11 16:23:35,085][85176] Updated weights for policy 0, policy_version 34682 (0.0009) +[2023-10-11 16:23:36,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 71532544. Throughput: 0: 1665.6, 1: 1675.0. Samples: 17886046. Policy #0 lag: (min: 17.0, avg: 29.1, max: 49.0) +[2023-10-11 16:23:36,063][84230] Avg episode reward: [(0, '8.120'), (1, '7.220')] +[2023-10-11 16:23:36,596][85175] Updated weights for policy 1, policy_version 35170 (0.0007) +[2023-10-11 16:23:36,958][85175] Updated weights for policy 1, policy_version 35180 (0.0009) +[2023-10-11 16:23:37,327][85175] Updated weights for policy 1, policy_version 35190 (0.0010) +[2023-10-11 16:23:37,694][85175] Updated weights for policy 1, policy_version 35200 (0.0007) +[2023-10-11 16:23:39,186][85176] Updated weights for policy 0, policy_version 34692 (0.0010) +[2023-10-11 16:23:39,554][85176] Updated weights for policy 0, policy_version 34702 (0.0010) +[2023-10-11 16:23:39,929][85176] Updated weights for policy 0, policy_version 34712 (0.0010) +[2023-10-11 16:23:41,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 71598080. Throughput: 0: 1654.0, 1: 1700.2. Samples: 17906372. Policy #0 lag: (min: 31.0, avg: 37.2, max: 63.0) +[2023-10-11 16:23:41,064][84230] Avg episode reward: [(0, '8.010'), (1, '7.870')] +[2023-10-11 16:23:41,851][85175] Updated weights for policy 1, policy_version 35210 (0.0010) +[2023-10-11 16:23:42,210][85175] Updated weights for policy 1, policy_version 35220 (0.0009) +[2023-10-11 16:23:42,583][85175] Updated weights for policy 1, policy_version 35230 (0.0009) +[2023-10-11 16:23:44,053][85176] Updated weights for policy 0, policy_version 34722 (0.0011) +[2023-10-11 16:23:44,462][85176] Updated weights for policy 0, policy_version 34732 (0.0011) +[2023-10-11 16:23:44,835][85176] Updated weights for policy 0, policy_version 34742 (0.0009) +[2023-10-11 16:23:45,203][85176] Updated weights for policy 0, policy_version 34752 (0.0007) +[2023-10-11 16:23:46,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 71663616. Throughput: 0: 1659.1, 1: 1703.3. Samples: 17926264. Policy #0 lag: (min: 31.0, avg: 37.2, max: 63.0) +[2023-10-11 16:23:46,063][84230] Avg episode reward: [(0, '7.750'), (1, '8.000')] +[2023-10-11 16:23:46,728][85175] Updated weights for policy 1, policy_version 35240 (0.0010) +[2023-10-11 16:23:47,093][85175] Updated weights for policy 1, policy_version 35250 (0.0010) +[2023-10-11 16:23:47,464][85175] Updated weights for policy 1, policy_version 35260 (0.0010) +[2023-10-11 16:23:49,350][85176] Updated weights for policy 0, policy_version 34762 (0.0010) +[2023-10-11 16:23:49,722][85176] Updated weights for policy 0, policy_version 34772 (0.0008) +[2023-10-11 16:23:50,091][85176] Updated weights for policy 0, policy_version 34782 (0.0008) +[2023-10-11 16:23:51,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 71729152. Throughput: 0: 1667.6, 1: 1688.2. Samples: 17936540. Policy #0 lag: (min: 31.0, avg: 37.2, max: 63.0) +[2023-10-11 16:23:51,064][84230] Avg episode reward: [(0, '7.880'), (1, '8.070')] +[2023-10-11 16:23:51,682][85175] Updated weights for policy 1, policy_version 35270 (0.0010) +[2023-10-11 16:23:52,057][85175] Updated weights for policy 1, policy_version 35280 (0.0010) +[2023-10-11 16:23:52,418][85175] Updated weights for policy 1, policy_version 35290 (0.0009) +[2023-10-11 16:23:54,112][85176] Updated weights for policy 0, policy_version 34792 (0.0009) +[2023-10-11 16:23:54,488][85176] Updated weights for policy 0, policy_version 34802 (0.0009) +[2023-10-11 16:23:54,863][85176] Updated weights for policy 0, policy_version 34812 (0.0010) +[2023-10-11 16:23:56,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 71794688. Throughput: 0: 1654.5, 1: 1696.2. Samples: 17956174. Policy #0 lag: (min: 31.0, avg: 37.2, max: 63.0) +[2023-10-11 16:23:56,064][84230] Avg episode reward: [(0, '7.570'), (1, '7.300')] +[2023-10-11 16:23:56,396][85175] Updated weights for policy 1, policy_version 35300 (0.0009) +[2023-10-11 16:23:56,766][85175] Updated weights for policy 1, policy_version 35310 (0.0008) +[2023-10-11 16:23:57,141][85175] Updated weights for policy 1, policy_version 35320 (0.0010) +[2023-10-11 16:23:59,026][85176] Updated weights for policy 0, policy_version 34822 (0.0010) +[2023-10-11 16:23:59,402][85176] Updated weights for policy 0, policy_version 34832 (0.0010) +[2023-10-11 16:23:59,778][85176] Updated weights for policy 0, policy_version 34842 (0.0008) +[2023-10-11 16:24:01,062][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 71860224. Throughput: 0: 1669.2, 1: 1691.2. Samples: 17976350. Policy #0 lag: (min: 31.0, avg: 37.2, max: 63.0) +[2023-10-11 16:24:01,063][84230] Avg episode reward: [(0, '7.610'), (1, '7.280')] +[2023-10-11 16:24:01,177][85175] Updated weights for policy 1, policy_version 35330 (0.0009) +[2023-10-11 16:24:01,548][85175] Updated weights for policy 1, policy_version 35340 (0.0007) +[2023-10-11 16:24:01,917][85175] Updated weights for policy 1, policy_version 35350 (0.0007) +[2023-10-11 16:24:02,281][85175] Updated weights for policy 1, policy_version 35360 (0.0009) +[2023-10-11 16:24:03,956][85176] Updated weights for policy 0, policy_version 34852 (0.0009) +[2023-10-11 16:24:04,334][85176] Updated weights for policy 0, policy_version 34862 (0.0007) +[2023-10-11 16:24:04,708][85176] Updated weights for policy 0, policy_version 34872 (0.0007) +[2023-10-11 16:24:06,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 71925760. Throughput: 0: 1672.9, 1: 1683.4. Samples: 17986706. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:24:06,063][84230] Avg episode reward: [(0, '7.520'), (1, '7.420')] +[2023-10-11 16:24:06,359][85175] Updated weights for policy 1, policy_version 35370 (0.0009) +[2023-10-11 16:24:06,723][85175] Updated weights for policy 1, policy_version 35380 (0.0009) +[2023-10-11 16:24:07,093][85175] Updated weights for policy 1, policy_version 35390 (0.0010) +[2023-10-11 16:24:08,609][85176] Updated weights for policy 0, policy_version 34882 (0.0008) +[2023-10-11 16:24:08,986][85176] Updated weights for policy 0, policy_version 34892 (0.0007) +[2023-10-11 16:24:09,368][85176] Updated weights for policy 0, policy_version 34902 (0.0008) +[2023-10-11 16:24:09,744][85176] Updated weights for policy 0, policy_version 34912 (0.0007) +[2023-10-11 16:24:11,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 71991296. Throughput: 0: 1655.2, 1: 1691.0. Samples: 18006278. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:24:11,064][84230] Avg episode reward: [(0, '8.120'), (1, '7.840')] +[2023-10-11 16:24:11,157][85175] Updated weights for policy 1, policy_version 35400 (0.0008) +[2023-10-11 16:24:11,520][85175] Updated weights for policy 1, policy_version 35410 (0.0009) +[2023-10-11 16:24:11,885][85175] Updated weights for policy 1, policy_version 35420 (0.0007) +[2023-10-11 16:24:13,806][85176] Updated weights for policy 0, policy_version 34922 (0.0009) +[2023-10-11 16:24:14,166][85176] Updated weights for policy 0, policy_version 34932 (0.0010) +[2023-10-11 16:24:14,545][85176] Updated weights for policy 0, policy_version 34942 (0.0009) +[2023-10-11 16:24:15,988][85175] Updated weights for policy 1, policy_version 35430 (0.0008) +[2023-10-11 16:24:16,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 72056832. Throughput: 0: 1666.5, 1: 1690.8. Samples: 18026552. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:24:16,063][84230] Avg episode reward: [(0, '8.150'), (1, '7.930')] +[2023-10-11 16:24:16,072][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000034944_35782656.pth... +[2023-10-11 16:24:16,104][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000033376_34177024.pth +[2023-10-11 16:24:16,352][85175] Updated weights for policy 1, policy_version 35440 (0.0008) +[2023-10-11 16:24:16,712][85175] Updated weights for policy 1, policy_version 35450 (0.0010) +[2023-10-11 16:24:16,936][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000035456_36306944.pth... +[2023-10-11 16:24:16,975][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000033856_34668544.pth +[2023-10-11 16:24:18,630][85176] Updated weights for policy 0, policy_version 34952 (0.0007) +[2023-10-11 16:24:19,003][85176] Updated weights for policy 0, policy_version 34962 (0.0008) +[2023-10-11 16:24:19,373][85176] Updated weights for policy 0, policy_version 34972 (0.0007) +[2023-10-11 16:24:20,727][85175] Updated weights for policy 1, policy_version 35460 (0.0008) +[2023-10-11 16:24:21,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 72122368. Throughput: 0: 1663.2, 1: 1687.6. Samples: 18036830. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:24:21,063][84230] Avg episode reward: [(0, '8.010'), (1, '7.700')] +[2023-10-11 16:24:21,100][85175] Updated weights for policy 1, policy_version 35470 (0.0007) +[2023-10-11 16:24:21,471][85175] Updated weights for policy 1, policy_version 35480 (0.0008) +[2023-10-11 16:24:23,578][85176] Updated weights for policy 0, policy_version 34982 (0.0009) +[2023-10-11 16:24:23,957][85176] Updated weights for policy 0, policy_version 34992 (0.0010) +[2023-10-11 16:24:24,332][85176] Updated weights for policy 0, policy_version 35002 (0.0010) +[2023-10-11 16:24:25,497][85175] Updated weights for policy 1, policy_version 35490 (0.0007) +[2023-10-11 16:24:25,869][85175] Updated weights for policy 1, policy_version 35500 (0.0010) +[2023-10-11 16:24:26,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 72187904. Throughput: 0: 1649.6, 1: 1680.7. Samples: 18056236. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:24:26,063][84230] Avg episode reward: [(0, '7.600'), (1, '7.220')] +[2023-10-11 16:24:26,233][85175] Updated weights for policy 1, policy_version 35510 (0.0007) +[2023-10-11 16:24:26,600][85175] Updated weights for policy 1, policy_version 35520 (0.0010) +[2023-10-11 16:24:28,333][85176] Updated weights for policy 0, policy_version 35012 (0.0008) +[2023-10-11 16:24:28,711][85176] Updated weights for policy 0, policy_version 35022 (0.0007) +[2023-10-11 16:24:29,087][85176] Updated weights for policy 0, policy_version 35032 (0.0010) +[2023-10-11 16:24:30,761][85175] Updated weights for policy 1, policy_version 35530 (0.0008) +[2023-10-11 16:24:31,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 72253440. Throughput: 0: 1669.8, 1: 1675.8. Samples: 18076816. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) +[2023-10-11 16:24:31,063][84230] Avg episode reward: [(0, '7.300'), (1, '6.930')] +[2023-10-11 16:24:31,135][85175] Updated weights for policy 1, policy_version 35540 (0.0008) +[2023-10-11 16:24:31,512][85175] Updated weights for policy 1, policy_version 35550 (0.0010) +[2023-10-11 16:24:33,177][85176] Updated weights for policy 0, policy_version 35042 (0.0008) +[2023-10-11 16:24:33,568][85176] Updated weights for policy 0, policy_version 35052 (0.0007) +[2023-10-11 16:24:33,934][85176] Updated weights for policy 0, policy_version 35062 (0.0008) +[2023-10-11 16:24:34,312][85176] Updated weights for policy 0, policy_version 35072 (0.0008) +[2023-10-11 16:24:35,518][85175] Updated weights for policy 1, policy_version 35560 (0.0010) +[2023-10-11 16:24:35,884][85175] Updated weights for policy 1, policy_version 35570 (0.0008) +[2023-10-11 16:24:36,062][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 72318976. Throughput: 0: 1656.8, 1: 1686.1. Samples: 18086972. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) +[2023-10-11 16:24:36,063][84230] Avg episode reward: [(0, '7.110'), (1, '7.640')] +[2023-10-11 16:24:36,248][85175] Updated weights for policy 1, policy_version 35580 (0.0010) +[2023-10-11 16:24:38,382][85176] Updated weights for policy 0, policy_version 35082 (0.0008) +[2023-10-11 16:24:38,750][85176] Updated weights for policy 0, policy_version 35092 (0.0010) +[2023-10-11 16:24:39,113][85176] Updated weights for policy 0, policy_version 35102 (0.0007) +[2023-10-11 16:24:40,396][85175] Updated weights for policy 1, policy_version 35590 (0.0009) +[2023-10-11 16:24:40,783][85175] Updated weights for policy 1, policy_version 35600 (0.0010) +[2023-10-11 16:24:41,062][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 72384512. Throughput: 0: 1656.7, 1: 1687.3. Samples: 18106654. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) +[2023-10-11 16:24:41,063][84230] Avg episode reward: [(0, '7.540'), (1, '7.780')] +[2023-10-11 16:24:41,157][85175] Updated weights for policy 1, policy_version 35610 (0.0008) +[2023-10-11 16:24:43,367][85176] Updated weights for policy 0, policy_version 35112 (0.0008) +[2023-10-11 16:24:43,740][85176] Updated weights for policy 0, policy_version 35122 (0.0007) +[2023-10-11 16:24:44,107][85176] Updated weights for policy 0, policy_version 35132 (0.0008) +[2023-10-11 16:24:45,242][85175] Updated weights for policy 1, policy_version 35620 (0.0009) +[2023-10-11 16:24:45,622][85175] Updated weights for policy 1, policy_version 35630 (0.0010) +[2023-10-11 16:24:45,989][85175] Updated weights for policy 1, policy_version 35640 (0.0011) +[2023-10-11 16:24:46,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 72450048. Throughput: 0: 1667.8, 1: 1677.0. Samples: 18126866. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) +[2023-10-11 16:24:46,063][84230] Avg episode reward: [(0, '7.580'), (1, '7.650')] +[2023-10-11 16:24:48,126][85176] Updated weights for policy 0, policy_version 35142 (0.0008) +[2023-10-11 16:24:48,498][85176] Updated weights for policy 0, policy_version 35152 (0.0008) +[2023-10-11 16:24:48,866][85176] Updated weights for policy 0, policy_version 35162 (0.0008) +[2023-10-11 16:24:50,006][85175] Updated weights for policy 1, policy_version 35650 (0.0010) +[2023-10-11 16:24:50,386][85175] Updated weights for policy 1, policy_version 35660 (0.0008) +[2023-10-11 16:24:50,750][85175] Updated weights for policy 1, policy_version 35670 (0.0007) +[2023-10-11 16:24:51,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 72515584. Throughput: 0: 1654.2, 1: 1686.6. Samples: 18137040. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) +[2023-10-11 16:24:51,063][84230] Avg episode reward: [(0, '8.200'), (1, '7.560')] +[2023-10-11 16:24:51,130][85175] Updated weights for policy 1, policy_version 35680 (0.0009) +[2023-10-11 16:24:53,006][85176] Updated weights for policy 0, policy_version 35172 (0.0009) +[2023-10-11 16:24:53,386][85176] Updated weights for policy 0, policy_version 35182 (0.0010) +[2023-10-11 16:24:53,755][85176] Updated weights for policy 0, policy_version 35192 (0.0007) +[2023-10-11 16:24:55,263][85175] Updated weights for policy 1, policy_version 35690 (0.0007) +[2023-10-11 16:24:55,635][85175] Updated weights for policy 1, policy_version 35700 (0.0007) +[2023-10-11 16:24:56,000][85175] Updated weights for policy 1, policy_version 35710 (0.0008) +[2023-10-11 16:24:56,062][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 72581120. Throughput: 0: 1661.4, 1: 1684.1. Samples: 18156826. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:24:56,063][84230] Avg episode reward: [(0, '8.050'), (1, '7.550')] +[2023-10-11 16:24:57,843][85176] Updated weights for policy 0, policy_version 35202 (0.0010) +[2023-10-11 16:24:58,212][85176] Updated weights for policy 0, policy_version 35212 (0.0009) +[2023-10-11 16:24:58,595][85176] Updated weights for policy 0, policy_version 35222 (0.0010) +[2023-10-11 16:24:58,958][85176] Updated weights for policy 0, policy_version 35232 (0.0009) +[2023-10-11 16:25:00,002][85175] Updated weights for policy 1, policy_version 35720 (0.0008) +[2023-10-11 16:25:00,377][85175] Updated weights for policy 1, policy_version 35730 (0.0009) +[2023-10-11 16:25:00,744][85175] Updated weights for policy 1, policy_version 35740 (0.0007) +[2023-10-11 16:25:01,063][84230] Fps is (10 sec: 16383.7, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 72679424. Throughput: 0: 1670.9, 1: 1665.6. Samples: 18176696. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:25:01,064][84230] Avg episode reward: [(0, '7.750'), (1, '7.540')] +[2023-10-11 16:25:03,226][85176] Updated weights for policy 0, policy_version 35242 (0.0007) +[2023-10-11 16:25:03,597][85176] Updated weights for policy 0, policy_version 35252 (0.0010) +[2023-10-11 16:25:03,969][85176] Updated weights for policy 0, policy_version 35262 (0.0012) +[2023-10-11 16:25:04,694][85175] Updated weights for policy 1, policy_version 35750 (0.0008) +[2023-10-11 16:25:05,064][85175] Updated weights for policy 1, policy_version 35760 (0.0007) +[2023-10-11 16:25:05,426][85175] Updated weights for policy 1, policy_version 35770 (0.0009) +[2023-10-11 16:25:06,063][84230] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 72744960. Throughput: 0: 1656.8, 1: 1683.8. Samples: 18187158. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:25:06,063][84230] Avg episode reward: [(0, '7.450'), (1, '7.650')] +[2023-10-11 16:25:08,087][85176] Updated weights for policy 0, policy_version 35272 (0.0008) +[2023-10-11 16:25:08,458][85176] Updated weights for policy 0, policy_version 35282 (0.0007) +[2023-10-11 16:25:08,826][85176] Updated weights for policy 0, policy_version 35292 (0.0010) +[2023-10-11 16:25:09,508][85175] Updated weights for policy 1, policy_version 35780 (0.0010) +[2023-10-11 16:25:09,872][85175] Updated weights for policy 1, policy_version 35790 (0.0010) +[2023-10-11 16:25:10,239][85175] Updated weights for policy 1, policy_version 35800 (0.0010) +[2023-10-11 16:25:11,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 72810496. Throughput: 0: 1669.1, 1: 1679.9. Samples: 18206940. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:25:11,064][84230] Avg episode reward: [(0, '7.450'), (1, '7.390')] +[2023-10-11 16:25:12,869][85176] Updated weights for policy 0, policy_version 35302 (0.0008) +[2023-10-11 16:25:13,239][85176] Updated weights for policy 0, policy_version 35312 (0.0009) +[2023-10-11 16:25:13,609][85176] Updated weights for policy 0, policy_version 35322 (0.0008) +[2023-10-11 16:25:14,243][85175] Updated weights for policy 1, policy_version 35810 (0.0007) +[2023-10-11 16:25:14,609][85175] Updated weights for policy 1, policy_version 35820 (0.0007) +[2023-10-11 16:25:14,979][85175] Updated weights for policy 1, policy_version 35830 (0.0007) +[2023-10-11 16:25:15,352][85175] Updated weights for policy 1, policy_version 35840 (0.0009) +[2023-10-11 16:25:16,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 72876032. Throughput: 0: 1668.7, 1: 1661.7. Samples: 18226684. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:25:16,064][84230] Avg episode reward: [(0, '8.150'), (1, '7.660')] +[2023-10-11 16:25:17,742][85176] Updated weights for policy 0, policy_version 35332 (0.0008) +[2023-10-11 16:25:18,137][85176] Updated weights for policy 0, policy_version 35342 (0.0009) +[2023-10-11 16:25:18,518][85176] Updated weights for policy 0, policy_version 35352 (0.0007) +[2023-10-11 16:25:19,384][85175] Updated weights for policy 1, policy_version 35850 (0.0010) +[2023-10-11 16:25:19,758][85175] Updated weights for policy 1, policy_version 35860 (0.0011) +[2023-10-11 16:25:20,119][85175] Updated weights for policy 1, policy_version 35870 (0.0010) +[2023-10-11 16:25:21,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 72941568. Throughput: 0: 1652.5, 1: 1683.5. Samples: 18237092. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:25:21,064][84230] Avg episode reward: [(0, '8.000'), (1, '7.730')] +[2023-10-11 16:25:22,647][85176] Updated weights for policy 0, policy_version 35362 (0.0008) +[2023-10-11 16:25:23,023][85176] Updated weights for policy 0, policy_version 35372 (0.0007) +[2023-10-11 16:25:23,391][85176] Updated weights for policy 0, policy_version 35382 (0.0007) +[2023-10-11 16:25:23,768][85176] Updated weights for policy 0, policy_version 35392 (0.0007) +[2023-10-11 16:25:24,322][85175] Updated weights for policy 1, policy_version 35880 (0.0010) +[2023-10-11 16:25:24,694][85175] Updated weights for policy 1, policy_version 35890 (0.0008) +[2023-10-11 16:25:25,060][85175] Updated weights for policy 1, policy_version 35900 (0.0007) +[2023-10-11 16:25:26,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 73007104. Throughput: 0: 1662.6, 1: 1672.4. Samples: 18256732. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:25:26,064][84230] Avg episode reward: [(0, '7.870'), (1, '7.750')] +[2023-10-11 16:25:27,873][85176] Updated weights for policy 0, policy_version 35402 (0.0010) +[2023-10-11 16:25:28,243][85176] Updated weights for policy 0, policy_version 35412 (0.0009) +[2023-10-11 16:25:28,624][85176] Updated weights for policy 0, policy_version 35422 (0.0008) +[2023-10-11 16:25:29,133][85175] Updated weights for policy 1, policy_version 35910 (0.0007) +[2023-10-11 16:25:29,526][85175] Updated weights for policy 1, policy_version 35920 (0.0007) +[2023-10-11 16:25:29,887][85175] Updated weights for policy 1, policy_version 35930 (0.0007) +[2023-10-11 16:25:31,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 73072640. Throughput: 0: 1662.9, 1: 1663.9. Samples: 18276574. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:25:31,063][84230] Avg episode reward: [(0, '7.720'), (1, '7.690')] +[2023-10-11 16:25:32,578][85176] Updated weights for policy 0, policy_version 35432 (0.0008) +[2023-10-11 16:25:32,957][85176] Updated weights for policy 0, policy_version 35442 (0.0008) +[2023-10-11 16:25:33,330][85176] Updated weights for policy 0, policy_version 35452 (0.0010) +[2023-10-11 16:25:33,786][85175] Updated weights for policy 1, policy_version 35940 (0.0009) +[2023-10-11 16:25:34,143][85175] Updated weights for policy 1, policy_version 35950 (0.0007) +[2023-10-11 16:25:34,509][85175] Updated weights for policy 1, policy_version 35960 (0.0007) +[2023-10-11 16:25:36,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 73138176. Throughput: 0: 1646.2, 1: 1688.5. Samples: 18287102. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:25:36,064][84230] Avg episode reward: [(0, '7.410'), (1, '7.560')] +[2023-10-11 16:25:37,489][85176] Updated weights for policy 0, policy_version 35462 (0.0011) +[2023-10-11 16:25:37,848][85176] Updated weights for policy 0, policy_version 35472 (0.0010) +[2023-10-11 16:25:38,219][85176] Updated weights for policy 0, policy_version 35482 (0.0010) +[2023-10-11 16:25:38,480][85175] Updated weights for policy 1, policy_version 35970 (0.0009) +[2023-10-11 16:25:38,846][85175] Updated weights for policy 1, policy_version 35980 (0.0010) +[2023-10-11 16:25:39,211][85175] Updated weights for policy 1, policy_version 35990 (0.0008) +[2023-10-11 16:25:39,581][85175] Updated weights for policy 1, policy_version 36000 (0.0009) +[2023-10-11 16:25:41,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 73203712. Throughput: 0: 1662.8, 1: 1669.3. Samples: 18306772. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:25:41,064][84230] Avg episode reward: [(0, '7.820'), (1, '7.170')] +[2023-10-11 16:25:42,359][85176] Updated weights for policy 0, policy_version 35492 (0.0008) +[2023-10-11 16:25:42,735][85176] Updated weights for policy 0, policy_version 35502 (0.0009) +[2023-10-11 16:25:43,111][85176] Updated weights for policy 0, policy_version 35512 (0.0008) +[2023-10-11 16:25:43,655][85175] Updated weights for policy 1, policy_version 36010 (0.0008) +[2023-10-11 16:25:44,022][85175] Updated weights for policy 1, policy_version 36020 (0.0009) +[2023-10-11 16:25:44,401][85175] Updated weights for policy 1, policy_version 36030 (0.0009) +[2023-10-11 16:25:46,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 73269248. Throughput: 0: 1660.6, 1: 1688.2. Samples: 18327390. Policy #0 lag: (min: 13.0, avg: 25.0, max: 45.0) +[2023-10-11 16:25:46,063][84230] Avg episode reward: [(0, '7.710'), (1, '7.690')] +[2023-10-11 16:25:47,243][85176] Updated weights for policy 0, policy_version 35522 (0.0008) +[2023-10-11 16:25:47,614][85176] Updated weights for policy 0, policy_version 35532 (0.0009) +[2023-10-11 16:25:47,981][85176] Updated weights for policy 0, policy_version 35542 (0.0010) +[2023-10-11 16:25:48,360][85176] Updated weights for policy 0, policy_version 35552 (0.0009) +[2023-10-11 16:25:48,464][85175] Updated weights for policy 1, policy_version 36040 (0.0009) +[2023-10-11 16:25:48,832][85175] Updated weights for policy 1, policy_version 36050 (0.0008) +[2023-10-11 16:25:49,204][85175] Updated weights for policy 1, policy_version 36060 (0.0008) +[2023-10-11 16:25:51,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 73334784. Throughput: 0: 1649.9, 1: 1686.8. Samples: 18337310. Policy #0 lag: (min: 13.0, avg: 25.0, max: 45.0) +[2023-10-11 16:25:51,063][84230] Avg episode reward: [(0, '7.750'), (1, '7.910')] +[2023-10-11 16:25:52,424][85176] Updated weights for policy 0, policy_version 35562 (0.0007) +[2023-10-11 16:25:52,788][85176] Updated weights for policy 0, policy_version 35572 (0.0007) +[2023-10-11 16:25:53,158][85176] Updated weights for policy 0, policy_version 35582 (0.0008) +[2023-10-11 16:25:53,170][85175] Updated weights for policy 1, policy_version 36070 (0.0008) +[2023-10-11 16:25:53,530][85175] Updated weights for policy 1, policy_version 36080 (0.0009) +[2023-10-11 16:25:53,901][85175] Updated weights for policy 1, policy_version 36090 (0.0010) +[2023-10-11 16:25:56,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 73400320. Throughput: 0: 1661.7, 1: 1676.1. Samples: 18357144. Policy #0 lag: (min: 13.0, avg: 25.0, max: 45.0) +[2023-10-11 16:25:56,063][84230] Avg episode reward: [(0, '7.450'), (1, '7.760')] +[2023-10-11 16:25:57,458][85176] Updated weights for policy 0, policy_version 35592 (0.0008) +[2023-10-11 16:25:57,827][85176] Updated weights for policy 0, policy_version 35602 (0.0008) +[2023-10-11 16:25:57,913][85175] Updated weights for policy 1, policy_version 36100 (0.0008) +[2023-10-11 16:25:58,206][85176] Updated weights for policy 0, policy_version 35612 (0.0009) +[2023-10-11 16:25:58,278][85175] Updated weights for policy 1, policy_version 36110 (0.0007) +[2023-10-11 16:25:58,645][85175] Updated weights for policy 1, policy_version 36120 (0.0010) +[2023-10-11 16:26:01,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 73465856. Throughput: 0: 1659.2, 1: 1699.1. Samples: 18377808. Policy #0 lag: (min: 13.0, avg: 25.0, max: 45.0) +[2023-10-11 16:26:01,064][84230] Avg episode reward: [(0, '7.450'), (1, '7.200')] +[2023-10-11 16:26:02,254][85176] Updated weights for policy 0, policy_version 35622 (0.0009) +[2023-10-11 16:26:02,620][85176] Updated weights for policy 0, policy_version 35632 (0.0009) +[2023-10-11 16:26:02,893][85175] Updated weights for policy 1, policy_version 36130 (0.0009) +[2023-10-11 16:26:02,997][85176] Updated weights for policy 0, policy_version 35642 (0.0010) +[2023-10-11 16:26:03,257][85175] Updated weights for policy 1, policy_version 36140 (0.0007) +[2023-10-11 16:26:03,636][85175] Updated weights for policy 1, policy_version 36150 (0.0008) +[2023-10-11 16:26:03,994][85175] Updated weights for policy 1, policy_version 36160 (0.0010) +[2023-10-11 16:26:06,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 73531392. Throughput: 0: 1659.2, 1: 1681.8. Samples: 18387434. Policy #0 lag: (min: 13.0, avg: 25.0, max: 45.0) +[2023-10-11 16:26:06,063][84230] Avg episode reward: [(0, '7.750'), (1, '7.450')] +[2023-10-11 16:26:07,209][85176] Updated weights for policy 0, policy_version 35652 (0.0008) +[2023-10-11 16:26:07,603][85176] Updated weights for policy 0, policy_version 35662 (0.0010) +[2023-10-11 16:26:07,968][85176] Updated weights for policy 0, policy_version 35672 (0.0007) +[2023-10-11 16:26:08,074][85175] Updated weights for policy 1, policy_version 36170 (0.0007) +[2023-10-11 16:26:08,445][85175] Updated weights for policy 1, policy_version 36180 (0.0008) +[2023-10-11 16:26:08,814][85175] Updated weights for policy 1, policy_version 36190 (0.0009) +[2023-10-11 16:26:11,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 73596928. Throughput: 0: 1667.2, 1: 1681.8. Samples: 18407438. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) +[2023-10-11 16:26:11,063][84230] Avg episode reward: [(0, '8.200'), (1, '7.730')] +[2023-10-11 16:26:12,020][85176] Updated weights for policy 0, policy_version 35682 (0.0008) +[2023-10-11 16:26:12,391][85176] Updated weights for policy 0, policy_version 35692 (0.0008) +[2023-10-11 16:26:12,765][85176] Updated weights for policy 0, policy_version 35702 (0.0009) +[2023-10-11 16:26:12,964][85175] Updated weights for policy 1, policy_version 36200 (0.0009) +[2023-10-11 16:26:13,142][85176] Updated weights for policy 0, policy_version 35712 (0.0009) +[2023-10-11 16:26:13,331][85175] Updated weights for policy 1, policy_version 36210 (0.0007) +[2023-10-11 16:26:13,698][85175] Updated weights for policy 1, policy_version 36220 (0.0007) +[2023-10-11 16:26:16,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 73662464. Throughput: 0: 1665.1, 1: 1698.1. Samples: 18427916. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) +[2023-10-11 16:26:16,063][84230] Avg episode reward: [(0, '8.350'), (1, '7.580')] +[2023-10-11 16:26:16,071][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000036224_37093376.pth... +[2023-10-11 16:26:16,071][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000035712_36569088.pth... +[2023-10-11 16:26:16,111][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000034656_35487744.pth +[2023-10-11 16:26:16,112][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000034176_34996224.pth +[2023-10-11 16:26:17,156][85176] Updated weights for policy 0, policy_version 35722 (0.0008) +[2023-10-11 16:26:17,524][85176] Updated weights for policy 0, policy_version 35732 (0.0010) +[2023-10-11 16:26:17,798][85175] Updated weights for policy 1, policy_version 36230 (0.0007) +[2023-10-11 16:26:17,897][85176] Updated weights for policy 0, policy_version 35742 (0.0009) +[2023-10-11 16:26:18,200][85175] Updated weights for policy 1, policy_version 36240 (0.0008) +[2023-10-11 16:26:18,572][85175] Updated weights for policy 1, policy_version 36250 (0.0008) +[2023-10-11 16:26:21,063][84230] Fps is (10 sec: 13106.8, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 73728000. Throughput: 0: 1667.4, 1: 1671.3. Samples: 18437344. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) +[2023-10-11 16:26:21,064][84230] Avg episode reward: [(0, '7.730'), (1, '7.260')] +[2023-10-11 16:26:21,816][85176] Updated weights for policy 0, policy_version 35752 (0.0007) +[2023-10-11 16:26:22,184][85176] Updated weights for policy 0, policy_version 35762 (0.0007) +[2023-10-11 16:26:22,553][85176] Updated weights for policy 0, policy_version 35772 (0.0008) +[2023-10-11 16:26:22,553][85175] Updated weights for policy 1, policy_version 36260 (0.0010) +[2023-10-11 16:26:22,930][85175] Updated weights for policy 1, policy_version 36270 (0.0008) +[2023-10-11 16:26:23,297][85175] Updated weights for policy 1, policy_version 36280 (0.0007) +[2023-10-11 16:26:26,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 73793536. Throughput: 0: 1672.0, 1: 1685.2. Samples: 18457848. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) +[2023-10-11 16:26:26,063][84230] Avg episode reward: [(0, '7.280'), (1, '7.290')] +[2023-10-11 16:26:26,589][85176] Updated weights for policy 0, policy_version 35782 (0.0008) +[2023-10-11 16:26:26,965][85176] Updated weights for policy 0, policy_version 35792 (0.0010) +[2023-10-11 16:26:27,339][85176] Updated weights for policy 0, policy_version 35802 (0.0009) +[2023-10-11 16:26:27,379][85175] Updated weights for policy 1, policy_version 36290 (0.0008) +[2023-10-11 16:26:27,744][85175] Updated weights for policy 1, policy_version 36300 (0.0009) +[2023-10-11 16:26:28,120][85175] Updated weights for policy 1, policy_version 36310 (0.0008) +[2023-10-11 16:26:28,499][85175] Updated weights for policy 1, policy_version 36320 (0.0007) +[2023-10-11 16:26:31,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 73859072. Throughput: 0: 1674.6, 1: 1689.6. Samples: 18478782. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) +[2023-10-11 16:26:31,064][84230] Avg episode reward: [(0, '7.070'), (1, '7.510')] +[2023-10-11 16:26:31,481][85176] Updated weights for policy 0, policy_version 35812 (0.0008) +[2023-10-11 16:26:31,857][85176] Updated weights for policy 0, policy_version 35822 (0.0008) +[2023-10-11 16:26:32,232][85176] Updated weights for policy 0, policy_version 35832 (0.0009) +[2023-10-11 16:26:32,341][85175] Updated weights for policy 1, policy_version 36330 (0.0007) +[2023-10-11 16:26:32,703][85175] Updated weights for policy 1, policy_version 36340 (0.0008) +[2023-10-11 16:26:33,071][85175] Updated weights for policy 1, policy_version 36350 (0.0008) +[2023-10-11 16:26:36,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 73924608. Throughput: 0: 1673.2, 1: 1672.8. Samples: 18487876. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:26:36,063][84230] Avg episode reward: [(0, '7.950'), (1, '7.920')] +[2023-10-11 16:26:36,350][85176] Updated weights for policy 0, policy_version 35842 (0.0010) +[2023-10-11 16:26:36,731][85176] Updated weights for policy 0, policy_version 35852 (0.0010) +[2023-10-11 16:26:36,983][85175] Updated weights for policy 1, policy_version 36360 (0.0008) +[2023-10-11 16:26:37,111][85176] Updated weights for policy 0, policy_version 35862 (0.0010) +[2023-10-11 16:26:37,343][85175] Updated weights for policy 1, policy_version 36370 (0.0008) +[2023-10-11 16:26:37,474][85176] Updated weights for policy 0, policy_version 35872 (0.0009) +[2023-10-11 16:26:37,711][85175] Updated weights for policy 1, policy_version 36380 (0.0007) +[2023-10-11 16:26:41,063][84230] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 73990144. Throughput: 0: 1669.9, 1: 1692.1. Samples: 18508432. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:26:41,063][84230] Avg episode reward: [(0, '8.030'), (1, '8.310')] +[2023-10-11 16:26:41,643][85176] Updated weights for policy 0, policy_version 35882 (0.0007) +[2023-10-11 16:26:41,713][85175] Updated weights for policy 1, policy_version 36390 (0.0008) +[2023-10-11 16:26:42,012][85176] Updated weights for policy 0, policy_version 35892 (0.0007) +[2023-10-11 16:26:42,077][85175] Updated weights for policy 1, policy_version 36400 (0.0009) +[2023-10-11 16:26:42,374][85176] Updated weights for policy 0, policy_version 35902 (0.0007) +[2023-10-11 16:26:42,441][85175] Updated weights for policy 1, policy_version 36410 (0.0008) +[2023-10-11 16:26:46,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13107.1, 300 sec: 13329.3). Total num frames: 74055680. Throughput: 0: 1666.1, 1: 1694.4. Samples: 18529032. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:26:46,064][84230] Avg episode reward: [(0, '8.050'), (1, '7.870')] +[2023-10-11 16:26:46,647][85175] Updated weights for policy 1, policy_version 36420 (0.0009) +[2023-10-11 16:26:46,682][85176] Updated weights for policy 0, policy_version 35912 (0.0008) +[2023-10-11 16:26:47,015][85175] Updated weights for policy 1, policy_version 36430 (0.0009) +[2023-10-11 16:26:47,058][85176] Updated weights for policy 0, policy_version 35922 (0.0008) +[2023-10-11 16:26:47,389][85175] Updated weights for policy 1, policy_version 36440 (0.0009) +[2023-10-11 16:26:47,429][85176] Updated weights for policy 0, policy_version 35932 (0.0008) +[2023-10-11 16:26:51,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 74121216. Throughput: 0: 1664.1, 1: 1683.1. Samples: 18538060. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:26:51,064][84230] Avg episode reward: [(0, '7.440'), (1, '7.390')] +[2023-10-11 16:26:51,432][85176] Updated weights for policy 0, policy_version 35942 (0.0009) +[2023-10-11 16:26:51,439][85175] Updated weights for policy 1, policy_version 36450 (0.0007) +[2023-10-11 16:26:51,807][85175] Updated weights for policy 1, policy_version 36460 (0.0008) +[2023-10-11 16:26:51,809][85176] Updated weights for policy 0, policy_version 35952 (0.0009) +[2023-10-11 16:26:52,177][85175] Updated weights for policy 1, policy_version 36470 (0.0008) +[2023-10-11 16:26:52,186][85176] Updated weights for policy 0, policy_version 35962 (0.0008) +[2023-10-11 16:26:52,544][85175] Updated weights for policy 1, policy_version 36480 (0.0008) +[2023-10-11 16:26:56,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.1, 300 sec: 13329.4). Total num frames: 74186752. Throughput: 0: 1662.8, 1: 1696.0. Samples: 18558588. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:26:56,064][84230] Avg episode reward: [(0, '7.580'), (1, '7.260')] +[2023-10-11 16:26:56,431][85176] Updated weights for policy 0, policy_version 35972 (0.0008) +[2023-10-11 16:26:56,600][85175] Updated weights for policy 1, policy_version 36490 (0.0008) +[2023-10-11 16:26:56,814][85176] Updated weights for policy 0, policy_version 35982 (0.0010) +[2023-10-11 16:26:56,960][85175] Updated weights for policy 1, policy_version 36500 (0.0009) +[2023-10-11 16:26:57,186][85176] Updated weights for policy 0, policy_version 35992 (0.0008) +[2023-10-11 16:26:57,333][85175] Updated weights for policy 1, policy_version 36510 (0.0007) +[2023-10-11 16:27:01,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 74252288. Throughput: 0: 1660.2, 1: 1703.3. Samples: 18579274. Policy #0 lag: (min: 9.0, avg: 28.3, max: 32.0) +[2023-10-11 16:27:01,063][84230] Avg episode reward: [(0, '7.890'), (1, '7.540')] +[2023-10-11 16:27:01,293][85175] Updated weights for policy 1, policy_version 36520 (0.0009) +[2023-10-11 16:27:01,411][85176] Updated weights for policy 0, policy_version 36002 (0.0010) +[2023-10-11 16:27:01,668][85175] Updated weights for policy 1, policy_version 36530 (0.0008) +[2023-10-11 16:27:01,780][85176] Updated weights for policy 0, policy_version 36012 (0.0007) +[2023-10-11 16:27:02,032][85175] Updated weights for policy 1, policy_version 36540 (0.0007) +[2023-10-11 16:27:02,145][85176] Updated weights for policy 0, policy_version 36022 (0.0008) +[2023-10-11 16:27:02,527][85176] Updated weights for policy 0, policy_version 36032 (0.0010) +[2023-10-11 16:27:06,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 74317824. Throughput: 0: 1657.6, 1: 1697.6. Samples: 18588326. Policy #0 lag: (min: 9.0, avg: 28.3, max: 32.0) +[2023-10-11 16:27:06,063][84230] Avg episode reward: [(0, '8.350'), (1, '7.700')] +[2023-10-11 16:27:06,088][85175] Updated weights for policy 1, policy_version 36550 (0.0009) +[2023-10-11 16:27:06,470][85175] Updated weights for policy 1, policy_version 36560 (0.0007) +[2023-10-11 16:27:06,573][85176] Updated weights for policy 0, policy_version 36042 (0.0009) +[2023-10-11 16:27:06,840][85175] Updated weights for policy 1, policy_version 36570 (0.0008) +[2023-10-11 16:27:06,940][85176] Updated weights for policy 0, policy_version 36052 (0.0009) +[2023-10-11 16:27:07,322][85176] Updated weights for policy 0, policy_version 36062 (0.0008) +[2023-10-11 16:27:10,897][85175] Updated weights for policy 1, policy_version 36580 (0.0007) +[2023-10-11 16:27:11,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 74383360. Throughput: 0: 1654.2, 1: 1701.5. Samples: 18608856. Policy #0 lag: (min: 9.0, avg: 28.3, max: 32.0) +[2023-10-11 16:27:11,063][84230] Avg episode reward: [(0, '7.890'), (1, '7.640')] +[2023-10-11 16:27:11,257][85175] Updated weights for policy 1, policy_version 36590 (0.0007) +[2023-10-11 16:27:11,259][85176] Updated weights for policy 0, policy_version 36072 (0.0007) +[2023-10-11 16:27:11,619][85175] Updated weights for policy 1, policy_version 36600 (0.0007) +[2023-10-11 16:27:11,637][85176] Updated weights for policy 0, policy_version 36082 (0.0007) +[2023-10-11 16:27:12,011][85176] Updated weights for policy 0, policy_version 36092 (0.0009) +[2023-10-11 16:27:15,722][85175] Updated weights for policy 1, policy_version 36610 (0.0008) +[2023-10-11 16:27:16,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 74448896. Throughput: 0: 1657.1, 1: 1695.7. Samples: 18629654. Policy #0 lag: (min: 9.0, avg: 28.3, max: 32.0) +[2023-10-11 16:27:16,063][84230] Avg episode reward: [(0, '7.440'), (1, '7.010')] +[2023-10-11 16:27:16,090][85175] Updated weights for policy 1, policy_version 36620 (0.0008) +[2023-10-11 16:27:16,110][85176] Updated weights for policy 0, policy_version 36102 (0.0007) +[2023-10-11 16:27:16,453][85175] Updated weights for policy 1, policy_version 36630 (0.0007) +[2023-10-11 16:27:16,480][85176] Updated weights for policy 0, policy_version 36112 (0.0008) +[2023-10-11 16:27:16,811][85175] Updated weights for policy 1, policy_version 36640 (0.0008) +[2023-10-11 16:27:16,856][85176] Updated weights for policy 0, policy_version 36122 (0.0007) +[2023-10-11 16:27:20,978][85175] Updated weights for policy 1, policy_version 36650 (0.0008) +[2023-10-11 16:27:21,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 74514432. Throughput: 0: 1656.6, 1: 1692.4. Samples: 18638582. Policy #0 lag: (min: 9.0, avg: 28.3, max: 32.0) +[2023-10-11 16:27:21,063][84230] Avg episode reward: [(0, '7.150'), (1, '7.310')] +[2023-10-11 16:27:21,134][85176] Updated weights for policy 0, policy_version 36132 (0.0009) +[2023-10-11 16:27:21,344][85175] Updated weights for policy 1, policy_version 36660 (0.0009) +[2023-10-11 16:27:21,507][85176] Updated weights for policy 0, policy_version 36142 (0.0007) +[2023-10-11 16:27:21,712][85175] Updated weights for policy 1, policy_version 36670 (0.0008) +[2023-10-11 16:27:21,889][85176] Updated weights for policy 0, policy_version 36152 (0.0009) +[2023-10-11 16:27:25,778][85175] Updated weights for policy 1, policy_version 36680 (0.0009) +[2023-10-11 16:27:26,050][85176] Updated weights for policy 0, policy_version 36162 (0.0007) +[2023-10-11 16:27:26,062][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 74579968. Throughput: 0: 1655.7, 1: 1686.9. Samples: 18658852. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:27:26,063][84230] Avg episode reward: [(0, '7.740'), (1, '7.800')] +[2023-10-11 16:27:26,162][85175] Updated weights for policy 1, policy_version 36690 (0.0010) +[2023-10-11 16:27:26,428][85176] Updated weights for policy 0, policy_version 36172 (0.0007) +[2023-10-11 16:27:26,533][85175] Updated weights for policy 1, policy_version 36700 (0.0009) +[2023-10-11 16:27:26,802][85176] Updated weights for policy 0, policy_version 36182 (0.0009) +[2023-10-11 16:27:27,168][85176] Updated weights for policy 0, policy_version 36192 (0.0010) +[2023-10-11 16:27:30,729][85175] Updated weights for policy 1, policy_version 36710 (0.0007) +[2023-10-11 16:27:31,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 74645504. Throughput: 0: 1663.7, 1: 1681.7. Samples: 18679576. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:27:31,063][84230] Avg episode reward: [(0, '8.000'), (1, '7.780')] +[2023-10-11 16:27:31,087][85175] Updated weights for policy 1, policy_version 36720 (0.0009) +[2023-10-11 16:27:31,149][85176] Updated weights for policy 0, policy_version 36202 (0.0008) +[2023-10-11 16:27:31,448][85175] Updated weights for policy 1, policy_version 36730 (0.0008) +[2023-10-11 16:27:31,520][85176] Updated weights for policy 0, policy_version 36212 (0.0009) +[2023-10-11 16:27:31,895][85176] Updated weights for policy 0, policy_version 36222 (0.0008) +[2023-10-11 16:27:35,597][85175] Updated weights for policy 1, policy_version 36740 (0.0007) +[2023-10-11 16:27:35,963][85175] Updated weights for policy 1, policy_version 36750 (0.0008) +[2023-10-11 16:27:36,006][85176] Updated weights for policy 0, policy_version 36232 (0.0008) +[2023-10-11 16:27:36,062][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 74711040. Throughput: 0: 1665.0, 1: 1680.3. Samples: 18688598. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:27:36,063][84230] Avg episode reward: [(0, '7.860'), (1, '7.640')] +[2023-10-11 16:27:36,330][85175] Updated weights for policy 1, policy_version 36760 (0.0008) +[2023-10-11 16:27:36,370][85176] Updated weights for policy 0, policy_version 36242 (0.0007) +[2023-10-11 16:27:36,742][85176] Updated weights for policy 0, policy_version 36252 (0.0009) +[2023-10-11 16:27:40,259][85175] Updated weights for policy 1, policy_version 36770 (0.0007) +[2023-10-11 16:27:40,619][85175] Updated weights for policy 1, policy_version 36780 (0.0008) +[2023-10-11 16:27:40,963][85176] Updated weights for policy 0, policy_version 36262 (0.0008) +[2023-10-11 16:27:40,987][85175] Updated weights for policy 1, policy_version 36790 (0.0007) +[2023-10-11 16:27:41,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 74776576. Throughput: 0: 1668.6, 1: 1685.4. Samples: 18709516. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:27:41,064][84230] Avg episode reward: [(0, '7.450'), (1, '7.190')] +[2023-10-11 16:27:41,354][85176] Updated weights for policy 0, policy_version 36272 (0.0008) +[2023-10-11 16:27:41,356][85175] Updated weights for policy 1, policy_version 36800 (0.0008) +[2023-10-11 16:27:41,734][85176] Updated weights for policy 0, policy_version 36282 (0.0008) +[2023-10-11 16:27:45,445][85175] Updated weights for policy 1, policy_version 36810 (0.0011) +[2023-10-11 16:27:45,695][85176] Updated weights for policy 0, policy_version 36292 (0.0007) +[2023-10-11 16:27:45,811][85175] Updated weights for policy 1, policy_version 36820 (0.0008) +[2023-10-11 16:27:46,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 74842112. Throughput: 0: 1669.9, 1: 1672.3. Samples: 18729670. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:27:46,063][84230] Avg episode reward: [(0, '7.600'), (1, '7.580')] +[2023-10-11 16:27:46,077][85176] Updated weights for policy 0, policy_version 36302 (0.0008) +[2023-10-11 16:27:46,178][85175] Updated weights for policy 1, policy_version 36830 (0.0008) +[2023-10-11 16:27:46,440][85176] Updated weights for policy 0, policy_version 36312 (0.0008) +[2023-10-11 16:27:50,172][85175] Updated weights for policy 1, policy_version 36840 (0.0009) +[2023-10-11 16:27:50,531][85176] Updated weights for policy 0, policy_version 36322 (0.0009) +[2023-10-11 16:27:50,532][85175] Updated weights for policy 1, policy_version 36850 (0.0008) +[2023-10-11 16:27:50,893][85175] Updated weights for policy 1, policy_version 36860 (0.0007) +[2023-10-11 16:27:50,906][85176] Updated weights for policy 0, policy_version 36332 (0.0009) +[2023-10-11 16:27:51,063][84230] Fps is (10 sec: 16384.2, 60 sec: 13653.4, 300 sec: 13329.4). Total num frames: 74940416. Throughput: 0: 1671.4, 1: 1682.4. Samples: 18739248. Policy #0 lag: (min: 16.0, avg: 38.6, max: 48.0) +[2023-10-11 16:27:51,063][84230] Avg episode reward: [(0, '8.050'), (1, '8.120')] +[2023-10-11 16:27:51,277][85176] Updated weights for policy 0, policy_version 36342 (0.0009) +[2023-10-11 16:27:51,648][85176] Updated weights for policy 0, policy_version 36352 (0.0009) +[2023-10-11 16:27:55,205][85175] Updated weights for policy 1, policy_version 36870 (0.0007) +[2023-10-11 16:27:55,604][85175] Updated weights for policy 1, policy_version 36880 (0.0008) +[2023-10-11 16:27:55,820][85176] Updated weights for policy 0, policy_version 36362 (0.0010) +[2023-10-11 16:27:55,976][85175] Updated weights for policy 1, policy_version 36890 (0.0008) +[2023-10-11 16:27:56,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13218.3). Total num frames: 74973184. Throughput: 0: 1667.6, 1: 1681.9. Samples: 18759580. Policy #0 lag: (min: 16.0, avg: 38.6, max: 48.0) +[2023-10-11 16:27:56,064][84230] Avg episode reward: [(0, '8.350'), (1, '7.670')] +[2023-10-11 16:27:56,189][85176] Updated weights for policy 0, policy_version 36372 (0.0008) +[2023-10-11 16:27:56,565][85176] Updated weights for policy 0, policy_version 36382 (0.0008) +[2023-10-11 16:27:59,928][85175] Updated weights for policy 1, policy_version 36900 (0.0008) +[2023-10-11 16:28:00,293][85175] Updated weights for policy 1, policy_version 36910 (0.0008) +[2023-10-11 16:28:00,483][85176] Updated weights for policy 0, policy_version 36392 (0.0008) +[2023-10-11 16:28:00,669][85175] Updated weights for policy 1, policy_version 36920 (0.0008) +[2023-10-11 16:28:00,846][85176] Updated weights for policy 0, policy_version 36402 (0.0009) +[2023-10-11 16:28:01,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 75071488. Throughput: 0: 1658.3, 1: 1663.1. Samples: 18779118. Policy #0 lag: (min: 16.0, avg: 38.6, max: 48.0) +[2023-10-11 16:28:01,063][84230] Avg episode reward: [(0, '8.180'), (1, '7.700')] +[2023-10-11 16:28:01,216][85176] Updated weights for policy 0, policy_version 36412 (0.0008) +[2023-10-11 16:28:04,693][85175] Updated weights for policy 1, policy_version 36930 (0.0007) +[2023-10-11 16:28:05,055][85175] Updated weights for policy 1, policy_version 36940 (0.0007) +[2023-10-11 16:28:05,354][85176] Updated weights for policy 0, policy_version 36422 (0.0009) +[2023-10-11 16:28:05,427][85175] Updated weights for policy 1, policy_version 36950 (0.0009) +[2023-10-11 16:28:05,718][85176] Updated weights for policy 0, policy_version 36432 (0.0009) +[2023-10-11 16:28:05,788][85175] Updated weights for policy 1, policy_version 36960 (0.0008) +[2023-10-11 16:28:06,062][84230] Fps is (10 sec: 16384.3, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 75137024. Throughput: 0: 1668.0, 1: 1680.4. Samples: 18789256. Policy #0 lag: (min: 16.0, avg: 38.6, max: 48.0) +[2023-10-11 16:28:06,063][84230] Avg episode reward: [(0, '7.430'), (1, '7.060')] +[2023-10-11 16:28:06,096][85176] Updated weights for policy 0, policy_version 36442 (0.0009) +[2023-10-11 16:28:09,850][85175] Updated weights for policy 1, policy_version 36970 (0.0010) +[2023-10-11 16:28:10,215][85175] Updated weights for policy 1, policy_version 36980 (0.0010) +[2023-10-11 16:28:10,318][85176] Updated weights for policy 0, policy_version 36452 (0.0007) +[2023-10-11 16:28:10,580][85175] Updated weights for policy 1, policy_version 36990 (0.0008) +[2023-10-11 16:28:10,689][85176] Updated weights for policy 0, policy_version 36462 (0.0009) +[2023-10-11 16:28:11,057][85176] Updated weights for policy 0, policy_version 36472 (0.0009) +[2023-10-11 16:28:11,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13329.4). Total num frames: 75202560. Throughput: 0: 1667.9, 1: 1684.7. Samples: 18809716. Policy #0 lag: (min: 16.0, avg: 38.6, max: 48.0) +[2023-10-11 16:28:11,063][84230] Avg episode reward: [(0, '7.000'), (1, '7.450')] +[2023-10-11 16:28:14,739][85175] Updated weights for policy 1, policy_version 37000 (0.0008) +[2023-10-11 16:28:15,108][85175] Updated weights for policy 1, policy_version 37010 (0.0009) +[2023-10-11 16:28:15,198][85176] Updated weights for policy 0, policy_version 36482 (0.0008) +[2023-10-11 16:28:15,476][85175] Updated weights for policy 1, policy_version 37020 (0.0007) +[2023-10-11 16:28:15,563][85176] Updated weights for policy 0, policy_version 36492 (0.0009) +[2023-10-11 16:28:15,934][85176] Updated weights for policy 0, policy_version 36502 (0.0008) +[2023-10-11 16:28:16,062][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13329.4). Total num frames: 75268096. Throughput: 0: 1650.5, 1: 1665.5. Samples: 18828796. Policy #0 lag: (min: 8.0, avg: 20.2, max: 40.0) +[2023-10-11 16:28:16,063][84230] Avg episode reward: [(0, '7.450'), (1, '7.100')] +[2023-10-11 16:28:16,070][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000037024_37912576.pth... +[2023-10-11 16:28:16,100][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000035456_36306944.pth +[2023-10-11 16:28:16,313][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000036512_37388288.pth... +[2023-10-11 16:28:16,315][85176] Updated weights for policy 0, policy_version 36512 (0.0007) +[2023-10-11 16:28:16,342][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000034944_35782656.pth +[2023-10-11 16:28:19,313][85175] Updated weights for policy 1, policy_version 37030 (0.0009) +[2023-10-11 16:28:19,690][85175] Updated weights for policy 1, policy_version 37040 (0.0007) +[2023-10-11 16:28:20,062][85175] Updated weights for policy 1, policy_version 37050 (0.0009) +[2023-10-11 16:28:20,539][85176] Updated weights for policy 0, policy_version 36522 (0.0007) +[2023-10-11 16:28:20,911][85176] Updated weights for policy 0, policy_version 36532 (0.0008) +[2023-10-11 16:28:21,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 75333632. Throughput: 0: 1656.4, 1: 1694.8. Samples: 18839400. Policy #0 lag: (min: 8.0, avg: 20.2, max: 40.0) +[2023-10-11 16:28:21,063][84230] Avg episode reward: [(0, '8.320'), (1, '7.460')] +[2023-10-11 16:28:21,297][85176] Updated weights for policy 0, policy_version 36542 (0.0010) +[2023-10-11 16:28:24,167][85175] Updated weights for policy 1, policy_version 37060 (0.0009) +[2023-10-11 16:28:24,537][85175] Updated weights for policy 1, policy_version 37070 (0.0010) +[2023-10-11 16:28:24,900][85175] Updated weights for policy 1, policy_version 37080 (0.0008) +[2023-10-11 16:28:25,471][85176] Updated weights for policy 0, policy_version 36552 (0.0011) +[2023-10-11 16:28:25,832][85176] Updated weights for policy 0, policy_version 36562 (0.0008) +[2023-10-11 16:28:26,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 75399168. Throughput: 0: 1651.9, 1: 1678.0. Samples: 18859364. Policy #0 lag: (min: 8.0, avg: 20.2, max: 40.0) +[2023-10-11 16:28:26,064][84230] Avg episode reward: [(0, '8.170'), (1, '8.000')] +[2023-10-11 16:28:26,206][85176] Updated weights for policy 0, policy_version 36572 (0.0008) +[2023-10-11 16:28:28,883][85175] Updated weights for policy 1, policy_version 37090 (0.0008) +[2023-10-11 16:28:29,253][85175] Updated weights for policy 1, policy_version 37100 (0.0009) +[2023-10-11 16:28:29,625][85175] Updated weights for policy 1, policy_version 37110 (0.0009) +[2023-10-11 16:28:29,995][85175] Updated weights for policy 1, policy_version 37120 (0.0010) +[2023-10-11 16:28:30,374][85176] Updated weights for policy 0, policy_version 36582 (0.0008) +[2023-10-11 16:28:30,751][85176] Updated weights for policy 0, policy_version 36592 (0.0008) +[2023-10-11 16:28:31,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13329.3). Total num frames: 75464704. Throughput: 0: 1645.5, 1: 1669.9. Samples: 18878860. Policy #0 lag: (min: 8.0, avg: 20.2, max: 40.0) +[2023-10-11 16:28:31,063][84230] Avg episode reward: [(0, '7.750'), (1, '7.960')] +[2023-10-11 16:28:31,131][85176] Updated weights for policy 0, policy_version 36602 (0.0010) +[2023-10-11 16:28:33,914][85175] Updated weights for policy 1, policy_version 37130 (0.0009) +[2023-10-11 16:28:34,288][85175] Updated weights for policy 1, policy_version 37140 (0.0008) +[2023-10-11 16:28:34,656][85175] Updated weights for policy 1, policy_version 37150 (0.0007) +[2023-10-11 16:28:35,233][85176] Updated weights for policy 0, policy_version 36612 (0.0007) +[2023-10-11 16:28:35,604][85176] Updated weights for policy 0, policy_version 36622 (0.0010) +[2023-10-11 16:28:35,967][85176] Updated weights for policy 0, policy_version 36632 (0.0010) +[2023-10-11 16:28:36,062][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 75530240. Throughput: 0: 1653.7, 1: 1693.1. Samples: 18889854. Policy #0 lag: (min: 8.0, avg: 20.2, max: 40.0) +[2023-10-11 16:28:36,063][84230] Avg episode reward: [(0, '7.600'), (1, '7.680')] +[2023-10-11 16:28:38,546][85175] Updated weights for policy 1, policy_version 37160 (0.0010) +[2023-10-11 16:28:38,912][85175] Updated weights for policy 1, policy_version 37170 (0.0009) +[2023-10-11 16:28:39,287][85175] Updated weights for policy 1, policy_version 37180 (0.0008) +[2023-10-11 16:28:39,894][85176] Updated weights for policy 0, policy_version 36642 (0.0008) +[2023-10-11 16:28:40,268][85176] Updated weights for policy 0, policy_version 36652 (0.0010) +[2023-10-11 16:28:40,640][85176] Updated weights for policy 0, policy_version 36662 (0.0009) +[2023-10-11 16:28:41,012][85176] Updated weights for policy 0, policy_version 36672 (0.0010) +[2023-10-11 16:28:41,063][84230] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 13440.4). Total num frames: 75628544. Throughput: 0: 1667.2, 1: 1671.5. Samples: 18909820. Policy #0 lag: (min: 2.0, avg: 13.3, max: 34.0) +[2023-10-11 16:28:41,064][84230] Avg episode reward: [(0, '7.150'), (1, '7.320')] +[2023-10-11 16:28:43,347][85175] Updated weights for policy 1, policy_version 37190 (0.0009) +[2023-10-11 16:28:43,734][85175] Updated weights for policy 1, policy_version 37200 (0.0008) +[2023-10-11 16:28:44,100][85175] Updated weights for policy 1, policy_version 37210 (0.0009) +[2023-10-11 16:28:44,987][85176] Updated weights for policy 0, policy_version 36682 (0.0010) +[2023-10-11 16:28:45,365][85176] Updated weights for policy 0, policy_version 36692 (0.0009) +[2023-10-11 16:28:45,741][85176] Updated weights for policy 0, policy_version 36702 (0.0009) +[2023-10-11 16:28:46,063][84230] Fps is (10 sec: 16383.6, 60 sec: 14199.4, 300 sec: 13440.4). Total num frames: 75694080. Throughput: 0: 1653.4, 1: 1690.4. Samples: 18929588. Policy #0 lag: (min: 2.0, avg: 13.3, max: 34.0) +[2023-10-11 16:28:46,064][84230] Avg episode reward: [(0, '7.430'), (1, '7.150')] +[2023-10-11 16:28:48,159][85175] Updated weights for policy 1, policy_version 37220 (0.0008) +[2023-10-11 16:28:48,535][85175] Updated weights for policy 1, policy_version 37230 (0.0008) +[2023-10-11 16:28:48,897][85175] Updated weights for policy 1, policy_version 37240 (0.0008) +[2023-10-11 16:28:49,697][85176] Updated weights for policy 0, policy_version 36712 (0.0008) +[2023-10-11 16:28:50,067][85176] Updated weights for policy 0, policy_version 36722 (0.0008) +[2023-10-11 16:28:50,435][85176] Updated weights for policy 0, policy_version 36732 (0.0007) +[2023-10-11 16:28:51,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 75759616. Throughput: 0: 1669.1, 1: 1687.3. Samples: 18940296. Policy #0 lag: (min: 2.0, avg: 13.3, max: 34.0) +[2023-10-11 16:28:51,063][84230] Avg episode reward: [(0, '7.430'), (1, '7.510')] +[2023-10-11 16:28:52,968][85175] Updated weights for policy 1, policy_version 37250 (0.0008) +[2023-10-11 16:28:53,344][85175] Updated weights for policy 1, policy_version 37260 (0.0007) +[2023-10-11 16:28:53,715][85175] Updated weights for policy 1, policy_version 37270 (0.0007) +[2023-10-11 16:28:54,076][85175] Updated weights for policy 1, policy_version 37280 (0.0009) +[2023-10-11 16:28:54,655][85176] Updated weights for policy 0, policy_version 36742 (0.0008) +[2023-10-11 16:28:55,033][85176] Updated weights for policy 0, policy_version 36752 (0.0008) +[2023-10-11 16:28:55,405][85176] Updated weights for policy 0, policy_version 36762 (0.0007) +[2023-10-11 16:28:56,063][84230] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13440.4). Total num frames: 75825152. Throughput: 0: 1672.6, 1: 1672.0. Samples: 18960224. Policy #0 lag: (min: 2.0, avg: 13.3, max: 34.0) +[2023-10-11 16:28:56,064][84230] Avg episode reward: [(0, '8.340'), (1, '7.420')] +[2023-10-11 16:28:58,145][85175] Updated weights for policy 1, policy_version 37290 (0.0007) +[2023-10-11 16:28:58,514][85175] Updated weights for policy 1, policy_version 37300 (0.0007) +[2023-10-11 16:28:58,886][85175] Updated weights for policy 1, policy_version 37310 (0.0009) +[2023-10-11 16:28:59,382][85176] Updated weights for policy 0, policy_version 36772 (0.0010) +[2023-10-11 16:28:59,761][85176] Updated weights for policy 0, policy_version 36782 (0.0007) +[2023-10-11 16:29:00,124][85176] Updated weights for policy 0, policy_version 36792 (0.0007) +[2023-10-11 16:29:01,063][84230] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 75890688. Throughput: 0: 1661.9, 1: 1702.2. Samples: 18980180. Policy #0 lag: (min: 20.0, avg: 20.0, max: 20.0) +[2023-10-11 16:29:01,064][84230] Avg episode reward: [(0, '8.220'), (1, '7.720')] +[2023-10-11 16:29:02,889][85175] Updated weights for policy 1, policy_version 37320 (0.0009) +[2023-10-11 16:29:03,245][85175] Updated weights for policy 1, policy_version 37330 (0.0009) +[2023-10-11 16:29:03,619][85175] Updated weights for policy 1, policy_version 37340 (0.0007) +[2023-10-11 16:29:04,256][85176] Updated weights for policy 0, policy_version 36802 (0.0007) +[2023-10-11 16:29:04,624][85176] Updated weights for policy 0, policy_version 36812 (0.0009) +[2023-10-11 16:29:05,011][85176] Updated weights for policy 0, policy_version 36822 (0.0008) +[2023-10-11 16:29:05,377][85176] Updated weights for policy 0, policy_version 36832 (0.0009) +[2023-10-11 16:29:06,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 75956224. Throughput: 0: 1683.2, 1: 1677.0. Samples: 18990610. Policy #0 lag: (min: 20.0, avg: 20.0, max: 20.0) +[2023-10-11 16:29:06,063][84230] Avg episode reward: [(0, '7.900'), (1, '7.840')] +[2023-10-11 16:29:07,623][85175] Updated weights for policy 1, policy_version 37350 (0.0007) +[2023-10-11 16:29:07,987][85175] Updated weights for policy 1, policy_version 37360 (0.0007) +[2023-10-11 16:29:08,371][85175] Updated weights for policy 1, policy_version 37370 (0.0009) +[2023-10-11 16:29:09,338][85176] Updated weights for policy 0, policy_version 36842 (0.0008) +[2023-10-11 16:29:09,705][85176] Updated weights for policy 0, policy_version 36852 (0.0008) +[2023-10-11 16:29:10,083][85176] Updated weights for policy 0, policy_version 36862 (0.0010) +[2023-10-11 16:29:11,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 76021760. Throughput: 0: 1671.2, 1: 1687.1. Samples: 19010486. Policy #0 lag: (min: 20.0, avg: 20.0, max: 20.0) +[2023-10-11 16:29:11,064][84230] Avg episode reward: [(0, '7.570'), (1, '7.740')] +[2023-10-11 16:29:12,460][85175] Updated weights for policy 1, policy_version 37380 (0.0011) +[2023-10-11 16:29:12,826][85175] Updated weights for policy 1, policy_version 37390 (0.0008) +[2023-10-11 16:29:13,187][85175] Updated weights for policy 1, policy_version 37400 (0.0008) +[2023-10-11 16:29:14,268][85176] Updated weights for policy 0, policy_version 36872 (0.0011) +[2023-10-11 16:29:14,646][85176] Updated weights for policy 0, policy_version 36882 (0.0007) +[2023-10-11 16:29:15,023][85176] Updated weights for policy 0, policy_version 36892 (0.0007) +[2023-10-11 16:29:16,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 76087296. Throughput: 0: 1668.6, 1: 1704.2. Samples: 19030636. Policy #0 lag: (min: 20.0, avg: 20.0, max: 20.0) +[2023-10-11 16:29:16,063][84230] Avg episode reward: [(0, '7.430'), (1, '7.030')] +[2023-10-11 16:29:17,312][85175] Updated weights for policy 1, policy_version 37410 (0.0009) +[2023-10-11 16:29:17,683][85175] Updated weights for policy 1, policy_version 37420 (0.0008) +[2023-10-11 16:29:18,057][85175] Updated weights for policy 1, policy_version 37430 (0.0010) +[2023-10-11 16:29:18,424][85175] Updated weights for policy 1, policy_version 37440 (0.0009) +[2023-10-11 16:29:19,222][85176] Updated weights for policy 0, policy_version 36902 (0.0007) +[2023-10-11 16:29:19,599][85176] Updated weights for policy 0, policy_version 36912 (0.0007) +[2023-10-11 16:29:19,978][85176] Updated weights for policy 0, policy_version 36922 (0.0009) +[2023-10-11 16:29:21,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 76152832. Throughput: 0: 1688.3, 1: 1667.1. Samples: 19040846. Policy #0 lag: (min: 20.0, avg: 20.0, max: 20.0) +[2023-10-11 16:29:21,064][84230] Avg episode reward: [(0, '7.240'), (1, '7.350')] +[2023-10-11 16:29:22,525][85175] Updated weights for policy 1, policy_version 37450 (0.0009) +[2023-10-11 16:29:22,904][85175] Updated weights for policy 1, policy_version 37460 (0.0009) +[2023-10-11 16:29:23,281][85175] Updated weights for policy 1, policy_version 37470 (0.0010) +[2023-10-11 16:29:24,118][85176] Updated weights for policy 0, policy_version 36932 (0.0011) +[2023-10-11 16:29:24,493][85176] Updated weights for policy 0, policy_version 36942 (0.0009) +[2023-10-11 16:29:24,859][85176] Updated weights for policy 0, policy_version 36952 (0.0008) +[2023-10-11 16:29:26,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 76218368. Throughput: 0: 1657.6, 1: 1692.0. Samples: 19060552. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:29:26,063][84230] Avg episode reward: [(0, '8.010'), (1, '7.800')] +[2023-10-11 16:29:27,414][85175] Updated weights for policy 1, policy_version 37480 (0.0008) +[2023-10-11 16:29:27,780][85175] Updated weights for policy 1, policy_version 37490 (0.0009) +[2023-10-11 16:29:28,141][85175] Updated weights for policy 1, policy_version 37500 (0.0009) +[2023-10-11 16:29:28,986][85176] Updated weights for policy 0, policy_version 36962 (0.0007) +[2023-10-11 16:29:29,355][85176] Updated weights for policy 0, policy_version 36972 (0.0011) +[2023-10-11 16:29:29,734][85176] Updated weights for policy 0, policy_version 36982 (0.0010) +[2023-10-11 16:29:30,103][85176] Updated weights for policy 0, policy_version 36992 (0.0008) +[2023-10-11 16:29:31,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 76283904. Throughput: 0: 1663.5, 1: 1695.3. Samples: 19080732. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:29:31,064][84230] Avg episode reward: [(0, '8.010'), (1, '8.150')] +[2023-10-11 16:29:32,054][85175] Updated weights for policy 1, policy_version 37510 (0.0009) +[2023-10-11 16:29:32,441][85175] Updated weights for policy 1, policy_version 37520 (0.0010) +[2023-10-11 16:29:32,808][85175] Updated weights for policy 1, policy_version 37530 (0.0009) +[2023-10-11 16:29:34,069][85176] Updated weights for policy 0, policy_version 37002 (0.0009) +[2023-10-11 16:29:34,449][85176] Updated weights for policy 0, policy_version 37012 (0.0009) +[2023-10-11 16:29:34,829][85176] Updated weights for policy 0, policy_version 37022 (0.0009) +[2023-10-11 16:29:36,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 76349440. Throughput: 0: 1668.9, 1: 1683.8. Samples: 19091166. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:29:36,063][84230] Avg episode reward: [(0, '8.050'), (1, '7.640')] +[2023-10-11 16:29:36,849][85175] Updated weights for policy 1, policy_version 37540 (0.0008) +[2023-10-11 16:29:37,217][85175] Updated weights for policy 1, policy_version 37550 (0.0011) +[2023-10-11 16:29:37,585][85175] Updated weights for policy 1, policy_version 37560 (0.0011) +[2023-10-11 16:29:38,909][85176] Updated weights for policy 0, policy_version 37032 (0.0009) +[2023-10-11 16:29:39,278][85176] Updated weights for policy 0, policy_version 37042 (0.0010) +[2023-10-11 16:29:39,652][85176] Updated weights for policy 0, policy_version 37052 (0.0010) +[2023-10-11 16:29:41,063][84230] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 76414976. Throughput: 0: 1647.5, 1: 1697.6. Samples: 19110754. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:29:41,063][84230] Avg episode reward: [(0, '7.450'), (1, '7.420')] +[2023-10-11 16:29:41,739][85175] Updated weights for policy 1, policy_version 37570 (0.0010) +[2023-10-11 16:29:42,103][85175] Updated weights for policy 1, policy_version 37580 (0.0007) +[2023-10-11 16:29:42,470][85175] Updated weights for policy 1, policy_version 37590 (0.0008) +[2023-10-11 16:29:42,834][85175] Updated weights for policy 1, policy_version 37600 (0.0010) +[2023-10-11 16:29:43,667][85176] Updated weights for policy 0, policy_version 37062 (0.0010) +[2023-10-11 16:29:44,038][85176] Updated weights for policy 0, policy_version 37072 (0.0008) +[2023-10-11 16:29:44,409][85176] Updated weights for policy 0, policy_version 37082 (0.0010) +[2023-10-11 16:29:46,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 76480512. Throughput: 0: 1669.6, 1: 1689.1. Samples: 19131322. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:29:46,063][84230] Avg episode reward: [(0, '7.300'), (1, '7.290')] +[2023-10-11 16:29:46,839][85175] Updated weights for policy 1, policy_version 37610 (0.0007) +[2023-10-11 16:29:47,211][85175] Updated weights for policy 1, policy_version 37620 (0.0007) +[2023-10-11 16:29:47,581][85175] Updated weights for policy 1, policy_version 37630 (0.0008) +[2023-10-11 16:29:48,577][85176] Updated weights for policy 0, policy_version 37092 (0.0011) +[2023-10-11 16:29:48,944][85176] Updated weights for policy 0, policy_version 37102 (0.0010) +[2023-10-11 16:29:49,324][85176] Updated weights for policy 0, policy_version 37112 (0.0007) +[2023-10-11 16:29:51,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 76546048. Throughput: 0: 1664.2, 1: 1684.9. Samples: 19141320. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:29:51,063][84230] Avg episode reward: [(0, '7.750'), (1, '7.510')] +[2023-10-11 16:29:51,591][85175] Updated weights for policy 1, policy_version 37640 (0.0010) +[2023-10-11 16:29:51,950][85175] Updated weights for policy 1, policy_version 37650 (0.0007) +[2023-10-11 16:29:52,317][85175] Updated weights for policy 1, policy_version 37660 (0.0009) +[2023-10-11 16:29:53,292][85176] Updated weights for policy 0, policy_version 37122 (0.0009) +[2023-10-11 16:29:53,671][85176] Updated weights for policy 0, policy_version 37132 (0.0008) +[2023-10-11 16:29:54,045][85176] Updated weights for policy 0, policy_version 37142 (0.0008) +[2023-10-11 16:29:54,422][85176] Updated weights for policy 0, policy_version 37152 (0.0009) +[2023-10-11 16:29:56,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 76611584. Throughput: 0: 1656.7, 1: 1687.8. Samples: 19160988. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:29:56,063][84230] Avg episode reward: [(0, '8.150'), (1, '7.420')] +[2023-10-11 16:29:56,313][85175] Updated weights for policy 1, policy_version 37670 (0.0008) +[2023-10-11 16:29:56,673][85175] Updated weights for policy 1, policy_version 37680 (0.0007) +[2023-10-11 16:29:57,045][85175] Updated weights for policy 1, policy_version 37690 (0.0008) +[2023-10-11 16:29:58,516][85176] Updated weights for policy 0, policy_version 37162 (0.0008) +[2023-10-11 16:29:58,880][85176] Updated weights for policy 0, policy_version 37172 (0.0009) +[2023-10-11 16:29:59,256][85176] Updated weights for policy 0, policy_version 37182 (0.0009) +[2023-10-11 16:30:01,062][85175] Updated weights for policy 1, policy_version 37700 (0.0008) +[2023-10-11 16:30:01,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 76677120. Throughput: 0: 1668.9, 1: 1689.4. Samples: 19181760. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:30:01,063][84230] Avg episode reward: [(0, '8.280'), (1, '7.800')] +[2023-10-11 16:30:01,422][85175] Updated weights for policy 1, policy_version 37710 (0.0009) +[2023-10-11 16:30:01,796][85175] Updated weights for policy 1, policy_version 37720 (0.0007) +[2023-10-11 16:30:03,387][85176] Updated weights for policy 0, policy_version 37192 (0.0009) +[2023-10-11 16:30:03,770][85176] Updated weights for policy 0, policy_version 37202 (0.0008) +[2023-10-11 16:30:04,142][85176] Updated weights for policy 0, policy_version 37212 (0.0009) +[2023-10-11 16:30:05,868][85175] Updated weights for policy 1, policy_version 37730 (0.0009) +[2023-10-11 16:30:06,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 76742656. Throughput: 0: 1657.7, 1: 1691.0. Samples: 19191538. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:30:06,063][84230] Avg episode reward: [(0, '7.820'), (1, '7.760')] +[2023-10-11 16:30:06,234][85175] Updated weights for policy 1, policy_version 37740 (0.0008) +[2023-10-11 16:30:06,599][85175] Updated weights for policy 1, policy_version 37750 (0.0011) +[2023-10-11 16:30:06,975][85175] Updated weights for policy 1, policy_version 37760 (0.0008) +[2023-10-11 16:30:08,165][85176] Updated weights for policy 0, policy_version 37222 (0.0009) +[2023-10-11 16:30:08,538][85176] Updated weights for policy 0, policy_version 37232 (0.0007) +[2023-10-11 16:30:08,910][85176] Updated weights for policy 0, policy_version 37242 (0.0009) +[2023-10-11 16:30:11,025][85175] Updated weights for policy 1, policy_version 37770 (0.0007) +[2023-10-11 16:30:11,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 76808192. Throughput: 0: 1661.0, 1: 1695.1. Samples: 19211576. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:30:11,063][84230] Avg episode reward: [(0, '7.620'), (1, '7.630')] +[2023-10-11 16:30:11,401][85175] Updated weights for policy 1, policy_version 37780 (0.0007) +[2023-10-11 16:30:11,770][85175] Updated weights for policy 1, policy_version 37790 (0.0007) +[2023-10-11 16:30:12,992][85176] Updated weights for policy 0, policy_version 37252 (0.0008) +[2023-10-11 16:30:13,364][85176] Updated weights for policy 0, policy_version 37262 (0.0007) +[2023-10-11 16:30:13,754][85176] Updated weights for policy 0, policy_version 37272 (0.0008) +[2023-10-11 16:30:15,601][85175] Updated weights for policy 1, policy_version 37800 (0.0009) +[2023-10-11 16:30:15,968][85175] Updated weights for policy 1, policy_version 37810 (0.0010) +[2023-10-11 16:30:16,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 76873728. Throughput: 0: 1674.8, 1: 1695.7. Samples: 19232402. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:30:16,063][84230] Avg episode reward: [(0, '7.270'), (1, '7.150')] +[2023-10-11 16:30:16,072][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000037280_38174720.pth... +[2023-10-11 16:30:16,108][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000035712_36569088.pth +[2023-10-11 16:30:16,332][85175] Updated weights for policy 1, policy_version 37820 (0.0008) +[2023-10-11 16:30:16,473][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000037824_38731776.pth... +[2023-10-11 16:30:16,503][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000036224_37093376.pth +[2023-10-11 16:30:17,817][85176] Updated weights for policy 0, policy_version 37282 (0.0009) +[2023-10-11 16:30:18,188][85176] Updated weights for policy 0, policy_version 37292 (0.0011) +[2023-10-11 16:30:18,565][85176] Updated weights for policy 0, policy_version 37302 (0.0007) +[2023-10-11 16:30:18,938][85176] Updated weights for policy 0, policy_version 37312 (0.0008) +[2023-10-11 16:30:20,385][85175] Updated weights for policy 1, policy_version 37830 (0.0008) +[2023-10-11 16:30:20,750][85175] Updated weights for policy 1, policy_version 37840 (0.0009) +[2023-10-11 16:30:21,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 76939264. Throughput: 0: 1657.7, 1: 1704.4. Samples: 19242460. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:30:21,063][84230] Avg episode reward: [(0, '7.240'), (1, '7.170')] +[2023-10-11 16:30:21,119][85175] Updated weights for policy 1, policy_version 37850 (0.0010) +[2023-10-11 16:30:23,120][85176] Updated weights for policy 0, policy_version 37322 (0.0007) +[2023-10-11 16:30:23,491][85176] Updated weights for policy 0, policy_version 37332 (0.0007) +[2023-10-11 16:30:23,866][85176] Updated weights for policy 0, policy_version 37342 (0.0008) +[2023-10-11 16:30:25,204][85175] Updated weights for policy 1, policy_version 37860 (0.0009) +[2023-10-11 16:30:25,575][85175] Updated weights for policy 1, policy_version 37870 (0.0009) +[2023-10-11 16:30:25,939][85175] Updated weights for policy 1, policy_version 37880 (0.0009) +[2023-10-11 16:30:26,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 77004800. Throughput: 0: 1668.9, 1: 1702.0. Samples: 19262442. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) +[2023-10-11 16:30:26,063][84230] Avg episode reward: [(0, '7.430'), (1, '7.270')] +[2023-10-11 16:30:27,884][85176] Updated weights for policy 0, policy_version 37352 (0.0008) +[2023-10-11 16:30:28,266][85176] Updated weights for policy 0, policy_version 37362 (0.0008) +[2023-10-11 16:30:28,633][85176] Updated weights for policy 0, policy_version 37372 (0.0010) +[2023-10-11 16:30:30,023][85175] Updated weights for policy 1, policy_version 37890 (0.0008) +[2023-10-11 16:30:30,386][85175] Updated weights for policy 1, policy_version 37900 (0.0009) +[2023-10-11 16:30:30,755][85175] Updated weights for policy 1, policy_version 37910 (0.0007) +[2023-10-11 16:30:31,063][84230] Fps is (10 sec: 13106.8, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 77070336. Throughput: 0: 1675.0, 1: 1687.2. Samples: 19282622. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) +[2023-10-11 16:30:31,064][84230] Avg episode reward: [(0, '7.550'), (1, '7.580')] +[2023-10-11 16:30:31,113][85175] Updated weights for policy 1, policy_version 37920 (0.0008) +[2023-10-11 16:30:32,847][85176] Updated weights for policy 0, policy_version 37382 (0.0008) +[2023-10-11 16:30:33,230][85176] Updated weights for policy 0, policy_version 37392 (0.0008) +[2023-10-11 16:30:33,609][85176] Updated weights for policy 0, policy_version 37402 (0.0010) +[2023-10-11 16:30:35,220][85175] Updated weights for policy 1, policy_version 37930 (0.0007) +[2023-10-11 16:30:35,598][85175] Updated weights for policy 1, policy_version 37940 (0.0007) +[2023-10-11 16:30:35,965][85175] Updated weights for policy 1, policy_version 37950 (0.0007) +[2023-10-11 16:30:36,062][84230] Fps is (10 sec: 16384.0, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 77168640. Throughput: 0: 1657.8, 1: 1703.1. Samples: 19292560. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) +[2023-10-11 16:30:36,063][84230] Avg episode reward: [(0, '7.320'), (1, '7.270')] +[2023-10-11 16:30:37,820][85176] Updated weights for policy 0, policy_version 37412 (0.0010) +[2023-10-11 16:30:38,197][85176] Updated weights for policy 0, policy_version 37422 (0.0010) +[2023-10-11 16:30:38,568][85176] Updated weights for policy 0, policy_version 37432 (0.0011) +[2023-10-11 16:30:39,965][85175] Updated weights for policy 1, policy_version 37960 (0.0010) +[2023-10-11 16:30:40,325][85175] Updated weights for policy 1, policy_version 37970 (0.0008) +[2023-10-11 16:30:40,701][85175] Updated weights for policy 1, policy_version 37980 (0.0009) +[2023-10-11 16:30:41,063][84230] Fps is (10 sec: 16384.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 77234176. Throughput: 0: 1671.5, 1: 1705.4. Samples: 19312952. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) +[2023-10-11 16:30:41,064][84230] Avg episode reward: [(0, '7.920'), (1, '7.390')] +[2023-10-11 16:30:42,725][85176] Updated weights for policy 0, policy_version 37442 (0.0010) +[2023-10-11 16:30:43,105][85176] Updated weights for policy 0, policy_version 37452 (0.0008) +[2023-10-11 16:30:43,474][85176] Updated weights for policy 0, policy_version 37462 (0.0009) +[2023-10-11 16:30:43,853][85176] Updated weights for policy 0, policy_version 37472 (0.0009) +[2023-10-11 16:30:44,789][85175] Updated weights for policy 1, policy_version 37990 (0.0010) +[2023-10-11 16:30:45,149][85175] Updated weights for policy 1, policy_version 38000 (0.0007) +[2023-10-11 16:30:45,512][85175] Updated weights for policy 1, policy_version 38010 (0.0007) +[2023-10-11 16:30:46,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 77299712. Throughput: 0: 1671.5, 1: 1680.7. Samples: 19332608. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) +[2023-10-11 16:30:46,063][84230] Avg episode reward: [(0, '7.800'), (1, '7.770')] +[2023-10-11 16:30:47,908][85176] Updated weights for policy 0, policy_version 37482 (0.0008) +[2023-10-11 16:30:48,270][85176] Updated weights for policy 0, policy_version 37492 (0.0008) +[2023-10-11 16:30:48,652][85176] Updated weights for policy 0, policy_version 37502 (0.0007) +[2023-10-11 16:30:49,534][85175] Updated weights for policy 1, policy_version 38020 (0.0008) +[2023-10-11 16:30:49,913][85175] Updated weights for policy 1, policy_version 38030 (0.0008) +[2023-10-11 16:30:50,286][85175] Updated weights for policy 1, policy_version 38040 (0.0010) +[2023-10-11 16:30:51,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 77365248. Throughput: 0: 1657.7, 1: 1704.5. Samples: 19342838. Policy #0 lag: (min: 23.0, avg: 38.9, max: 40.0) +[2023-10-11 16:30:51,063][84230] Avg episode reward: [(0, '7.700'), (1, '8.090')] +[2023-10-11 16:30:52,770][85176] Updated weights for policy 0, policy_version 37512 (0.0010) +[2023-10-11 16:30:53,152][85176] Updated weights for policy 0, policy_version 37522 (0.0010) +[2023-10-11 16:30:53,519][85176] Updated weights for policy 0, policy_version 37532 (0.0009) +[2023-10-11 16:30:54,407][85175] Updated weights for policy 1, policy_version 38050 (0.0010) +[2023-10-11 16:30:54,778][85175] Updated weights for policy 1, policy_version 38060 (0.0007) +[2023-10-11 16:30:55,151][85175] Updated weights for policy 1, policy_version 38070 (0.0007) +[2023-10-11 16:30:55,523][85175] Updated weights for policy 1, policy_version 38080 (0.0008) +[2023-10-11 16:30:56,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 77430784. Throughput: 0: 1667.6, 1: 1698.7. Samples: 19363058. Policy #0 lag: (min: 23.0, avg: 38.9, max: 40.0) +[2023-10-11 16:30:56,063][84230] Avg episode reward: [(0, '7.840'), (1, '7.640')] +[2023-10-11 16:30:57,761][85176] Updated weights for policy 0, policy_version 37542 (0.0008) +[2023-10-11 16:30:58,140][85176] Updated weights for policy 0, policy_version 37552 (0.0007) +[2023-10-11 16:30:58,509][85176] Updated weights for policy 0, policy_version 37562 (0.0009) +[2023-10-11 16:30:59,469][85175] Updated weights for policy 1, policy_version 38090 (0.0008) +[2023-10-11 16:30:59,839][85175] Updated weights for policy 1, policy_version 38100 (0.0008) +[2023-10-11 16:31:00,198][85175] Updated weights for policy 1, policy_version 38110 (0.0008) +[2023-10-11 16:31:01,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 77496320. Throughput: 0: 1665.1, 1: 1676.3. Samples: 19382766. Policy #0 lag: (min: 23.0, avg: 38.9, max: 40.0) +[2023-10-11 16:31:01,063][84230] Avg episode reward: [(0, '7.720'), (1, '7.540')] +[2023-10-11 16:31:02,332][85176] Updated weights for policy 0, policy_version 37572 (0.0009) +[2023-10-11 16:31:02,706][85176] Updated weights for policy 0, policy_version 37582 (0.0007) +[2023-10-11 16:31:03,077][85176] Updated weights for policy 0, policy_version 37592 (0.0007) +[2023-10-11 16:31:04,205][85175] Updated weights for policy 1, policy_version 38120 (0.0008) +[2023-10-11 16:31:04,576][85175] Updated weights for policy 1, policy_version 38130 (0.0007) +[2023-10-11 16:31:04,947][85175] Updated weights for policy 1, policy_version 38140 (0.0009) +[2023-10-11 16:31:06,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 77561856. Throughput: 0: 1653.6, 1: 1698.3. Samples: 19393298. Policy #0 lag: (min: 23.0, avg: 38.9, max: 40.0) +[2023-10-11 16:31:06,064][84230] Avg episode reward: [(0, '7.740'), (1, '7.180')] +[2023-10-11 16:31:07,042][85176] Updated weights for policy 0, policy_version 37602 (0.0007) +[2023-10-11 16:31:07,425][85176] Updated weights for policy 0, policy_version 37612 (0.0007) +[2023-10-11 16:31:07,790][85176] Updated weights for policy 0, policy_version 37622 (0.0007) +[2023-10-11 16:31:08,168][85176] Updated weights for policy 0, policy_version 37632 (0.0009) +[2023-10-11 16:31:08,999][85175] Updated weights for policy 1, policy_version 38150 (0.0009) +[2023-10-11 16:31:09,382][85175] Updated weights for policy 1, policy_version 38160 (0.0011) +[2023-10-11 16:31:09,750][85175] Updated weights for policy 1, policy_version 38170 (0.0007) +[2023-10-11 16:31:11,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 77627392. Throughput: 0: 1665.1, 1: 1684.2. Samples: 19413160. Policy #0 lag: (min: 23.0, avg: 38.9, max: 40.0) +[2023-10-11 16:31:11,064][84230] Avg episode reward: [(0, '7.580'), (1, '7.420')] +[2023-10-11 16:31:12,328][85176] Updated weights for policy 0, policy_version 37642 (0.0008) +[2023-10-11 16:31:12,712][85176] Updated weights for policy 0, policy_version 37652 (0.0009) +[2023-10-11 16:31:13,083][85176] Updated weights for policy 0, policy_version 37662 (0.0009) +[2023-10-11 16:31:13,823][85175] Updated weights for policy 1, policy_version 38180 (0.0010) +[2023-10-11 16:31:14,199][85175] Updated weights for policy 1, policy_version 38190 (0.0009) +[2023-10-11 16:31:14,567][85175] Updated weights for policy 1, policy_version 38200 (0.0008) +[2023-10-11 16:31:16,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 77692928. Throughput: 0: 1662.8, 1: 1684.0. Samples: 19433230. Policy #0 lag: (min: 31.0, avg: 36.8, max: 63.0) +[2023-10-11 16:31:16,064][84230] Avg episode reward: [(0, '7.420'), (1, '7.530')] +[2023-10-11 16:31:17,178][85176] Updated weights for policy 0, policy_version 37672 (0.0010) +[2023-10-11 16:31:17,544][85176] Updated weights for policy 0, policy_version 37682 (0.0011) +[2023-10-11 16:31:17,917][85176] Updated weights for policy 0, policy_version 37692 (0.0009) +[2023-10-11 16:31:18,473][85175] Updated weights for policy 1, policy_version 38210 (0.0010) +[2023-10-11 16:31:18,840][85175] Updated weights for policy 1, policy_version 38220 (0.0011) +[2023-10-11 16:31:19,210][85175] Updated weights for policy 1, policy_version 38230 (0.0009) +[2023-10-11 16:31:19,590][85175] Updated weights for policy 1, policy_version 38240 (0.0011) +[2023-10-11 16:31:21,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 77758464. Throughput: 0: 1654.5, 1: 1693.5. Samples: 19443220. Policy #0 lag: (min: 31.0, avg: 36.8, max: 63.0) +[2023-10-11 16:31:21,064][84230] Avg episode reward: [(0, '7.900'), (1, '7.560')] +[2023-10-11 16:31:22,259][85176] Updated weights for policy 0, policy_version 37702 (0.0008) +[2023-10-11 16:31:22,645][85176] Updated weights for policy 0, policy_version 37712 (0.0012) +[2023-10-11 16:31:23,007][85176] Updated weights for policy 0, policy_version 37722 (0.0010) +[2023-10-11 16:31:23,651][85175] Updated weights for policy 1, policy_version 38250 (0.0007) +[2023-10-11 16:31:24,016][85175] Updated weights for policy 1, policy_version 38260 (0.0010) +[2023-10-11 16:31:24,396][85175] Updated weights for policy 1, policy_version 38270 (0.0010) +[2023-10-11 16:31:26,063][84230] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 77824000. Throughput: 0: 1661.3, 1: 1670.0. Samples: 19462860. Policy #0 lag: (min: 31.0, avg: 36.8, max: 63.0) +[2023-10-11 16:31:26,063][84230] Avg episode reward: [(0, '7.890'), (1, '7.600')] +[2023-10-11 16:31:27,097][85176] Updated weights for policy 0, policy_version 37732 (0.0008) +[2023-10-11 16:31:27,461][85176] Updated weights for policy 0, policy_version 37742 (0.0009) +[2023-10-11 16:31:27,834][85176] Updated weights for policy 0, policy_version 37752 (0.0009) +[2023-10-11 16:31:28,363][85175] Updated weights for policy 1, policy_version 38280 (0.0009) +[2023-10-11 16:31:28,725][85175] Updated weights for policy 1, policy_version 38290 (0.0008) +[2023-10-11 16:31:29,107][85175] Updated weights for policy 1, policy_version 38300 (0.0010) +[2023-10-11 16:31:31,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 77889536. Throughput: 0: 1661.5, 1: 1693.9. Samples: 19483604. Policy #0 lag: (min: 31.0, avg: 36.8, max: 63.0) +[2023-10-11 16:31:31,064][84230] Avg episode reward: [(0, '7.600'), (1, '7.880')] +[2023-10-11 16:31:31,825][85176] Updated weights for policy 0, policy_version 37762 (0.0009) +[2023-10-11 16:31:32,195][85176] Updated weights for policy 0, policy_version 37772 (0.0008) +[2023-10-11 16:31:32,564][85176] Updated weights for policy 0, policy_version 37782 (0.0010) +[2023-10-11 16:31:32,944][85176] Updated weights for policy 0, policy_version 37792 (0.0010) +[2023-10-11 16:31:33,045][85175] Updated weights for policy 1, policy_version 38310 (0.0008) +[2023-10-11 16:31:33,420][85175] Updated weights for policy 1, policy_version 38320 (0.0009) +[2023-10-11 16:31:33,796][85175] Updated weights for policy 1, policy_version 38330 (0.0008) +[2023-10-11 16:31:36,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 77955072. Throughput: 0: 1659.6, 1: 1684.5. Samples: 19493320. Policy #0 lag: (min: 31.0, avg: 36.8, max: 63.0) +[2023-10-11 16:31:36,063][84230] Avg episode reward: [(0, '7.600'), (1, '7.380')] +[2023-10-11 16:31:37,044][85176] Updated weights for policy 0, policy_version 37802 (0.0009) +[2023-10-11 16:31:37,419][85176] Updated weights for policy 0, policy_version 37812 (0.0009) +[2023-10-11 16:31:37,799][85176] Updated weights for policy 0, policy_version 37822 (0.0009) +[2023-10-11 16:31:37,882][85175] Updated weights for policy 1, policy_version 38340 (0.0009) +[2023-10-11 16:31:38,259][85175] Updated weights for policy 1, policy_version 38350 (0.0008) +[2023-10-11 16:31:38,627][85175] Updated weights for policy 1, policy_version 38360 (0.0007) +[2023-10-11 16:31:41,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 78020608. Throughput: 0: 1669.9, 1: 1672.2. Samples: 19513454. Policy #0 lag: (min: 31.0, avg: 36.8, max: 63.0) +[2023-10-11 16:31:41,063][84230] Avg episode reward: [(0, '7.390'), (1, '7.450')] +[2023-10-11 16:31:42,031][85176] Updated weights for policy 0, policy_version 37832 (0.0010) +[2023-10-11 16:31:42,406][85176] Updated weights for policy 0, policy_version 37842 (0.0009) +[2023-10-11 16:31:42,665][85175] Updated weights for policy 1, policy_version 38370 (0.0007) +[2023-10-11 16:31:42,772][85176] Updated weights for policy 0, policy_version 37852 (0.0009) +[2023-10-11 16:31:43,028][85175] Updated weights for policy 1, policy_version 38380 (0.0007) +[2023-10-11 16:31:43,393][85175] Updated weights for policy 1, policy_version 38390 (0.0011) +[2023-10-11 16:31:43,768][85175] Updated weights for policy 1, policy_version 38400 (0.0008) +[2023-10-11 16:31:46,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 78086144. Throughput: 0: 1670.8, 1: 1693.3. Samples: 19534154. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:31:46,064][84230] Avg episode reward: [(0, '7.820'), (1, '7.800')] +[2023-10-11 16:31:46,870][85176] Updated weights for policy 0, policy_version 37862 (0.0009) +[2023-10-11 16:31:47,236][85176] Updated weights for policy 0, policy_version 37872 (0.0010) +[2023-10-11 16:31:47,615][85176] Updated weights for policy 0, policy_version 37882 (0.0010) +[2023-10-11 16:31:47,935][85175] Updated weights for policy 1, policy_version 38410 (0.0010) +[2023-10-11 16:31:48,308][85175] Updated weights for policy 1, policy_version 38420 (0.0009) +[2023-10-11 16:31:48,668][85175] Updated weights for policy 1, policy_version 38430 (0.0007) +[2023-10-11 16:31:51,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 78151680. Throughput: 0: 1668.1, 1: 1665.4. Samples: 19543306. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:31:51,064][84230] Avg episode reward: [(0, '8.100'), (1, '8.700')] +[2023-10-11 16:31:51,065][85000] Saving new best policy, reward=8.700! +[2023-10-11 16:31:51,774][85176] Updated weights for policy 0, policy_version 37892 (0.0008) +[2023-10-11 16:31:52,150][85176] Updated weights for policy 0, policy_version 37902 (0.0009) +[2023-10-11 16:31:52,518][85176] Updated weights for policy 0, policy_version 37912 (0.0007) +[2023-10-11 16:31:52,692][85175] Updated weights for policy 1, policy_version 38440 (0.0009) +[2023-10-11 16:31:53,069][85175] Updated weights for policy 1, policy_version 38450 (0.0007) +[2023-10-11 16:31:53,446][85175] Updated weights for policy 1, policy_version 38460 (0.0007) +[2023-10-11 16:31:56,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 78217216. Throughput: 0: 1667.6, 1: 1677.7. Samples: 19563696. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:31:56,064][84230] Avg episode reward: [(0, '7.970'), (1, '8.180')] +[2023-10-11 16:31:56,699][85176] Updated weights for policy 0, policy_version 37922 (0.0007) +[2023-10-11 16:31:57,074][85176] Updated weights for policy 0, policy_version 37932 (0.0011) +[2023-10-11 16:31:57,446][85176] Updated weights for policy 0, policy_version 37942 (0.0008) +[2023-10-11 16:31:57,510][85175] Updated weights for policy 1, policy_version 38470 (0.0008) +[2023-10-11 16:31:57,816][85176] Updated weights for policy 0, policy_version 37952 (0.0008) +[2023-10-11 16:31:57,901][85175] Updated weights for policy 1, policy_version 38480 (0.0008) +[2023-10-11 16:31:58,270][85175] Updated weights for policy 1, policy_version 38490 (0.0008) +[2023-10-11 16:32:01,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 78282752. Throughput: 0: 1673.6, 1: 1692.1. Samples: 19584684. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:32:01,064][84230] Avg episode reward: [(0, '7.450'), (1, '7.640')] +[2023-10-11 16:32:01,757][85176] Updated weights for policy 0, policy_version 37962 (0.0008) +[2023-10-11 16:32:02,138][85176] Updated weights for policy 0, policy_version 37972 (0.0009) +[2023-10-11 16:32:02,361][85175] Updated weights for policy 1, policy_version 38500 (0.0010) +[2023-10-11 16:32:02,501][85176] Updated weights for policy 0, policy_version 37982 (0.0008) +[2023-10-11 16:32:02,728][85175] Updated weights for policy 1, policy_version 38510 (0.0007) +[2023-10-11 16:32:03,093][85175] Updated weights for policy 1, policy_version 38520 (0.0010) +[2023-10-11 16:32:06,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 78348288. Throughput: 0: 1677.8, 1: 1666.8. Samples: 19593728. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:32:06,063][84230] Avg episode reward: [(0, '7.300'), (1, '7.220')] +[2023-10-11 16:32:06,302][85176] Updated weights for policy 0, policy_version 37992 (0.0009) +[2023-10-11 16:32:06,677][85176] Updated weights for policy 0, policy_version 38002 (0.0008) +[2023-10-11 16:32:07,043][85176] Updated weights for policy 0, policy_version 38012 (0.0007) +[2023-10-11 16:32:07,119][85175] Updated weights for policy 1, policy_version 38530 (0.0010) +[2023-10-11 16:32:07,489][85175] Updated weights for policy 1, policy_version 38540 (0.0008) +[2023-10-11 16:32:07,845][85175] Updated weights for policy 1, policy_version 38550 (0.0008) +[2023-10-11 16:32:08,209][85175] Updated weights for policy 1, policy_version 38560 (0.0007) +[2023-10-11 16:32:11,063][84230] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 78413824. Throughput: 0: 1685.6, 1: 1690.2. Samples: 19614768. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:32:11,063][84230] Avg episode reward: [(0, '7.300'), (1, '7.170')] +[2023-10-11 16:32:11,151][85176] Updated weights for policy 0, policy_version 38022 (0.0009) +[2023-10-11 16:32:11,517][85176] Updated weights for policy 0, policy_version 38032 (0.0008) +[2023-10-11 16:32:11,895][85176] Updated weights for policy 0, policy_version 38042 (0.0008) +[2023-10-11 16:32:12,283][85175] Updated weights for policy 1, policy_version 38570 (0.0009) +[2023-10-11 16:32:12,657][85175] Updated weights for policy 1, policy_version 38580 (0.0010) +[2023-10-11 16:32:13,030][85175] Updated weights for policy 1, policy_version 38590 (0.0009) +[2023-10-11 16:32:15,948][85176] Updated weights for policy 0, policy_version 38052 (0.0007) +[2023-10-11 16:32:16,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 78479360. Throughput: 0: 1684.7, 1: 1689.2. Samples: 19635432. Policy #0 lag: (min: 31.0, avg: 31.4, max: 44.0) +[2023-10-11 16:32:16,064][84230] Avg episode reward: [(0, '7.510'), (1, '7.220')] +[2023-10-11 16:32:16,072][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000038592_39518208.pth... +[2023-10-11 16:32:16,102][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000037024_37912576.pth +[2023-10-11 16:32:16,320][85176] Updated weights for policy 0, policy_version 38062 (0.0007) +[2023-10-11 16:32:16,693][85176] Updated weights for policy 0, policy_version 38072 (0.0010) +[2023-10-11 16:32:16,991][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000038080_38993920.pth... +[2023-10-11 16:32:17,026][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000036512_37388288.pth +[2023-10-11 16:32:17,089][85175] Updated weights for policy 1, policy_version 38600 (0.0008) +[2023-10-11 16:32:17,450][85175] Updated weights for policy 1, policy_version 38610 (0.0010) +[2023-10-11 16:32:17,820][85175] Updated weights for policy 1, policy_version 38620 (0.0011) +[2023-10-11 16:32:20,696][85176] Updated weights for policy 0, policy_version 38082 (0.0007) +[2023-10-11 16:32:21,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 78544896. Throughput: 0: 1680.9, 1: 1675.2. Samples: 19644346. Policy #0 lag: (min: 31.0, avg: 31.4, max: 44.0) +[2023-10-11 16:32:21,063][84230] Avg episode reward: [(0, '8.040'), (1, '8.110')] +[2023-10-11 16:32:21,065][85176] Updated weights for policy 0, policy_version 38092 (0.0009) +[2023-10-11 16:32:21,438][85176] Updated weights for policy 0, policy_version 38102 (0.0010) +[2023-10-11 16:32:21,810][85176] Updated weights for policy 0, policy_version 38112 (0.0007) +[2023-10-11 16:32:21,846][85175] Updated weights for policy 1, policy_version 38630 (0.0010) +[2023-10-11 16:32:22,213][85175] Updated weights for policy 1, policy_version 38640 (0.0011) +[2023-10-11 16:32:22,590][85175] Updated weights for policy 1, policy_version 38650 (0.0011) +[2023-10-11 16:32:26,063][84230] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 78610432. Throughput: 0: 1677.0, 1: 1688.9. Samples: 19664918. Policy #0 lag: (min: 31.0, avg: 31.4, max: 44.0) +[2023-10-11 16:32:26,063][84230] Avg episode reward: [(0, '7.960'), (1, '7.960')] +[2023-10-11 16:32:26,094][85176] Updated weights for policy 0, policy_version 38122 (0.0010) +[2023-10-11 16:32:26,469][85176] Updated weights for policy 0, policy_version 38132 (0.0009) +[2023-10-11 16:32:26,581][85175] Updated weights for policy 1, policy_version 38660 (0.0010) +[2023-10-11 16:32:26,844][85176] Updated weights for policy 0, policy_version 38142 (0.0008) +[2023-10-11 16:32:26,946][85175] Updated weights for policy 1, policy_version 38670 (0.0008) +[2023-10-11 16:32:27,320][85175] Updated weights for policy 1, policy_version 38680 (0.0007) +[2023-10-11 16:32:31,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 78675968. Throughput: 0: 1672.6, 1: 1692.6. Samples: 19685588. Policy #0 lag: (min: 31.0, avg: 31.4, max: 44.0) +[2023-10-11 16:32:31,063][84230] Avg episode reward: [(0, '7.840'), (1, '7.640')] +[2023-10-11 16:32:31,119][85176] Updated weights for policy 0, policy_version 38152 (0.0007) +[2023-10-11 16:32:31,241][85175] Updated weights for policy 1, policy_version 38690 (0.0007) +[2023-10-11 16:32:31,489][85176] Updated weights for policy 0, policy_version 38162 (0.0007) +[2023-10-11 16:32:31,603][85175] Updated weights for policy 1, policy_version 38700 (0.0007) +[2023-10-11 16:32:31,865][85176] Updated weights for policy 0, policy_version 38172 (0.0008) +[2023-10-11 16:32:31,978][85175] Updated weights for policy 1, policy_version 38710 (0.0010) +[2023-10-11 16:32:32,346][85175] Updated weights for policy 1, policy_version 38720 (0.0008) +[2023-10-11 16:32:35,814][85176] Updated weights for policy 0, policy_version 38182 (0.0010) +[2023-10-11 16:32:36,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 78741504. Throughput: 0: 1671.9, 1: 1691.6. Samples: 19694666. Policy #0 lag: (min: 31.0, avg: 31.4, max: 44.0) +[2023-10-11 16:32:36,064][84230] Avg episode reward: [(0, '7.580'), (1, '7.360')] +[2023-10-11 16:32:36,174][85176] Updated weights for policy 0, policy_version 38192 (0.0009) +[2023-10-11 16:32:36,448][85175] Updated weights for policy 1, policy_version 38730 (0.0007) +[2023-10-11 16:32:36,552][85176] Updated weights for policy 0, policy_version 38202 (0.0007) +[2023-10-11 16:32:36,811][85175] Updated weights for policy 1, policy_version 38740 (0.0007) +[2023-10-11 16:32:37,176][85175] Updated weights for policy 1, policy_version 38750 (0.0010) +[2023-10-11 16:32:40,677][85176] Updated weights for policy 0, policy_version 38212 (0.0007) +[2023-10-11 16:32:41,058][85176] Updated weights for policy 0, policy_version 38222 (0.0008) +[2023-10-11 16:32:41,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 78807040. Throughput: 0: 1675.8, 1: 1696.1. Samples: 19715430. Policy #0 lag: (min: 31.0, avg: 31.4, max: 44.0) +[2023-10-11 16:32:41,063][84230] Avg episode reward: [(0, '7.580'), (1, '7.420')] +[2023-10-11 16:32:41,228][85175] Updated weights for policy 1, policy_version 38760 (0.0008) +[2023-10-11 16:32:41,424][85176] Updated weights for policy 0, policy_version 38232 (0.0007) +[2023-10-11 16:32:41,599][85175] Updated weights for policy 1, policy_version 38770 (0.0007) +[2023-10-11 16:32:41,969][85175] Updated weights for policy 1, policy_version 38780 (0.0008) +[2023-10-11 16:32:45,612][85176] Updated weights for policy 0, policy_version 38242 (0.0008) +[2023-10-11 16:32:45,990][85176] Updated weights for policy 0, policy_version 38252 (0.0007) +[2023-10-11 16:32:46,039][85175] Updated weights for policy 1, policy_version 38790 (0.0007) +[2023-10-11 16:32:46,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 78872576. Throughput: 0: 1664.2, 1: 1700.6. Samples: 19736102. Policy #0 lag: (min: 26.0, avg: 28.0, max: 57.0) +[2023-10-11 16:32:46,064][84230] Avg episode reward: [(0, '8.040'), (1, '7.720')] +[2023-10-11 16:32:46,359][85176] Updated weights for policy 0, policy_version 38262 (0.0007) +[2023-10-11 16:32:46,433][85175] Updated weights for policy 1, policy_version 38800 (0.0008) +[2023-10-11 16:32:46,730][85176] Updated weights for policy 0, policy_version 38272 (0.0008) +[2023-10-11 16:32:46,800][85175] Updated weights for policy 1, policy_version 38810 (0.0007) +[2023-10-11 16:32:50,795][85175] Updated weights for policy 1, policy_version 38820 (0.0009) +[2023-10-11 16:32:50,863][85176] Updated weights for policy 0, policy_version 38282 (0.0007) +[2023-10-11 16:32:51,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 78938112. Throughput: 0: 1664.1, 1: 1697.3. Samples: 19744992. Policy #0 lag: (min: 26.0, avg: 28.0, max: 57.0) +[2023-10-11 16:32:51,063][84230] Avg episode reward: [(0, '8.160'), (1, '8.010')] +[2023-10-11 16:32:51,159][85175] Updated weights for policy 1, policy_version 38830 (0.0007) +[2023-10-11 16:32:51,232][85176] Updated weights for policy 0, policy_version 38292 (0.0008) +[2023-10-11 16:32:51,521][85175] Updated weights for policy 1, policy_version 38840 (0.0008) +[2023-10-11 16:32:51,614][85176] Updated weights for policy 0, policy_version 38302 (0.0009) +[2023-10-11 16:32:55,628][85175] Updated weights for policy 1, policy_version 38850 (0.0008) +[2023-10-11 16:32:55,729][85176] Updated weights for policy 0, policy_version 38312 (0.0007) +[2023-10-11 16:32:55,990][85175] Updated weights for policy 1, policy_version 38860 (0.0008) +[2023-10-11 16:32:56,063][84230] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 79003648. Throughput: 0: 1655.4, 1: 1696.1. Samples: 19765584. Policy #0 lag: (min: 26.0, avg: 28.0, max: 57.0) +[2023-10-11 16:32:56,063][84230] Avg episode reward: [(0, '7.730'), (1, '7.860')] +[2023-10-11 16:32:56,106][85176] Updated weights for policy 0, policy_version 38322 (0.0007) +[2023-10-11 16:32:56,356][85175] Updated weights for policy 1, policy_version 38870 (0.0007) +[2023-10-11 16:32:56,479][85176] Updated weights for policy 0, policy_version 38332 (0.0009) +[2023-10-11 16:32:56,722][85175] Updated weights for policy 1, policy_version 38880 (0.0007) +[2023-10-11 16:33:00,775][85176] Updated weights for policy 0, policy_version 38342 (0.0008) +[2023-10-11 16:33:00,896][85175] Updated weights for policy 1, policy_version 38890 (0.0008) +[2023-10-11 16:33:01,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 79069184. Throughput: 0: 1654.7, 1: 1692.5. Samples: 19786058. Policy #0 lag: (min: 26.0, avg: 28.0, max: 57.0) +[2023-10-11 16:33:01,063][84230] Avg episode reward: [(0, '7.580'), (1, '7.370')] +[2023-10-11 16:33:01,153][85176] Updated weights for policy 0, policy_version 38352 (0.0008) +[2023-10-11 16:33:01,262][85175] Updated weights for policy 1, policy_version 38900 (0.0007) +[2023-10-11 16:33:01,528][85176] Updated weights for policy 0, policy_version 38362 (0.0008) +[2023-10-11 16:33:01,629][85175] Updated weights for policy 1, policy_version 38910 (0.0007) +[2023-10-11 16:33:05,539][85175] Updated weights for policy 1, policy_version 38920 (0.0011) +[2023-10-11 16:33:05,761][85176] Updated weights for policy 0, policy_version 38372 (0.0007) +[2023-10-11 16:33:05,900][85175] Updated weights for policy 1, policy_version 38930 (0.0009) +[2023-10-11 16:33:06,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 79134720. Throughput: 0: 1654.8, 1: 1695.9. Samples: 19795128. Policy #0 lag: (min: 26.0, avg: 28.0, max: 57.0) +[2023-10-11 16:33:06,063][84230] Avg episode reward: [(0, '7.570'), (1, '7.180')] +[2023-10-11 16:33:06,123][85176] Updated weights for policy 0, policy_version 38382 (0.0009) +[2023-10-11 16:33:06,271][85175] Updated weights for policy 1, policy_version 38940 (0.0008) +[2023-10-11 16:33:06,507][85176] Updated weights for policy 0, policy_version 38392 (0.0010) +[2023-10-11 16:33:10,134][85175] Updated weights for policy 1, policy_version 38950 (0.0008) +[2023-10-11 16:33:10,502][85175] Updated weights for policy 1, policy_version 38960 (0.0008) +[2023-10-11 16:33:10,573][85176] Updated weights for policy 0, policy_version 38402 (0.0011) +[2023-10-11 16:33:10,875][85175] Updated weights for policy 1, policy_version 38970 (0.0008) +[2023-10-11 16:33:10,951][85176] Updated weights for policy 0, policy_version 38412 (0.0009) +[2023-10-11 16:33:11,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 79200256. Throughput: 0: 1648.3, 1: 1702.2. Samples: 19815692. Policy #0 lag: (min: 26.0, avg: 28.0, max: 57.0) +[2023-10-11 16:33:11,063][84230] Avg episode reward: [(0, '7.530'), (1, '7.510')] +[2023-10-11 16:33:11,316][85176] Updated weights for policy 0, policy_version 38422 (0.0009) +[2023-10-11 16:33:11,688][85176] Updated weights for policy 0, policy_version 38432 (0.0008) +[2023-10-11 16:33:14,951][85175] Updated weights for policy 1, policy_version 38980 (0.0009) +[2023-10-11 16:33:15,326][85175] Updated weights for policy 1, policy_version 38990 (0.0008) +[2023-10-11 16:33:15,693][85175] Updated weights for policy 1, policy_version 39000 (0.0007) +[2023-10-11 16:33:15,822][85176] Updated weights for policy 0, policy_version 38442 (0.0007) +[2023-10-11 16:33:16,063][84230] Fps is (10 sec: 16383.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 79298560. Throughput: 0: 1646.7, 1: 1683.9. Samples: 19835462. Policy #0 lag: (min: 20.0, avg: 28.0, max: 52.0) +[2023-10-11 16:33:16,064][84230] Avg episode reward: [(0, '7.550'), (1, '7.390')] +[2023-10-11 16:33:16,189][85176] Updated weights for policy 0, policy_version 38452 (0.0009) +[2023-10-11 16:33:16,566][85176] Updated weights for policy 0, policy_version 38462 (0.0009) +[2023-10-11 16:33:19,701][85175] Updated weights for policy 1, policy_version 39010 (0.0008) +[2023-10-11 16:33:20,071][85175] Updated weights for policy 1, policy_version 39020 (0.0007) +[2023-10-11 16:33:20,436][85175] Updated weights for policy 1, policy_version 39030 (0.0008) +[2023-10-11 16:33:20,802][85175] Updated weights for policy 1, policy_version 39040 (0.0008) +[2023-10-11 16:33:20,824][85176] Updated weights for policy 0, policy_version 38472 (0.0008) +[2023-10-11 16:33:21,062][84230] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 79364096. Throughput: 0: 1650.7, 1: 1698.8. Samples: 19845392. Policy #0 lag: (min: 20.0, avg: 28.0, max: 52.0) +[2023-10-11 16:33:21,063][84230] Avg episode reward: [(0, '7.750'), (1, '7.390')] +[2023-10-11 16:33:21,199][85176] Updated weights for policy 0, policy_version 38482 (0.0007) +[2023-10-11 16:33:21,584][85176] Updated weights for policy 0, policy_version 38492 (0.0009) +[2023-10-11 16:33:24,790][85175] Updated weights for policy 1, policy_version 39050 (0.0008) +[2023-10-11 16:33:25,157][85175] Updated weights for policy 1, policy_version 39060 (0.0008) +[2023-10-11 16:33:25,534][85175] Updated weights for policy 1, policy_version 39070 (0.0007) +[2023-10-11 16:33:25,696][85176] Updated weights for policy 0, policy_version 38502 (0.0008) +[2023-10-11 16:33:26,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 79429632. Throughput: 0: 1645.2, 1: 1698.9. Samples: 19865916. Policy #0 lag: (min: 20.0, avg: 28.0, max: 52.0) +[2023-10-11 16:33:26,064][84230] Avg episode reward: [(0, '7.900'), (1, '7.800')] +[2023-10-11 16:33:26,068][85176] Updated weights for policy 0, policy_version 38512 (0.0010) +[2023-10-11 16:33:26,434][85176] Updated weights for policy 0, policy_version 38522 (0.0009) +[2023-10-11 16:33:29,679][85175] Updated weights for policy 1, policy_version 39080 (0.0007) +[2023-10-11 16:33:30,037][85175] Updated weights for policy 1, policy_version 39090 (0.0007) +[2023-10-11 16:33:30,408][85175] Updated weights for policy 1, policy_version 39100 (0.0007) +[2023-10-11 16:33:30,527][85176] Updated weights for policy 0, policy_version 38532 (0.0008) +[2023-10-11 16:33:30,897][85176] Updated weights for policy 0, policy_version 38542 (0.0007) +[2023-10-11 16:33:31,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 79495168. Throughput: 0: 1649.1, 1: 1672.5. Samples: 19885572. Policy #0 lag: (min: 20.0, avg: 28.0, max: 52.0) +[2023-10-11 16:33:31,063][84230] Avg episode reward: [(0, '8.180'), (1, '7.760')] +[2023-10-11 16:33:31,276][85176] Updated weights for policy 0, policy_version 38552 (0.0007) +[2023-10-11 16:33:34,495][85175] Updated weights for policy 1, policy_version 39110 (0.0010) +[2023-10-11 16:33:34,869][85175] Updated weights for policy 1, policy_version 39120 (0.0009) +[2023-10-11 16:33:35,242][85175] Updated weights for policy 1, policy_version 39130 (0.0008) +[2023-10-11 16:33:35,367][85176] Updated weights for policy 0, policy_version 38562 (0.0008) +[2023-10-11 16:33:35,733][85176] Updated weights for policy 0, policy_version 38572 (0.0009) +[2023-10-11 16:33:36,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 79560704. Throughput: 0: 1653.1, 1: 1705.5. Samples: 19896132. Policy #0 lag: (min: 20.0, avg: 28.0, max: 52.0) +[2023-10-11 16:33:36,064][84230] Avg episode reward: [(0, '7.650'), (1, '7.800')] +[2023-10-11 16:33:36,109][85176] Updated weights for policy 0, policy_version 38582 (0.0008) +[2023-10-11 16:33:36,488][85176] Updated weights for policy 0, policy_version 38592 (0.0008) +[2023-10-11 16:33:39,210][85175] Updated weights for policy 1, policy_version 39140 (0.0009) +[2023-10-11 16:33:39,574][85175] Updated weights for policy 1, policy_version 39150 (0.0010) +[2023-10-11 16:33:39,942][85175] Updated weights for policy 1, policy_version 39160 (0.0007) +[2023-10-11 16:33:40,598][85176] Updated weights for policy 0, policy_version 38602 (0.0012) +[2023-10-11 16:33:40,986][85176] Updated weights for policy 0, policy_version 38612 (0.0007) +[2023-10-11 16:33:41,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 79626240. Throughput: 0: 1656.4, 1: 1693.5. Samples: 19916326. Policy #0 lag: (min: 20.0, avg: 28.0, max: 52.0) +[2023-10-11 16:33:41,063][84230] Avg episode reward: [(0, '7.820'), (1, '8.000')] +[2023-10-11 16:33:41,362][85176] Updated weights for policy 0, policy_version 38622 (0.0008) +[2023-10-11 16:33:43,958][85175] Updated weights for policy 1, policy_version 39170 (0.0007) +[2023-10-11 16:33:44,319][85175] Updated weights for policy 1, policy_version 39180 (0.0008) +[2023-10-11 16:33:44,688][85175] Updated weights for policy 1, policy_version 39190 (0.0007) +[2023-10-11 16:33:45,051][85175] Updated weights for policy 1, policy_version 39200 (0.0008) +[2023-10-11 16:33:45,315][85176] Updated weights for policy 0, policy_version 38632 (0.0008) +[2023-10-11 16:33:45,690][85176] Updated weights for policy 0, policy_version 38642 (0.0007) +[2023-10-11 16:33:46,061][85176] Updated weights for policy 0, policy_version 38652 (0.0007) +[2023-10-11 16:33:46,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13329.3). Total num frames: 79691776. Throughput: 0: 1647.9, 1: 1679.9. Samples: 19935808. Policy #0 lag: (min: 27.0, avg: 27.0, max: 27.0) +[2023-10-11 16:33:46,064][84230] Avg episode reward: [(0, '7.900'), (1, '7.690')] +[2023-10-11 16:33:49,179][85175] Updated weights for policy 1, policy_version 39210 (0.0008) +[2023-10-11 16:33:49,538][85175] Updated weights for policy 1, policy_version 39220 (0.0009) +[2023-10-11 16:33:49,902][85175] Updated weights for policy 1, policy_version 39230 (0.0009) +[2023-10-11 16:33:50,172][85176] Updated weights for policy 0, policy_version 38662 (0.0007) +[2023-10-11 16:33:50,542][85176] Updated weights for policy 0, policy_version 38672 (0.0007) +[2023-10-11 16:33:50,911][85176] Updated weights for policy 0, policy_version 38682 (0.0008) +[2023-10-11 16:33:51,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 79757312. Throughput: 0: 1660.5, 1: 1706.6. Samples: 19946646. Policy #0 lag: (min: 27.0, avg: 27.0, max: 27.0) +[2023-10-11 16:33:51,064][84230] Avg episode reward: [(0, '7.870'), (1, '7.280')] +[2023-10-11 16:33:53,897][85175] Updated weights for policy 1, policy_version 39240 (0.0010) +[2023-10-11 16:33:54,261][85175] Updated weights for policy 1, policy_version 39250 (0.0007) +[2023-10-11 16:33:54,624][85175] Updated weights for policy 1, policy_version 39260 (0.0010) +[2023-10-11 16:33:54,940][85176] Updated weights for policy 0, policy_version 38692 (0.0010) +[2023-10-11 16:33:55,305][85176] Updated weights for policy 0, policy_version 38702 (0.0010) +[2023-10-11 16:33:55,675][85176] Updated weights for policy 0, policy_version 38712 (0.0012) +[2023-10-11 16:33:56,063][84230] Fps is (10 sec: 16384.2, 60 sec: 14199.5, 300 sec: 13440.4). Total num frames: 79855616. Throughput: 0: 1668.5, 1: 1677.9. Samples: 19966280. Policy #0 lag: (min: 27.0, avg: 27.0, max: 27.0) +[2023-10-11 16:33:56,063][84230] Avg episode reward: [(0, '7.580'), (1, '7.170')] +[2023-10-11 16:33:58,608][85175] Updated weights for policy 1, policy_version 39270 (0.0008) +[2023-10-11 16:33:58,985][85175] Updated weights for policy 1, policy_version 39280 (0.0009) +[2023-10-11 16:33:59,347][85175] Updated weights for policy 1, policy_version 39290 (0.0009) +[2023-10-11 16:33:59,774][85176] Updated weights for policy 0, policy_version 38722 (0.0008) +[2023-10-11 16:34:00,149][85176] Updated weights for policy 0, policy_version 38732 (0.0007) +[2023-10-11 16:34:00,519][85176] Updated weights for policy 0, policy_version 38742 (0.0009) +[2023-10-11 16:34:00,896][85176] Updated weights for policy 0, policy_version 38752 (0.0009) +[2023-10-11 16:34:01,063][84230] Fps is (10 sec: 16383.7, 60 sec: 14199.4, 300 sec: 13440.4). Total num frames: 79921152. Throughput: 0: 1654.1, 1: 1689.6. Samples: 19985928. Policy #0 lag: (min: 27.0, avg: 27.0, max: 27.0) +[2023-10-11 16:34:01,064][84230] Avg episode reward: [(0, '7.420'), (1, '6.990')] +[2023-10-11 16:34:03,263][85175] Updated weights for policy 1, policy_version 39300 (0.0009) +[2023-10-11 16:34:03,632][85175] Updated weights for policy 1, policy_version 39310 (0.0009) +[2023-10-11 16:34:04,010][85175] Updated weights for policy 1, policy_version 39320 (0.0009) +[2023-10-11 16:34:05,114][85176] Updated weights for policy 0, policy_version 38762 (0.0008) +[2023-10-11 16:34:05,488][85176] Updated weights for policy 0, policy_version 38772 (0.0009) +[2023-10-11 16:34:05,854][85176] Updated weights for policy 0, policy_version 38782 (0.0009) +[2023-10-11 16:34:06,063][84230] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13440.4). Total num frames: 79986688. Throughput: 0: 1673.1, 1: 1693.7. Samples: 19996898. Policy #0 lag: (min: 27.0, avg: 27.0, max: 27.0) +[2023-10-11 16:34:06,063][84230] Avg episode reward: [(0, '7.870'), (1, '8.020')] +[2023-10-11 16:34:07,977][85175] Updated weights for policy 1, policy_version 39330 (0.0010) +[2023-10-11 16:34:08,343][85175] Updated weights for policy 1, policy_version 39340 (0.0008) +[2023-10-11 16:34:08,726][85175] Updated weights for policy 1, policy_version 39350 (0.0008) +[2023-10-11 16:34:09,091][85175] Updated weights for policy 1, policy_version 39360 (0.0008) +[2023-10-11 16:34:09,702][85176] Updated weights for policy 0, policy_version 38792 (0.0008) +[2023-10-11 16:34:10,068][85176] Updated weights for policy 0, policy_version 38802 (0.0007) +[2023-10-11 16:34:10,443][85176] Updated weights for policy 0, policy_version 38812 (0.0009) +[2023-10-11 16:34:11,063][84230] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 13440.4). Total num frames: 80052224. Throughput: 0: 1673.6, 1: 1677.6. Samples: 20016718. Policy #0 lag: (min: 3.0, avg: 10.8, max: 35.0) +[2023-10-11 16:34:11,064][84230] Avg episode reward: [(0, '7.450'), (1, '8.030')] +[2023-10-11 16:34:13,134][85175] Updated weights for policy 1, policy_version 39370 (0.0007) +[2023-10-11 16:34:13,504][85175] Updated weights for policy 1, policy_version 39380 (0.0008) +[2023-10-11 16:34:13,868][85175] Updated weights for policy 1, policy_version 39390 (0.0010) +[2023-10-11 16:34:14,680][85176] Updated weights for policy 0, policy_version 38822 (0.0010) +[2023-10-11 16:34:15,059][85176] Updated weights for policy 0, policy_version 38832 (0.0010) +[2023-10-11 16:34:15,416][85176] Updated weights for policy 0, policy_version 38842 (0.0008) +[2023-10-11 16:34:16,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 80117760. Throughput: 0: 1648.3, 1: 1704.5. Samples: 20036452. Policy #0 lag: (min: 3.0, avg: 10.8, max: 35.0) +[2023-10-11 16:34:16,063][84230] Avg episode reward: [(0, '7.750'), (1, '8.250')] +[2023-10-11 16:34:16,072][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000039392_40337408.pth... +[2023-10-11 16:34:16,072][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000038848_39780352.pth... +[2023-10-11 16:34:16,107][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000037824_38731776.pth +[2023-10-11 16:34:16,111][85000] Saving a milestone ./train_atari/atari_frostbite_APPO/checkpoint_p1/milestones/checkpoint_000039392_40337408.pth +[2023-10-11 16:34:16,112][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000037280_38174720.pth +[2023-10-11 16:34:16,116][84801] Saving a milestone ./train_atari/atari_frostbite_APPO/checkpoint_p0/milestones/checkpoint_000038848_39780352.pth +[2023-10-11 16:34:17,857][85175] Updated weights for policy 1, policy_version 39400 (0.0007) +[2023-10-11 16:34:18,216][85175] Updated weights for policy 1, policy_version 39410 (0.0007) +[2023-10-11 16:34:18,590][85175] Updated weights for policy 1, policy_version 39420 (0.0007) +[2023-10-11 16:34:19,397][85176] Updated weights for policy 0, policy_version 38852 (0.0010) +[2023-10-11 16:34:19,776][85176] Updated weights for policy 0, policy_version 38862 (0.0008) +[2023-10-11 16:34:20,145][85176] Updated weights for policy 0, policy_version 38872 (0.0010) +[2023-10-11 16:34:21,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 80183296. Throughput: 0: 1672.6, 1: 1682.5. Samples: 20047110. Policy #0 lag: (min: 3.0, avg: 10.8, max: 35.0) +[2023-10-11 16:34:21,063][84230] Avg episode reward: [(0, '7.750'), (1, '7.800')] +[2023-10-11 16:34:22,427][85175] Updated weights for policy 1, policy_version 39430 (0.0008) +[2023-10-11 16:34:22,791][85175] Updated weights for policy 1, policy_version 39440 (0.0010) +[2023-10-11 16:34:23,153][85175] Updated weights for policy 1, policy_version 39450 (0.0009) +[2023-10-11 16:34:24,306][85176] Updated weights for policy 0, policy_version 38882 (0.0011) +[2023-10-11 16:34:24,675][85176] Updated weights for policy 0, policy_version 38892 (0.0010) +[2023-10-11 16:34:25,053][85176] Updated weights for policy 0, policy_version 38902 (0.0009) +[2023-10-11 16:34:25,426][85176] Updated weights for policy 0, policy_version 38912 (0.0009) +[2023-10-11 16:34:26,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 80248832. Throughput: 0: 1662.5, 1: 1691.1. Samples: 20067238. Policy #0 lag: (min: 3.0, avg: 10.8, max: 35.0) +[2023-10-11 16:34:26,063][84230] Avg episode reward: [(0, '7.720'), (1, '7.420')] +[2023-10-11 16:34:27,272][85175] Updated weights for policy 1, policy_version 39460 (0.0007) +[2023-10-11 16:34:27,667][85175] Updated weights for policy 1, policy_version 39470 (0.0009) +[2023-10-11 16:34:28,041][85175] Updated weights for policy 1, policy_version 39480 (0.0009) +[2023-10-11 16:34:29,531][85176] Updated weights for policy 0, policy_version 38922 (0.0008) +[2023-10-11 16:34:29,905][85176] Updated weights for policy 0, policy_version 38932 (0.0007) +[2023-10-11 16:34:30,272][85176] Updated weights for policy 0, policy_version 38942 (0.0009) +[2023-10-11 16:34:31,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 80314368. Throughput: 0: 1649.6, 1: 1710.4. Samples: 20087010. Policy #0 lag: (min: 3.0, avg: 10.8, max: 35.0) +[2023-10-11 16:34:31,063][84230] Avg episode reward: [(0, '7.870'), (1, '6.980')] +[2023-10-11 16:34:32,194][85175] Updated weights for policy 1, policy_version 39490 (0.0010) +[2023-10-11 16:34:32,554][85175] Updated weights for policy 1, policy_version 39500 (0.0008) +[2023-10-11 16:34:32,933][85175] Updated weights for policy 1, policy_version 39510 (0.0007) +[2023-10-11 16:34:33,295][85175] Updated weights for policy 1, policy_version 39520 (0.0009) +[2023-10-11 16:34:34,433][85176] Updated weights for policy 0, policy_version 38952 (0.0011) +[2023-10-11 16:34:34,802][85176] Updated weights for policy 0, policy_version 38962 (0.0011) +[2023-10-11 16:34:35,169][85176] Updated weights for policy 0, policy_version 38972 (0.0011) +[2023-10-11 16:34:36,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 80379904. Throughput: 0: 1668.6, 1: 1681.6. Samples: 20097406. Policy #0 lag: (min: 16.0, avg: 31.1, max: 48.0) +[2023-10-11 16:34:36,063][84230] Avg episode reward: [(0, '7.690'), (1, '7.300')] +[2023-10-11 16:34:37,208][85175] Updated weights for policy 1, policy_version 39530 (0.0008) +[2023-10-11 16:34:37,584][85175] Updated weights for policy 1, policy_version 39540 (0.0011) +[2023-10-11 16:34:37,958][85175] Updated weights for policy 1, policy_version 39550 (0.0008) +[2023-10-11 16:34:39,387][85176] Updated weights for policy 0, policy_version 38982 (0.0008) +[2023-10-11 16:34:39,767][85176] Updated weights for policy 0, policy_version 38992 (0.0007) +[2023-10-11 16:34:40,141][85176] Updated weights for policy 0, policy_version 39002 (0.0009) +[2023-10-11 16:34:41,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 80445440. Throughput: 0: 1654.6, 1: 1706.3. Samples: 20117518. Policy #0 lag: (min: 16.0, avg: 31.1, max: 48.0) +[2023-10-11 16:34:41,063][84230] Avg episode reward: [(0, '7.580'), (1, '7.450')] +[2023-10-11 16:34:42,107][85175] Updated weights for policy 1, policy_version 39560 (0.0007) +[2023-10-11 16:34:42,487][85175] Updated weights for policy 1, policy_version 39570 (0.0007) +[2023-10-11 16:34:42,847][85175] Updated weights for policy 1, policy_version 39580 (0.0010) +[2023-10-11 16:34:44,261][85176] Updated weights for policy 0, policy_version 39012 (0.0008) +[2023-10-11 16:34:44,637][85176] Updated weights for policy 0, policy_version 39022 (0.0010) +[2023-10-11 16:34:45,010][85176] Updated weights for policy 0, policy_version 39032 (0.0008) +[2023-10-11 16:34:46,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 80510976. Throughput: 0: 1655.2, 1: 1706.9. Samples: 20137222. Policy #0 lag: (min: 16.0, avg: 31.1, max: 48.0) +[2023-10-11 16:34:46,063][84230] Avg episode reward: [(0, '7.600'), (1, '7.700')] +[2023-10-11 16:34:46,923][85175] Updated weights for policy 1, policy_version 39590 (0.0007) +[2023-10-11 16:34:47,287][85175] Updated weights for policy 1, policy_version 39600 (0.0007) +[2023-10-11 16:34:47,642][85175] Updated weights for policy 1, policy_version 39610 (0.0008) +[2023-10-11 16:34:49,109][85176] Updated weights for policy 0, policy_version 39042 (0.0010) +[2023-10-11 16:34:49,477][85176] Updated weights for policy 0, policy_version 39052 (0.0008) +[2023-10-11 16:34:49,854][85176] Updated weights for policy 0, policy_version 39062 (0.0010) +[2023-10-11 16:34:50,222][85176] Updated weights for policy 0, policy_version 39072 (0.0007) +[2023-10-11 16:34:51,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 80576512. Throughput: 0: 1662.4, 1: 1684.1. Samples: 20147490. Policy #0 lag: (min: 16.0, avg: 31.1, max: 48.0) +[2023-10-11 16:34:51,063][84230] Avg episode reward: [(0, '7.600'), (1, '7.890')] +[2023-10-11 16:34:51,767][85175] Updated weights for policy 1, policy_version 39620 (0.0007) +[2023-10-11 16:34:52,145][85175] Updated weights for policy 1, policy_version 39630 (0.0008) +[2023-10-11 16:34:52,514][85175] Updated weights for policy 1, policy_version 39640 (0.0009) +[2023-10-11 16:34:54,267][85176] Updated weights for policy 0, policy_version 39082 (0.0009) +[2023-10-11 16:34:54,638][85176] Updated weights for policy 0, policy_version 39092 (0.0011) +[2023-10-11 16:34:55,009][85176] Updated weights for policy 0, policy_version 39102 (0.0008) +[2023-10-11 16:34:56,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 80642048. Throughput: 0: 1646.7, 1: 1704.2. Samples: 20167508. Policy #0 lag: (min: 16.0, avg: 31.1, max: 48.0) +[2023-10-11 16:34:56,063][84230] Avg episode reward: [(0, '8.050'), (1, '7.480')] +[2023-10-11 16:34:56,374][85175] Updated weights for policy 1, policy_version 39650 (0.0010) +[2023-10-11 16:34:56,743][85175] Updated weights for policy 1, policy_version 39660 (0.0009) +[2023-10-11 16:34:57,113][85175] Updated weights for policy 1, policy_version 39670 (0.0008) +[2023-10-11 16:34:57,490][85175] Updated weights for policy 1, policy_version 39680 (0.0007) +[2023-10-11 16:34:59,083][85176] Updated weights for policy 0, policy_version 39112 (0.0008) +[2023-10-11 16:34:59,451][85176] Updated weights for policy 0, policy_version 39122 (0.0009) +[2023-10-11 16:34:59,826][85176] Updated weights for policy 0, policy_version 39132 (0.0009) +[2023-10-11 16:35:01,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 80707584. Throughput: 0: 1660.8, 1: 1703.5. Samples: 20187848. Policy #0 lag: (min: 16.0, avg: 31.1, max: 48.0) +[2023-10-11 16:35:01,064][84230] Avg episode reward: [(0, '7.900'), (1, '7.480')] +[2023-10-11 16:35:01,452][85175] Updated weights for policy 1, policy_version 39690 (0.0007) +[2023-10-11 16:35:01,822][85175] Updated weights for policy 1, policy_version 39700 (0.0007) +[2023-10-11 16:35:02,186][85175] Updated weights for policy 1, policy_version 39710 (0.0007) +[2023-10-11 16:35:04,096][85176] Updated weights for policy 0, policy_version 39142 (0.0009) +[2023-10-11 16:35:04,470][85176] Updated weights for policy 0, policy_version 39152 (0.0008) +[2023-10-11 16:35:04,843][85176] Updated weights for policy 0, policy_version 39162 (0.0008) +[2023-10-11 16:35:06,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 80773120. Throughput: 0: 1657.9, 1: 1697.4. Samples: 20198100. Policy #0 lag: (min: 16.0, avg: 36.3, max: 48.0) +[2023-10-11 16:35:06,063][84230] Avg episode reward: [(0, '7.590'), (1, '7.480')] +[2023-10-11 16:35:06,182][85175] Updated weights for policy 1, policy_version 39720 (0.0008) +[2023-10-11 16:35:06,539][85175] Updated weights for policy 1, policy_version 39730 (0.0009) +[2023-10-11 16:35:06,900][85175] Updated weights for policy 1, policy_version 39740 (0.0008) +[2023-10-11 16:35:08,972][85176] Updated weights for policy 0, policy_version 39172 (0.0009) +[2023-10-11 16:35:09,355][85176] Updated weights for policy 0, policy_version 39182 (0.0009) +[2023-10-11 16:35:09,717][85176] Updated weights for policy 0, policy_version 39192 (0.0008) +[2023-10-11 16:35:10,853][85175] Updated weights for policy 1, policy_version 39750 (0.0008) +[2023-10-11 16:35:11,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 80838656. Throughput: 0: 1648.3, 1: 1706.3. Samples: 20218192. Policy #0 lag: (min: 16.0, avg: 36.3, max: 48.0) +[2023-10-11 16:35:11,063][84230] Avg episode reward: [(0, '7.400'), (1, '7.350')] +[2023-10-11 16:35:11,224][85175] Updated weights for policy 1, policy_version 39760 (0.0009) +[2023-10-11 16:35:11,601][85175] Updated weights for policy 1, policy_version 39770 (0.0009) +[2023-10-11 16:35:13,941][85176] Updated weights for policy 0, policy_version 39202 (0.0007) +[2023-10-11 16:35:14,326][85176] Updated weights for policy 0, policy_version 39212 (0.0009) +[2023-10-11 16:35:14,689][85176] Updated weights for policy 0, policy_version 39222 (0.0007) +[2023-10-11 16:35:15,070][85176] Updated weights for policy 0, policy_version 39232 (0.0008) +[2023-10-11 16:35:15,661][85175] Updated weights for policy 1, policy_version 39780 (0.0010) +[2023-10-11 16:35:16,057][85175] Updated weights for policy 1, policy_version 39790 (0.0007) +[2023-10-11 16:35:16,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 80904192. Throughput: 0: 1662.7, 1: 1703.2. Samples: 20238472. Policy #0 lag: (min: 16.0, avg: 36.3, max: 48.0) +[2023-10-11 16:35:16,063][84230] Avg episode reward: [(0, '7.280'), (1, '7.420')] +[2023-10-11 16:35:16,426][85175] Updated weights for policy 1, policy_version 39800 (0.0008) +[2023-10-11 16:35:19,179][85176] Updated weights for policy 0, policy_version 39242 (0.0010) +[2023-10-11 16:35:19,563][85176] Updated weights for policy 0, policy_version 39252 (0.0009) +[2023-10-11 16:35:19,946][85176] Updated weights for policy 0, policy_version 39262 (0.0007) +[2023-10-11 16:35:20,361][85175] Updated weights for policy 1, policy_version 39810 (0.0010) +[2023-10-11 16:35:20,721][85175] Updated weights for policy 1, policy_version 39820 (0.0008) +[2023-10-11 16:35:21,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 80969728. Throughput: 0: 1660.0, 1: 1702.6. Samples: 20248726. Policy #0 lag: (min: 16.0, avg: 36.3, max: 48.0) +[2023-10-11 16:35:21,063][84230] Avg episode reward: [(0, '7.750'), (1, '8.100')] +[2023-10-11 16:35:21,094][85175] Updated weights for policy 1, policy_version 39830 (0.0007) +[2023-10-11 16:35:21,452][85175] Updated weights for policy 1, policy_version 39840 (0.0010) +[2023-10-11 16:35:23,877][85176] Updated weights for policy 0, policy_version 39272 (0.0007) +[2023-10-11 16:35:24,249][85176] Updated weights for policy 0, policy_version 39282 (0.0007) +[2023-10-11 16:35:24,619][85176] Updated weights for policy 0, policy_version 39292 (0.0007) +[2023-10-11 16:35:25,556][85175] Updated weights for policy 1, policy_version 39850 (0.0008) +[2023-10-11 16:35:25,916][85175] Updated weights for policy 1, policy_version 39860 (0.0009) +[2023-10-11 16:35:26,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 81035264. Throughput: 0: 1656.4, 1: 1700.8. Samples: 20268590. Policy #0 lag: (min: 16.0, avg: 36.3, max: 48.0) +[2023-10-11 16:35:26,063][84230] Avg episode reward: [(0, '7.750'), (1, '8.060')] +[2023-10-11 16:35:26,288][85175] Updated weights for policy 1, policy_version 39870 (0.0008) +[2023-10-11 16:35:28,423][85176] Updated weights for policy 0, policy_version 39302 (0.0007) +[2023-10-11 16:35:28,800][85176] Updated weights for policy 0, policy_version 39312 (0.0009) +[2023-10-11 16:35:29,176][85176] Updated weights for policy 0, policy_version 39322 (0.0009) +[2023-10-11 16:35:30,408][85175] Updated weights for policy 1, policy_version 39880 (0.0007) +[2023-10-11 16:35:30,781][85175] Updated weights for policy 1, policy_version 39890 (0.0007) +[2023-10-11 16:35:31,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 81100800. Throughput: 0: 1674.8, 1: 1691.2. Samples: 20288694. Policy #0 lag: (min: 16.0, avg: 36.3, max: 48.0) +[2023-10-11 16:35:31,063][84230] Avg episode reward: [(0, '7.600'), (1, '7.320')] +[2023-10-11 16:35:31,147][85175] Updated weights for policy 1, policy_version 39900 (0.0007) +[2023-10-11 16:35:33,096][85176] Updated weights for policy 0, policy_version 39332 (0.0010) +[2023-10-11 16:35:33,476][85176] Updated weights for policy 0, policy_version 39342 (0.0010) +[2023-10-11 16:35:33,849][85176] Updated weights for policy 0, policy_version 39352 (0.0009) +[2023-10-11 16:35:35,260][85175] Updated weights for policy 1, policy_version 39910 (0.0009) +[2023-10-11 16:35:35,634][85175] Updated weights for policy 1, policy_version 39920 (0.0010) +[2023-10-11 16:35:35,990][85175] Updated weights for policy 1, policy_version 39930 (0.0009) +[2023-10-11 16:35:36,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 81166336. Throughput: 0: 1668.8, 1: 1701.4. Samples: 20299150. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:35:36,063][84230] Avg episode reward: [(0, '7.430'), (1, '7.030')] +[2023-10-11 16:35:38,027][85176] Updated weights for policy 0, policy_version 39362 (0.0009) +[2023-10-11 16:35:38,388][85176] Updated weights for policy 0, policy_version 39372 (0.0008) +[2023-10-11 16:35:38,768][85176] Updated weights for policy 0, policy_version 39382 (0.0009) +[2023-10-11 16:35:39,133][85176] Updated weights for policy 0, policy_version 39392 (0.0010) +[2023-10-11 16:35:40,200][85175] Updated weights for policy 1, policy_version 39940 (0.0009) +[2023-10-11 16:35:40,571][85175] Updated weights for policy 1, policy_version 39950 (0.0008) +[2023-10-11 16:35:40,948][85175] Updated weights for policy 1, policy_version 39960 (0.0009) +[2023-10-11 16:35:41,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 81231872. Throughput: 0: 1673.3, 1: 1695.5. Samples: 20319104. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:35:41,063][84230] Avg episode reward: [(0, '7.530'), (1, '7.540')] +[2023-10-11 16:35:43,316][85176] Updated weights for policy 0, policy_version 39402 (0.0008) +[2023-10-11 16:35:43,684][85176] Updated weights for policy 0, policy_version 39412 (0.0009) +[2023-10-11 16:35:44,054][85176] Updated weights for policy 0, policy_version 39422 (0.0011) +[2023-10-11 16:35:44,779][85175] Updated weights for policy 1, policy_version 39970 (0.0008) +[2023-10-11 16:35:45,148][85175] Updated weights for policy 1, policy_version 39980 (0.0009) +[2023-10-11 16:35:45,527][85175] Updated weights for policy 1, policy_version 39990 (0.0011) +[2023-10-11 16:35:45,893][85175] Updated weights for policy 1, policy_version 40000 (0.0010) +[2023-10-11 16:35:46,063][84230] Fps is (10 sec: 16383.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 81330176. Throughput: 0: 1688.2, 1: 1675.4. Samples: 20339210. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:35:46,063][84230] Avg episode reward: [(0, '8.120'), (1, '7.960')] +[2023-10-11 16:35:48,081][85176] Updated weights for policy 0, policy_version 39432 (0.0009) +[2023-10-11 16:35:48,454][85176] Updated weights for policy 0, policy_version 39442 (0.0007) +[2023-10-11 16:35:48,828][85176] Updated weights for policy 0, policy_version 39452 (0.0007) +[2023-10-11 16:35:49,957][85175] Updated weights for policy 1, policy_version 40010 (0.0007) +[2023-10-11 16:35:50,315][85175] Updated weights for policy 1, policy_version 40020 (0.0009) +[2023-10-11 16:35:50,680][85175] Updated weights for policy 1, policy_version 40030 (0.0009) +[2023-10-11 16:35:51,063][84230] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 81395712. Throughput: 0: 1673.5, 1: 1694.3. Samples: 20349648. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:35:51,063][84230] Avg episode reward: [(0, '8.020'), (1, '7.900')] +[2023-10-11 16:35:52,862][85176] Updated weights for policy 0, policy_version 39462 (0.0008) +[2023-10-11 16:35:53,236][85176] Updated weights for policy 0, policy_version 39472 (0.0008) +[2023-10-11 16:35:53,618][85176] Updated weights for policy 0, policy_version 39482 (0.0008) +[2023-10-11 16:35:54,561][85175] Updated weights for policy 1, policy_version 40040 (0.0007) +[2023-10-11 16:35:54,929][85175] Updated weights for policy 1, policy_version 40050 (0.0007) +[2023-10-11 16:35:55,295][85175] Updated weights for policy 1, policy_version 40060 (0.0008) +[2023-10-11 16:35:56,062][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 81461248. Throughput: 0: 1678.1, 1: 1689.9. Samples: 20369750. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:35:56,063][84230] Avg episode reward: [(0, '8.040'), (1, '7.730')] +[2023-10-11 16:35:57,739][85176] Updated weights for policy 0, policy_version 39492 (0.0007) +[2023-10-11 16:35:58,116][85176] Updated weights for policy 0, policy_version 39502 (0.0007) +[2023-10-11 16:35:58,492][85176] Updated weights for policy 0, policy_version 39512 (0.0007) +[2023-10-11 16:35:59,390][85175] Updated weights for policy 1, policy_version 40070 (0.0007) +[2023-10-11 16:35:59,750][85175] Updated weights for policy 1, policy_version 40080 (0.0007) +[2023-10-11 16:36:00,126][85175] Updated weights for policy 1, policy_version 40090 (0.0009) +[2023-10-11 16:36:01,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 81526784. Throughput: 0: 1688.8, 1: 1667.9. Samples: 20389520. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:36:01,063][84230] Avg episode reward: [(0, '7.140'), (1, '7.310')] +[2023-10-11 16:36:02,555][85176] Updated weights for policy 0, policy_version 39522 (0.0009) +[2023-10-11 16:36:02,932][85176] Updated weights for policy 0, policy_version 39532 (0.0007) +[2023-10-11 16:36:03,311][85176] Updated weights for policy 0, policy_version 39542 (0.0009) +[2023-10-11 16:36:03,677][85176] Updated weights for policy 0, policy_version 39552 (0.0008) +[2023-10-11 16:36:04,314][85175] Updated weights for policy 1, policy_version 40100 (0.0009) +[2023-10-11 16:36:04,731][85175] Updated weights for policy 1, policy_version 40110 (0.0007) +[2023-10-11 16:36:05,099][85175] Updated weights for policy 1, policy_version 40120 (0.0007) +[2023-10-11 16:36:06,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 81592320. Throughput: 0: 1662.7, 1: 1699.0. Samples: 20400000. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) +[2023-10-11 16:36:06,064][84230] Avg episode reward: [(0, '7.430'), (1, '7.670')] +[2023-10-11 16:36:07,948][85176] Updated weights for policy 0, policy_version 39562 (0.0009) +[2023-10-11 16:36:08,324][85176] Updated weights for policy 0, policy_version 39572 (0.0010) +[2023-10-11 16:36:08,696][85176] Updated weights for policy 0, policy_version 39582 (0.0010) +[2023-10-11 16:36:09,043][85175] Updated weights for policy 1, policy_version 40130 (0.0009) +[2023-10-11 16:36:09,407][85175] Updated weights for policy 1, policy_version 40140 (0.0009) +[2023-10-11 16:36:09,783][85175] Updated weights for policy 1, policy_version 40150 (0.0007) +[2023-10-11 16:36:10,152][85175] Updated weights for policy 1, policy_version 40160 (0.0009) +[2023-10-11 16:36:11,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 81657856. Throughput: 0: 1675.5, 1: 1682.7. Samples: 20419706. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) +[2023-10-11 16:36:11,063][84230] Avg episode reward: [(0, '8.170'), (1, '7.450')] +[2023-10-11 16:36:12,624][85176] Updated weights for policy 0, policy_version 39592 (0.0008) +[2023-10-11 16:36:13,002][85176] Updated weights for policy 0, policy_version 39602 (0.0009) +[2023-10-11 16:36:13,373][85176] Updated weights for policy 0, policy_version 39612 (0.0009) +[2023-10-11 16:36:14,172][85175] Updated weights for policy 1, policy_version 40170 (0.0010) +[2023-10-11 16:36:14,547][85175] Updated weights for policy 1, policy_version 40180 (0.0008) +[2023-10-11 16:36:14,911][85175] Updated weights for policy 1, policy_version 40190 (0.0008) +[2023-10-11 16:36:16,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 81723392. Throughput: 0: 1676.4, 1: 1677.9. Samples: 20439636. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) +[2023-10-11 16:36:16,064][84230] Avg episode reward: [(0, '8.300'), (1, '7.780')] +[2023-10-11 16:36:16,077][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000040192_41156608.pth... +[2023-10-11 16:36:16,077][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000039616_40566784.pth... +[2023-10-11 16:36:16,106][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000038592_39518208.pth +[2023-10-11 16:36:16,117][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000038080_38993920.pth +[2023-10-11 16:36:17,533][85176] Updated weights for policy 0, policy_version 39622 (0.0008) +[2023-10-11 16:36:17,906][85176] Updated weights for policy 0, policy_version 39632 (0.0007) +[2023-10-11 16:36:18,277][85176] Updated weights for policy 0, policy_version 39642 (0.0009) +[2023-10-11 16:36:19,015][85175] Updated weights for policy 1, policy_version 40200 (0.0008) +[2023-10-11 16:36:19,387][85175] Updated weights for policy 1, policy_version 40210 (0.0008) +[2023-10-11 16:36:19,748][85175] Updated weights for policy 1, policy_version 40220 (0.0008) +[2023-10-11 16:36:21,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 81788928. Throughput: 0: 1654.5, 1: 1698.4. Samples: 20450032. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) +[2023-10-11 16:36:21,064][84230] Avg episode reward: [(0, '7.860'), (1, '7.870')] +[2023-10-11 16:36:22,131][85176] Updated weights for policy 0, policy_version 39652 (0.0008) +[2023-10-11 16:36:22,501][85176] Updated weights for policy 0, policy_version 39662 (0.0007) +[2023-10-11 16:36:22,874][85176] Updated weights for policy 0, policy_version 39672 (0.0007) +[2023-10-11 16:36:23,743][85175] Updated weights for policy 1, policy_version 40230 (0.0010) +[2023-10-11 16:36:24,104][85175] Updated weights for policy 1, policy_version 40240 (0.0008) +[2023-10-11 16:36:24,476][85175] Updated weights for policy 1, policy_version 40250 (0.0008) +[2023-10-11 16:36:26,063][84230] Fps is (10 sec: 13107.6, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 81854464. Throughput: 0: 1675.3, 1: 1681.6. Samples: 20470164. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) +[2023-10-11 16:36:26,063][84230] Avg episode reward: [(0, '7.750'), (1, '7.350')] +[2023-10-11 16:36:26,952][85176] Updated weights for policy 0, policy_version 39682 (0.0007) +[2023-10-11 16:36:27,337][85176] Updated weights for policy 0, policy_version 39692 (0.0007) +[2023-10-11 16:36:27,699][85176] Updated weights for policy 0, policy_version 39702 (0.0007) +[2023-10-11 16:36:28,082][85176] Updated weights for policy 0, policy_version 39712 (0.0007) +[2023-10-11 16:36:28,331][85175] Updated weights for policy 1, policy_version 40260 (0.0008) +[2023-10-11 16:36:28,708][85175] Updated weights for policy 1, policy_version 40270 (0.0007) +[2023-10-11 16:36:29,071][85175] Updated weights for policy 1, policy_version 40280 (0.0009) +[2023-10-11 16:36:31,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 81920000. Throughput: 0: 1676.5, 1: 1695.6. Samples: 20490958. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) +[2023-10-11 16:36:31,063][84230] Avg episode reward: [(0, '7.450'), (1, '6.970')] +[2023-10-11 16:36:32,173][85176] Updated weights for policy 0, policy_version 39722 (0.0007) +[2023-10-11 16:36:32,557][85176] Updated weights for policy 0, policy_version 39732 (0.0008) +[2023-10-11 16:36:32,941][85176] Updated weights for policy 0, policy_version 39742 (0.0010) +[2023-10-11 16:36:33,151][85175] Updated weights for policy 1, policy_version 40290 (0.0008) +[2023-10-11 16:36:33,526][85175] Updated weights for policy 1, policy_version 40300 (0.0010) +[2023-10-11 16:36:33,892][85175] Updated weights for policy 1, policy_version 40310 (0.0009) +[2023-10-11 16:36:34,260][85175] Updated weights for policy 1, policy_version 40320 (0.0008) +[2023-10-11 16:36:36,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 81985536. Throughput: 0: 1664.5, 1: 1694.6. Samples: 20500806. Policy #0 lag: (min: 12.0, avg: 18.4, max: 44.0) +[2023-10-11 16:36:36,063][84230] Avg episode reward: [(0, '7.590'), (1, '7.550')] +[2023-10-11 16:36:37,016][85176] Updated weights for policy 0, policy_version 39752 (0.0008) +[2023-10-11 16:36:37,381][85176] Updated weights for policy 0, policy_version 39762 (0.0007) +[2023-10-11 16:36:37,754][85176] Updated weights for policy 0, policy_version 39772 (0.0007) +[2023-10-11 16:36:38,295][85175] Updated weights for policy 1, policy_version 40330 (0.0009) +[2023-10-11 16:36:38,664][85175] Updated weights for policy 1, policy_version 40340 (0.0007) +[2023-10-11 16:36:39,042][85175] Updated weights for policy 1, policy_version 40350 (0.0010) +[2023-10-11 16:36:41,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 82051072. Throughput: 0: 1681.7, 1: 1676.8. Samples: 20520884. Policy #0 lag: (min: 12.0, avg: 18.4, max: 44.0) +[2023-10-11 16:36:41,063][84230] Avg episode reward: [(0, '7.740'), (1, '8.060')] +[2023-10-11 16:36:41,773][85176] Updated weights for policy 0, policy_version 39782 (0.0008) +[2023-10-11 16:36:42,142][85176] Updated weights for policy 0, policy_version 39792 (0.0009) +[2023-10-11 16:36:42,518][85176] Updated weights for policy 0, policy_version 39802 (0.0008) +[2023-10-11 16:36:42,868][85175] Updated weights for policy 1, policy_version 40360 (0.0010) +[2023-10-11 16:36:43,233][85175] Updated weights for policy 1, policy_version 40370 (0.0009) +[2023-10-11 16:36:43,603][85175] Updated weights for policy 1, policy_version 40380 (0.0008) +[2023-10-11 16:36:46,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 82116608. Throughput: 0: 1681.9, 1: 1704.8. Samples: 20541924. Policy #0 lag: (min: 12.0, avg: 18.4, max: 44.0) +[2023-10-11 16:36:46,063][84230] Avg episode reward: [(0, '8.010'), (1, '7.960')] +[2023-10-11 16:36:46,569][85176] Updated weights for policy 0, policy_version 39812 (0.0009) +[2023-10-11 16:36:46,939][85176] Updated weights for policy 0, policy_version 39822 (0.0007) +[2023-10-11 16:36:47,317][85176] Updated weights for policy 0, policy_version 39832 (0.0008) +[2023-10-11 16:36:47,801][85175] Updated weights for policy 1, policy_version 40390 (0.0009) +[2023-10-11 16:36:48,176][85175] Updated weights for policy 1, policy_version 40400 (0.0010) +[2023-10-11 16:36:48,547][85175] Updated weights for policy 1, policy_version 40410 (0.0008) +[2023-10-11 16:36:51,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 82182144. Throughput: 0: 1680.0, 1: 1679.4. Samples: 20551176. Policy #0 lag: (min: 12.0, avg: 18.4, max: 44.0) +[2023-10-11 16:36:51,064][84230] Avg episode reward: [(0, '7.860'), (1, '7.350')] +[2023-10-11 16:36:51,531][85176] Updated weights for policy 0, policy_version 39842 (0.0010) +[2023-10-11 16:36:51,897][85176] Updated weights for policy 0, policy_version 39852 (0.0008) +[2023-10-11 16:36:52,281][85176] Updated weights for policy 0, policy_version 39862 (0.0009) +[2023-10-11 16:36:52,646][85176] Updated weights for policy 0, policy_version 39872 (0.0008) +[2023-10-11 16:36:52,655][85175] Updated weights for policy 1, policy_version 40420 (0.0009) +[2023-10-11 16:36:53,029][85175] Updated weights for policy 1, policy_version 40430 (0.0008) +[2023-10-11 16:36:53,395][85175] Updated weights for policy 1, policy_version 40440 (0.0010) +[2023-10-11 16:36:56,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 82247680. Throughput: 0: 1683.5, 1: 1692.9. Samples: 20571642. Policy #0 lag: (min: 12.0, avg: 18.4, max: 44.0) +[2023-10-11 16:36:56,063][84230] Avg episode reward: [(0, '7.560'), (1, '7.130')] +[2023-10-11 16:36:56,615][85176] Updated weights for policy 0, policy_version 39882 (0.0010) +[2023-10-11 16:36:56,986][85176] Updated weights for policy 0, policy_version 39892 (0.0010) +[2023-10-11 16:36:57,360][85176] Updated weights for policy 0, policy_version 39902 (0.0008) +[2023-10-11 16:36:57,483][85175] Updated weights for policy 1, policy_version 40450 (0.0008) +[2023-10-11 16:36:57,905][85175] Updated weights for policy 1, policy_version 40460 (0.0007) +[2023-10-11 16:36:58,274][85175] Updated weights for policy 1, policy_version 40470 (0.0007) +[2023-10-11 16:36:58,638][85175] Updated weights for policy 1, policy_version 40480 (0.0010) +[2023-10-11 16:37:01,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 82313216. Throughput: 0: 1689.0, 1: 1709.3. Samples: 20592560. Policy #0 lag: (min: 12.0, avg: 18.4, max: 44.0) +[2023-10-11 16:37:01,064][84230] Avg episode reward: [(0, '7.600'), (1, '7.580')] +[2023-10-11 16:37:01,518][85176] Updated weights for policy 0, policy_version 39912 (0.0007) +[2023-10-11 16:37:01,894][85176] Updated weights for policy 0, policy_version 39922 (0.0007) +[2023-10-11 16:37:02,264][85176] Updated weights for policy 0, policy_version 39932 (0.0009) +[2023-10-11 16:37:02,478][85175] Updated weights for policy 1, policy_version 40490 (0.0007) +[2023-10-11 16:37:02,837][85175] Updated weights for policy 1, policy_version 40500 (0.0008) +[2023-10-11 16:37:03,201][85175] Updated weights for policy 1, policy_version 40510 (0.0010) +[2023-10-11 16:37:06,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 82378752. Throughput: 0: 1687.5, 1: 1683.2. Samples: 20601714. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) +[2023-10-11 16:37:06,064][84230] Avg episode reward: [(0, '7.600'), (1, '7.740')] +[2023-10-11 16:37:06,363][85176] Updated weights for policy 0, policy_version 39942 (0.0007) +[2023-10-11 16:37:06,727][85176] Updated weights for policy 0, policy_version 39952 (0.0008) +[2023-10-11 16:37:07,095][85176] Updated weights for policy 0, policy_version 39962 (0.0008) +[2023-10-11 16:37:07,177][85175] Updated weights for policy 1, policy_version 40520 (0.0007) +[2023-10-11 16:37:07,541][85175] Updated weights for policy 1, policy_version 40530 (0.0008) +[2023-10-11 16:37:07,914][85175] Updated weights for policy 1, policy_version 40540 (0.0008) +[2023-10-11 16:37:11,062][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 82444288. Throughput: 0: 1681.9, 1: 1704.4. Samples: 20622550. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) +[2023-10-11 16:37:11,063][84230] Avg episode reward: [(0, '7.750'), (1, '7.790')] +[2023-10-11 16:37:11,263][85176] Updated weights for policy 0, policy_version 39972 (0.0009) +[2023-10-11 16:37:11,640][85176] Updated weights for policy 0, policy_version 39982 (0.0008) +[2023-10-11 16:37:11,727][85175] Updated weights for policy 1, policy_version 40550 (0.0008) +[2023-10-11 16:37:12,023][85176] Updated weights for policy 0, policy_version 39992 (0.0008) +[2023-10-11 16:37:12,094][85175] Updated weights for policy 1, policy_version 40560 (0.0007) +[2023-10-11 16:37:12,456][85175] Updated weights for policy 1, policy_version 40570 (0.0009) +[2023-10-11 16:37:16,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 82509824. Throughput: 0: 1675.2, 1: 1704.0. Samples: 20643024. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) +[2023-10-11 16:37:16,064][84230] Avg episode reward: [(0, '7.600'), (1, '7.530')] +[2023-10-11 16:37:16,147][85176] Updated weights for policy 0, policy_version 40002 (0.0009) +[2023-10-11 16:37:16,502][85176] Updated weights for policy 0, policy_version 40012 (0.0008) +[2023-10-11 16:37:16,630][85175] Updated weights for policy 1, policy_version 40580 (0.0009) +[2023-10-11 16:37:16,871][85176] Updated weights for policy 0, policy_version 40022 (0.0008) +[2023-10-11 16:37:16,998][85175] Updated weights for policy 1, policy_version 40590 (0.0009) +[2023-10-11 16:37:17,240][85176] Updated weights for policy 0, policy_version 40032 (0.0007) +[2023-10-11 16:37:17,365][85175] Updated weights for policy 1, policy_version 40600 (0.0009) +[2023-10-11 16:37:21,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 82575360. Throughput: 0: 1675.7, 1: 1685.5. Samples: 20652062. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) +[2023-10-11 16:37:21,064][84230] Avg episode reward: [(0, '7.570'), (1, '7.320')] +[2023-10-11 16:37:21,366][85176] Updated weights for policy 0, policy_version 40042 (0.0008) +[2023-10-11 16:37:21,523][85175] Updated weights for policy 1, policy_version 40610 (0.0007) +[2023-10-11 16:37:21,738][85176] Updated weights for policy 0, policy_version 40052 (0.0008) +[2023-10-11 16:37:21,885][85175] Updated weights for policy 1, policy_version 40620 (0.0008) +[2023-10-11 16:37:22,111][85176] Updated weights for policy 0, policy_version 40062 (0.0009) +[2023-10-11 16:37:22,256][85175] Updated weights for policy 1, policy_version 40630 (0.0008) +[2023-10-11 16:37:22,623][85175] Updated weights for policy 1, policy_version 40640 (0.0008) +[2023-10-11 16:37:26,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 82640896. Throughput: 0: 1666.7, 1: 1701.8. Samples: 20672468. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) +[2023-10-11 16:37:26,064][84230] Avg episode reward: [(0, '7.570'), (1, '7.740')] +[2023-10-11 16:37:26,260][85176] Updated weights for policy 0, policy_version 40072 (0.0009) +[2023-10-11 16:37:26,630][85176] Updated weights for policy 0, policy_version 40082 (0.0010) +[2023-10-11 16:37:26,650][85175] Updated weights for policy 1, policy_version 40650 (0.0007) +[2023-10-11 16:37:26,994][85176] Updated weights for policy 0, policy_version 40092 (0.0010) +[2023-10-11 16:37:27,023][85175] Updated weights for policy 1, policy_version 40660 (0.0008) +[2023-10-11 16:37:27,386][85175] Updated weights for policy 1, policy_version 40670 (0.0008) +[2023-10-11 16:37:31,039][85176] Updated weights for policy 0, policy_version 40102 (0.0009) +[2023-10-11 16:37:31,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 82706432. Throughput: 0: 1660.5, 1: 1698.6. Samples: 20693086. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) +[2023-10-11 16:37:31,063][84230] Avg episode reward: [(0, '7.780'), (1, '7.770')] +[2023-10-11 16:37:31,414][85176] Updated weights for policy 0, policy_version 40112 (0.0008) +[2023-10-11 16:37:31,536][85175] Updated weights for policy 1, policy_version 40680 (0.0009) +[2023-10-11 16:37:31,783][85176] Updated weights for policy 0, policy_version 40122 (0.0007) +[2023-10-11 16:37:31,904][85175] Updated weights for policy 1, policy_version 40690 (0.0008) +[2023-10-11 16:37:32,264][85175] Updated weights for policy 1, policy_version 40700 (0.0009) +[2023-10-11 16:37:35,898][85176] Updated weights for policy 0, policy_version 40132 (0.0007) +[2023-10-11 16:37:36,063][84230] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 82771968. Throughput: 0: 1660.9, 1: 1694.4. Samples: 20702164. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) +[2023-10-11 16:37:36,063][84230] Avg episode reward: [(0, '7.780'), (1, '7.550')] +[2023-10-11 16:37:36,259][85175] Updated weights for policy 1, policy_version 40710 (0.0007) +[2023-10-11 16:37:36,279][85176] Updated weights for policy 0, policy_version 40142 (0.0007) +[2023-10-11 16:37:36,629][85175] Updated weights for policy 1, policy_version 40720 (0.0009) +[2023-10-11 16:37:36,646][85176] Updated weights for policy 0, policy_version 40152 (0.0009) +[2023-10-11 16:37:36,996][85175] Updated weights for policy 1, policy_version 40730 (0.0007) +[2023-10-11 16:37:40,721][85176] Updated weights for policy 0, policy_version 40162 (0.0008) +[2023-10-11 16:37:40,986][85175] Updated weights for policy 1, policy_version 40740 (0.0008) +[2023-10-11 16:37:41,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 82837504. Throughput: 0: 1662.5, 1: 1698.2. Samples: 20722872. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) +[2023-10-11 16:37:41,063][84230] Avg episode reward: [(0, '7.880'), (1, '7.710')] +[2023-10-11 16:37:41,095][85176] Updated weights for policy 0, policy_version 40172 (0.0008) +[2023-10-11 16:37:41,351][85175] Updated weights for policy 1, policy_version 40750 (0.0009) +[2023-10-11 16:37:41,472][85176] Updated weights for policy 0, policy_version 40182 (0.0007) +[2023-10-11 16:37:41,725][85175] Updated weights for policy 1, policy_version 40760 (0.0009) +[2023-10-11 16:37:41,843][85176] Updated weights for policy 0, policy_version 40192 (0.0007) +[2023-10-11 16:37:45,928][85176] Updated weights for policy 0, policy_version 40202 (0.0009) +[2023-10-11 16:37:45,928][85175] Updated weights for policy 1, policy_version 40770 (0.0009) +[2023-10-11 16:37:46,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 82903040. Throughput: 0: 1655.6, 1: 1692.4. Samples: 20743220. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) +[2023-10-11 16:37:46,064][84230] Avg episode reward: [(0, '7.730'), (1, '7.540')] +[2023-10-11 16:37:46,298][85176] Updated weights for policy 0, policy_version 40212 (0.0008) +[2023-10-11 16:37:46,334][85175] Updated weights for policy 1, policy_version 40780 (0.0010) +[2023-10-11 16:37:46,666][85176] Updated weights for policy 0, policy_version 40222 (0.0009) +[2023-10-11 16:37:46,712][85175] Updated weights for policy 1, policy_version 40790 (0.0008) +[2023-10-11 16:37:47,082][85175] Updated weights for policy 1, policy_version 40800 (0.0008) +[2023-10-11 16:37:50,832][85176] Updated weights for policy 0, policy_version 40232 (0.0008) +[2023-10-11 16:37:51,063][84230] Fps is (10 sec: 13106.8, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 82968576. Throughput: 0: 1656.3, 1: 1681.8. Samples: 20751930. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) +[2023-10-11 16:37:51,064][84230] Avg episode reward: [(0, '7.870'), (1, '7.170')] +[2023-10-11 16:37:51,193][85175] Updated weights for policy 1, policy_version 40810 (0.0007) +[2023-10-11 16:37:51,209][85176] Updated weights for policy 0, policy_version 40242 (0.0009) +[2023-10-11 16:37:51,558][85175] Updated weights for policy 1, policy_version 40820 (0.0008) +[2023-10-11 16:37:51,578][85176] Updated weights for policy 0, policy_version 40252 (0.0007) +[2023-10-11 16:37:51,929][85175] Updated weights for policy 1, policy_version 40830 (0.0010) +[2023-10-11 16:37:55,703][85176] Updated weights for policy 0, policy_version 40262 (0.0007) +[2023-10-11 16:37:55,881][85175] Updated weights for policy 1, policy_version 40840 (0.0008) +[2023-10-11 16:37:56,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 83034112. Throughput: 0: 1654.2, 1: 1679.7. Samples: 20772576. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) +[2023-10-11 16:37:56,063][84230] Avg episode reward: [(0, '8.170'), (1, '7.460')] +[2023-10-11 16:37:56,074][85176] Updated weights for policy 0, policy_version 40272 (0.0009) +[2023-10-11 16:37:56,257][85175] Updated weights for policy 1, policy_version 40850 (0.0008) +[2023-10-11 16:37:56,457][85176] Updated weights for policy 0, policy_version 40282 (0.0009) +[2023-10-11 16:37:56,617][85175] Updated weights for policy 1, policy_version 40860 (0.0007) +[2023-10-11 16:38:00,584][85176] Updated weights for policy 0, policy_version 40292 (0.0007) +[2023-10-11 16:38:00,853][85175] Updated weights for policy 1, policy_version 40870 (0.0007) +[2023-10-11 16:38:00,956][85176] Updated weights for policy 0, policy_version 40302 (0.0007) +[2023-10-11 16:38:01,062][84230] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 83099648. Throughput: 0: 1654.7, 1: 1679.8. Samples: 20793076. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) +[2023-10-11 16:38:01,063][84230] Avg episode reward: [(0, '8.310'), (1, '7.830')] +[2023-10-11 16:38:01,229][85175] Updated weights for policy 1, policy_version 40880 (0.0009) +[2023-10-11 16:38:01,323][85176] Updated weights for policy 0, policy_version 40312 (0.0008) +[2023-10-11 16:38:01,604][85175] Updated weights for policy 1, policy_version 40890 (0.0009) +[2023-10-11 16:38:05,452][85176] Updated weights for policy 0, policy_version 40322 (0.0008) +[2023-10-11 16:38:05,652][85175] Updated weights for policy 1, policy_version 40900 (0.0008) +[2023-10-11 16:38:05,829][85176] Updated weights for policy 0, policy_version 40332 (0.0008) +[2023-10-11 16:38:06,031][85175] Updated weights for policy 1, policy_version 40910 (0.0010) +[2023-10-11 16:38:06,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 83165184. Throughput: 0: 1656.0, 1: 1681.2. Samples: 20802232. Policy #0 lag: (min: 0.0, avg: 23.4, max: 32.0) +[2023-10-11 16:38:06,063][84230] Avg episode reward: [(0, '7.660'), (1, '7.800')] +[2023-10-11 16:38:06,210][85176] Updated weights for policy 0, policy_version 40342 (0.0009) +[2023-10-11 16:38:06,398][85175] Updated weights for policy 1, policy_version 40920 (0.0008) +[2023-10-11 16:38:06,585][85176] Updated weights for policy 0, policy_version 40352 (0.0009) +[2023-10-11 16:38:10,464][85175] Updated weights for policy 1, policy_version 40930 (0.0008) +[2023-10-11 16:38:10,809][85176] Updated weights for policy 0, policy_version 40362 (0.0007) +[2023-10-11 16:38:10,829][85175] Updated weights for policy 1, policy_version 40940 (0.0007) +[2023-10-11 16:38:11,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 83230720. Throughput: 0: 1661.1, 1: 1678.9. Samples: 20822766. Policy #0 lag: (min: 0.0, avg: 23.4, max: 32.0) +[2023-10-11 16:38:11,063][84230] Avg episode reward: [(0, '7.170'), (1, '7.550')] +[2023-10-11 16:38:11,185][85176] Updated weights for policy 0, policy_version 40372 (0.0009) +[2023-10-11 16:38:11,188][85175] Updated weights for policy 1, policy_version 40950 (0.0008) +[2023-10-11 16:38:11,555][85175] Updated weights for policy 1, policy_version 40960 (0.0008) +[2023-10-11 16:38:11,556][85176] Updated weights for policy 0, policy_version 40382 (0.0008) +[2023-10-11 16:38:15,569][85175] Updated weights for policy 1, policy_version 40970 (0.0008) +[2023-10-11 16:38:15,584][85176] Updated weights for policy 0, policy_version 40392 (0.0008) +[2023-10-11 16:38:15,943][85175] Updated weights for policy 1, policy_version 40980 (0.0008) +[2023-10-11 16:38:15,952][85176] Updated weights for policy 0, policy_version 40402 (0.0008) +[2023-10-11 16:38:16,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 83296256. Throughput: 0: 1655.9, 1: 1664.1. Samples: 20842488. Policy #0 lag: (min: 0.0, avg: 23.4, max: 32.0) +[2023-10-11 16:38:16,063][84230] Avg episode reward: [(0, '7.240'), (1, '7.990')] +[2023-10-11 16:38:16,310][85175] Updated weights for policy 1, policy_version 40990 (0.0009) +[2023-10-11 16:38:16,325][85176] Updated weights for policy 0, policy_version 40412 (0.0009) +[2023-10-11 16:38:16,383][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000040992_41975808.pth... +[2023-10-11 16:38:16,411][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000039392_40337408.pth +[2023-10-11 16:38:16,464][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000040416_41385984.pth... +[2023-10-11 16:38:16,502][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000038848_39780352.pth +[2023-10-11 16:38:20,477][85175] Updated weights for policy 1, policy_version 41000 (0.0007) +[2023-10-11 16:38:20,505][85176] Updated weights for policy 0, policy_version 40422 (0.0008) +[2023-10-11 16:38:20,843][85175] Updated weights for policy 1, policy_version 41010 (0.0009) +[2023-10-11 16:38:20,869][85176] Updated weights for policy 0, policy_version 40432 (0.0009) +[2023-10-11 16:38:21,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 83361792. Throughput: 0: 1660.8, 1: 1670.7. Samples: 20852080. Policy #0 lag: (min: 0.0, avg: 23.4, max: 32.0) +[2023-10-11 16:38:21,063][84230] Avg episode reward: [(0, '7.710'), (1, '7.510')] +[2023-10-11 16:38:21,208][85175] Updated weights for policy 1, policy_version 41020 (0.0008) +[2023-10-11 16:38:21,243][85176] Updated weights for policy 0, policy_version 40442 (0.0007) +[2023-10-11 16:38:25,295][85176] Updated weights for policy 0, policy_version 40452 (0.0009) +[2023-10-11 16:38:25,363][85175] Updated weights for policy 1, policy_version 41030 (0.0007) +[2023-10-11 16:38:25,677][85176] Updated weights for policy 0, policy_version 40462 (0.0009) +[2023-10-11 16:38:25,725][85175] Updated weights for policy 1, policy_version 41040 (0.0008) +[2023-10-11 16:38:26,044][85176] Updated weights for policy 0, policy_version 40472 (0.0007) +[2023-10-11 16:38:26,062][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 83427328. Throughput: 0: 1662.8, 1: 1667.5. Samples: 20872738. Policy #0 lag: (min: 0.0, avg: 23.4, max: 32.0) +[2023-10-11 16:38:26,063][84230] Avg episode reward: [(0, '8.010'), (1, '7.420')] +[2023-10-11 16:38:26,105][85175] Updated weights for policy 1, policy_version 41050 (0.0007) +[2023-10-11 16:38:30,077][85175] Updated weights for policy 1, policy_version 41060 (0.0008) +[2023-10-11 16:38:30,269][85176] Updated weights for policy 0, policy_version 40482 (0.0007) +[2023-10-11 16:38:30,438][85175] Updated weights for policy 1, policy_version 41070 (0.0009) +[2023-10-11 16:38:30,640][85176] Updated weights for policy 0, policy_version 40492 (0.0007) +[2023-10-11 16:38:30,813][85175] Updated weights for policy 1, policy_version 41080 (0.0009) +[2023-10-11 16:38:31,013][85176] Updated weights for policy 0, policy_version 40502 (0.0007) +[2023-10-11 16:38:31,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 83492864. Throughput: 0: 1658.5, 1: 1658.3. Samples: 20892474. Policy #0 lag: (min: 0.0, avg: 23.4, max: 32.0) +[2023-10-11 16:38:31,063][84230] Avg episode reward: [(0, '8.140'), (1, '7.610')] +[2023-10-11 16:38:31,379][85176] Updated weights for policy 0, policy_version 40512 (0.0008) +[2023-10-11 16:38:35,041][85175] Updated weights for policy 1, policy_version 41090 (0.0008) +[2023-10-11 16:38:35,439][85175] Updated weights for policy 1, policy_version 41100 (0.0008) +[2023-10-11 16:38:35,641][85176] Updated weights for policy 0, policy_version 40522 (0.0010) +[2023-10-11 16:38:35,811][85175] Updated weights for policy 1, policy_version 41110 (0.0008) +[2023-10-11 16:38:36,013][85176] Updated weights for policy 0, policy_version 40532 (0.0007) +[2023-10-11 16:38:36,070][84230] Fps is (10 sec: 13097.0, 60 sec: 13105.5, 300 sec: 13329.0). Total num frames: 83558400. Throughput: 0: 1663.9, 1: 1675.7. Samples: 20902238. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:38:36,071][84230] Avg episode reward: [(0, '7.690'), (1, '8.480')] +[2023-10-11 16:38:36,171][85175] Updated weights for policy 1, policy_version 41120 (0.0008) +[2023-10-11 16:38:36,387][85176] Updated weights for policy 0, policy_version 40542 (0.0008) +[2023-10-11 16:38:40,185][85175] Updated weights for policy 1, policy_version 41130 (0.0009) +[2023-10-11 16:38:40,527][85176] Updated weights for policy 0, policy_version 40552 (0.0010) +[2023-10-11 16:38:40,545][85175] Updated weights for policy 1, policy_version 41140 (0.0008) +[2023-10-11 16:38:40,904][85176] Updated weights for policy 0, policy_version 40562 (0.0008) +[2023-10-11 16:38:40,909][85175] Updated weights for policy 1, policy_version 41150 (0.0009) +[2023-10-11 16:38:41,062][84230] Fps is (10 sec: 16384.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 83656704. Throughput: 0: 1659.0, 1: 1676.0. Samples: 20922650. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:38:41,063][84230] Avg episode reward: [(0, '7.450'), (1, '7.960')] +[2023-10-11 16:38:41,284][85176] Updated weights for policy 0, policy_version 40572 (0.0009) +[2023-10-11 16:38:44,998][85175] Updated weights for policy 1, policy_version 41160 (0.0009) +[2023-10-11 16:38:45,359][85175] Updated weights for policy 1, policy_version 41170 (0.0009) +[2023-10-11 16:38:45,443][85176] Updated weights for policy 0, policy_version 40582 (0.0009) +[2023-10-11 16:38:45,735][85175] Updated weights for policy 1, policy_version 41180 (0.0008) +[2023-10-11 16:38:45,820][85176] Updated weights for policy 0, policy_version 40592 (0.0009) +[2023-10-11 16:38:46,063][84230] Fps is (10 sec: 16396.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 83722240. Throughput: 0: 1652.8, 1: 1658.9. Samples: 20942104. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:38:46,064][84230] Avg episode reward: [(0, '7.590'), (1, '7.410')] +[2023-10-11 16:38:46,196][85176] Updated weights for policy 0, policy_version 40602 (0.0008) +[2023-10-11 16:38:49,832][85175] Updated weights for policy 1, policy_version 41190 (0.0007) +[2023-10-11 16:38:50,197][85175] Updated weights for policy 1, policy_version 41200 (0.0008) +[2023-10-11 16:38:50,259][85176] Updated weights for policy 0, policy_version 40612 (0.0008) +[2023-10-11 16:38:50,569][85175] Updated weights for policy 1, policy_version 41210 (0.0007) +[2023-10-11 16:38:50,629][85176] Updated weights for policy 0, policy_version 40622 (0.0008) +[2023-10-11 16:38:51,012][85176] Updated weights for policy 0, policy_version 40632 (0.0009) +[2023-10-11 16:38:51,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13329.4). Total num frames: 83787776. Throughput: 0: 1660.1, 1: 1677.9. Samples: 20952442. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:38:51,063][84230] Avg episode reward: [(0, '8.020'), (1, '7.610')] +[2023-10-11 16:38:54,366][85175] Updated weights for policy 1, policy_version 41220 (0.0007) +[2023-10-11 16:38:54,731][85175] Updated weights for policy 1, policy_version 41230 (0.0009) +[2023-10-11 16:38:55,085][85176] Updated weights for policy 0, policy_version 40642 (0.0011) +[2023-10-11 16:38:55,095][85175] Updated weights for policy 1, policy_version 41240 (0.0008) +[2023-10-11 16:38:55,468][85176] Updated weights for policy 0, policy_version 40652 (0.0008) +[2023-10-11 16:38:55,849][85176] Updated weights for policy 0, policy_version 40662 (0.0009) +[2023-10-11 16:38:56,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 83853312. Throughput: 0: 1658.4, 1: 1677.8. Samples: 20972896. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:38:56,063][84230] Avg episode reward: [(0, '7.970'), (1, '7.870')] +[2023-10-11 16:38:56,215][85176] Updated weights for policy 0, policy_version 40672 (0.0007) +[2023-10-11 16:38:59,154][85175] Updated weights for policy 1, policy_version 41250 (0.0007) +[2023-10-11 16:38:59,521][85175] Updated weights for policy 1, policy_version 41260 (0.0007) +[2023-10-11 16:38:59,884][85175] Updated weights for policy 1, policy_version 41270 (0.0008) +[2023-10-11 16:39:00,250][85175] Updated weights for policy 1, policy_version 41280 (0.0009) +[2023-10-11 16:39:00,289][85176] Updated weights for policy 0, policy_version 40682 (0.0008) +[2023-10-11 16:39:00,665][85176] Updated weights for policy 0, policy_version 40692 (0.0010) +[2023-10-11 16:39:01,039][85176] Updated weights for policy 0, policy_version 40702 (0.0011) +[2023-10-11 16:39:01,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 83918848. Throughput: 0: 1651.3, 1: 1667.1. Samples: 20991816. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:39:01,063][84230] Avg episode reward: [(0, '7.540'), (1, '7.740')] +[2023-10-11 16:39:04,357][85175] Updated weights for policy 1, policy_version 41290 (0.0009) +[2023-10-11 16:39:04,737][85175] Updated weights for policy 1, policy_version 41300 (0.0009) +[2023-10-11 16:39:05,101][85175] Updated weights for policy 1, policy_version 41310 (0.0008) +[2023-10-11 16:39:05,185][85176] Updated weights for policy 0, policy_version 40712 (0.0007) +[2023-10-11 16:39:05,571][85176] Updated weights for policy 0, policy_version 40722 (0.0011) +[2023-10-11 16:39:05,927][85176] Updated weights for policy 0, policy_version 40732 (0.0008) +[2023-10-11 16:39:06,063][84230] Fps is (10 sec: 13106.7, 60 sec: 13653.2, 300 sec: 13329.3). Total num frames: 83984384. Throughput: 0: 1659.5, 1: 1691.9. Samples: 21002894. Policy #0 lag: (min: 31.0, avg: 36.9, max: 63.0) +[2023-10-11 16:39:06,064][84230] Avg episode reward: [(0, '7.330'), (1, '7.550')] +[2023-10-11 16:39:09,021][85175] Updated weights for policy 1, policy_version 41320 (0.0010) +[2023-10-11 16:39:09,388][85175] Updated weights for policy 1, policy_version 41330 (0.0010) +[2023-10-11 16:39:09,754][85175] Updated weights for policy 1, policy_version 41340 (0.0008) +[2023-10-11 16:39:09,930][85176] Updated weights for policy 0, policy_version 40742 (0.0008) +[2023-10-11 16:39:10,300][85176] Updated weights for policy 0, policy_version 40752 (0.0009) +[2023-10-11 16:39:10,670][85176] Updated weights for policy 0, policy_version 40762 (0.0007) +[2023-10-11 16:39:11,063][84230] Fps is (10 sec: 16383.7, 60 sec: 14199.4, 300 sec: 13440.4). Total num frames: 84082688. Throughput: 0: 1657.8, 1: 1678.8. Samples: 21022886. Policy #0 lag: (min: 31.0, avg: 36.9, max: 63.0) +[2023-10-11 16:39:11,064][84230] Avg episode reward: [(0, '7.270'), (1, '7.000')] +[2023-10-11 16:39:13,770][85175] Updated weights for policy 1, policy_version 41350 (0.0010) +[2023-10-11 16:39:14,134][85175] Updated weights for policy 1, policy_version 41360 (0.0012) +[2023-10-11 16:39:14,499][85175] Updated weights for policy 1, policy_version 41370 (0.0011) +[2023-10-11 16:39:14,733][85176] Updated weights for policy 0, policy_version 40772 (0.0009) +[2023-10-11 16:39:15,115][85176] Updated weights for policy 0, policy_version 40782 (0.0007) +[2023-10-11 16:39:15,497][85176] Updated weights for policy 0, policy_version 40792 (0.0008) +[2023-10-11 16:39:16,063][84230] Fps is (10 sec: 16384.2, 60 sec: 14199.4, 300 sec: 13440.4). Total num frames: 84148224. Throughput: 0: 1644.4, 1: 1682.6. Samples: 21042192. Policy #0 lag: (min: 31.0, avg: 36.9, max: 63.0) +[2023-10-11 16:39:16,064][84230] Avg episode reward: [(0, '7.320'), (1, '7.610')] +[2023-10-11 16:39:18,698][85175] Updated weights for policy 1, policy_version 41380 (0.0008) +[2023-10-11 16:39:19,068][85175] Updated weights for policy 1, policy_version 41390 (0.0008) +[2023-10-11 16:39:19,440][85175] Updated weights for policy 1, policy_version 41400 (0.0008) +[2023-10-11 16:39:19,559][85176] Updated weights for policy 0, policy_version 40802 (0.0007) +[2023-10-11 16:39:19,930][85176] Updated weights for policy 0, policy_version 40812 (0.0008) +[2023-10-11 16:39:20,303][85176] Updated weights for policy 0, policy_version 40822 (0.0009) +[2023-10-11 16:39:20,672][85176] Updated weights for policy 0, policy_version 40832 (0.0008) +[2023-10-11 16:39:21,063][84230] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 13440.4). Total num frames: 84213760. Throughput: 0: 1663.6, 1: 1699.6. Samples: 21053556. Policy #0 lag: (min: 31.0, avg: 36.9, max: 63.0) +[2023-10-11 16:39:21,063][84230] Avg episode reward: [(0, '8.070'), (1, '7.870')] +[2023-10-11 16:39:23,342][85175] Updated weights for policy 1, policy_version 41410 (0.0009) +[2023-10-11 16:39:23,709][85175] Updated weights for policy 1, policy_version 41420 (0.0007) +[2023-10-11 16:39:24,072][85175] Updated weights for policy 1, policy_version 41430 (0.0010) +[2023-10-11 16:39:24,441][85175] Updated weights for policy 1, policy_version 41440 (0.0009) +[2023-10-11 16:39:24,758][85176] Updated weights for policy 0, policy_version 40842 (0.0010) +[2023-10-11 16:39:25,135][85176] Updated weights for policy 0, policy_version 40852 (0.0009) +[2023-10-11 16:39:25,505][85176] Updated weights for policy 0, policy_version 40862 (0.0008) +[2023-10-11 16:39:26,063][84230] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 13440.4). Total num frames: 84279296. Throughput: 0: 1664.3, 1: 1676.0. Samples: 21072962. Policy #0 lag: (min: 31.0, avg: 36.9, max: 63.0) +[2023-10-11 16:39:26,063][84230] Avg episode reward: [(0, '7.660'), (1, '7.230')] +[2023-10-11 16:39:28,581][85175] Updated weights for policy 1, policy_version 41450 (0.0009) +[2023-10-11 16:39:28,950][85175] Updated weights for policy 1, policy_version 41460 (0.0009) +[2023-10-11 16:39:29,315][85175] Updated weights for policy 1, policy_version 41470 (0.0007) +[2023-10-11 16:39:29,561][85176] Updated weights for policy 0, policy_version 40872 (0.0009) +[2023-10-11 16:39:29,932][85176] Updated weights for policy 0, policy_version 40882 (0.0011) +[2023-10-11 16:39:30,310][85176] Updated weights for policy 0, policy_version 40892 (0.0008) +[2023-10-11 16:39:31,062][84230] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13440.4). Total num frames: 84344832. Throughput: 0: 1651.9, 1: 1696.9. Samples: 21092798. Policy #0 lag: (min: 1.0, avg: 9.8, max: 33.0) +[2023-10-11 16:39:31,063][84230] Avg episode reward: [(0, '7.460'), (1, '7.230')] +[2023-10-11 16:39:33,244][85175] Updated weights for policy 1, policy_version 41480 (0.0008) +[2023-10-11 16:39:33,620][85175] Updated weights for policy 1, policy_version 41490 (0.0007) +[2023-10-11 16:39:33,984][85175] Updated weights for policy 1, policy_version 41500 (0.0007) +[2023-10-11 16:39:34,469][85176] Updated weights for policy 0, policy_version 40902 (0.0008) +[2023-10-11 16:39:34,848][85176] Updated weights for policy 0, policy_version 40912 (0.0008) +[2023-10-11 16:39:35,230][85176] Updated weights for policy 0, policy_version 40922 (0.0009) +[2023-10-11 16:39:36,062][84230] Fps is (10 sec: 13107.3, 60 sec: 14201.3, 300 sec: 13440.4). Total num frames: 84410368. Throughput: 0: 1671.1, 1: 1690.1. Samples: 21103696. Policy #0 lag: (min: 1.0, avg: 9.8, max: 33.0) +[2023-10-11 16:39:36,063][84230] Avg episode reward: [(0, '7.070'), (1, '7.780')] +[2023-10-11 16:39:37,870][85175] Updated weights for policy 1, policy_version 41510 (0.0008) +[2023-10-11 16:39:38,237][85175] Updated weights for policy 1, policy_version 41520 (0.0007) +[2023-10-11 16:39:38,607][85175] Updated weights for policy 1, policy_version 41530 (0.0008) +[2023-10-11 16:39:39,265][85176] Updated weights for policy 0, policy_version 40932 (0.0009) +[2023-10-11 16:39:39,643][85176] Updated weights for policy 0, policy_version 40942 (0.0009) +[2023-10-11 16:39:40,007][85176] Updated weights for policy 0, policy_version 40952 (0.0008) +[2023-10-11 16:39:41,063][84230] Fps is (10 sec: 13106.7, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 84475904. Throughput: 0: 1664.9, 1: 1681.6. Samples: 21123490. Policy #0 lag: (min: 1.0, avg: 9.8, max: 33.0) +[2023-10-11 16:39:41,064][84230] Avg episode reward: [(0, '7.510'), (1, '8.000')] +[2023-10-11 16:39:42,760][85175] Updated weights for policy 1, policy_version 41540 (0.0009) +[2023-10-11 16:39:43,133][85175] Updated weights for policy 1, policy_version 41550 (0.0010) +[2023-10-11 16:39:43,497][85175] Updated weights for policy 1, policy_version 41560 (0.0009) +[2023-10-11 16:39:44,069][85176] Updated weights for policy 0, policy_version 40962 (0.0009) +[2023-10-11 16:39:44,467][85176] Updated weights for policy 0, policy_version 40972 (0.0007) +[2023-10-11 16:39:44,829][85176] Updated weights for policy 0, policy_version 40982 (0.0008) +[2023-10-11 16:39:45,198][85176] Updated weights for policy 0, policy_version 40992 (0.0007) +[2023-10-11 16:39:46,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 84541440. Throughput: 0: 1665.6, 1: 1703.9. Samples: 21143442. Policy #0 lag: (min: 1.0, avg: 9.8, max: 33.0) +[2023-10-11 16:39:46,064][84230] Avg episode reward: [(0, '7.750'), (1, '7.670')] +[2023-10-11 16:39:47,553][85175] Updated weights for policy 1, policy_version 41570 (0.0010) +[2023-10-11 16:39:47,916][85175] Updated weights for policy 1, policy_version 41580 (0.0009) +[2023-10-11 16:39:48,288][85175] Updated weights for policy 1, policy_version 41590 (0.0008) +[2023-10-11 16:39:48,658][85175] Updated weights for policy 1, policy_version 41600 (0.0009) +[2023-10-11 16:39:49,158][85176] Updated weights for policy 0, policy_version 41002 (0.0008) +[2023-10-11 16:39:49,522][85176] Updated weights for policy 0, policy_version 41012 (0.0009) +[2023-10-11 16:39:49,891][85176] Updated weights for policy 0, policy_version 41022 (0.0008) +[2023-10-11 16:39:51,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 84606976. Throughput: 0: 1682.6, 1: 1674.2. Samples: 21153950. Policy #0 lag: (min: 1.0, avg: 9.8, max: 33.0) +[2023-10-11 16:39:51,064][84230] Avg episode reward: [(0, '7.690'), (1, '7.220')] +[2023-10-11 16:39:52,949][85175] Updated weights for policy 1, policy_version 41610 (0.0009) +[2023-10-11 16:39:53,315][85175] Updated weights for policy 1, policy_version 41620 (0.0008) +[2023-10-11 16:39:53,675][85175] Updated weights for policy 1, policy_version 41630 (0.0009) +[2023-10-11 16:39:53,696][85176] Updated weights for policy 0, policy_version 41032 (0.0008) +[2023-10-11 16:39:54,079][85176] Updated weights for policy 0, policy_version 41042 (0.0008) +[2023-10-11 16:39:54,448][85176] Updated weights for policy 0, policy_version 41052 (0.0010) +[2023-10-11 16:39:56,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 84672512. Throughput: 0: 1662.4, 1: 1678.5. Samples: 21173226. Policy #0 lag: (min: 1.0, avg: 9.8, max: 33.0) +[2023-10-11 16:39:56,063][84230] Avg episode reward: [(0, '7.470'), (1, '7.670')] +[2023-10-11 16:39:57,616][85175] Updated weights for policy 1, policy_version 41640 (0.0007) +[2023-10-11 16:39:57,987][85175] Updated weights for policy 1, policy_version 41650 (0.0010) +[2023-10-11 16:39:58,359][85175] Updated weights for policy 1, policy_version 41660 (0.0008) +[2023-10-11 16:39:58,386][85176] Updated weights for policy 0, policy_version 41062 (0.0008) +[2023-10-11 16:39:58,758][85176] Updated weights for policy 0, policy_version 41072 (0.0011) +[2023-10-11 16:39:59,129][85176] Updated weights for policy 0, policy_version 41082 (0.0010) +[2023-10-11 16:40:01,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 84738048. Throughput: 0: 1683.5, 1: 1696.5. Samples: 21194290. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) +[2023-10-11 16:40:01,064][84230] Avg episode reward: [(0, '7.500'), (1, '9.000')] +[2023-10-11 16:40:01,076][85000] Saving new best policy, reward=9.000! +[2023-10-11 16:40:02,410][85175] Updated weights for policy 1, policy_version 41670 (0.0008) +[2023-10-11 16:40:02,789][85175] Updated weights for policy 1, policy_version 41680 (0.0008) +[2023-10-11 16:40:03,154][85175] Updated weights for policy 1, policy_version 41690 (0.0009) +[2023-10-11 16:40:03,231][85176] Updated weights for policy 0, policy_version 41092 (0.0010) +[2023-10-11 16:40:03,609][85176] Updated weights for policy 0, policy_version 41102 (0.0009) +[2023-10-11 16:40:03,970][85176] Updated weights for policy 0, policy_version 41112 (0.0011) +[2023-10-11 16:40:06,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 84803584. Throughput: 0: 1678.9, 1: 1669.5. Samples: 21204234. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) +[2023-10-11 16:40:06,063][84230] Avg episode reward: [(0, '7.920'), (1, '9.150')] +[2023-10-11 16:40:06,064][85000] Saving new best policy, reward=9.150! +[2023-10-11 16:40:07,290][85175] Updated weights for policy 1, policy_version 41700 (0.0009) +[2023-10-11 16:40:07,660][85175] Updated weights for policy 1, policy_version 41710 (0.0009) +[2023-10-11 16:40:08,037][85175] Updated weights for policy 1, policy_version 41720 (0.0009) +[2023-10-11 16:40:08,161][85176] Updated weights for policy 0, policy_version 41122 (0.0008) +[2023-10-11 16:40:08,539][85176] Updated weights for policy 0, policy_version 41132 (0.0008) +[2023-10-11 16:40:08,921][85176] Updated weights for policy 0, policy_version 41142 (0.0010) +[2023-10-11 16:40:09,291][85176] Updated weights for policy 0, policy_version 41152 (0.0009) +[2023-10-11 16:40:11,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 84869120. Throughput: 0: 1667.9, 1: 1689.7. Samples: 21224054. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) +[2023-10-11 16:40:11,064][84230] Avg episode reward: [(0, '8.010'), (1, '8.400')] +[2023-10-11 16:40:12,190][85175] Updated weights for policy 1, policy_version 41730 (0.0009) +[2023-10-11 16:40:12,561][85175] Updated weights for policy 1, policy_version 41740 (0.0008) +[2023-10-11 16:40:12,930][85175] Updated weights for policy 1, policy_version 41750 (0.0007) +[2023-10-11 16:40:13,291][85175] Updated weights for policy 1, policy_version 41760 (0.0007) +[2023-10-11 16:40:13,362][85176] Updated weights for policy 0, policy_version 41162 (0.0008) +[2023-10-11 16:40:13,743][85176] Updated weights for policy 0, policy_version 41172 (0.0009) +[2023-10-11 16:40:14,109][85176] Updated weights for policy 0, policy_version 41182 (0.0011) +[2023-10-11 16:40:16,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 84934656. Throughput: 0: 1687.5, 1: 1687.9. Samples: 21244690. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) +[2023-10-11 16:40:16,063][84230] Avg episode reward: [(0, '7.930'), (1, '7.550')] +[2023-10-11 16:40:16,070][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000041184_42172416.pth... +[2023-10-11 16:40:16,071][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000041760_42762240.pth... +[2023-10-11 16:40:16,105][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000040192_41156608.pth +[2023-10-11 16:40:16,114][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000039616_40566784.pth +[2023-10-11 16:40:17,409][85175] Updated weights for policy 1, policy_version 41770 (0.0008) +[2023-10-11 16:40:17,784][85175] Updated weights for policy 1, policy_version 41780 (0.0009) +[2023-10-11 16:40:18,160][85175] Updated weights for policy 1, policy_version 41790 (0.0009) +[2023-10-11 16:40:18,341][85176] Updated weights for policy 0, policy_version 41192 (0.0009) +[2023-10-11 16:40:18,715][85176] Updated weights for policy 0, policy_version 41202 (0.0009) +[2023-10-11 16:40:19,084][85176] Updated weights for policy 0, policy_version 41212 (0.0009) +[2023-10-11 16:40:21,062][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 85000192. Throughput: 0: 1674.2, 1: 1670.3. Samples: 21254196. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) +[2023-10-11 16:40:21,063][84230] Avg episode reward: [(0, '7.690'), (1, '8.140')] +[2023-10-11 16:40:22,080][85175] Updated weights for policy 1, policy_version 41800 (0.0009) +[2023-10-11 16:40:22,443][85175] Updated weights for policy 1, policy_version 41810 (0.0009) +[2023-10-11 16:40:22,814][85175] Updated weights for policy 1, policy_version 41820 (0.0009) +[2023-10-11 16:40:23,037][85176] Updated weights for policy 0, policy_version 41222 (0.0009) +[2023-10-11 16:40:23,412][85176] Updated weights for policy 0, policy_version 41232 (0.0009) +[2023-10-11 16:40:23,776][85176] Updated weights for policy 0, policy_version 41242 (0.0007) +[2023-10-11 16:40:26,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 85065728. Throughput: 0: 1671.1, 1: 1683.0. Samples: 21274424. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) +[2023-10-11 16:40:26,063][84230] Avg episode reward: [(0, '7.300'), (1, '8.770')] +[2023-10-11 16:40:26,860][85175] Updated weights for policy 1, policy_version 41830 (0.0010) +[2023-10-11 16:40:27,227][85175] Updated weights for policy 1, policy_version 41840 (0.0007) +[2023-10-11 16:40:27,590][85175] Updated weights for policy 1, policy_version 41850 (0.0007) +[2023-10-11 16:40:27,933][85176] Updated weights for policy 0, policy_version 41252 (0.0007) +[2023-10-11 16:40:28,310][85176] Updated weights for policy 0, policy_version 41262 (0.0008) +[2023-10-11 16:40:28,697][85176] Updated weights for policy 0, policy_version 41272 (0.0009) +[2023-10-11 16:40:31,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 85131264. Throughput: 0: 1685.5, 1: 1682.1. Samples: 21294984. Policy #0 lag: (min: 10.0, avg: 18.2, max: 42.0) +[2023-10-11 16:40:31,063][84230] Avg episode reward: [(0, '7.110'), (1, '9.000')] +[2023-10-11 16:40:31,627][85175] Updated weights for policy 1, policy_version 41860 (0.0008) +[2023-10-11 16:40:32,002][85175] Updated weights for policy 1, policy_version 41870 (0.0007) +[2023-10-11 16:40:32,377][85175] Updated weights for policy 1, policy_version 41880 (0.0008) +[2023-10-11 16:40:32,850][85176] Updated weights for policy 0, policy_version 41282 (0.0009) +[2023-10-11 16:40:33,227][85176] Updated weights for policy 0, policy_version 41292 (0.0007) +[2023-10-11 16:40:33,607][85176] Updated weights for policy 0, policy_version 41302 (0.0009) +[2023-10-11 16:40:33,974][85176] Updated weights for policy 0, policy_version 41312 (0.0008) +[2023-10-11 16:40:36,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 85196800. Throughput: 0: 1664.5, 1: 1680.9. Samples: 21304492. Policy #0 lag: (min: 10.0, avg: 18.2, max: 42.0) +[2023-10-11 16:40:36,064][84230] Avg episode reward: [(0, '7.560'), (1, '8.740')] +[2023-10-11 16:40:36,463][85175] Updated weights for policy 1, policy_version 41890 (0.0010) +[2023-10-11 16:40:36,831][85175] Updated weights for policy 1, policy_version 41900 (0.0011) +[2023-10-11 16:40:37,203][85175] Updated weights for policy 1, policy_version 41910 (0.0008) +[2023-10-11 16:40:37,572][85175] Updated weights for policy 1, policy_version 41920 (0.0009) +[2023-10-11 16:40:38,146][85176] Updated weights for policy 0, policy_version 41322 (0.0008) +[2023-10-11 16:40:38,528][85176] Updated weights for policy 0, policy_version 41332 (0.0009) +[2023-10-11 16:40:38,891][85176] Updated weights for policy 0, policy_version 41342 (0.0010) +[2023-10-11 16:40:41,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 85262336. Throughput: 0: 1673.8, 1: 1690.4. Samples: 21324618. Policy #0 lag: (min: 10.0, avg: 18.2, max: 42.0) +[2023-10-11 16:40:41,063][84230] Avg episode reward: [(0, '7.900'), (1, '7.750')] +[2023-10-11 16:40:41,668][85175] Updated weights for policy 1, policy_version 41930 (0.0008) +[2023-10-11 16:40:42,043][85175] Updated weights for policy 1, policy_version 41940 (0.0008) +[2023-10-11 16:40:42,421][85175] Updated weights for policy 1, policy_version 41950 (0.0010) +[2023-10-11 16:40:43,029][85176] Updated weights for policy 0, policy_version 41352 (0.0009) +[2023-10-11 16:40:43,400][85176] Updated weights for policy 0, policy_version 41362 (0.0009) +[2023-10-11 16:40:43,785][85176] Updated weights for policy 0, policy_version 41372 (0.0010) +[2023-10-11 16:40:46,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 85327872. Throughput: 0: 1677.6, 1: 1680.5. Samples: 21345408. Policy #0 lag: (min: 10.0, avg: 18.2, max: 42.0) +[2023-10-11 16:40:46,064][84230] Avg episode reward: [(0, '7.840'), (1, '7.640')] +[2023-10-11 16:40:46,457][85175] Updated weights for policy 1, policy_version 41960 (0.0010) +[2023-10-11 16:40:46,825][85175] Updated weights for policy 1, policy_version 41970 (0.0010) +[2023-10-11 16:40:47,196][85175] Updated weights for policy 1, policy_version 41980 (0.0010) +[2023-10-11 16:40:47,681][85176] Updated weights for policy 0, policy_version 41382 (0.0008) +[2023-10-11 16:40:48,058][85176] Updated weights for policy 0, policy_version 41392 (0.0010) +[2023-10-11 16:40:48,427][85176] Updated weights for policy 0, policy_version 41402 (0.0011) +[2023-10-11 16:40:51,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 85393408. Throughput: 0: 1662.9, 1: 1681.1. Samples: 21354714. Policy #0 lag: (min: 10.0, avg: 18.2, max: 42.0) +[2023-10-11 16:40:51,064][84230] Avg episode reward: [(0, '7.660'), (1, '7.640')] +[2023-10-11 16:40:51,314][85175] Updated weights for policy 1, policy_version 41990 (0.0008) +[2023-10-11 16:40:51,679][85175] Updated weights for policy 1, policy_version 42000 (0.0007) +[2023-10-11 16:40:52,048][85175] Updated weights for policy 1, policy_version 42010 (0.0007) +[2023-10-11 16:40:52,527][85176] Updated weights for policy 0, policy_version 41412 (0.0009) +[2023-10-11 16:40:52,904][85176] Updated weights for policy 0, policy_version 41422 (0.0011) +[2023-10-11 16:40:53,274][85176] Updated weights for policy 0, policy_version 41432 (0.0008) +[2023-10-11 16:40:56,007][85175] Updated weights for policy 1, policy_version 42020 (0.0009) +[2023-10-11 16:40:56,062][84230] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 85458944. Throughput: 0: 1673.1, 1: 1686.1. Samples: 21375216. Policy #0 lag: (min: 10.0, avg: 18.2, max: 42.0) +[2023-10-11 16:40:56,063][84230] Avg episode reward: [(0, '7.540'), (1, '8.120')] +[2023-10-11 16:40:56,370][85175] Updated weights for policy 1, policy_version 42030 (0.0007) +[2023-10-11 16:40:56,741][85175] Updated weights for policy 1, policy_version 42040 (0.0008) +[2023-10-11 16:40:57,392][85176] Updated weights for policy 0, policy_version 41442 (0.0007) +[2023-10-11 16:40:57,754][85176] Updated weights for policy 0, policy_version 41452 (0.0010) +[2023-10-11 16:40:58,122][85176] Updated weights for policy 0, policy_version 41462 (0.0010) +[2023-10-11 16:40:58,500][85176] Updated weights for policy 0, policy_version 41472 (0.0010) +[2023-10-11 16:41:00,792][85175] Updated weights for policy 1, policy_version 42050 (0.0010) +[2023-10-11 16:41:01,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 85524480. Throughput: 0: 1670.1, 1: 1687.6. Samples: 21395788. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:41:01,063][84230] Avg episode reward: [(0, '7.720'), (1, '9.050')] +[2023-10-11 16:41:01,159][85175] Updated weights for policy 1, policy_version 42060 (0.0009) +[2023-10-11 16:41:01,526][85175] Updated weights for policy 1, policy_version 42070 (0.0009) +[2023-10-11 16:41:01,898][85175] Updated weights for policy 1, policy_version 42080 (0.0010) +[2023-10-11 16:41:02,634][85176] Updated weights for policy 0, policy_version 41482 (0.0008) +[2023-10-11 16:41:03,009][85176] Updated weights for policy 0, policy_version 41492 (0.0008) +[2023-10-11 16:41:03,368][85176] Updated weights for policy 0, policy_version 41502 (0.0009) +[2023-10-11 16:41:06,018][85175] Updated weights for policy 1, policy_version 42090 (0.0007) +[2023-10-11 16:41:06,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 85590016. Throughput: 0: 1656.8, 1: 1692.4. Samples: 21404912. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:41:06,063][84230] Avg episode reward: [(0, '7.720'), (1, '9.340')] +[2023-10-11 16:41:06,395][85175] Updated weights for policy 1, policy_version 42100 (0.0010) +[2023-10-11 16:41:06,749][85175] Updated weights for policy 1, policy_version 42110 (0.0009) +[2023-10-11 16:41:06,820][85000] Saving new best policy, reward=9.340! +[2023-10-11 16:41:07,438][85176] Updated weights for policy 0, policy_version 41512 (0.0009) +[2023-10-11 16:41:07,818][85176] Updated weights for policy 0, policy_version 41522 (0.0007) +[2023-10-11 16:41:08,197][85176] Updated weights for policy 0, policy_version 41532 (0.0010) +[2023-10-11 16:41:10,721][85175] Updated weights for policy 1, policy_version 42120 (0.0008) +[2023-10-11 16:41:11,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 85655552. Throughput: 0: 1664.7, 1: 1689.8. Samples: 21425376. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:41:11,064][84230] Avg episode reward: [(0, '8.010'), (1, '8.280')] +[2023-10-11 16:41:11,092][85175] Updated weights for policy 1, policy_version 42130 (0.0008) +[2023-10-11 16:41:11,454][85175] Updated weights for policy 1, policy_version 42140 (0.0007) +[2023-10-11 16:41:12,325][85176] Updated weights for policy 0, policy_version 41542 (0.0010) +[2023-10-11 16:41:12,688][85176] Updated weights for policy 0, policy_version 41552 (0.0009) +[2023-10-11 16:41:13,071][85176] Updated weights for policy 0, policy_version 41562 (0.0008) +[2023-10-11 16:41:15,476][85175] Updated weights for policy 1, policy_version 42150 (0.0007) +[2023-10-11 16:41:15,841][85175] Updated weights for policy 1, policy_version 42160 (0.0007) +[2023-10-11 16:41:16,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13107.1, 300 sec: 13329.4). Total num frames: 85721088. Throughput: 0: 1666.6, 1: 1686.6. Samples: 21445876. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:41:16,064][84230] Avg episode reward: [(0, '7.900'), (1, '8.610')] +[2023-10-11 16:41:16,216][85175] Updated weights for policy 1, policy_version 42170 (0.0007) +[2023-10-11 16:41:17,173][85176] Updated weights for policy 0, policy_version 41572 (0.0007) +[2023-10-11 16:41:17,542][85176] Updated weights for policy 0, policy_version 41582 (0.0008) +[2023-10-11 16:41:17,910][85176] Updated weights for policy 0, policy_version 41592 (0.0007) +[2023-10-11 16:41:20,276][85175] Updated weights for policy 1, policy_version 42180 (0.0009) +[2023-10-11 16:41:20,641][85175] Updated weights for policy 1, policy_version 42190 (0.0007) +[2023-10-11 16:41:21,009][85175] Updated weights for policy 1, policy_version 42200 (0.0007) +[2023-10-11 16:41:21,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 85786624. Throughput: 0: 1656.9, 1: 1690.3. Samples: 21455114. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:41:21,063][84230] Avg episode reward: [(0, '7.600'), (1, '8.040')] +[2023-10-11 16:41:22,106][85176] Updated weights for policy 0, policy_version 41602 (0.0007) +[2023-10-11 16:41:22,478][85176] Updated weights for policy 0, policy_version 41612 (0.0008) +[2023-10-11 16:41:22,845][85176] Updated weights for policy 0, policy_version 41622 (0.0007) +[2023-10-11 16:41:23,216][85176] Updated weights for policy 0, policy_version 41632 (0.0008) +[2023-10-11 16:41:24,965][85175] Updated weights for policy 1, policy_version 42210 (0.0008) +[2023-10-11 16:41:25,339][85175] Updated weights for policy 1, policy_version 42220 (0.0009) +[2023-10-11 16:41:25,705][85175] Updated weights for policy 1, policy_version 42230 (0.0009) +[2023-10-11 16:41:26,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 85852160. Throughput: 0: 1661.1, 1: 1692.8. Samples: 21475544. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:41:26,064][84230] Avg episode reward: [(0, '7.750'), (1, '7.550')] +[2023-10-11 16:41:26,071][85175] Updated weights for policy 1, policy_version 42240 (0.0008) +[2023-10-11 16:41:27,479][85176] Updated weights for policy 0, policy_version 41642 (0.0009) +[2023-10-11 16:41:27,862][85176] Updated weights for policy 0, policy_version 41652 (0.0011) +[2023-10-11 16:41:28,242][85176] Updated weights for policy 0, policy_version 41662 (0.0010) +[2023-10-11 16:41:30,313][85175] Updated weights for policy 1, policy_version 42250 (0.0009) +[2023-10-11 16:41:30,694][85175] Updated weights for policy 1, policy_version 42260 (0.0008) +[2023-10-11 16:41:31,051][85175] Updated weights for policy 1, policy_version 42270 (0.0007) +[2023-10-11 16:41:31,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 85917696. Throughput: 0: 1653.4, 1: 1678.8. Samples: 21495358. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:41:31,063][84230] Avg episode reward: [(0, '7.450'), (1, '7.670')] +[2023-10-11 16:41:32,415][85176] Updated weights for policy 0, policy_version 41672 (0.0010) +[2023-10-11 16:41:32,780][85176] Updated weights for policy 0, policy_version 41682 (0.0008) +[2023-10-11 16:41:33,155][85176] Updated weights for policy 0, policy_version 41692 (0.0008) +[2023-10-11 16:41:34,963][85175] Updated weights for policy 1, policy_version 42280 (0.0008) +[2023-10-11 16:41:35,327][85175] Updated weights for policy 1, policy_version 42290 (0.0012) +[2023-10-11 16:41:35,701][85175] Updated weights for policy 1, policy_version 42300 (0.0007) +[2023-10-11 16:41:36,063][84230] Fps is (10 sec: 16384.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 86016000. Throughput: 0: 1647.6, 1: 1695.7. Samples: 21505164. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:41:36,064][84230] Avg episode reward: [(0, '7.350'), (1, '8.720')] +[2023-10-11 16:41:37,305][85176] Updated weights for policy 0, policy_version 41702 (0.0007) +[2023-10-11 16:41:37,685][85176] Updated weights for policy 0, policy_version 41712 (0.0009) +[2023-10-11 16:41:38,058][85176] Updated weights for policy 0, policy_version 41722 (0.0008) +[2023-10-11 16:41:39,896][85175] Updated weights for policy 1, policy_version 42310 (0.0009) +[2023-10-11 16:41:40,272][85175] Updated weights for policy 1, policy_version 42320 (0.0010) +[2023-10-11 16:41:40,634][85175] Updated weights for policy 1, policy_version 42330 (0.0009) +[2023-10-11 16:41:41,063][84230] Fps is (10 sec: 16383.7, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 86081536. Throughput: 0: 1652.7, 1: 1693.9. Samples: 21525816. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:41:41,064][84230] Avg episode reward: [(0, '7.950'), (1, '9.720')] +[2023-10-11 16:41:41,065][85000] Saving new best policy, reward=9.720! +[2023-10-11 16:41:41,924][85176] Updated weights for policy 0, policy_version 41732 (0.0008) +[2023-10-11 16:41:42,288][85176] Updated weights for policy 0, policy_version 41742 (0.0009) +[2023-10-11 16:41:42,662][85176] Updated weights for policy 0, policy_version 41752 (0.0009) +[2023-10-11 16:41:44,723][85175] Updated weights for policy 1, policy_version 42340 (0.0009) +[2023-10-11 16:41:45,097][85175] Updated weights for policy 1, policy_version 42350 (0.0007) +[2023-10-11 16:41:45,461][85175] Updated weights for policy 1, policy_version 42360 (0.0009) +[2023-10-11 16:41:46,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 86147072. Throughput: 0: 1658.5, 1: 1671.1. Samples: 21545622. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:41:46,064][84230] Avg episode reward: [(0, '7.900'), (1, '9.240')] +[2023-10-11 16:41:46,896][85176] Updated weights for policy 0, policy_version 41762 (0.0010) +[2023-10-11 16:41:47,270][85176] Updated weights for policy 0, policy_version 41772 (0.0008) +[2023-10-11 16:41:47,630][85176] Updated weights for policy 0, policy_version 41782 (0.0008) +[2023-10-11 16:41:48,003][85176] Updated weights for policy 0, policy_version 41792 (0.0009) +[2023-10-11 16:41:49,338][85175] Updated weights for policy 1, policy_version 42370 (0.0008) +[2023-10-11 16:41:49,705][85175] Updated weights for policy 1, policy_version 42380 (0.0010) +[2023-10-11 16:41:50,068][85175] Updated weights for policy 1, policy_version 42390 (0.0007) +[2023-10-11 16:41:50,442][85175] Updated weights for policy 1, policy_version 42400 (0.0007) +[2023-10-11 16:41:51,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 86212608. Throughput: 0: 1655.3, 1: 1693.3. Samples: 21555600. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:41:51,063][84230] Avg episode reward: [(0, '7.750'), (1, '9.820')] +[2023-10-11 16:41:51,065][85000] Saving new best policy, reward=9.820! +[2023-10-11 16:41:52,294][85176] Updated weights for policy 0, policy_version 41802 (0.0009) +[2023-10-11 16:41:52,668][85176] Updated weights for policy 0, policy_version 41812 (0.0008) +[2023-10-11 16:41:53,043][85176] Updated weights for policy 0, policy_version 41822 (0.0008) +[2023-10-11 16:41:54,700][85175] Updated weights for policy 1, policy_version 42410 (0.0008) +[2023-10-11 16:41:55,082][85175] Updated weights for policy 1, policy_version 42420 (0.0008) +[2023-10-11 16:41:55,450][85175] Updated weights for policy 1, policy_version 42430 (0.0009) +[2023-10-11 16:41:56,062][84230] Fps is (10 sec: 13107.6, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 86278144. Throughput: 0: 1656.9, 1: 1686.6. Samples: 21575834. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:41:56,063][84230] Avg episode reward: [(0, '7.720'), (1, '8.980')] +[2023-10-11 16:41:56,933][85176] Updated weights for policy 0, policy_version 41832 (0.0008) +[2023-10-11 16:41:57,301][85176] Updated weights for policy 0, policy_version 41842 (0.0009) +[2023-10-11 16:41:57,686][85176] Updated weights for policy 0, policy_version 41852 (0.0009) +[2023-10-11 16:41:59,283][85175] Updated weights for policy 1, policy_version 42440 (0.0011) +[2023-10-11 16:41:59,653][85175] Updated weights for policy 1, policy_version 42450 (0.0010) +[2023-10-11 16:42:00,031][85175] Updated weights for policy 1, policy_version 42460 (0.0010) +[2023-10-11 16:42:01,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 86343680. Throughput: 0: 1665.9, 1: 1669.1. Samples: 21595950. Policy #0 lag: (min: 31.0, avg: 37.0, max: 63.0) +[2023-10-11 16:42:01,063][84230] Avg episode reward: [(0, '7.750'), (1, '9.260')] +[2023-10-11 16:42:01,871][85176] Updated weights for policy 0, policy_version 41862 (0.0009) +[2023-10-11 16:42:02,248][85176] Updated weights for policy 0, policy_version 41872 (0.0007) +[2023-10-11 16:42:02,613][85176] Updated weights for policy 0, policy_version 41882 (0.0010) +[2023-10-11 16:42:04,017][85175] Updated weights for policy 1, policy_version 42470 (0.0008) +[2023-10-11 16:42:04,383][85175] Updated weights for policy 1, policy_version 42480 (0.0009) +[2023-10-11 16:42:04,749][85175] Updated weights for policy 1, policy_version 42490 (0.0007) +[2023-10-11 16:42:06,063][84230] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 86409216. Throughput: 0: 1661.5, 1: 1695.9. Samples: 21606194. Policy #0 lag: (min: 31.0, avg: 37.0, max: 63.0) +[2023-10-11 16:42:06,064][84230] Avg episode reward: [(0, '7.900'), (1, '9.710')] +[2023-10-11 16:42:06,742][85176] Updated weights for policy 0, policy_version 41892 (0.0009) +[2023-10-11 16:42:07,118][85176] Updated weights for policy 0, policy_version 41902 (0.0009) +[2023-10-11 16:42:07,498][85176] Updated weights for policy 0, policy_version 41912 (0.0009) +[2023-10-11 16:42:08,736][85175] Updated weights for policy 1, policy_version 42500 (0.0009) +[2023-10-11 16:42:09,104][85175] Updated weights for policy 1, policy_version 42510 (0.0010) +[2023-10-11 16:42:09,467][85175] Updated weights for policy 1, policy_version 42520 (0.0009) +[2023-10-11 16:42:11,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 86474752. Throughput: 0: 1664.0, 1: 1679.2. Samples: 21625986. Policy #0 lag: (min: 31.0, avg: 37.0, max: 63.0) +[2023-10-11 16:42:11,063][84230] Avg episode reward: [(0, '8.190'), (1, '10.830')] +[2023-10-11 16:42:11,064][85000] Saving new best policy, reward=10.830! +[2023-10-11 16:42:11,553][85176] Updated weights for policy 0, policy_version 41922 (0.0010) +[2023-10-11 16:42:11,923][85176] Updated weights for policy 0, policy_version 41932 (0.0007) +[2023-10-11 16:42:12,295][85176] Updated weights for policy 0, policy_version 41942 (0.0007) +[2023-10-11 16:42:12,662][85176] Updated weights for policy 0, policy_version 41952 (0.0009) +[2023-10-11 16:42:13,615][85175] Updated weights for policy 1, policy_version 42530 (0.0007) +[2023-10-11 16:42:13,982][85175] Updated weights for policy 1, policy_version 42540 (0.0009) +[2023-10-11 16:42:14,352][85175] Updated weights for policy 1, policy_version 42550 (0.0007) +[2023-10-11 16:42:14,716][85175] Updated weights for policy 1, policy_version 42560 (0.0009) +[2023-10-11 16:42:16,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 86540288. Throughput: 0: 1670.0, 1: 1693.9. Samples: 21646732. Policy #0 lag: (min: 31.0, avg: 37.0, max: 63.0) +[2023-10-11 16:42:16,063][84230] Avg episode reward: [(0, '7.890'), (1, '11.240')] +[2023-10-11 16:42:16,075][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000042560_43581440.pth... +[2023-10-11 16:42:16,075][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000041952_42958848.pth... +[2023-10-11 16:42:16,112][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000040416_41385984.pth +[2023-10-11 16:42:16,115][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000040992_41975808.pth +[2023-10-11 16:42:16,120][85000] Saving new best policy, reward=11.240! +[2023-10-11 16:42:16,744][85176] Updated weights for policy 0, policy_version 41962 (0.0009) +[2023-10-11 16:42:17,118][85176] Updated weights for policy 0, policy_version 41972 (0.0009) +[2023-10-11 16:42:17,488][85176] Updated weights for policy 0, policy_version 41982 (0.0009) +[2023-10-11 16:42:18,707][85175] Updated weights for policy 1, policy_version 42570 (0.0008) +[2023-10-11 16:42:19,071][85175] Updated weights for policy 1, policy_version 42580 (0.0010) +[2023-10-11 16:42:19,434][85175] Updated weights for policy 1, policy_version 42590 (0.0009) +[2023-10-11 16:42:21,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 86605824. Throughput: 0: 1670.6, 1: 1700.3. Samples: 21656854. Policy #0 lag: (min: 31.0, avg: 37.0, max: 63.0) +[2023-10-11 16:42:21,063][84230] Avg episode reward: [(0, '7.450'), (1, '11.860')] +[2023-10-11 16:42:21,064][85000] Saving new best policy, reward=11.860! +[2023-10-11 16:42:21,442][85176] Updated weights for policy 0, policy_version 41992 (0.0008) +[2023-10-11 16:42:21,808][85176] Updated weights for policy 0, policy_version 42002 (0.0007) +[2023-10-11 16:42:22,188][85176] Updated weights for policy 0, policy_version 42012 (0.0008) +[2023-10-11 16:42:23,338][85175] Updated weights for policy 1, policy_version 42600 (0.0008) +[2023-10-11 16:42:23,716][85175] Updated weights for policy 1, policy_version 42610 (0.0010) +[2023-10-11 16:42:24,072][85175] Updated weights for policy 1, policy_version 42620 (0.0010) +[2023-10-11 16:42:26,062][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 86671360. Throughput: 0: 1672.1, 1: 1677.8. Samples: 21676562. Policy #0 lag: (min: 31.0, avg: 37.0, max: 63.0) +[2023-10-11 16:42:26,063][84230] Avg episode reward: [(0, '7.450'), (1, '14.510')] +[2023-10-11 16:42:26,064][85000] Saving new best policy, reward=14.510! +[2023-10-11 16:42:26,321][85176] Updated weights for policy 0, policy_version 42022 (0.0008) +[2023-10-11 16:42:26,685][85176] Updated weights for policy 0, policy_version 42032 (0.0008) +[2023-10-11 16:42:27,058][85176] Updated weights for policy 0, policy_version 42042 (0.0007) +[2023-10-11 16:42:28,077][85175] Updated weights for policy 1, policy_version 42630 (0.0008) +[2023-10-11 16:42:28,443][85175] Updated weights for policy 1, policy_version 42640 (0.0008) +[2023-10-11 16:42:28,804][85175] Updated weights for policy 1, policy_version 42650 (0.0008) +[2023-10-11 16:42:31,062][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 86736896. Throughput: 0: 1670.9, 1: 1700.1. Samples: 21697314. Policy #0 lag: (min: 31.0, avg: 37.4, max: 63.0) +[2023-10-11 16:42:31,063][84230] Avg episode reward: [(0, '7.300'), (1, '14.140')] +[2023-10-11 16:42:31,176][85176] Updated weights for policy 0, policy_version 42052 (0.0007) +[2023-10-11 16:42:31,537][85176] Updated weights for policy 0, policy_version 42062 (0.0010) +[2023-10-11 16:42:31,911][85176] Updated weights for policy 0, policy_version 42072 (0.0010) +[2023-10-11 16:42:32,821][85175] Updated weights for policy 1, policy_version 42660 (0.0010) +[2023-10-11 16:42:33,189][85175] Updated weights for policy 1, policy_version 42670 (0.0010) +[2023-10-11 16:42:33,559][85175] Updated weights for policy 1, policy_version 42680 (0.0010) +[2023-10-11 16:42:36,002][85176] Updated weights for policy 0, policy_version 42082 (0.0009) +[2023-10-11 16:42:36,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 86802432. Throughput: 0: 1675.0, 1: 1691.1. Samples: 21707072. Policy #0 lag: (min: 31.0, avg: 37.4, max: 63.0) +[2023-10-11 16:42:36,063][84230] Avg episode reward: [(0, '7.750'), (1, '14.060')] +[2023-10-11 16:42:36,369][85176] Updated weights for policy 0, policy_version 42092 (0.0008) +[2023-10-11 16:42:36,740][85176] Updated weights for policy 0, policy_version 42102 (0.0011) +[2023-10-11 16:42:37,112][85176] Updated weights for policy 0, policy_version 42112 (0.0007) +[2023-10-11 16:42:37,517][85175] Updated weights for policy 1, policy_version 42690 (0.0007) +[2023-10-11 16:42:37,885][85175] Updated weights for policy 1, policy_version 42700 (0.0007) +[2023-10-11 16:42:38,256][85175] Updated weights for policy 1, policy_version 42710 (0.0010) +[2023-10-11 16:42:38,624][85175] Updated weights for policy 1, policy_version 42720 (0.0009) +[2023-10-11 16:42:41,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 86867968. Throughput: 0: 1677.1, 1: 1693.7. Samples: 21727520. Policy #0 lag: (min: 31.0, avg: 37.4, max: 63.0) +[2023-10-11 16:42:41,064][84230] Avg episode reward: [(0, '7.740'), (1, '13.970')] +[2023-10-11 16:42:41,211][85176] Updated weights for policy 0, policy_version 42122 (0.0009) +[2023-10-11 16:42:41,576][85176] Updated weights for policy 0, policy_version 42132 (0.0010) +[2023-10-11 16:42:41,956][85176] Updated weights for policy 0, policy_version 42142 (0.0010) +[2023-10-11 16:42:42,688][85175] Updated weights for policy 1, policy_version 42730 (0.0009) +[2023-10-11 16:42:43,056][85175] Updated weights for policy 1, policy_version 42740 (0.0009) +[2023-10-11 16:42:43,435][85175] Updated weights for policy 1, policy_version 42750 (0.0008) +[2023-10-11 16:42:46,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 86933504. Throughput: 0: 1668.2, 1: 1713.8. Samples: 21748140. Policy #0 lag: (min: 31.0, avg: 37.4, max: 63.0) +[2023-10-11 16:42:46,064][84230] Avg episode reward: [(0, '8.040'), (1, '13.920')] +[2023-10-11 16:42:46,180][85176] Updated weights for policy 0, policy_version 42152 (0.0011) +[2023-10-11 16:42:46,556][85176] Updated weights for policy 0, policy_version 42162 (0.0008) +[2023-10-11 16:42:46,929][85176] Updated weights for policy 0, policy_version 42172 (0.0007) +[2023-10-11 16:42:47,450][85175] Updated weights for policy 1, policy_version 42760 (0.0008) +[2023-10-11 16:42:47,814][85175] Updated weights for policy 1, policy_version 42770 (0.0009) +[2023-10-11 16:42:48,187][85175] Updated weights for policy 1, policy_version 42780 (0.0009) +[2023-10-11 16:42:51,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 86999040. Throughput: 0: 1673.8, 1: 1680.2. Samples: 21757122. Policy #0 lag: (min: 31.0, avg: 37.4, max: 63.0) +[2023-10-11 16:42:51,063][84230] Avg episode reward: [(0, '7.730'), (1, '15.580')] +[2023-10-11 16:42:51,064][85000] Saving new best policy, reward=15.580! +[2023-10-11 16:42:51,088][85176] Updated weights for policy 0, policy_version 42182 (0.0010) +[2023-10-11 16:42:51,460][85176] Updated weights for policy 0, policy_version 42192 (0.0011) +[2023-10-11 16:42:51,839][85176] Updated weights for policy 0, policy_version 42202 (0.0010) +[2023-10-11 16:42:52,240][85175] Updated weights for policy 1, policy_version 42790 (0.0008) +[2023-10-11 16:42:52,612][85175] Updated weights for policy 1, policy_version 42800 (0.0010) +[2023-10-11 16:42:52,985][85175] Updated weights for policy 1, policy_version 42810 (0.0007) +[2023-10-11 16:42:55,981][85176] Updated weights for policy 0, policy_version 42212 (0.0009) +[2023-10-11 16:42:56,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 87064576. Throughput: 0: 1676.0, 1: 1700.3. Samples: 21777920. Policy #0 lag: (min: 31.0, avg: 37.4, max: 63.0) +[2023-10-11 16:42:56,063][84230] Avg episode reward: [(0, '7.880'), (1, '15.920')] +[2023-10-11 16:42:56,064][85000] Saving new best policy, reward=15.920! +[2023-10-11 16:42:56,355][85176] Updated weights for policy 0, policy_version 42222 (0.0008) +[2023-10-11 16:42:56,736][85176] Updated weights for policy 0, policy_version 42232 (0.0007) +[2023-10-11 16:42:57,058][85175] Updated weights for policy 1, policy_version 42820 (0.0008) +[2023-10-11 16:42:57,430][85175] Updated weights for policy 1, policy_version 42830 (0.0007) +[2023-10-11 16:42:57,805][85175] Updated weights for policy 1, policy_version 42840 (0.0009) +[2023-10-11 16:43:00,811][85176] Updated weights for policy 0, policy_version 42242 (0.0007) +[2023-10-11 16:43:01,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 87130112. Throughput: 0: 1675.2, 1: 1697.9. Samples: 21798520. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-11 16:43:01,063][84230] Avg episode reward: [(0, '7.720'), (1, '15.660')] +[2023-10-11 16:43:01,204][85176] Updated weights for policy 0, policy_version 42252 (0.0008) +[2023-10-11 16:43:01,588][85176] Updated weights for policy 0, policy_version 42262 (0.0008) +[2023-10-11 16:43:01,960][85176] Updated weights for policy 0, policy_version 42272 (0.0008) +[2023-10-11 16:43:01,999][85175] Updated weights for policy 1, policy_version 42850 (0.0008) +[2023-10-11 16:43:02,372][85175] Updated weights for policy 1, policy_version 42860 (0.0008) +[2023-10-11 16:43:02,747][85175] Updated weights for policy 1, policy_version 42870 (0.0011) +[2023-10-11 16:43:03,102][85175] Updated weights for policy 1, policy_version 42880 (0.0009) +[2023-10-11 16:43:05,892][85176] Updated weights for policy 0, policy_version 42282 (0.0011) +[2023-10-11 16:43:06,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 87195648. Throughput: 0: 1674.9, 1: 1673.2. Samples: 21807518. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-11 16:43:06,063][84230] Avg episode reward: [(0, '7.870'), (1, '15.510')] +[2023-10-11 16:43:06,272][85176] Updated weights for policy 0, policy_version 42292 (0.0009) +[2023-10-11 16:43:06,641][85176] Updated weights for policy 0, policy_version 42302 (0.0011) +[2023-10-11 16:43:07,188][85175] Updated weights for policy 1, policy_version 42890 (0.0007) +[2023-10-11 16:43:07,560][85175] Updated weights for policy 1, policy_version 42900 (0.0007) +[2023-10-11 16:43:07,924][85175] Updated weights for policy 1, policy_version 42910 (0.0007) +[2023-10-11 16:43:10,713][85176] Updated weights for policy 0, policy_version 42312 (0.0010) +[2023-10-11 16:43:11,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 87261184. Throughput: 0: 1675.3, 1: 1700.6. Samples: 21828478. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-11 16:43:11,063][84230] Avg episode reward: [(0, '7.750'), (1, '16.970')] +[2023-10-11 16:43:11,064][85000] Saving new best policy, reward=16.970! +[2023-10-11 16:43:11,089][85176] Updated weights for policy 0, policy_version 42322 (0.0010) +[2023-10-11 16:43:11,462][85176] Updated weights for policy 0, policy_version 42332 (0.0011) +[2023-10-11 16:43:11,832][85175] Updated weights for policy 1, policy_version 42920 (0.0010) +[2023-10-11 16:43:12,196][85175] Updated weights for policy 1, policy_version 42930 (0.0007) +[2023-10-11 16:43:12,565][85175] Updated weights for policy 1, policy_version 42940 (0.0008) +[2023-10-11 16:43:15,384][85176] Updated weights for policy 0, policy_version 42342 (0.0008) +[2023-10-11 16:43:15,751][85176] Updated weights for policy 0, policy_version 42352 (0.0009) +[2023-10-11 16:43:16,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 87326720. Throughput: 0: 1664.9, 1: 1703.2. Samples: 21848878. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-11 16:43:16,063][84230] Avg episode reward: [(0, '7.720'), (1, '15.380')] +[2023-10-11 16:43:16,118][85176] Updated weights for policy 0, policy_version 42362 (0.0007) +[2023-10-11 16:43:16,623][85175] Updated weights for policy 1, policy_version 42950 (0.0009) +[2023-10-11 16:43:16,991][85175] Updated weights for policy 1, policy_version 42960 (0.0009) +[2023-10-11 16:43:17,359][85175] Updated weights for policy 1, policy_version 42970 (0.0009) +[2023-10-11 16:43:20,232][85176] Updated weights for policy 0, policy_version 42372 (0.0008) +[2023-10-11 16:43:20,598][85176] Updated weights for policy 0, policy_version 42382 (0.0010) +[2023-10-11 16:43:20,969][85176] Updated weights for policy 0, policy_version 42392 (0.0010) +[2023-10-11 16:43:21,063][84230] Fps is (10 sec: 13106.8, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 87392256. Throughput: 0: 1676.7, 1: 1691.9. Samples: 21858660. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-11 16:43:21,064][84230] Avg episode reward: [(0, '7.570'), (1, '14.110')] +[2023-10-11 16:43:21,213][85175] Updated weights for policy 1, policy_version 42980 (0.0009) +[2023-10-11 16:43:21,584][85175] Updated weights for policy 1, policy_version 42990 (0.0007) +[2023-10-11 16:43:21,958][85175] Updated weights for policy 1, policy_version 43000 (0.0011) +[2023-10-11 16:43:24,979][85176] Updated weights for policy 0, policy_version 42402 (0.0008) +[2023-10-11 16:43:25,353][85176] Updated weights for policy 0, policy_version 42412 (0.0009) +[2023-10-11 16:43:25,727][85176] Updated weights for policy 0, policy_version 42422 (0.0007) +[2023-10-11 16:43:25,997][85175] Updated weights for policy 1, policy_version 43010 (0.0008) +[2023-10-11 16:43:26,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 87457792. Throughput: 0: 1679.5, 1: 1698.1. Samples: 21879508. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-11 16:43:26,063][84230] Avg episode reward: [(0, '7.750'), (1, '17.220')] +[2023-10-11 16:43:26,092][85176] Updated weights for policy 0, policy_version 42432 (0.0007) +[2023-10-11 16:43:26,366][85175] Updated weights for policy 1, policy_version 43020 (0.0007) +[2023-10-11 16:43:26,730][85175] Updated weights for policy 1, policy_version 43030 (0.0009) +[2023-10-11 16:43:27,096][85000] Saving new best policy, reward=17.220! +[2023-10-11 16:43:27,098][85175] Updated weights for policy 1, policy_version 43040 (0.0010) +[2023-10-11 16:43:30,121][85176] Updated weights for policy 0, policy_version 42442 (0.0008) +[2023-10-11 16:43:30,485][85176] Updated weights for policy 0, policy_version 42452 (0.0008) +[2023-10-11 16:43:30,856][85176] Updated weights for policy 0, policy_version 42462 (0.0009) +[2023-10-11 16:43:31,063][84230] Fps is (10 sec: 16384.3, 60 sec: 13653.3, 300 sec: 13551.9). Total num frames: 87556096. Throughput: 0: 1658.5, 1: 1700.2. Samples: 21899280. Policy #0 lag: (min: 27.0, avg: 53.8, max: 56.0) +[2023-10-11 16:43:31,063][84230] Avg episode reward: [(0, '7.600'), (1, '20.540')] +[2023-10-11 16:43:31,121][85175] Updated weights for policy 1, policy_version 43050 (0.0008) +[2023-10-11 16:43:31,485][85175] Updated weights for policy 1, policy_version 43060 (0.0007) +[2023-10-11 16:43:31,847][85175] Updated weights for policy 1, policy_version 43070 (0.0007) +[2023-10-11 16:43:31,918][85000] Saving new best policy, reward=20.540! +[2023-10-11 16:43:35,017][85176] Updated weights for policy 0, policy_version 42472 (0.0011) +[2023-10-11 16:43:35,388][85176] Updated weights for policy 0, policy_version 42482 (0.0008) +[2023-10-11 16:43:35,725][85175] Updated weights for policy 1, policy_version 43080 (0.0007) +[2023-10-11 16:43:35,764][85176] Updated weights for policy 0, policy_version 42492 (0.0009) +[2023-10-11 16:43:36,062][84230] Fps is (10 sec: 16384.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 87621632. Throughput: 0: 1675.8, 1: 1699.6. Samples: 21909012. Policy #0 lag: (min: 27.0, avg: 53.8, max: 56.0) +[2023-10-11 16:43:36,063][84230] Avg episode reward: [(0, '7.600'), (1, '21.260')] +[2023-10-11 16:43:36,092][85175] Updated weights for policy 1, policy_version 43090 (0.0007) +[2023-10-11 16:43:36,471][85175] Updated weights for policy 1, policy_version 43100 (0.0010) +[2023-10-11 16:43:36,613][85000] Saving new best policy, reward=21.260! +[2023-10-11 16:43:39,894][85176] Updated weights for policy 0, policy_version 42502 (0.0009) +[2023-10-11 16:43:40,273][85176] Updated weights for policy 0, policy_version 42512 (0.0011) +[2023-10-11 16:43:40,543][85175] Updated weights for policy 1, policy_version 43110 (0.0008) +[2023-10-11 16:43:40,641][85176] Updated weights for policy 0, policy_version 42522 (0.0008) +[2023-10-11 16:43:40,909][85175] Updated weights for policy 1, policy_version 43120 (0.0007) +[2023-10-11 16:43:41,062][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 87687168. Throughput: 0: 1674.2, 1: 1705.2. Samples: 21929990. Policy #0 lag: (min: 27.0, avg: 53.8, max: 56.0) +[2023-10-11 16:43:41,063][84230] Avg episode reward: [(0, '7.880'), (1, '20.880')] +[2023-10-11 16:43:41,285][85175] Updated weights for policy 1, policy_version 43130 (0.0010) +[2023-10-11 16:43:44,916][85176] Updated weights for policy 0, policy_version 42532 (0.0008) +[2023-10-11 16:43:45,287][85176] Updated weights for policy 0, policy_version 42542 (0.0008) +[2023-10-11 16:43:45,393][85175] Updated weights for policy 1, policy_version 43140 (0.0009) +[2023-10-11 16:43:45,665][85176] Updated weights for policy 0, policy_version 42552 (0.0008) +[2023-10-11 16:43:45,766][85175] Updated weights for policy 1, policy_version 43150 (0.0007) +[2023-10-11 16:43:46,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 87752704. Throughput: 0: 1652.4, 1: 1703.2. Samples: 21949524. Policy #0 lag: (min: 27.0, avg: 53.8, max: 56.0) +[2023-10-11 16:43:46,063][84230] Avg episode reward: [(0, '7.990'), (1, '21.230')] +[2023-10-11 16:43:46,134][85175] Updated weights for policy 1, policy_version 43160 (0.0007) +[2023-10-11 16:43:49,710][85176] Updated weights for policy 0, policy_version 42562 (0.0008) +[2023-10-11 16:43:50,087][85176] Updated weights for policy 0, policy_version 42572 (0.0009) +[2023-10-11 16:43:50,119][85175] Updated weights for policy 1, policy_version 43170 (0.0009) +[2023-10-11 16:43:50,461][85176] Updated weights for policy 0, policy_version 42582 (0.0008) +[2023-10-11 16:43:50,485][85175] Updated weights for policy 1, policy_version 43180 (0.0009) +[2023-10-11 16:43:50,829][85176] Updated weights for policy 0, policy_version 42592 (0.0009) +[2023-10-11 16:43:50,860][85175] Updated weights for policy 1, policy_version 43190 (0.0007) +[2023-10-11 16:43:51,063][84230] Fps is (10 sec: 13106.7, 60 sec: 13653.2, 300 sec: 13440.4). Total num frames: 87818240. Throughput: 0: 1677.8, 1: 1710.4. Samples: 21959988. Policy #0 lag: (min: 27.0, avg: 53.8, max: 56.0) +[2023-10-11 16:43:51,064][84230] Avg episode reward: [(0, '8.160'), (1, '17.630')] +[2023-10-11 16:43:51,225][85175] Updated weights for policy 1, policy_version 43200 (0.0008) +[2023-10-11 16:43:55,032][85176] Updated weights for policy 0, policy_version 42602 (0.0009) +[2023-10-11 16:43:55,361][85175] Updated weights for policy 1, policy_version 43210 (0.0007) +[2023-10-11 16:43:55,397][85176] Updated weights for policy 0, policy_version 42612 (0.0009) +[2023-10-11 16:43:55,724][85175] Updated weights for policy 1, policy_version 43220 (0.0008) +[2023-10-11 16:43:55,768][85176] Updated weights for policy 0, policy_version 42622 (0.0010) +[2023-10-11 16:43:56,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 87883776. Throughput: 0: 1671.5, 1: 1707.3. Samples: 21980524. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:43:56,063][84230] Avg episode reward: [(0, '7.740'), (1, '19.160')] +[2023-10-11 16:43:56,092][85175] Updated weights for policy 1, policy_version 43230 (0.0007) +[2023-10-11 16:43:59,819][85176] Updated weights for policy 0, policy_version 42632 (0.0011) +[2023-10-11 16:44:00,187][85176] Updated weights for policy 0, policy_version 42642 (0.0007) +[2023-10-11 16:44:00,228][85175] Updated weights for policy 1, policy_version 43240 (0.0008) +[2023-10-11 16:44:00,556][85176] Updated weights for policy 0, policy_version 42652 (0.0008) +[2023-10-11 16:44:00,594][85175] Updated weights for policy 1, policy_version 43250 (0.0007) +[2023-10-11 16:44:00,954][85175] Updated weights for policy 1, policy_version 43260 (0.0007) +[2023-10-11 16:44:01,063][84230] Fps is (10 sec: 13107.6, 60 sec: 13653.3, 300 sec: 13440.5). Total num frames: 87949312. Throughput: 0: 1655.9, 1: 1690.0. Samples: 21999440. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:44:01,063][84230] Avg episode reward: [(0, '7.410'), (1, '18.700')] +[2023-10-11 16:44:04,869][85176] Updated weights for policy 0, policy_version 42662 (0.0007) +[2023-10-11 16:44:05,054][85175] Updated weights for policy 1, policy_version 43270 (0.0008) +[2023-10-11 16:44:05,242][85176] Updated weights for policy 0, policy_version 42672 (0.0007) +[2023-10-11 16:44:05,415][85175] Updated weights for policy 1, policy_version 43280 (0.0009) +[2023-10-11 16:44:05,619][85176] Updated weights for policy 0, policy_version 42682 (0.0008) +[2023-10-11 16:44:05,778][85175] Updated weights for policy 1, policy_version 43290 (0.0008) +[2023-10-11 16:44:06,063][84230] Fps is (10 sec: 16383.9, 60 sec: 14199.4, 300 sec: 13440.4). Total num frames: 88047616. Throughput: 0: 1660.9, 1: 1702.2. Samples: 22010000. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:44:06,063][84230] Avg episode reward: [(0, '7.720'), (1, '19.790')] +[2023-10-11 16:44:09,735][85176] Updated weights for policy 0, policy_version 42692 (0.0008) +[2023-10-11 16:44:09,768][85175] Updated weights for policy 1, policy_version 43300 (0.0010) +[2023-10-11 16:44:10,114][85176] Updated weights for policy 0, policy_version 42702 (0.0008) +[2023-10-11 16:44:10,142][85175] Updated weights for policy 1, policy_version 43310 (0.0008) +[2023-10-11 16:44:10,475][85176] Updated weights for policy 0, policy_version 42712 (0.0008) +[2023-10-11 16:44:10,497][85175] Updated weights for policy 1, policy_version 43320 (0.0007) +[2023-10-11 16:44:11,063][84230] Fps is (10 sec: 16383.8, 60 sec: 14199.4, 300 sec: 13440.4). Total num frames: 88113152. Throughput: 0: 1653.9, 1: 1701.4. Samples: 22030496. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:44:11,064][84230] Avg episode reward: [(0, '7.630'), (1, '21.310')] +[2023-10-11 16:44:11,066][85000] Saving new best policy, reward=21.310! +[2023-10-11 16:44:14,501][85175] Updated weights for policy 1, policy_version 43330 (0.0008) +[2023-10-11 16:44:14,644][85176] Updated weights for policy 0, policy_version 42722 (0.0008) +[2023-10-11 16:44:14,870][85175] Updated weights for policy 1, policy_version 43340 (0.0007) +[2023-10-11 16:44:15,009][85176] Updated weights for policy 0, policy_version 42732 (0.0007) +[2023-10-11 16:44:15,241][85175] Updated weights for policy 1, policy_version 43350 (0.0008) +[2023-10-11 16:44:15,391][85176] Updated weights for policy 0, policy_version 42742 (0.0009) +[2023-10-11 16:44:15,602][85175] Updated weights for policy 1, policy_version 43360 (0.0008) +[2023-10-11 16:44:15,766][85176] Updated weights for policy 0, policy_version 42752 (0.0010) +[2023-10-11 16:44:16,063][84230] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 13440.4). Total num frames: 88178688. Throughput: 0: 1650.3, 1: 1678.6. Samples: 22049082. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:44:16,064][84230] Avg episode reward: [(0, '7.780'), (1, '22.860')] +[2023-10-11 16:44:16,077][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000042752_43778048.pth... +[2023-10-11 16:44:16,077][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000043360_44400640.pth... +[2023-10-11 16:44:16,113][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000041760_42762240.pth +[2023-10-11 16:44:16,117][85000] Saving new best policy, reward=22.860! +[2023-10-11 16:44:16,118][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000041184_42172416.pth +[2023-10-11 16:44:19,690][85176] Updated weights for policy 0, policy_version 42762 (0.0007) +[2023-10-11 16:44:19,759][85175] Updated weights for policy 1, policy_version 43370 (0.0007) +[2023-10-11 16:44:20,065][85176] Updated weights for policy 0, policy_version 42772 (0.0007) +[2023-10-11 16:44:20,133][85175] Updated weights for policy 1, policy_version 43380 (0.0008) +[2023-10-11 16:44:20,423][85176] Updated weights for policy 0, policy_version 42782 (0.0010) +[2023-10-11 16:44:20,501][85175] Updated weights for policy 1, policy_version 43390 (0.0009) +[2023-10-11 16:44:21,063][84230] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13440.4). Total num frames: 88244224. Throughput: 0: 1657.7, 1: 1705.8. Samples: 22060372. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:44:21,064][84230] Avg episode reward: [(0, '7.380'), (1, '23.120')] +[2023-10-11 16:44:21,065][85000] Saving new best policy, reward=23.120! +[2023-10-11 16:44:24,536][85176] Updated weights for policy 0, policy_version 42792 (0.0007) +[2023-10-11 16:44:24,556][85175] Updated weights for policy 1, policy_version 43400 (0.0009) +[2023-10-11 16:44:24,904][85176] Updated weights for policy 0, policy_version 42802 (0.0007) +[2023-10-11 16:44:24,931][85175] Updated weights for policy 1, policy_version 43410 (0.0008) +[2023-10-11 16:44:25,284][85176] Updated weights for policy 0, policy_version 42812 (0.0008) +[2023-10-11 16:44:25,294][85175] Updated weights for policy 1, policy_version 43420 (0.0008) +[2023-10-11 16:44:26,063][84230] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 13440.4). Total num frames: 88309760. Throughput: 0: 1655.3, 1: 1684.7. Samples: 22080292. Policy #0 lag: (min: 31.0, avg: 31.5, max: 46.0) +[2023-10-11 16:44:26,064][84230] Avg episode reward: [(0, '7.830'), (1, '25.900')] +[2023-10-11 16:44:26,065][85000] Saving new best policy, reward=25.900! +[2023-10-11 16:44:29,196][85175] Updated weights for policy 1, policy_version 43430 (0.0008) +[2023-10-11 16:44:29,232][85176] Updated weights for policy 0, policy_version 42822 (0.0007) +[2023-10-11 16:44:29,562][85175] Updated weights for policy 1, policy_version 43440 (0.0007) +[2023-10-11 16:44:29,613][85176] Updated weights for policy 0, policy_version 42832 (0.0009) +[2023-10-11 16:44:29,930][85175] Updated weights for policy 1, policy_version 43450 (0.0008) +[2023-10-11 16:44:29,979][85176] Updated weights for policy 0, policy_version 42842 (0.0008) +[2023-10-11 16:44:31,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 88375296. Throughput: 0: 1655.8, 1: 1670.1. Samples: 22099190. Policy #0 lag: (min: 31.0, avg: 31.5, max: 46.0) +[2023-10-11 16:44:31,064][84230] Avg episode reward: [(0, '7.970'), (1, '26.020')] +[2023-10-11 16:44:31,075][85000] Saving new best policy, reward=26.020! +[2023-10-11 16:44:34,024][85175] Updated weights for policy 1, policy_version 43460 (0.0007) +[2023-10-11 16:44:34,041][85176] Updated weights for policy 0, policy_version 42852 (0.0008) +[2023-10-11 16:44:34,382][85175] Updated weights for policy 1, policy_version 43470 (0.0009) +[2023-10-11 16:44:34,415][85176] Updated weights for policy 0, policy_version 42862 (0.0008) +[2023-10-11 16:44:34,752][85175] Updated weights for policy 1, policy_version 43480 (0.0007) +[2023-10-11 16:44:34,791][85176] Updated weights for policy 0, policy_version 42872 (0.0009) +[2023-10-11 16:44:36,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 88440832. Throughput: 0: 1662.6, 1: 1689.9. Samples: 22110850. Policy #0 lag: (min: 31.0, avg: 31.5, max: 46.0) +[2023-10-11 16:44:36,063][84230] Avg episode reward: [(0, '7.530'), (1, '25.130')] +[2023-10-11 16:44:38,802][85175] Updated weights for policy 1, policy_version 43490 (0.0007) +[2023-10-11 16:44:39,025][85176] Updated weights for policy 0, policy_version 42882 (0.0007) +[2023-10-11 16:44:39,183][85175] Updated weights for policy 1, policy_version 43500 (0.0008) +[2023-10-11 16:44:39,437][85176] Updated weights for policy 0, policy_version 42892 (0.0007) +[2023-10-11 16:44:39,550][85175] Updated weights for policy 1, policy_version 43510 (0.0007) +[2023-10-11 16:44:39,800][85176] Updated weights for policy 0, policy_version 42902 (0.0008) +[2023-10-11 16:44:39,909][85175] Updated weights for policy 1, policy_version 43520 (0.0009) +[2023-10-11 16:44:40,176][85176] Updated weights for policy 0, policy_version 42912 (0.0008) +[2023-10-11 16:44:41,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 88506368. Throughput: 0: 1645.7, 1: 1671.9. Samples: 22129816. Policy #0 lag: (min: 31.0, avg: 31.5, max: 46.0) +[2023-10-11 16:44:41,063][84230] Avg episode reward: [(0, '7.750'), (1, '21.750')] +[2023-10-11 16:44:43,867][85175] Updated weights for policy 1, policy_version 43530 (0.0008) +[2023-10-11 16:44:44,231][85175] Updated weights for policy 1, policy_version 43540 (0.0007) +[2023-10-11 16:44:44,339][85176] Updated weights for policy 0, policy_version 42922 (0.0009) +[2023-10-11 16:44:44,598][85175] Updated weights for policy 1, policy_version 43550 (0.0007) +[2023-10-11 16:44:44,705][85176] Updated weights for policy 0, policy_version 42932 (0.0010) +[2023-10-11 16:44:45,083][85176] Updated weights for policy 0, policy_version 42942 (0.0009) +[2023-10-11 16:44:46,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 88571904. Throughput: 0: 1653.7, 1: 1684.2. Samples: 22149646. Policy #0 lag: (min: 31.0, avg: 31.5, max: 46.0) +[2023-10-11 16:44:46,064][84230] Avg episode reward: [(0, '7.900'), (1, '23.120')] +[2023-10-11 16:44:48,555][85175] Updated weights for policy 1, policy_version 43560 (0.0007) +[2023-10-11 16:44:48,926][85175] Updated weights for policy 1, policy_version 43570 (0.0008) +[2023-10-11 16:44:49,061][85176] Updated weights for policy 0, policy_version 42952 (0.0008) +[2023-10-11 16:44:49,287][85175] Updated weights for policy 1, policy_version 43580 (0.0009) +[2023-10-11 16:44:49,435][85176] Updated weights for policy 0, policy_version 42962 (0.0008) +[2023-10-11 16:44:49,801][85176] Updated weights for policy 0, policy_version 42972 (0.0010) +[2023-10-11 16:44:51,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 88637440. Throughput: 0: 1664.7, 1: 1692.1. Samples: 22161056. Policy #0 lag: (min: 31.0, avg: 31.5, max: 46.0) +[2023-10-11 16:44:51,063][84230] Avg episode reward: [(0, '7.380'), (1, '26.200')] +[2023-10-11 16:44:51,064][85000] Saving new best policy, reward=26.200! +[2023-10-11 16:44:53,438][85175] Updated weights for policy 1, policy_version 43590 (0.0009) +[2023-10-11 16:44:53,811][85175] Updated weights for policy 1, policy_version 43600 (0.0008) +[2023-10-11 16:44:54,011][85176] Updated weights for policy 0, policy_version 42982 (0.0010) +[2023-10-11 16:44:54,179][85175] Updated weights for policy 1, policy_version 43610 (0.0008) +[2023-10-11 16:44:54,373][85176] Updated weights for policy 0, policy_version 42992 (0.0009) +[2023-10-11 16:44:54,757][85176] Updated weights for policy 0, policy_version 43002 (0.0010) +[2023-10-11 16:44:56,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 88702976. Throughput: 0: 1645.4, 1: 1665.6. Samples: 22179492. Policy #0 lag: (min: 31.0, avg: 31.0, max: 32.0) +[2023-10-11 16:44:56,063][84230] Avg episode reward: [(0, '7.490'), (1, '25.130')] +[2023-10-11 16:44:58,131][85175] Updated weights for policy 1, policy_version 43620 (0.0007) +[2023-10-11 16:44:58,493][85175] Updated weights for policy 1, policy_version 43630 (0.0010) +[2023-10-11 16:44:58,866][85175] Updated weights for policy 1, policy_version 43640 (0.0010) +[2023-10-11 16:44:59,083][85176] Updated weights for policy 0, policy_version 43012 (0.0007) +[2023-10-11 16:44:59,450][85176] Updated weights for policy 0, policy_version 43022 (0.0008) +[2023-10-11 16:44:59,832][85176] Updated weights for policy 0, policy_version 43032 (0.0008) +[2023-10-11 16:45:01,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 88768512. Throughput: 0: 1656.2, 1: 1694.7. Samples: 22199872. Policy #0 lag: (min: 31.0, avg: 31.0, max: 32.0) +[2023-10-11 16:45:01,063][84230] Avg episode reward: [(0, '8.010'), (1, '26.050')] +[2023-10-11 16:45:02,956][85175] Updated weights for policy 1, policy_version 43650 (0.0009) +[2023-10-11 16:45:03,324][85175] Updated weights for policy 1, policy_version 43660 (0.0007) +[2023-10-11 16:45:03,700][85175] Updated weights for policy 1, policy_version 43670 (0.0008) +[2023-10-11 16:45:04,065][85176] Updated weights for policy 0, policy_version 43042 (0.0008) +[2023-10-11 16:45:04,065][85175] Updated weights for policy 1, policy_version 43680 (0.0009) +[2023-10-11 16:45:04,438][85176] Updated weights for policy 0, policy_version 43052 (0.0009) +[2023-10-11 16:45:04,808][85176] Updated weights for policy 0, policy_version 43062 (0.0007) +[2023-10-11 16:45:05,181][85176] Updated weights for policy 0, policy_version 43072 (0.0007) +[2023-10-11 16:45:06,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 88834048. Throughput: 0: 1659.7, 1: 1690.5. Samples: 22211134. Policy #0 lag: (min: 31.0, avg: 31.0, max: 32.0) +[2023-10-11 16:45:06,063][84230] Avg episode reward: [(0, '7.890'), (1, '27.940')] +[2023-10-11 16:45:06,064][85000] Saving new best policy, reward=27.940! +[2023-10-11 16:45:08,192][85175] Updated weights for policy 1, policy_version 43690 (0.0007) +[2023-10-11 16:45:08,561][85175] Updated weights for policy 1, policy_version 43700 (0.0009) +[2023-10-11 16:45:08,925][85175] Updated weights for policy 1, policy_version 43710 (0.0008) +[2023-10-11 16:45:09,332][85176] Updated weights for policy 0, policy_version 43082 (0.0009) +[2023-10-11 16:45:09,704][85176] Updated weights for policy 0, policy_version 43092 (0.0008) +[2023-10-11 16:45:10,081][85176] Updated weights for policy 0, policy_version 43102 (0.0008) +[2023-10-11 16:45:11,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 88899584. Throughput: 0: 1648.8, 1: 1684.5. Samples: 22230294. Policy #0 lag: (min: 31.0, avg: 31.0, max: 32.0) +[2023-10-11 16:45:11,064][84230] Avg episode reward: [(0, '7.620'), (1, '28.430')] +[2023-10-11 16:45:11,065][85000] Saving new best policy, reward=28.430! +[2023-10-11 16:45:12,984][85175] Updated weights for policy 1, policy_version 43720 (0.0008) +[2023-10-11 16:45:13,354][85175] Updated weights for policy 1, policy_version 43730 (0.0007) +[2023-10-11 16:45:13,719][85175] Updated weights for policy 1, policy_version 43740 (0.0010) +[2023-10-11 16:45:14,208][85176] Updated weights for policy 0, policy_version 43112 (0.0008) +[2023-10-11 16:45:14,580][85176] Updated weights for policy 0, policy_version 43122 (0.0008) +[2023-10-11 16:45:14,935][85176] Updated weights for policy 0, policy_version 43132 (0.0009) +[2023-10-11 16:45:16,063][84230] Fps is (10 sec: 13106.8, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 88965120. Throughput: 0: 1654.0, 1: 1702.9. Samples: 22250250. Policy #0 lag: (min: 31.0, avg: 31.0, max: 32.0) +[2023-10-11 16:45:16,064][84230] Avg episode reward: [(0, '7.350'), (1, '25.560')] +[2023-10-11 16:45:17,650][85175] Updated weights for policy 1, policy_version 43750 (0.0008) +[2023-10-11 16:45:18,016][85175] Updated weights for policy 1, policy_version 43760 (0.0009) +[2023-10-11 16:45:18,388][85175] Updated weights for policy 1, policy_version 43770 (0.0007) +[2023-10-11 16:45:19,098][85176] Updated weights for policy 0, policy_version 43142 (0.0008) +[2023-10-11 16:45:19,472][85176] Updated weights for policy 0, policy_version 43152 (0.0009) +[2023-10-11 16:45:19,845][85176] Updated weights for policy 0, policy_version 43162 (0.0009) +[2023-10-11 16:45:21,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 89030656. Throughput: 0: 1651.4, 1: 1682.2. Samples: 22260862. Policy #0 lag: (min: 31.0, avg: 31.0, max: 32.0) +[2023-10-11 16:45:21,063][84230] Avg episode reward: [(0, '7.120'), (1, '22.260')] +[2023-10-11 16:45:22,401][85175] Updated weights for policy 1, policy_version 43780 (0.0009) +[2023-10-11 16:45:22,767][85175] Updated weights for policy 1, policy_version 43790 (0.0007) +[2023-10-11 16:45:23,145][85175] Updated weights for policy 1, policy_version 43800 (0.0009) +[2023-10-11 16:45:23,959][85176] Updated weights for policy 0, policy_version 43172 (0.0009) +[2023-10-11 16:45:24,334][85176] Updated weights for policy 0, policy_version 43182 (0.0010) +[2023-10-11 16:45:24,707][85176] Updated weights for policy 0, policy_version 43192 (0.0007) +[2023-10-11 16:45:26,062][84230] Fps is (10 sec: 13107.6, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 89096192. Throughput: 0: 1652.0, 1: 1694.6. Samples: 22280414. Policy #0 lag: (min: 31.0, avg: 37.3, max: 63.0) +[2023-10-11 16:45:26,063][84230] Avg episode reward: [(0, '7.600'), (1, '23.520')] +[2023-10-11 16:45:27,165][85175] Updated weights for policy 1, policy_version 43810 (0.0009) +[2023-10-11 16:45:27,537][85175] Updated weights for policy 1, policy_version 43820 (0.0009) +[2023-10-11 16:45:27,905][85175] Updated weights for policy 1, policy_version 43830 (0.0008) +[2023-10-11 16:45:28,262][85175] Updated weights for policy 1, policy_version 43840 (0.0009) +[2023-10-11 16:45:28,709][85176] Updated weights for policy 0, policy_version 43202 (0.0008) +[2023-10-11 16:45:29,087][85176] Updated weights for policy 0, policy_version 43212 (0.0011) +[2023-10-11 16:45:29,452][85176] Updated weights for policy 0, policy_version 43222 (0.0008) +[2023-10-11 16:45:29,832][85176] Updated weights for policy 0, policy_version 43232 (0.0009) +[2023-10-11 16:45:31,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 89161728. Throughput: 0: 1659.3, 1: 1697.1. Samples: 22300682. Policy #0 lag: (min: 31.0, avg: 37.3, max: 63.0) +[2023-10-11 16:45:31,064][84230] Avg episode reward: [(0, '7.860'), (1, '28.380')] +[2023-10-11 16:45:32,286][85175] Updated weights for policy 1, policy_version 43850 (0.0011) +[2023-10-11 16:45:32,662][85175] Updated weights for policy 1, policy_version 43860 (0.0011) +[2023-10-11 16:45:33,022][85175] Updated weights for policy 1, policy_version 43870 (0.0011) +[2023-10-11 16:45:33,738][85176] Updated weights for policy 0, policy_version 43242 (0.0008) +[2023-10-11 16:45:34,109][85176] Updated weights for policy 0, policy_version 43252 (0.0010) +[2023-10-11 16:45:34,477][85176] Updated weights for policy 0, policy_version 43262 (0.0010) +[2023-10-11 16:45:36,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 89227264. Throughput: 0: 1654.5, 1: 1671.0. Samples: 22310706. Policy #0 lag: (min: 31.0, avg: 37.3, max: 63.0) +[2023-10-11 16:45:36,063][84230] Avg episode reward: [(0, '8.000'), (1, '27.900')] +[2023-10-11 16:45:37,093][85175] Updated weights for policy 1, policy_version 43880 (0.0008) +[2023-10-11 16:45:37,452][85175] Updated weights for policy 1, policy_version 43890 (0.0008) +[2023-10-11 16:45:37,834][85175] Updated weights for policy 1, policy_version 43900 (0.0009) +[2023-10-11 16:45:38,484][85176] Updated weights for policy 0, policy_version 43272 (0.0007) +[2023-10-11 16:45:38,849][85176] Updated weights for policy 0, policy_version 43282 (0.0007) +[2023-10-11 16:45:39,221][85176] Updated weights for policy 0, policy_version 43292 (0.0007) +[2023-10-11 16:45:41,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 89292800. Throughput: 0: 1659.2, 1: 1700.1. Samples: 22330664. Policy #0 lag: (min: 31.0, avg: 37.3, max: 63.0) +[2023-10-11 16:45:41,063][84230] Avg episode reward: [(0, '7.890'), (1, '28.170')] +[2023-10-11 16:45:41,819][85175] Updated weights for policy 1, policy_version 43910 (0.0008) +[2023-10-11 16:45:42,192][85175] Updated weights for policy 1, policy_version 43920 (0.0009) +[2023-10-11 16:45:42,557][85175] Updated weights for policy 1, policy_version 43930 (0.0009) +[2023-10-11 16:45:43,303][85176] Updated weights for policy 0, policy_version 43302 (0.0008) +[2023-10-11 16:45:43,677][85176] Updated weights for policy 0, policy_version 43312 (0.0008) +[2023-10-11 16:45:44,044][85176] Updated weights for policy 0, policy_version 43322 (0.0008) +[2023-10-11 16:45:46,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 89358336. Throughput: 0: 1672.5, 1: 1693.9. Samples: 22351360. Policy #0 lag: (min: 31.0, avg: 37.3, max: 63.0) +[2023-10-11 16:45:46,063][84230] Avg episode reward: [(0, '8.000'), (1, '24.970')] +[2023-10-11 16:45:46,590][85175] Updated weights for policy 1, policy_version 43940 (0.0009) +[2023-10-11 16:45:46,957][85175] Updated weights for policy 1, policy_version 43950 (0.0010) +[2023-10-11 16:45:47,322][85175] Updated weights for policy 1, policy_version 43960 (0.0007) +[2023-10-11 16:45:48,197][85176] Updated weights for policy 0, policy_version 43332 (0.0008) +[2023-10-11 16:45:48,573][85176] Updated weights for policy 0, policy_version 43342 (0.0009) +[2023-10-11 16:45:48,934][85176] Updated weights for policy 0, policy_version 43352 (0.0011) +[2023-10-11 16:45:51,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 89423872. Throughput: 0: 1659.9, 1: 1675.9. Samples: 22361246. Policy #0 lag: (min: 31.0, avg: 37.3, max: 63.0) +[2023-10-11 16:45:51,063][84230] Avg episode reward: [(0, '8.010'), (1, '25.320')] +[2023-10-11 16:45:51,243][85175] Updated weights for policy 1, policy_version 43970 (0.0007) +[2023-10-11 16:45:51,604][85175] Updated weights for policy 1, policy_version 43980 (0.0007) +[2023-10-11 16:45:51,972][85175] Updated weights for policy 1, policy_version 43990 (0.0008) +[2023-10-11 16:45:52,333][85175] Updated weights for policy 1, policy_version 44000 (0.0009) +[2023-10-11 16:45:52,835][85176] Updated weights for policy 0, policy_version 43362 (0.0009) +[2023-10-11 16:45:53,217][85176] Updated weights for policy 0, policy_version 43372 (0.0008) +[2023-10-11 16:45:53,590][85176] Updated weights for policy 0, policy_version 43382 (0.0008) +[2023-10-11 16:45:53,963][85176] Updated weights for policy 0, policy_version 43392 (0.0008) +[2023-10-11 16:45:56,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 89489408. Throughput: 0: 1658.6, 1: 1698.8. Samples: 22381378. Policy #0 lag: (min: 31.0, avg: 39.1, max: 63.0) +[2023-10-11 16:45:56,064][84230] Avg episode reward: [(0, '8.020'), (1, '26.380')] +[2023-10-11 16:45:56,445][85175] Updated weights for policy 1, policy_version 44010 (0.0007) +[2023-10-11 16:45:56,820][85175] Updated weights for policy 1, policy_version 44020 (0.0008) +[2023-10-11 16:45:57,199][85175] Updated weights for policy 1, policy_version 44030 (0.0009) +[2023-10-11 16:45:58,064][85176] Updated weights for policy 0, policy_version 43402 (0.0010) +[2023-10-11 16:45:58,447][85176] Updated weights for policy 0, policy_version 43412 (0.0010) +[2023-10-11 16:45:58,822][85176] Updated weights for policy 0, policy_version 43422 (0.0009) +[2023-10-11 16:46:01,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 89554944. Throughput: 0: 1671.4, 1: 1697.4. Samples: 22401848. Policy #0 lag: (min: 31.0, avg: 39.1, max: 63.0) +[2023-10-11 16:46:01,063][84230] Avg episode reward: [(0, '7.120'), (1, '29.110')] +[2023-10-11 16:46:01,260][85175] Updated weights for policy 1, policy_version 44040 (0.0009) +[2023-10-11 16:46:01,641][85175] Updated weights for policy 1, policy_version 44050 (0.0008) +[2023-10-11 16:46:02,004][85175] Updated weights for policy 1, policy_version 44060 (0.0008) +[2023-10-11 16:46:02,150][85000] Saving new best policy, reward=29.110! +[2023-10-11 16:46:02,849][85176] Updated weights for policy 0, policy_version 43432 (0.0008) +[2023-10-11 16:46:03,216][85176] Updated weights for policy 0, policy_version 43442 (0.0008) +[2023-10-11 16:46:03,589][85176] Updated weights for policy 0, policy_version 43452 (0.0007) +[2023-10-11 16:46:06,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 89620480. Throughput: 0: 1652.8, 1: 1690.7. Samples: 22411322. Policy #0 lag: (min: 31.0, avg: 39.1, max: 63.0) +[2023-10-11 16:46:06,064][84230] Avg episode reward: [(0, '7.420'), (1, '28.070')] +[2023-10-11 16:46:06,071][85175] Updated weights for policy 1, policy_version 44070 (0.0009) +[2023-10-11 16:46:06,451][85175] Updated weights for policy 1, policy_version 44080 (0.0010) +[2023-10-11 16:46:06,825][85175] Updated weights for policy 1, policy_version 44090 (0.0009) +[2023-10-11 16:46:07,690][85176] Updated weights for policy 0, policy_version 43462 (0.0009) +[2023-10-11 16:46:08,073][85176] Updated weights for policy 0, policy_version 43472 (0.0011) +[2023-10-11 16:46:08,444][85176] Updated weights for policy 0, policy_version 43482 (0.0009) +[2023-10-11 16:46:11,039][85175] Updated weights for policy 1, policy_version 44100 (0.0008) +[2023-10-11 16:46:11,062][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 89686016. Throughput: 0: 1666.0, 1: 1696.0. Samples: 22431708. Policy #0 lag: (min: 31.0, avg: 39.1, max: 63.0) +[2023-10-11 16:46:11,063][84230] Avg episode reward: [(0, '7.250'), (1, '29.760')] +[2023-10-11 16:46:11,400][85175] Updated weights for policy 1, policy_version 44110 (0.0009) +[2023-10-11 16:46:11,766][85175] Updated weights for policy 1, policy_version 44120 (0.0008) +[2023-10-11 16:46:12,057][85000] Saving new best policy, reward=29.760! +[2023-10-11 16:46:12,694][85176] Updated weights for policy 0, policy_version 43492 (0.0009) +[2023-10-11 16:46:13,089][85176] Updated weights for policy 0, policy_version 43502 (0.0008) +[2023-10-11 16:46:13,467][85176] Updated weights for policy 0, policy_version 43512 (0.0009) +[2023-10-11 16:46:15,800][85175] Updated weights for policy 1, policy_version 44130 (0.0009) +[2023-10-11 16:46:16,062][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 89751552. Throughput: 0: 1672.2, 1: 1692.5. Samples: 22452094. Policy #0 lag: (min: 31.0, avg: 39.1, max: 63.0) +[2023-10-11 16:46:16,063][84230] Avg episode reward: [(0, '7.600'), (1, '29.320')] +[2023-10-11 16:46:16,073][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000043520_44564480.pth... +[2023-10-11 16:46:16,112][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000041952_42958848.pth +[2023-10-11 16:46:16,167][85175] Updated weights for policy 1, policy_version 44140 (0.0007) +[2023-10-11 16:46:16,534][85175] Updated weights for policy 1, policy_version 44150 (0.0007) +[2023-10-11 16:46:16,906][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000044160_45219840.pth... +[2023-10-11 16:46:16,907][85175] Updated weights for policy 1, policy_version 44160 (0.0008) +[2023-10-11 16:46:16,934][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000042560_43581440.pth +[2023-10-11 16:46:17,676][85176] Updated weights for policy 0, policy_version 43522 (0.0007) +[2023-10-11 16:46:18,059][85176] Updated weights for policy 0, policy_version 43532 (0.0008) +[2023-10-11 16:46:18,417][85176] Updated weights for policy 0, policy_version 43542 (0.0008) +[2023-10-11 16:46:18,792][85176] Updated weights for policy 0, policy_version 43552 (0.0008) +[2023-10-11 16:46:20,913][85175] Updated weights for policy 1, policy_version 44170 (0.0008) +[2023-10-11 16:46:21,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 89817088. Throughput: 0: 1651.6, 1: 1700.5. Samples: 22461548. Policy #0 lag: (min: 31.0, avg: 39.1, max: 63.0) +[2023-10-11 16:46:21,063][84230] Avg episode reward: [(0, '8.050'), (1, '27.000')] +[2023-10-11 16:46:21,288][85175] Updated weights for policy 1, policy_version 44180 (0.0011) +[2023-10-11 16:46:21,656][85175] Updated weights for policy 1, policy_version 44190 (0.0008) +[2023-10-11 16:46:23,008][85176] Updated weights for policy 0, policy_version 43562 (0.0007) +[2023-10-11 16:46:23,388][85176] Updated weights for policy 0, policy_version 43572 (0.0007) +[2023-10-11 16:46:23,763][85176] Updated weights for policy 0, policy_version 43582 (0.0007) +[2023-10-11 16:46:25,457][85175] Updated weights for policy 1, policy_version 44200 (0.0009) +[2023-10-11 16:46:25,813][85175] Updated weights for policy 1, policy_version 44210 (0.0010) +[2023-10-11 16:46:26,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 89882624. Throughput: 0: 1663.5, 1: 1703.3. Samples: 22482170. Policy #0 lag: (min: 31.0, avg: 35.5, max: 63.0) +[2023-10-11 16:46:26,063][84230] Avg episode reward: [(0, '8.160'), (1, '26.160')] +[2023-10-11 16:46:26,182][85175] Updated weights for policy 1, policy_version 44220 (0.0009) +[2023-10-11 16:46:27,750][85176] Updated weights for policy 0, policy_version 43592 (0.0007) +[2023-10-11 16:46:28,123][85176] Updated weights for policy 0, policy_version 43602 (0.0007) +[2023-10-11 16:46:28,495][85176] Updated weights for policy 0, policy_version 43612 (0.0008) +[2023-10-11 16:46:30,369][85175] Updated weights for policy 1, policy_version 44230 (0.0011) +[2023-10-11 16:46:30,731][85175] Updated weights for policy 1, policy_version 44240 (0.0009) +[2023-10-11 16:46:31,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 89948160. Throughput: 0: 1665.1, 1: 1694.6. Samples: 22502546. Policy #0 lag: (min: 31.0, avg: 35.5, max: 63.0) +[2023-10-11 16:46:31,063][84230] Avg episode reward: [(0, '7.840'), (1, '25.410')] +[2023-10-11 16:46:31,108][85175] Updated weights for policy 1, policy_version 44250 (0.0009) +[2023-10-11 16:46:32,765][85176] Updated weights for policy 0, policy_version 43622 (0.0008) +[2023-10-11 16:46:33,135][85176] Updated weights for policy 0, policy_version 43632 (0.0009) +[2023-10-11 16:46:33,506][85176] Updated weights for policy 0, policy_version 43642 (0.0008) +[2023-10-11 16:46:35,152][85175] Updated weights for policy 1, policy_version 44260 (0.0009) +[2023-10-11 16:46:35,520][85175] Updated weights for policy 1, policy_version 44270 (0.0009) +[2023-10-11 16:46:35,896][85175] Updated weights for policy 1, policy_version 44280 (0.0007) +[2023-10-11 16:46:36,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 90013696. Throughput: 0: 1653.8, 1: 1702.8. Samples: 22512292. Policy #0 lag: (min: 31.0, avg: 35.5, max: 63.0) +[2023-10-11 16:46:36,064][84230] Avg episode reward: [(0, '7.430'), (1, '28.790')] +[2023-10-11 16:46:37,560][85176] Updated weights for policy 0, policy_version 43652 (0.0007) +[2023-10-11 16:46:37,925][85176] Updated weights for policy 0, policy_version 43662 (0.0007) +[2023-10-11 16:46:38,301][85176] Updated weights for policy 0, policy_version 43672 (0.0007) +[2023-10-11 16:46:39,958][85175] Updated weights for policy 1, policy_version 44290 (0.0009) +[2023-10-11 16:46:40,321][85175] Updated weights for policy 1, policy_version 44300 (0.0009) +[2023-10-11 16:46:40,702][85175] Updated weights for policy 1, policy_version 44310 (0.0008) +[2023-10-11 16:46:41,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 90079232. Throughput: 0: 1666.2, 1: 1700.4. Samples: 22532874. Policy #0 lag: (min: 31.0, avg: 35.5, max: 63.0) +[2023-10-11 16:46:41,064][85175] Updated weights for policy 1, policy_version 44320 (0.0009) +[2023-10-11 16:46:41,064][84230] Avg episode reward: [(0, '7.600'), (1, '28.320')] +[2023-10-11 16:46:42,388][85176] Updated weights for policy 0, policy_version 43682 (0.0007) +[2023-10-11 16:46:42,760][85176] Updated weights for policy 0, policy_version 43692 (0.0008) +[2023-10-11 16:46:43,134][85176] Updated weights for policy 0, policy_version 43702 (0.0008) +[2023-10-11 16:46:43,505][85176] Updated weights for policy 0, policy_version 43712 (0.0008) +[2023-10-11 16:46:45,176][85175] Updated weights for policy 1, policy_version 44330 (0.0007) +[2023-10-11 16:46:45,552][85175] Updated weights for policy 1, policy_version 44340 (0.0007) +[2023-10-11 16:46:45,920][85175] Updated weights for policy 1, policy_version 44350 (0.0008) +[2023-10-11 16:46:46,063][84230] Fps is (10 sec: 16383.8, 60 sec: 13653.2, 300 sec: 13440.4). Total num frames: 90177536. Throughput: 0: 1670.3, 1: 1686.0. Samples: 22552886. Policy #0 lag: (min: 31.0, avg: 35.5, max: 63.0) +[2023-10-11 16:46:46,064][84230] Avg episode reward: [(0, '7.600'), (1, '25.180')] +[2023-10-11 16:46:47,705][85176] Updated weights for policy 0, policy_version 43722 (0.0007) +[2023-10-11 16:46:48,073][85176] Updated weights for policy 0, policy_version 43732 (0.0008) +[2023-10-11 16:46:48,442][85176] Updated weights for policy 0, policy_version 43742 (0.0008) +[2023-10-11 16:46:49,845][85175] Updated weights for policy 1, policy_version 44360 (0.0008) +[2023-10-11 16:46:50,217][85175] Updated weights for policy 1, policy_version 44370 (0.0009) +[2023-10-11 16:46:50,580][85175] Updated weights for policy 1, policy_version 44380 (0.0008) +[2023-10-11 16:46:51,062][84230] Fps is (10 sec: 16384.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 90243072. Throughput: 0: 1660.8, 1: 1703.3. Samples: 22562706. Policy #0 lag: (min: 31.0, avg: 35.5, max: 63.0) +[2023-10-11 16:46:51,063][84230] Avg episode reward: [(0, '7.900'), (1, '28.070')] +[2023-10-11 16:46:52,441][85176] Updated weights for policy 0, policy_version 43752 (0.0008) +[2023-10-11 16:46:52,814][85176] Updated weights for policy 0, policy_version 43762 (0.0010) +[2023-10-11 16:46:53,197][85176] Updated weights for policy 0, policy_version 43772 (0.0010) +[2023-10-11 16:46:54,557][85175] Updated weights for policy 1, policy_version 44390 (0.0009) +[2023-10-11 16:46:54,927][85175] Updated weights for policy 1, policy_version 44400 (0.0008) +[2023-10-11 16:46:55,297][85175] Updated weights for policy 1, policy_version 44410 (0.0007) +[2023-10-11 16:46:56,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 90308608. Throughput: 0: 1667.6, 1: 1702.3. Samples: 22583356. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) +[2023-10-11 16:46:56,064][84230] Avg episode reward: [(0, '7.880'), (1, '24.450')] +[2023-10-11 16:46:57,347][85176] Updated weights for policy 0, policy_version 43782 (0.0008) +[2023-10-11 16:46:57,729][85176] Updated weights for policy 0, policy_version 43792 (0.0007) +[2023-10-11 16:46:58,109][85176] Updated weights for policy 0, policy_version 43802 (0.0007) +[2023-10-11 16:46:59,297][85175] Updated weights for policy 1, policy_version 44420 (0.0010) +[2023-10-11 16:46:59,664][85175] Updated weights for policy 1, policy_version 44430 (0.0009) +[2023-10-11 16:47:00,034][85175] Updated weights for policy 1, policy_version 44440 (0.0007) +[2023-10-11 16:47:01,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 90374144. Throughput: 0: 1667.6, 1: 1682.8. Samples: 22602866. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) +[2023-10-11 16:47:01,064][84230] Avg episode reward: [(0, '7.860'), (1, '29.840')] +[2023-10-11 16:47:01,077][85000] Saving new best policy, reward=29.840! +[2023-10-11 16:47:02,158][85176] Updated weights for policy 0, policy_version 43812 (0.0007) +[2023-10-11 16:47:02,535][85176] Updated weights for policy 0, policy_version 43822 (0.0008) +[2023-10-11 16:47:02,907][85176] Updated weights for policy 0, policy_version 43832 (0.0008) +[2023-10-11 16:47:03,987][85175] Updated weights for policy 1, policy_version 44450 (0.0008) +[2023-10-11 16:47:04,364][85175] Updated weights for policy 1, policy_version 44460 (0.0010) +[2023-10-11 16:47:04,728][85175] Updated weights for policy 1, policy_version 44470 (0.0010) +[2023-10-11 16:47:05,088][85175] Updated weights for policy 1, policy_version 44480 (0.0009) +[2023-10-11 16:47:06,062][84230] Fps is (10 sec: 13107.6, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 90439680. Throughput: 0: 1662.7, 1: 1708.4. Samples: 22613250. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) +[2023-10-11 16:47:06,063][84230] Avg episode reward: [(0, '7.580'), (1, '25.090')] +[2023-10-11 16:47:07,037][85176] Updated weights for policy 0, policy_version 43842 (0.0010) +[2023-10-11 16:47:07,419][85176] Updated weights for policy 0, policy_version 43852 (0.0009) +[2023-10-11 16:47:07,795][85176] Updated weights for policy 0, policy_version 43862 (0.0009) +[2023-10-11 16:47:08,171][85176] Updated weights for policy 0, policy_version 43872 (0.0008) +[2023-10-11 16:47:09,150][85175] Updated weights for policy 1, policy_version 44490 (0.0010) +[2023-10-11 16:47:09,511][85175] Updated weights for policy 1, policy_version 44500 (0.0009) +[2023-10-11 16:47:09,872][85175] Updated weights for policy 1, policy_version 44510 (0.0009) +[2023-10-11 16:47:11,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 90505216. Throughput: 0: 1669.1, 1: 1684.8. Samples: 22633098. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) +[2023-10-11 16:47:11,063][84230] Avg episode reward: [(0, '7.600'), (1, '31.680')] +[2023-10-11 16:47:11,064][85000] Saving new best policy, reward=31.680! +[2023-10-11 16:47:12,272][85176] Updated weights for policy 0, policy_version 43882 (0.0009) +[2023-10-11 16:47:12,646][85176] Updated weights for policy 0, policy_version 43892 (0.0010) +[2023-10-11 16:47:13,015][85176] Updated weights for policy 0, policy_version 43902 (0.0008) +[2023-10-11 16:47:13,879][85175] Updated weights for policy 1, policy_version 44520 (0.0009) +[2023-10-11 16:47:14,248][85175] Updated weights for policy 1, policy_version 44530 (0.0007) +[2023-10-11 16:47:14,615][85175] Updated weights for policy 1, policy_version 44540 (0.0007) +[2023-10-11 16:47:16,063][84230] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 90570752. Throughput: 0: 1666.3, 1: 1685.2. Samples: 22653364. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) +[2023-10-11 16:47:16,064][84230] Avg episode reward: [(0, '7.600'), (1, '27.230')] +[2023-10-11 16:47:16,996][85176] Updated weights for policy 0, policy_version 43912 (0.0008) +[2023-10-11 16:47:17,363][85176] Updated weights for policy 0, policy_version 43922 (0.0007) +[2023-10-11 16:47:17,736][85176] Updated weights for policy 0, policy_version 43932 (0.0009) +[2023-10-11 16:47:18,510][85175] Updated weights for policy 1, policy_version 44550 (0.0009) +[2023-10-11 16:47:18,886][85175] Updated weights for policy 1, policy_version 44560 (0.0008) +[2023-10-11 16:47:19,258][85175] Updated weights for policy 1, policy_version 44570 (0.0009) +[2023-10-11 16:47:21,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 90636288. Throughput: 0: 1663.0, 1: 1702.7. Samples: 22663746. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) +[2023-10-11 16:47:21,063][84230] Avg episode reward: [(0, '7.750'), (1, '30.140')] +[2023-10-11 16:47:21,813][85176] Updated weights for policy 0, policy_version 43942 (0.0009) +[2023-10-11 16:47:22,193][85176] Updated weights for policy 0, policy_version 43952 (0.0009) +[2023-10-11 16:47:22,563][85176] Updated weights for policy 0, policy_version 43962 (0.0008) +[2023-10-11 16:47:23,189][85175] Updated weights for policy 1, policy_version 44580 (0.0009) +[2023-10-11 16:47:23,557][85175] Updated weights for policy 1, policy_version 44590 (0.0007) +[2023-10-11 16:47:23,925][85175] Updated weights for policy 1, policy_version 44600 (0.0009) +[2023-10-11 16:47:26,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 90701824. Throughput: 0: 1669.1, 1: 1677.5. Samples: 22683470. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:47:26,064][84230] Avg episode reward: [(0, '7.900'), (1, '25.870')] +[2023-10-11 16:47:26,874][85176] Updated weights for policy 0, policy_version 43972 (0.0009) +[2023-10-11 16:47:27,247][85176] Updated weights for policy 0, policy_version 43982 (0.0010) +[2023-10-11 16:47:27,634][85176] Updated weights for policy 0, policy_version 43992 (0.0010) +[2023-10-11 16:47:27,951][85175] Updated weights for policy 1, policy_version 44610 (0.0010) +[2023-10-11 16:47:28,323][85175] Updated weights for policy 1, policy_version 44620 (0.0009) +[2023-10-11 16:47:28,685][85175] Updated weights for policy 1, policy_version 44630 (0.0010) +[2023-10-11 16:47:29,061][85175] Updated weights for policy 1, policy_version 44640 (0.0009) +[2023-10-11 16:47:31,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 90767360. Throughput: 0: 1662.0, 1: 1699.8. Samples: 22704166. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:47:31,063][84230] Avg episode reward: [(0, '8.200'), (1, '29.490')] +[2023-10-11 16:47:31,805][85176] Updated weights for policy 0, policy_version 44002 (0.0009) +[2023-10-11 16:47:32,186][85176] Updated weights for policy 0, policy_version 44012 (0.0008) +[2023-10-11 16:47:32,554][85176] Updated weights for policy 0, policy_version 44022 (0.0009) +[2023-10-11 16:47:32,930][85176] Updated weights for policy 0, policy_version 44032 (0.0009) +[2023-10-11 16:47:33,127][85175] Updated weights for policy 1, policy_version 44650 (0.0009) +[2023-10-11 16:47:33,508][85175] Updated weights for policy 1, policy_version 44660 (0.0008) +[2023-10-11 16:47:33,882][85175] Updated weights for policy 1, policy_version 44670 (0.0007) +[2023-10-11 16:47:36,062][84230] Fps is (10 sec: 13107.6, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 90832896. Throughput: 0: 1660.6, 1: 1690.3. Samples: 22713498. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:47:36,063][84230] Avg episode reward: [(0, '8.050'), (1, '28.940')] +[2023-10-11 16:47:36,812][85176] Updated weights for policy 0, policy_version 44042 (0.0010) +[2023-10-11 16:47:37,177][85176] Updated weights for policy 0, policy_version 44052 (0.0009) +[2023-10-11 16:47:37,545][85176] Updated weights for policy 0, policy_version 44062 (0.0009) +[2023-10-11 16:47:37,879][85175] Updated weights for policy 1, policy_version 44680 (0.0008) +[2023-10-11 16:47:38,256][85175] Updated weights for policy 1, policy_version 44690 (0.0009) +[2023-10-11 16:47:38,614][85175] Updated weights for policy 1, policy_version 44700 (0.0007) +[2023-10-11 16:47:41,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 90898432. Throughput: 0: 1663.4, 1: 1679.3. Samples: 22733778. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:47:41,064][84230] Avg episode reward: [(0, '7.450'), (1, '30.390')] +[2023-10-11 16:47:41,436][85176] Updated weights for policy 0, policy_version 44072 (0.0010) +[2023-10-11 16:47:41,820][85176] Updated weights for policy 0, policy_version 44082 (0.0010) +[2023-10-11 16:47:42,193][85176] Updated weights for policy 0, policy_version 44092 (0.0008) +[2023-10-11 16:47:42,700][85175] Updated weights for policy 1, policy_version 44710 (0.0008) +[2023-10-11 16:47:43,076][85175] Updated weights for policy 1, policy_version 44720 (0.0009) +[2023-10-11 16:47:43,440][85175] Updated weights for policy 1, policy_version 44730 (0.0009) +[2023-10-11 16:47:46,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 90963968. Throughput: 0: 1670.1, 1: 1698.6. Samples: 22754456. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:47:46,063][84230] Avg episode reward: [(0, '7.450'), (1, '28.980')] +[2023-10-11 16:47:46,333][85176] Updated weights for policy 0, policy_version 44102 (0.0010) +[2023-10-11 16:47:46,729][85176] Updated weights for policy 0, policy_version 44112 (0.0011) +[2023-10-11 16:47:47,096][85176] Updated weights for policy 0, policy_version 44122 (0.0007) +[2023-10-11 16:47:47,431][85175] Updated weights for policy 1, policy_version 44740 (0.0009) +[2023-10-11 16:47:47,809][85175] Updated weights for policy 1, policy_version 44750 (0.0008) +[2023-10-11 16:47:48,178][85175] Updated weights for policy 1, policy_version 44760 (0.0009) +[2023-10-11 16:47:51,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 91029504. Throughput: 0: 1665.6, 1: 1672.2. Samples: 22763452. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:47:51,063][84230] Avg episode reward: [(0, '7.860'), (1, '30.800')] +[2023-10-11 16:47:51,241][85176] Updated weights for policy 0, policy_version 44132 (0.0007) +[2023-10-11 16:47:51,628][85176] Updated weights for policy 0, policy_version 44142 (0.0009) +[2023-10-11 16:47:52,009][85176] Updated weights for policy 0, policy_version 44152 (0.0010) +[2023-10-11 16:47:52,096][85175] Updated weights for policy 1, policy_version 44770 (0.0008) +[2023-10-11 16:47:52,461][85175] Updated weights for policy 1, policy_version 44780 (0.0008) +[2023-10-11 16:47:52,833][85175] Updated weights for policy 1, policy_version 44790 (0.0008) +[2023-10-11 16:47:53,196][85175] Updated weights for policy 1, policy_version 44800 (0.0009) +[2023-10-11 16:47:56,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 91095040. Throughput: 0: 1669.7, 1: 1692.6. Samples: 22784402. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:47:56,063][84230] Avg episode reward: [(0, '8.010'), (1, '29.330')] +[2023-10-11 16:47:56,119][85176] Updated weights for policy 0, policy_version 44162 (0.0008) +[2023-10-11 16:47:56,491][85176] Updated weights for policy 0, policy_version 44172 (0.0008) +[2023-10-11 16:47:56,872][85176] Updated weights for policy 0, policy_version 44182 (0.0008) +[2023-10-11 16:47:57,190][85175] Updated weights for policy 1, policy_version 44810 (0.0009) +[2023-10-11 16:47:57,241][85176] Updated weights for policy 0, policy_version 44192 (0.0007) +[2023-10-11 16:47:57,553][85175] Updated weights for policy 1, policy_version 44820 (0.0008) +[2023-10-11 16:47:57,925][85175] Updated weights for policy 1, policy_version 44830 (0.0007) +[2023-10-11 16:48:01,063][84230] Fps is (10 sec: 13106.8, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 91160576. Throughput: 0: 1669.4, 1: 1703.9. Samples: 22805164. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:48:01,064][84230] Avg episode reward: [(0, '8.200'), (1, '29.110')] +[2023-10-11 16:48:01,472][85176] Updated weights for policy 0, policy_version 44202 (0.0008) +[2023-10-11 16:48:01,851][85176] Updated weights for policy 0, policy_version 44212 (0.0009) +[2023-10-11 16:48:02,064][85175] Updated weights for policy 1, policy_version 44840 (0.0007) +[2023-10-11 16:48:02,218][85176] Updated weights for policy 0, policy_version 44222 (0.0008) +[2023-10-11 16:48:02,433][85175] Updated weights for policy 1, policy_version 44850 (0.0008) +[2023-10-11 16:48:02,791][85175] Updated weights for policy 1, policy_version 44860 (0.0011) +[2023-10-11 16:48:06,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 91226112. Throughput: 0: 1669.6, 1: 1673.9. Samples: 22814204. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:48:06,063][84230] Avg episode reward: [(0, '7.710'), (1, '29.180')] +[2023-10-11 16:48:06,423][85176] Updated weights for policy 0, policy_version 44232 (0.0008) +[2023-10-11 16:48:06,759][85175] Updated weights for policy 1, policy_version 44870 (0.0008) +[2023-10-11 16:48:06,798][85176] Updated weights for policy 0, policy_version 44242 (0.0008) +[2023-10-11 16:48:07,125][85175] Updated weights for policy 1, policy_version 44880 (0.0010) +[2023-10-11 16:48:07,167][85176] Updated weights for policy 0, policy_version 44252 (0.0007) +[2023-10-11 16:48:07,492][85175] Updated weights for policy 1, policy_version 44890 (0.0007) +[2023-10-11 16:48:11,063][84230] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 91291648. Throughput: 0: 1664.0, 1: 1700.6. Samples: 22834874. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:48:11,063][84230] Avg episode reward: [(0, '7.390'), (1, '32.100')] +[2023-10-11 16:48:11,064][85000] Saving new best policy, reward=32.100! +[2023-10-11 16:48:11,287][85176] Updated weights for policy 0, policy_version 44262 (0.0007) +[2023-10-11 16:48:11,624][85175] Updated weights for policy 1, policy_version 44900 (0.0007) +[2023-10-11 16:48:11,667][85176] Updated weights for policy 0, policy_version 44272 (0.0009) +[2023-10-11 16:48:11,996][85175] Updated weights for policy 1, policy_version 44910 (0.0008) +[2023-10-11 16:48:12,037][85176] Updated weights for policy 0, policy_version 44282 (0.0009) +[2023-10-11 16:48:12,355][85175] Updated weights for policy 1, policy_version 44920 (0.0008) +[2023-10-11 16:48:16,063][84230] Fps is (10 sec: 13106.8, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 91357184. Throughput: 0: 1668.9, 1: 1693.3. Samples: 22855464. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:48:16,064][84230] Avg episode reward: [(0, '7.550'), (1, '34.330')] +[2023-10-11 16:48:16,075][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000044928_46006272.pth... +[2023-10-11 16:48:16,115][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000043360_44400640.pth +[2023-10-11 16:48:16,120][85000] Saving new best policy, reward=34.330! +[2023-10-11 16:48:16,137][85176] Updated weights for policy 0, policy_version 44292 (0.0008) +[2023-10-11 16:48:16,506][85176] Updated weights for policy 0, policy_version 44302 (0.0008) +[2023-10-11 16:48:16,535][85175] Updated weights for policy 1, policy_version 44930 (0.0009) +[2023-10-11 16:48:16,872][85176] Updated weights for policy 0, policy_version 44312 (0.0008) +[2023-10-11 16:48:16,903][85175] Updated weights for policy 1, policy_version 44940 (0.0007) +[2023-10-11 16:48:17,173][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000044320_45383680.pth... +[2023-10-11 16:48:17,207][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000042752_43778048.pth +[2023-10-11 16:48:17,271][85175] Updated weights for policy 1, policy_version 44950 (0.0008) +[2023-10-11 16:48:17,640][85175] Updated weights for policy 1, policy_version 44960 (0.0010) +[2023-10-11 16:48:20,894][85176] Updated weights for policy 0, policy_version 44322 (0.0009) +[2023-10-11 16:48:21,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 91422720. Throughput: 0: 1671.1, 1: 1685.5. Samples: 22864546. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:48:21,064][84230] Avg episode reward: [(0, '7.760'), (1, '33.340')] +[2023-10-11 16:48:21,273][85176] Updated weights for policy 0, policy_version 44332 (0.0007) +[2023-10-11 16:48:21,642][85176] Updated weights for policy 0, policy_version 44342 (0.0009) +[2023-10-11 16:48:21,809][85175] Updated weights for policy 1, policy_version 44970 (0.0008) +[2023-10-11 16:48:22,014][85176] Updated weights for policy 0, policy_version 44352 (0.0009) +[2023-10-11 16:48:22,176][85175] Updated weights for policy 1, policy_version 44980 (0.0008) +[2023-10-11 16:48:22,547][85175] Updated weights for policy 1, policy_version 44990 (0.0009) +[2023-10-11 16:48:26,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 91488256. Throughput: 0: 1665.0, 1: 1695.2. Samples: 22884984. Policy #0 lag: (min: 31.0, avg: 38.8, max: 63.0) +[2023-10-11 16:48:26,064][84230] Avg episode reward: [(0, '7.650'), (1, '28.090')] +[2023-10-11 16:48:26,211][85176] Updated weights for policy 0, policy_version 44362 (0.0011) +[2023-10-11 16:48:26,489][85175] Updated weights for policy 1, policy_version 45000 (0.0008) +[2023-10-11 16:48:26,575][85176] Updated weights for policy 0, policy_version 44372 (0.0009) +[2023-10-11 16:48:26,859][85175] Updated weights for policy 1, policy_version 45010 (0.0008) +[2023-10-11 16:48:26,945][85176] Updated weights for policy 0, policy_version 44382 (0.0008) +[2023-10-11 16:48:27,215][85175] Updated weights for policy 1, policy_version 45020 (0.0009) +[2023-10-11 16:48:30,875][85176] Updated weights for policy 0, policy_version 44392 (0.0008) +[2023-10-11 16:48:31,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.1, 300 sec: 13329.3). Total num frames: 91553792. Throughput: 0: 1659.5, 1: 1698.4. Samples: 22905562. Policy #0 lag: (min: 31.0, avg: 38.8, max: 63.0) +[2023-10-11 16:48:31,064][84230] Avg episode reward: [(0, '7.690'), (1, '31.090')] +[2023-10-11 16:48:31,248][85176] Updated weights for policy 0, policy_version 44402 (0.0009) +[2023-10-11 16:48:31,348][85175] Updated weights for policy 1, policy_version 45030 (0.0008) +[2023-10-11 16:48:31,626][85176] Updated weights for policy 0, policy_version 44412 (0.0009) +[2023-10-11 16:48:31,710][85175] Updated weights for policy 1, policy_version 45040 (0.0009) +[2023-10-11 16:48:32,083][85175] Updated weights for policy 1, policy_version 45050 (0.0008) +[2023-10-11 16:48:35,858][85176] Updated weights for policy 0, policy_version 44422 (0.0007) +[2023-10-11 16:48:36,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.1, 300 sec: 13329.3). Total num frames: 91619328. Throughput: 0: 1663.1, 1: 1691.5. Samples: 22914408. Policy #0 lag: (min: 31.0, avg: 38.8, max: 63.0) +[2023-10-11 16:48:36,064][84230] Avg episode reward: [(0, '7.570'), (1, '29.560')] +[2023-10-11 16:48:36,161][85175] Updated weights for policy 1, policy_version 45060 (0.0008) +[2023-10-11 16:48:36,234][85176] Updated weights for policy 0, policy_version 44432 (0.0009) +[2023-10-11 16:48:36,524][85175] Updated weights for policy 1, policy_version 45070 (0.0008) +[2023-10-11 16:48:36,597][85176] Updated weights for policy 0, policy_version 44442 (0.0007) +[2023-10-11 16:48:36,895][85175] Updated weights for policy 1, policy_version 45080 (0.0010) +[2023-10-11 16:48:40,711][85176] Updated weights for policy 0, policy_version 44452 (0.0007) +[2023-10-11 16:48:41,014][85175] Updated weights for policy 1, policy_version 45090 (0.0011) +[2023-10-11 16:48:41,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 91684864. Throughput: 0: 1656.8, 1: 1689.8. Samples: 22935002. Policy #0 lag: (min: 31.0, avg: 38.8, max: 63.0) +[2023-10-11 16:48:41,064][84230] Avg episode reward: [(0, '7.300'), (1, '32.850')] +[2023-10-11 16:48:41,079][85176] Updated weights for policy 0, policy_version 44462 (0.0008) +[2023-10-11 16:48:41,385][85175] Updated weights for policy 1, policy_version 45100 (0.0007) +[2023-10-11 16:48:41,454][85176] Updated weights for policy 0, policy_version 44472 (0.0007) +[2023-10-11 16:48:41,751][85175] Updated weights for policy 1, policy_version 45110 (0.0008) +[2023-10-11 16:48:42,118][85175] Updated weights for policy 1, policy_version 45120 (0.0009) +[2023-10-11 16:48:45,553][85176] Updated weights for policy 0, policy_version 44482 (0.0008) +[2023-10-11 16:48:45,926][85176] Updated weights for policy 0, policy_version 44492 (0.0007) +[2023-10-11 16:48:46,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 91750400. Throughput: 0: 1655.3, 1: 1689.6. Samples: 22955684. Policy #0 lag: (min: 31.0, avg: 38.8, max: 63.0) +[2023-10-11 16:48:46,063][84230] Avg episode reward: [(0, '7.750'), (1, '27.190')] +[2023-10-11 16:48:46,100][85175] Updated weights for policy 1, policy_version 45130 (0.0008) +[2023-10-11 16:48:46,304][85176] Updated weights for policy 0, policy_version 44502 (0.0008) +[2023-10-11 16:48:46,457][85175] Updated weights for policy 1, policy_version 45140 (0.0008) +[2023-10-11 16:48:46,674][85176] Updated weights for policy 0, policy_version 44512 (0.0007) +[2023-10-11 16:48:46,833][85175] Updated weights for policy 1, policy_version 45150 (0.0009) +[2023-10-11 16:48:50,791][85175] Updated weights for policy 1, policy_version 45160 (0.0008) +[2023-10-11 16:48:50,918][85176] Updated weights for policy 0, policy_version 44522 (0.0008) +[2023-10-11 16:48:51,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 91815936. Throughput: 0: 1652.4, 1: 1694.7. Samples: 22964824. Policy #0 lag: (min: 31.0, avg: 38.8, max: 63.0) +[2023-10-11 16:48:51,064][84230] Avg episode reward: [(0, '8.020'), (1, '32.080')] +[2023-10-11 16:48:51,150][85175] Updated weights for policy 1, policy_version 45170 (0.0008) +[2023-10-11 16:48:51,290][85176] Updated weights for policy 0, policy_version 44532 (0.0008) +[2023-10-11 16:48:51,514][85175] Updated weights for policy 1, policy_version 45180 (0.0009) +[2023-10-11 16:48:51,663][85176] Updated weights for policy 0, policy_version 44542 (0.0009) +[2023-10-11 16:48:55,590][85175] Updated weights for policy 1, policy_version 45190 (0.0007) +[2023-10-11 16:48:55,934][85176] Updated weights for policy 0, policy_version 44552 (0.0010) +[2023-10-11 16:48:55,962][85175] Updated weights for policy 1, policy_version 45200 (0.0008) +[2023-10-11 16:48:56,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 91881472. Throughput: 0: 1654.1, 1: 1688.3. Samples: 22985284. Policy #0 lag: (min: 31.0, avg: 31.7, max: 49.0) +[2023-10-11 16:48:56,064][84230] Avg episode reward: [(0, '7.720'), (1, '27.440')] +[2023-10-11 16:48:56,313][85176] Updated weights for policy 0, policy_version 44562 (0.0007) +[2023-10-11 16:48:56,330][85175] Updated weights for policy 1, policy_version 45210 (0.0008) +[2023-10-11 16:48:56,693][85176] Updated weights for policy 0, policy_version 44572 (0.0008) +[2023-10-11 16:49:00,204][85175] Updated weights for policy 1, policy_version 45220 (0.0008) +[2023-10-11 16:49:00,575][85175] Updated weights for policy 1, policy_version 45230 (0.0007) +[2023-10-11 16:49:00,795][85176] Updated weights for policy 0, policy_version 44582 (0.0007) +[2023-10-11 16:49:00,950][85175] Updated weights for policy 1, policy_version 45240 (0.0008) +[2023-10-11 16:49:01,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13218.3). Total num frames: 91947008. Throughput: 0: 1649.9, 1: 1686.0. Samples: 23005580. Policy #0 lag: (min: 31.0, avg: 31.7, max: 49.0) +[2023-10-11 16:49:01,064][84230] Avg episode reward: [(0, '7.660'), (1, '27.710')] +[2023-10-11 16:49:01,174][85176] Updated weights for policy 0, policy_version 44592 (0.0008) +[2023-10-11 16:49:01,548][85176] Updated weights for policy 0, policy_version 44602 (0.0008) +[2023-10-11 16:49:05,065][85175] Updated weights for policy 1, policy_version 45250 (0.0007) +[2023-10-11 16:49:05,427][85175] Updated weights for policy 1, policy_version 45260 (0.0007) +[2023-10-11 16:49:05,769][85176] Updated weights for policy 0, policy_version 44612 (0.0007) +[2023-10-11 16:49:05,785][85175] Updated weights for policy 1, policy_version 45270 (0.0010) +[2023-10-11 16:49:06,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13218.3). Total num frames: 92012544. Throughput: 0: 1652.4, 1: 1699.4. Samples: 23015378. Policy #0 lag: (min: 31.0, avg: 31.7, max: 49.0) +[2023-10-11 16:49:06,063][84230] Avg episode reward: [(0, '7.550'), (1, '26.010')] +[2023-10-11 16:49:06,137][85176] Updated weights for policy 0, policy_version 44622 (0.0009) +[2023-10-11 16:49:06,151][85175] Updated weights for policy 1, policy_version 45280 (0.0010) +[2023-10-11 16:49:06,511][85176] Updated weights for policy 0, policy_version 44632 (0.0010) +[2023-10-11 16:49:10,420][85175] Updated weights for policy 1, policy_version 45290 (0.0011) +[2023-10-11 16:49:10,522][85176] Updated weights for policy 0, policy_version 44642 (0.0011) +[2023-10-11 16:49:10,796][85175] Updated weights for policy 1, policy_version 45300 (0.0008) +[2023-10-11 16:49:10,893][85176] Updated weights for policy 0, policy_version 44652 (0.0008) +[2023-10-11 16:49:11,062][84230] Fps is (10 sec: 13107.8, 60 sec: 13107.2, 300 sec: 13218.3). Total num frames: 92078080. Throughput: 0: 1655.1, 1: 1698.2. Samples: 23035882. Policy #0 lag: (min: 31.0, avg: 31.7, max: 49.0) +[2023-10-11 16:49:11,063][84230] Avg episode reward: [(0, '7.450'), (1, '29.750')] +[2023-10-11 16:49:11,166][85175] Updated weights for policy 1, policy_version 45310 (0.0009) +[2023-10-11 16:49:11,267][85176] Updated weights for policy 0, policy_version 44662 (0.0007) +[2023-10-11 16:49:11,644][85176] Updated weights for policy 0, policy_version 44672 (0.0007) +[2023-10-11 16:49:15,116][85175] Updated weights for policy 1, policy_version 45320 (0.0008) +[2023-10-11 16:49:15,486][85175] Updated weights for policy 1, policy_version 45330 (0.0007) +[2023-10-11 16:49:15,780][85176] Updated weights for policy 0, policy_version 44682 (0.0008) +[2023-10-11 16:49:15,860][85175] Updated weights for policy 1, policy_version 45340 (0.0007) +[2023-10-11 16:49:16,063][84230] Fps is (10 sec: 16383.5, 60 sec: 13653.3, 300 sec: 13329.3). Total num frames: 92176384. Throughput: 0: 1652.3, 1: 1677.4. Samples: 23055400. Policy #0 lag: (min: 31.0, avg: 31.7, max: 49.0) +[2023-10-11 16:49:16,064][84230] Avg episode reward: [(0, '7.670'), (1, '32.740')] +[2023-10-11 16:49:16,158][85176] Updated weights for policy 0, policy_version 44692 (0.0009) +[2023-10-11 16:49:16,518][85176] Updated weights for policy 0, policy_version 44702 (0.0008) +[2023-10-11 16:49:19,874][85175] Updated weights for policy 1, policy_version 45350 (0.0007) +[2023-10-11 16:49:20,233][85175] Updated weights for policy 1, policy_version 45360 (0.0007) +[2023-10-11 16:49:20,601][85175] Updated weights for policy 1, policy_version 45370 (0.0008) +[2023-10-11 16:49:20,850][85176] Updated weights for policy 0, policy_version 44712 (0.0009) +[2023-10-11 16:49:21,062][84230] Fps is (10 sec: 16384.0, 60 sec: 13653.4, 300 sec: 13329.4). Total num frames: 92241920. Throughput: 0: 1657.4, 1: 1699.1. Samples: 23065450. Policy #0 lag: (min: 31.0, avg: 31.7, max: 49.0) +[2023-10-11 16:49:21,063][84230] Avg episode reward: [(0, '8.160'), (1, '25.940')] +[2023-10-11 16:49:21,221][85176] Updated weights for policy 0, policy_version 44722 (0.0011) +[2023-10-11 16:49:21,608][85176] Updated weights for policy 0, policy_version 44732 (0.0009) +[2023-10-11 16:49:24,573][85175] Updated weights for policy 1, policy_version 45380 (0.0009) +[2023-10-11 16:49:24,944][85175] Updated weights for policy 1, policy_version 45390 (0.0008) +[2023-10-11 16:49:25,319][85175] Updated weights for policy 1, policy_version 45400 (0.0008) +[2023-10-11 16:49:25,671][85176] Updated weights for policy 0, policy_version 44742 (0.0008) +[2023-10-11 16:49:26,030][85176] Updated weights for policy 0, policy_version 44752 (0.0009) +[2023-10-11 16:49:26,063][84230] Fps is (10 sec: 13107.6, 60 sec: 13653.4, 300 sec: 13329.4). Total num frames: 92307456. Throughput: 0: 1652.8, 1: 1699.6. Samples: 23085864. Policy #0 lag: (min: 28.0, avg: 36.7, max: 60.0) +[2023-10-11 16:49:26,063][84230] Avg episode reward: [(0, '8.000'), (1, '28.060')] +[2023-10-11 16:49:26,406][85176] Updated weights for policy 0, policy_version 44762 (0.0007) +[2023-10-11 16:49:29,266][85175] Updated weights for policy 1, policy_version 45410 (0.0009) +[2023-10-11 16:49:29,630][85175] Updated weights for policy 1, policy_version 45420 (0.0007) +[2023-10-11 16:49:29,990][85175] Updated weights for policy 1, policy_version 45430 (0.0007) +[2023-10-11 16:49:30,354][85175] Updated weights for policy 1, policy_version 45440 (0.0008) +[2023-10-11 16:49:30,356][85176] Updated weights for policy 0, policy_version 44772 (0.0008) +[2023-10-11 16:49:30,734][85176] Updated weights for policy 0, policy_version 44782 (0.0009) +[2023-10-11 16:49:31,063][84230] Fps is (10 sec: 13106.7, 60 sec: 13653.3, 300 sec: 13329.3). Total num frames: 92372992. Throughput: 0: 1650.5, 1: 1674.5. Samples: 23105310. Policy #0 lag: (min: 28.0, avg: 36.7, max: 60.0) +[2023-10-11 16:49:31,064][84230] Avg episode reward: [(0, '7.280'), (1, '25.610')] +[2023-10-11 16:49:31,110][85176] Updated weights for policy 0, policy_version 44792 (0.0010) +[2023-10-11 16:49:34,347][85175] Updated weights for policy 1, policy_version 45450 (0.0010) +[2023-10-11 16:49:34,715][85175] Updated weights for policy 1, policy_version 45460 (0.0009) +[2023-10-11 16:49:35,079][85175] Updated weights for policy 1, policy_version 45470 (0.0010) +[2023-10-11 16:49:35,174][85176] Updated weights for policy 0, policy_version 44802 (0.0009) +[2023-10-11 16:49:35,548][85176] Updated weights for policy 0, policy_version 44812 (0.0009) +[2023-10-11 16:49:35,926][85176] Updated weights for policy 0, policy_version 44822 (0.0009) +[2023-10-11 16:49:36,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13329.4). Total num frames: 92438528. Throughput: 0: 1659.0, 1: 1701.7. Samples: 23116054. Policy #0 lag: (min: 28.0, avg: 36.7, max: 60.0) +[2023-10-11 16:49:36,063][84230] Avg episode reward: [(0, '6.990'), (1, '30.020')] +[2023-10-11 16:49:36,291][85176] Updated weights for policy 0, policy_version 44832 (0.0010) +[2023-10-11 16:49:39,153][85175] Updated weights for policy 1, policy_version 45480 (0.0008) +[2023-10-11 16:49:39,521][85175] Updated weights for policy 1, policy_version 45490 (0.0008) +[2023-10-11 16:49:39,889][85175] Updated weights for policy 1, policy_version 45500 (0.0009) +[2023-10-11 16:49:40,422][85176] Updated weights for policy 0, policy_version 44842 (0.0008) +[2023-10-11 16:49:40,799][85176] Updated weights for policy 0, policy_version 44852 (0.0008) +[2023-10-11 16:49:41,063][84230] Fps is (10 sec: 13107.6, 60 sec: 13653.4, 300 sec: 13329.4). Total num frames: 92504064. Throughput: 0: 1662.4, 1: 1688.2. Samples: 23136062. Policy #0 lag: (min: 28.0, avg: 36.7, max: 60.0) +[2023-10-11 16:49:41,063][84230] Avg episode reward: [(0, '7.570'), (1, '27.070')] +[2023-10-11 16:49:41,178][85176] Updated weights for policy 0, policy_version 44862 (0.0008) +[2023-10-11 16:49:43,940][85175] Updated weights for policy 1, policy_version 45510 (0.0008) +[2023-10-11 16:49:44,296][85175] Updated weights for policy 1, policy_version 45520 (0.0010) +[2023-10-11 16:49:44,664][85175] Updated weights for policy 1, policy_version 45530 (0.0010) +[2023-10-11 16:49:45,332][85176] Updated weights for policy 0, policy_version 44872 (0.0009) +[2023-10-11 16:49:45,709][85176] Updated weights for policy 0, policy_version 44882 (0.0008) +[2023-10-11 16:49:46,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 92569600. Throughput: 0: 1655.3, 1: 1680.5. Samples: 23155686. Policy #0 lag: (min: 28.0, avg: 36.7, max: 60.0) +[2023-10-11 16:49:46,063][84230] Avg episode reward: [(0, '8.000'), (1, '28.660')] +[2023-10-11 16:49:46,081][85176] Updated weights for policy 0, policy_version 44892 (0.0007) +[2023-10-11 16:49:48,727][85175] Updated weights for policy 1, policy_version 45540 (0.0009) +[2023-10-11 16:49:49,086][85175] Updated weights for policy 1, policy_version 45550 (0.0007) +[2023-10-11 16:49:49,459][85175] Updated weights for policy 1, policy_version 45560 (0.0009) +[2023-10-11 16:49:50,187][85176] Updated weights for policy 0, policy_version 44902 (0.0008) +[2023-10-11 16:49:50,561][85176] Updated weights for policy 0, policy_version 44912 (0.0008) +[2023-10-11 16:49:50,936][85176] Updated weights for policy 0, policy_version 44922 (0.0009) +[2023-10-11 16:49:51,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13329.4). Total num frames: 92635136. Throughput: 0: 1662.5, 1: 1694.8. Samples: 23166458. Policy #0 lag: (min: 28.0, avg: 36.7, max: 60.0) +[2023-10-11 16:49:51,063][84230] Avg episode reward: [(0, '7.970'), (1, '27.980')] +[2023-10-11 16:49:53,301][85175] Updated weights for policy 1, policy_version 45570 (0.0010) +[2023-10-11 16:49:53,662][85175] Updated weights for policy 1, policy_version 45580 (0.0007) +[2023-10-11 16:49:54,033][85175] Updated weights for policy 1, policy_version 45590 (0.0010) +[2023-10-11 16:49:54,399][85175] Updated weights for policy 1, policy_version 45600 (0.0009) +[2023-10-11 16:49:54,962][85176] Updated weights for policy 0, policy_version 44932 (0.0007) +[2023-10-11 16:49:55,345][85176] Updated weights for policy 0, policy_version 44942 (0.0010) +[2023-10-11 16:49:55,722][85176] Updated weights for policy 0, policy_version 44952 (0.0010) +[2023-10-11 16:49:56,063][84230] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 13440.4). Total num frames: 92733440. Throughput: 0: 1663.0, 1: 1674.8. Samples: 23186082. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:49:56,063][84230] Avg episode reward: [(0, '7.540'), (1, '27.270')] +[2023-10-11 16:49:58,671][85175] Updated weights for policy 1, policy_version 45610 (0.0007) +[2023-10-11 16:49:59,038][85175] Updated weights for policy 1, policy_version 45620 (0.0009) +[2023-10-11 16:49:59,398][85175] Updated weights for policy 1, policy_version 45630 (0.0009) +[2023-10-11 16:49:59,647][85176] Updated weights for policy 0, policy_version 44962 (0.0008) +[2023-10-11 16:50:00,025][85176] Updated weights for policy 0, policy_version 44972 (0.0009) +[2023-10-11 16:50:00,382][85176] Updated weights for policy 0, policy_version 44982 (0.0011) +[2023-10-11 16:50:00,750][85176] Updated weights for policy 0, policy_version 44992 (0.0010) +[2023-10-11 16:50:01,063][84230] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 13440.4). Total num frames: 92798976. Throughput: 0: 1647.6, 1: 1692.8. Samples: 23205718. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:50:01,063][84230] Avg episode reward: [(0, '7.000'), (1, '30.660')] +[2023-10-11 16:50:03,471][85175] Updated weights for policy 1, policy_version 45640 (0.0010) +[2023-10-11 16:50:03,844][85175] Updated weights for policy 1, policy_version 45650 (0.0009) +[2023-10-11 16:50:04,219][85175] Updated weights for policy 1, policy_version 45660 (0.0008) +[2023-10-11 16:50:04,818][85176] Updated weights for policy 0, policy_version 45002 (0.0008) +[2023-10-11 16:50:05,188][85176] Updated weights for policy 0, policy_version 45012 (0.0009) +[2023-10-11 16:50:05,560][85176] Updated weights for policy 0, policy_version 45022 (0.0007) +[2023-10-11 16:50:06,063][84230] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 13440.4). Total num frames: 92864512. Throughput: 0: 1667.4, 1: 1693.9. Samples: 23216708. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:50:06,064][84230] Avg episode reward: [(0, '7.280'), (1, '31.170')] +[2023-10-11 16:50:08,171][85175] Updated weights for policy 1, policy_version 45670 (0.0009) +[2023-10-11 16:50:08,537][85175] Updated weights for policy 1, policy_version 45680 (0.0011) +[2023-10-11 16:50:08,912][85175] Updated weights for policy 1, policy_version 45690 (0.0011) +[2023-10-11 16:50:09,577][85176] Updated weights for policy 0, policy_version 45032 (0.0008) +[2023-10-11 16:50:09,956][85176] Updated weights for policy 0, policy_version 45042 (0.0007) +[2023-10-11 16:50:10,332][85176] Updated weights for policy 0, policy_version 45052 (0.0007) +[2023-10-11 16:50:11,063][84230] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 13440.4). Total num frames: 92930048. Throughput: 0: 1669.1, 1: 1672.2. Samples: 23236220. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:50:11,064][84230] Avg episode reward: [(0, '8.470'), (1, '29.750')] +[2023-10-11 16:50:12,814][85175] Updated weights for policy 1, policy_version 45700 (0.0009) +[2023-10-11 16:50:13,173][85175] Updated weights for policy 1, policy_version 45710 (0.0007) +[2023-10-11 16:50:13,544][85175] Updated weights for policy 1, policy_version 45720 (0.0008) +[2023-10-11 16:50:14,497][85176] Updated weights for policy 0, policy_version 45062 (0.0008) +[2023-10-11 16:50:14,863][85176] Updated weights for policy 0, policy_version 45072 (0.0007) +[2023-10-11 16:50:15,246][85176] Updated weights for policy 0, policy_version 45082 (0.0010) +[2023-10-11 16:50:16,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 92995584. Throughput: 0: 1654.3, 1: 1703.7. Samples: 23256422. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:50:16,064][84230] Avg episode reward: [(0, '7.890'), (1, '32.360')] +[2023-10-11 16:50:16,077][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000045728_46825472.pth... +[2023-10-11 16:50:16,077][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000045088_46170112.pth... +[2023-10-11 16:50:16,112][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000044160_45219840.pth +[2023-10-11 16:50:16,113][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000043520_44564480.pth +[2023-10-11 16:50:17,518][85175] Updated weights for policy 1, policy_version 45730 (0.0007) +[2023-10-11 16:50:17,888][85175] Updated weights for policy 1, policy_version 45740 (0.0008) +[2023-10-11 16:50:18,261][85175] Updated weights for policy 1, policy_version 45750 (0.0007) +[2023-10-11 16:50:18,625][85175] Updated weights for policy 1, policy_version 45760 (0.0009) +[2023-10-11 16:50:19,219][85176] Updated weights for policy 0, policy_version 45092 (0.0010) +[2023-10-11 16:50:19,588][85176] Updated weights for policy 0, policy_version 45102 (0.0008) +[2023-10-11 16:50:19,972][85176] Updated weights for policy 0, policy_version 45112 (0.0009) +[2023-10-11 16:50:21,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 93061120. Throughput: 0: 1673.5, 1: 1677.6. Samples: 23266854. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:50:21,064][84230] Avg episode reward: [(0, '7.350'), (1, '29.290')] +[2023-10-11 16:50:22,695][85175] Updated weights for policy 1, policy_version 45770 (0.0008) +[2023-10-11 16:50:23,068][85175] Updated weights for policy 1, policy_version 45780 (0.0008) +[2023-10-11 16:50:23,443][85175] Updated weights for policy 1, policy_version 45790 (0.0008) +[2023-10-11 16:50:24,013][85176] Updated weights for policy 0, policy_version 45122 (0.0007) +[2023-10-11 16:50:24,386][85176] Updated weights for policy 0, policy_version 45132 (0.0007) +[2023-10-11 16:50:24,754][85176] Updated weights for policy 0, policy_version 45142 (0.0007) +[2023-10-11 16:50:25,128][85176] Updated weights for policy 0, policy_version 45152 (0.0007) +[2023-10-11 16:50:26,063][84230] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 93126656. Throughput: 0: 1660.1, 1: 1691.2. Samples: 23286872. Policy #0 lag: (min: 31.0, avg: 35.4, max: 63.0) +[2023-10-11 16:50:26,063][84230] Avg episode reward: [(0, '7.310'), (1, '28.440')] +[2023-10-11 16:50:27,387][85175] Updated weights for policy 1, policy_version 45800 (0.0009) +[2023-10-11 16:50:27,751][85175] Updated weights for policy 1, policy_version 45810 (0.0010) +[2023-10-11 16:50:28,117][85175] Updated weights for policy 1, policy_version 45820 (0.0007) +[2023-10-11 16:50:29,140][85176] Updated weights for policy 0, policy_version 45162 (0.0008) +[2023-10-11 16:50:29,512][85176] Updated weights for policy 0, policy_version 45172 (0.0009) +[2023-10-11 16:50:29,875][85176] Updated weights for policy 0, policy_version 45182 (0.0011) +[2023-10-11 16:50:31,063][84230] Fps is (10 sec: 13106.7, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 93192192. Throughput: 0: 1659.0, 1: 1707.0. Samples: 23307162. Policy #0 lag: (min: 31.0, avg: 35.4, max: 63.0) +[2023-10-11 16:50:31,064][84230] Avg episode reward: [(0, '7.940'), (1, '26.730')] +[2023-10-11 16:50:32,257][85175] Updated weights for policy 1, policy_version 45830 (0.0009) +[2023-10-11 16:50:32,621][85175] Updated weights for policy 1, policy_version 45840 (0.0011) +[2023-10-11 16:50:33,000][85175] Updated weights for policy 1, policy_version 45850 (0.0011) +[2023-10-11 16:50:33,921][85176] Updated weights for policy 0, policy_version 45192 (0.0010) +[2023-10-11 16:50:34,290][85176] Updated weights for policy 0, policy_version 45202 (0.0009) +[2023-10-11 16:50:34,662][85176] Updated weights for policy 0, policy_version 45212 (0.0010) +[2023-10-11 16:50:36,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 93257728. Throughput: 0: 1680.7, 1: 1677.8. Samples: 23317590. Policy #0 lag: (min: 31.0, avg: 35.4, max: 63.0) +[2023-10-11 16:50:36,063][84230] Avg episode reward: [(0, '8.010'), (1, '28.290')] +[2023-10-11 16:50:37,100][85175] Updated weights for policy 1, policy_version 45860 (0.0009) +[2023-10-11 16:50:37,468][85175] Updated weights for policy 1, policy_version 45870 (0.0008) +[2023-10-11 16:50:37,839][85175] Updated weights for policy 1, policy_version 45880 (0.0007) +[2023-10-11 16:50:38,991][85176] Updated weights for policy 0, policy_version 45222 (0.0009) +[2023-10-11 16:50:39,356][85176] Updated weights for policy 0, policy_version 45232 (0.0010) +[2023-10-11 16:50:39,724][85176] Updated weights for policy 0, policy_version 45242 (0.0009) +[2023-10-11 16:50:41,063][84230] Fps is (10 sec: 13107.7, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 93323264. Throughput: 0: 1659.9, 1: 1701.1. Samples: 23337326. Policy #0 lag: (min: 31.0, avg: 35.4, max: 63.0) +[2023-10-11 16:50:41,064][84230] Avg episode reward: [(0, '7.740'), (1, '26.800')] +[2023-10-11 16:50:41,778][85175] Updated weights for policy 1, policy_version 45890 (0.0007) +[2023-10-11 16:50:42,157][85175] Updated weights for policy 1, policy_version 45900 (0.0008) +[2023-10-11 16:50:42,516][85175] Updated weights for policy 1, policy_version 45910 (0.0007) +[2023-10-11 16:50:42,878][85175] Updated weights for policy 1, policy_version 45920 (0.0008) +[2023-10-11 16:50:43,742][85176] Updated weights for policy 0, policy_version 45252 (0.0009) +[2023-10-11 16:50:44,119][85176] Updated weights for policy 0, policy_version 45262 (0.0008) +[2023-10-11 16:50:44,496][85176] Updated weights for policy 0, policy_version 45272 (0.0007) +[2023-10-11 16:50:46,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 93388800. Throughput: 0: 1674.8, 1: 1710.5. Samples: 23358058. Policy #0 lag: (min: 31.0, avg: 35.4, max: 63.0) +[2023-10-11 16:50:46,064][84230] Avg episode reward: [(0, '7.450'), (1, '28.600')] +[2023-10-11 16:50:46,986][85175] Updated weights for policy 1, policy_version 45930 (0.0008) +[2023-10-11 16:50:47,356][85175] Updated weights for policy 1, policy_version 45940 (0.0008) +[2023-10-11 16:50:47,730][85175] Updated weights for policy 1, policy_version 45950 (0.0009) +[2023-10-11 16:50:48,721][85176] Updated weights for policy 0, policy_version 45282 (0.0008) +[2023-10-11 16:50:49,097][85176] Updated weights for policy 0, policy_version 45292 (0.0007) +[2023-10-11 16:50:49,474][85176] Updated weights for policy 0, policy_version 45302 (0.0009) +[2023-10-11 16:50:49,835][85176] Updated weights for policy 0, policy_version 45312 (0.0009) +[2023-10-11 16:50:51,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 93454336. Throughput: 0: 1680.5, 1: 1689.0. Samples: 23368334. Policy #0 lag: (min: 31.0, avg: 35.4, max: 63.0) +[2023-10-11 16:50:51,063][84230] Avg episode reward: [(0, '7.280'), (1, '28.710')] +[2023-10-11 16:50:51,565][85175] Updated weights for policy 1, policy_version 45960 (0.0008) +[2023-10-11 16:50:51,927][85175] Updated weights for policy 1, policy_version 45970 (0.0008) +[2023-10-11 16:50:52,296][85175] Updated weights for policy 1, policy_version 45980 (0.0008) +[2023-10-11 16:50:54,043][85176] Updated weights for policy 0, policy_version 45322 (0.0012) +[2023-10-11 16:50:54,414][85176] Updated weights for policy 0, policy_version 45332 (0.0010) +[2023-10-11 16:50:54,794][85176] Updated weights for policy 0, policy_version 45342 (0.0008) +[2023-10-11 16:50:56,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 93519872. Throughput: 0: 1665.6, 1: 1715.9. Samples: 23388384. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:50:56,063][84230] Avg episode reward: [(0, '7.590'), (1, '28.560')] +[2023-10-11 16:50:56,124][85175] Updated weights for policy 1, policy_version 45990 (0.0007) +[2023-10-11 16:50:56,481][85175] Updated weights for policy 1, policy_version 46000 (0.0009) +[2023-10-11 16:50:56,851][85175] Updated weights for policy 1, policy_version 46010 (0.0010) +[2023-10-11 16:50:58,854][85176] Updated weights for policy 0, policy_version 45352 (0.0009) +[2023-10-11 16:50:59,228][85176] Updated weights for policy 0, policy_version 45362 (0.0007) +[2023-10-11 16:50:59,600][85176] Updated weights for policy 0, policy_version 45372 (0.0009) +[2023-10-11 16:51:00,961][85175] Updated weights for policy 1, policy_version 46020 (0.0007) +[2023-10-11 16:51:01,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 93585408. Throughput: 0: 1674.8, 1: 1711.4. Samples: 23408800. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:51:01,063][84230] Avg episode reward: [(0, '7.600'), (1, '26.540')] +[2023-10-11 16:51:01,325][85175] Updated weights for policy 1, policy_version 46030 (0.0007) +[2023-10-11 16:51:01,699][85175] Updated weights for policy 1, policy_version 46040 (0.0007) +[2023-10-11 16:51:03,739][85176] Updated weights for policy 0, policy_version 45382 (0.0008) +[2023-10-11 16:51:04,105][85176] Updated weights for policy 0, policy_version 45392 (0.0008) +[2023-10-11 16:51:04,480][85176] Updated weights for policy 0, policy_version 45402 (0.0009) +[2023-10-11 16:51:05,733][85175] Updated weights for policy 1, policy_version 46050 (0.0007) +[2023-10-11 16:51:06,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 93650944. Throughput: 0: 1673.1, 1: 1706.1. Samples: 23418918. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:51:06,063][84230] Avg episode reward: [(0, '8.010'), (1, '29.450')] +[2023-10-11 16:51:06,112][85175] Updated weights for policy 1, policy_version 46060 (0.0007) +[2023-10-11 16:51:06,472][85175] Updated weights for policy 1, policy_version 46070 (0.0009) +[2023-10-11 16:51:06,840][85175] Updated weights for policy 1, policy_version 46080 (0.0009) +[2023-10-11 16:51:08,520][85176] Updated weights for policy 0, policy_version 45412 (0.0011) +[2023-10-11 16:51:08,895][85176] Updated weights for policy 0, policy_version 45422 (0.0010) +[2023-10-11 16:51:09,270][85176] Updated weights for policy 0, policy_version 45432 (0.0010) +[2023-10-11 16:51:10,962][85175] Updated weights for policy 1, policy_version 46090 (0.0008) +[2023-10-11 16:51:11,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 93716480. Throughput: 0: 1659.8, 1: 1709.0. Samples: 23438470. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:51:11,064][84230] Avg episode reward: [(0, '7.820'), (1, '32.450')] +[2023-10-11 16:51:11,323][85175] Updated weights for policy 1, policy_version 46100 (0.0009) +[2023-10-11 16:51:11,696][85175] Updated weights for policy 1, policy_version 46110 (0.0008) +[2023-10-11 16:51:13,424][85176] Updated weights for policy 0, policy_version 45442 (0.0007) +[2023-10-11 16:51:13,787][85176] Updated weights for policy 0, policy_version 45452 (0.0009) +[2023-10-11 16:51:14,154][85176] Updated weights for policy 0, policy_version 45462 (0.0010) +[2023-10-11 16:51:14,527][85176] Updated weights for policy 0, policy_version 45472 (0.0008) +[2023-10-11 16:51:15,683][85175] Updated weights for policy 1, policy_version 46120 (0.0008) +[2023-10-11 16:51:16,044][85175] Updated weights for policy 1, policy_version 46130 (0.0008) +[2023-10-11 16:51:16,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 93782016. Throughput: 0: 1664.0, 1: 1705.4. Samples: 23458784. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:51:16,064][84230] Avg episode reward: [(0, '7.930'), (1, '34.070')] +[2023-10-11 16:51:16,416][85175] Updated weights for policy 1, policy_version 46140 (0.0008) +[2023-10-11 16:51:18,708][85176] Updated weights for policy 0, policy_version 45482 (0.0008) +[2023-10-11 16:51:19,086][85176] Updated weights for policy 0, policy_version 45492 (0.0008) +[2023-10-11 16:51:19,452][85176] Updated weights for policy 0, policy_version 45502 (0.0009) +[2023-10-11 16:51:20,382][85175] Updated weights for policy 1, policy_version 46150 (0.0009) +[2023-10-11 16:51:20,756][85175] Updated weights for policy 1, policy_version 46160 (0.0008) +[2023-10-11 16:51:21,062][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 93847552. Throughput: 0: 1653.9, 1: 1710.3. Samples: 23468978. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:51:21,063][84230] Avg episode reward: [(0, '7.600'), (1, '32.580')] +[2023-10-11 16:51:21,122][85175] Updated weights for policy 1, policy_version 46170 (0.0009) +[2023-10-11 16:51:23,483][85176] Updated weights for policy 0, policy_version 45512 (0.0010) +[2023-10-11 16:51:23,849][85176] Updated weights for policy 0, policy_version 45522 (0.0010) +[2023-10-11 16:51:24,222][85176] Updated weights for policy 0, policy_version 45532 (0.0008) +[2023-10-11 16:51:25,171][85175] Updated weights for policy 1, policy_version 46180 (0.0008) +[2023-10-11 16:51:25,540][85175] Updated weights for policy 1, policy_version 46190 (0.0007) +[2023-10-11 16:51:25,898][85175] Updated weights for policy 1, policy_version 46200 (0.0009) +[2023-10-11 16:51:26,062][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 93913088. Throughput: 0: 1654.8, 1: 1713.3. Samples: 23488888. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:51:26,063][84230] Avg episode reward: [(0, '7.270'), (1, '29.530')] +[2023-10-11 16:51:28,090][85176] Updated weights for policy 0, policy_version 45542 (0.0007) +[2023-10-11 16:51:28,478][85176] Updated weights for policy 0, policy_version 45552 (0.0008) +[2023-10-11 16:51:28,847][85176] Updated weights for policy 0, policy_version 45562 (0.0007) +[2023-10-11 16:51:29,896][85175] Updated weights for policy 1, policy_version 46210 (0.0009) +[2023-10-11 16:51:30,268][85175] Updated weights for policy 1, policy_version 46220 (0.0009) +[2023-10-11 16:51:30,629][85175] Updated weights for policy 1, policy_version 46230 (0.0008) +[2023-10-11 16:51:30,994][85175] Updated weights for policy 1, policy_version 46240 (0.0007) +[2023-10-11 16:51:31,063][84230] Fps is (10 sec: 16383.7, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 94011392. Throughput: 0: 1664.2, 1: 1692.7. Samples: 23509118. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:51:31,064][84230] Avg episode reward: [(0, '7.590'), (1, '33.410')] +[2023-10-11 16:51:32,955][85176] Updated weights for policy 0, policy_version 45572 (0.0010) +[2023-10-11 16:51:33,332][85176] Updated weights for policy 0, policy_version 45582 (0.0007) +[2023-10-11 16:51:33,697][85176] Updated weights for policy 0, policy_version 45592 (0.0007) +[2023-10-11 16:51:35,093][85175] Updated weights for policy 1, policy_version 46250 (0.0009) +[2023-10-11 16:51:35,467][85175] Updated weights for policy 1, policy_version 46260 (0.0007) +[2023-10-11 16:51:35,836][85175] Updated weights for policy 1, policy_version 46270 (0.0007) +[2023-10-11 16:51:36,063][84230] Fps is (10 sec: 16383.7, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 94076928. Throughput: 0: 1650.7, 1: 1714.3. Samples: 23519760. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:51:36,064][84230] Avg episode reward: [(0, '7.860'), (1, '31.380')] +[2023-10-11 16:51:37,838][85176] Updated weights for policy 0, policy_version 45602 (0.0010) +[2023-10-11 16:51:38,201][85176] Updated weights for policy 0, policy_version 45612 (0.0007) +[2023-10-11 16:51:38,575][85176] Updated weights for policy 0, policy_version 45622 (0.0008) +[2023-10-11 16:51:38,950][85176] Updated weights for policy 0, policy_version 45632 (0.0008) +[2023-10-11 16:51:39,891][85175] Updated weights for policy 1, policy_version 46280 (0.0007) +[2023-10-11 16:51:40,254][85175] Updated weights for policy 1, policy_version 46290 (0.0008) +[2023-10-11 16:51:40,627][85175] Updated weights for policy 1, policy_version 46300 (0.0007) +[2023-10-11 16:51:41,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 94142464. Throughput: 0: 1659.6, 1: 1703.6. Samples: 23539730. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:51:41,063][84230] Avg episode reward: [(0, '8.160'), (1, '33.470')] +[2023-10-11 16:51:43,255][85176] Updated weights for policy 0, policy_version 45642 (0.0007) +[2023-10-11 16:51:43,638][85176] Updated weights for policy 0, policy_version 45652 (0.0007) +[2023-10-11 16:51:44,016][85176] Updated weights for policy 0, policy_version 45662 (0.0008) +[2023-10-11 16:51:44,673][85175] Updated weights for policy 1, policy_version 46310 (0.0008) +[2023-10-11 16:51:45,036][85175] Updated weights for policy 1, policy_version 46320 (0.0008) +[2023-10-11 16:51:45,397][85175] Updated weights for policy 1, policy_version 46330 (0.0009) +[2023-10-11 16:51:46,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 94208000. Throughput: 0: 1665.0, 1: 1675.4. Samples: 23559118. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:51:46,063][84230] Avg episode reward: [(0, '8.350'), (1, '29.980')] +[2023-10-11 16:51:48,240][85176] Updated weights for policy 0, policy_version 45672 (0.0010) +[2023-10-11 16:51:48,618][85176] Updated weights for policy 0, policy_version 45682 (0.0009) +[2023-10-11 16:51:48,989][85176] Updated weights for policy 0, policy_version 45692 (0.0009) +[2023-10-11 16:51:49,372][85175] Updated weights for policy 1, policy_version 46340 (0.0010) +[2023-10-11 16:51:49,737][85175] Updated weights for policy 1, policy_version 46350 (0.0007) +[2023-10-11 16:51:50,098][85175] Updated weights for policy 1, policy_version 46360 (0.0007) +[2023-10-11 16:51:51,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 94273536. Throughput: 0: 1654.5, 1: 1700.7. Samples: 23569902. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:51:51,063][84230] Avg episode reward: [(0, '7.150'), (1, '31.120')] +[2023-10-11 16:51:52,940][85176] Updated weights for policy 0, policy_version 45702 (0.0008) +[2023-10-11 16:51:53,312][85176] Updated weights for policy 0, policy_version 45712 (0.0008) +[2023-10-11 16:51:53,689][85176] Updated weights for policy 0, policy_version 45722 (0.0008) +[2023-10-11 16:51:54,049][85175] Updated weights for policy 1, policy_version 46370 (0.0007) +[2023-10-11 16:51:54,414][85175] Updated weights for policy 1, policy_version 46380 (0.0008) +[2023-10-11 16:51:54,782][85175] Updated weights for policy 1, policy_version 46390 (0.0008) +[2023-10-11 16:51:55,148][85175] Updated weights for policy 1, policy_version 46400 (0.0009) +[2023-10-11 16:51:56,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 94339072. Throughput: 0: 1668.2, 1: 1691.1. Samples: 23589638. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:51:56,063][84230] Avg episode reward: [(0, '7.000'), (1, '30.500')] +[2023-10-11 16:51:57,843][85176] Updated weights for policy 0, policy_version 45732 (0.0008) +[2023-10-11 16:51:58,221][85176] Updated weights for policy 0, policy_version 45742 (0.0009) +[2023-10-11 16:51:58,596][85176] Updated weights for policy 0, policy_version 45752 (0.0009) +[2023-10-11 16:51:59,326][85175] Updated weights for policy 1, policy_version 46410 (0.0008) +[2023-10-11 16:51:59,689][85175] Updated weights for policy 1, policy_version 46420 (0.0007) +[2023-10-11 16:52:00,067][85175] Updated weights for policy 1, policy_version 46430 (0.0007) +[2023-10-11 16:52:01,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 94404608. Throughput: 0: 1678.2, 1: 1673.6. Samples: 23609618. Policy #0 lag: (min: 31.0, avg: 50.4, max: 63.0) +[2023-10-11 16:52:01,063][84230] Avg episode reward: [(0, '7.220'), (1, '32.380')] +[2023-10-11 16:52:02,587][85176] Updated weights for policy 0, policy_version 45762 (0.0008) +[2023-10-11 16:52:02,954][85176] Updated weights for policy 0, policy_version 45772 (0.0007) +[2023-10-11 16:52:03,332][85176] Updated weights for policy 0, policy_version 45782 (0.0007) +[2023-10-11 16:52:03,711][85176] Updated weights for policy 0, policy_version 45792 (0.0010) +[2023-10-11 16:52:04,034][85175] Updated weights for policy 1, policy_version 46440 (0.0009) +[2023-10-11 16:52:04,403][85175] Updated weights for policy 1, policy_version 46450 (0.0011) +[2023-10-11 16:52:04,768][85175] Updated weights for policy 1, policy_version 46460 (0.0007) +[2023-10-11 16:52:06,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 94470144. Throughput: 0: 1657.7, 1: 1702.6. Samples: 23620194. Policy #0 lag: (min: 31.0, avg: 50.4, max: 63.0) +[2023-10-11 16:52:06,064][84230] Avg episode reward: [(0, '7.940'), (1, '31.210')] +[2023-10-11 16:52:07,766][85176] Updated weights for policy 0, policy_version 45802 (0.0009) +[2023-10-11 16:52:08,145][85176] Updated weights for policy 0, policy_version 45812 (0.0009) +[2023-10-11 16:52:08,524][85176] Updated weights for policy 0, policy_version 45822 (0.0007) +[2023-10-11 16:52:08,789][85175] Updated weights for policy 1, policy_version 46470 (0.0008) +[2023-10-11 16:52:09,152][85175] Updated weights for policy 1, policy_version 46480 (0.0008) +[2023-10-11 16:52:09,524][85175] Updated weights for policy 1, policy_version 46490 (0.0010) +[2023-10-11 16:52:11,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 94535680. Throughput: 0: 1670.3, 1: 1681.5. Samples: 23639718. Policy #0 lag: (min: 31.0, avg: 50.4, max: 63.0) +[2023-10-11 16:52:11,064][84230] Avg episode reward: [(0, '8.070'), (1, '29.670')] +[2023-10-11 16:52:12,633][85176] Updated weights for policy 0, policy_version 45832 (0.0008) +[2023-10-11 16:52:13,014][85176] Updated weights for policy 0, policy_version 45842 (0.0008) +[2023-10-11 16:52:13,392][85176] Updated weights for policy 0, policy_version 45852 (0.0008) +[2023-10-11 16:52:13,551][85175] Updated weights for policy 1, policy_version 46500 (0.0009) +[2023-10-11 16:52:13,912][85175] Updated weights for policy 1, policy_version 46510 (0.0009) +[2023-10-11 16:52:14,280][85175] Updated weights for policy 1, policy_version 46520 (0.0007) +[2023-10-11 16:52:16,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 94601216. Throughput: 0: 1668.8, 1: 1689.0. Samples: 23660218. Policy #0 lag: (min: 31.0, avg: 50.4, max: 63.0) +[2023-10-11 16:52:16,063][84230] Avg episode reward: [(0, '7.710'), (1, '30.690')] +[2023-10-11 16:52:16,073][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000045856_46956544.pth... +[2023-10-11 16:52:16,074][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000046528_47644672.pth... +[2023-10-11 16:52:16,110][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000044928_46006272.pth +[2023-10-11 16:52:16,111][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000044320_45383680.pth +[2023-10-11 16:52:17,418][85176] Updated weights for policy 0, policy_version 45862 (0.0007) +[2023-10-11 16:52:17,799][85176] Updated weights for policy 0, policy_version 45872 (0.0010) +[2023-10-11 16:52:18,170][85176] Updated weights for policy 0, policy_version 45882 (0.0008) +[2023-10-11 16:52:18,347][85175] Updated weights for policy 1, policy_version 46530 (0.0007) +[2023-10-11 16:52:18,728][85175] Updated weights for policy 1, policy_version 46540 (0.0007) +[2023-10-11 16:52:19,093][85175] Updated weights for policy 1, policy_version 46550 (0.0009) +[2023-10-11 16:52:19,462][85175] Updated weights for policy 1, policy_version 46560 (0.0009) +[2023-10-11 16:52:21,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 94666752. Throughput: 0: 1657.8, 1: 1690.9. Samples: 23670452. Policy #0 lag: (min: 31.0, avg: 50.4, max: 63.0) +[2023-10-11 16:52:21,064][84230] Avg episode reward: [(0, '7.000'), (1, '27.900')] +[2023-10-11 16:52:22,218][85176] Updated weights for policy 0, policy_version 45892 (0.0008) +[2023-10-11 16:52:22,588][85176] Updated weights for policy 0, policy_version 45902 (0.0007) +[2023-10-11 16:52:22,960][85176] Updated weights for policy 0, policy_version 45912 (0.0007) +[2023-10-11 16:52:23,511][85175] Updated weights for policy 1, policy_version 46570 (0.0007) +[2023-10-11 16:52:23,879][85175] Updated weights for policy 1, policy_version 46580 (0.0009) +[2023-10-11 16:52:24,243][85175] Updated weights for policy 1, policy_version 46590 (0.0010) +[2023-10-11 16:52:26,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 94732288. Throughput: 0: 1675.7, 1: 1670.2. Samples: 23690296. Policy #0 lag: (min: 31.0, avg: 50.4, max: 63.0) +[2023-10-11 16:52:26,063][84230] Avg episode reward: [(0, '7.090'), (1, '30.560')] +[2023-10-11 16:52:26,994][85176] Updated weights for policy 0, policy_version 45922 (0.0008) +[2023-10-11 16:52:27,376][85176] Updated weights for policy 0, policy_version 45932 (0.0009) +[2023-10-11 16:52:27,741][85176] Updated weights for policy 0, policy_version 45942 (0.0009) +[2023-10-11 16:52:28,122][85176] Updated weights for policy 0, policy_version 45952 (0.0010) +[2023-10-11 16:52:28,188][85175] Updated weights for policy 1, policy_version 46600 (0.0008) +[2023-10-11 16:52:28,553][85175] Updated weights for policy 1, policy_version 46610 (0.0007) +[2023-10-11 16:52:28,922][85175] Updated weights for policy 1, policy_version 46620 (0.0008) +[2023-10-11 16:52:31,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 94797824. Throughput: 0: 1681.4, 1: 1696.4. Samples: 23711122. Policy #0 lag: (min: 31.0, avg: 50.4, max: 63.0) +[2023-10-11 16:52:31,064][84230] Avg episode reward: [(0, '8.400'), (1, '28.880')] +[2023-10-11 16:52:32,337][85176] Updated weights for policy 0, policy_version 45962 (0.0010) +[2023-10-11 16:52:32,711][85176] Updated weights for policy 0, policy_version 45972 (0.0007) +[2023-10-11 16:52:33,067][85175] Updated weights for policy 1, policy_version 46630 (0.0008) +[2023-10-11 16:52:33,085][85176] Updated weights for policy 0, policy_version 45982 (0.0007) +[2023-10-11 16:52:33,429][85175] Updated weights for policy 1, policy_version 46640 (0.0007) +[2023-10-11 16:52:33,794][85175] Updated weights for policy 1, policy_version 46650 (0.0008) +[2023-10-11 16:52:36,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 94863360. Throughput: 0: 1662.0, 1: 1686.1. Samples: 23720564. Policy #0 lag: (min: 1.0, avg: 16.3, max: 33.0) +[2023-10-11 16:52:36,063][84230] Avg episode reward: [(0, '8.420'), (1, '31.780')] +[2023-10-11 16:52:37,181][85176] Updated weights for policy 0, policy_version 45992 (0.0007) +[2023-10-11 16:52:37,560][85176] Updated weights for policy 0, policy_version 46002 (0.0008) +[2023-10-11 16:52:37,759][85175] Updated weights for policy 1, policy_version 46660 (0.0009) +[2023-10-11 16:52:37,923][85176] Updated weights for policy 0, policy_version 46012 (0.0008) +[2023-10-11 16:52:38,129][85175] Updated weights for policy 1, policy_version 46670 (0.0009) +[2023-10-11 16:52:38,499][85175] Updated weights for policy 1, policy_version 46680 (0.0009) +[2023-10-11 16:52:41,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 94928896. Throughput: 0: 1676.6, 1: 1681.9. Samples: 23740770. Policy #0 lag: (min: 1.0, avg: 16.3, max: 33.0) +[2023-10-11 16:52:41,064][84230] Avg episode reward: [(0, '7.260'), (1, '28.560')] +[2023-10-11 16:52:41,938][85176] Updated weights for policy 0, policy_version 46022 (0.0008) +[2023-10-11 16:52:42,314][85176] Updated weights for policy 0, policy_version 46032 (0.0008) +[2023-10-11 16:52:42,457][85175] Updated weights for policy 1, policy_version 46690 (0.0008) +[2023-10-11 16:52:42,689][85176] Updated weights for policy 0, policy_version 46042 (0.0009) +[2023-10-11 16:52:42,823][85175] Updated weights for policy 1, policy_version 46700 (0.0007) +[2023-10-11 16:52:43,191][85175] Updated weights for policy 1, policy_version 46710 (0.0009) +[2023-10-11 16:52:43,557][85175] Updated weights for policy 1, policy_version 46720 (0.0009) +[2023-10-11 16:52:46,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 94994432. Throughput: 0: 1675.7, 1: 1701.8. Samples: 23761604. Policy #0 lag: (min: 1.0, avg: 16.3, max: 33.0) +[2023-10-11 16:52:46,064][84230] Avg episode reward: [(0, '6.790'), (1, '31.520')] +[2023-10-11 16:52:46,678][85176] Updated weights for policy 0, policy_version 46052 (0.0010) +[2023-10-11 16:52:47,062][85176] Updated weights for policy 0, policy_version 46062 (0.0008) +[2023-10-11 16:52:47,437][85176] Updated weights for policy 0, policy_version 46072 (0.0008) +[2023-10-11 16:52:47,603][85175] Updated weights for policy 1, policy_version 46730 (0.0007) +[2023-10-11 16:52:47,971][85175] Updated weights for policy 1, policy_version 46740 (0.0010) +[2023-10-11 16:52:48,347][85175] Updated weights for policy 1, policy_version 46750 (0.0009) +[2023-10-11 16:52:51,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 95059968. Throughput: 0: 1673.5, 1: 1672.0. Samples: 23770744. Policy #0 lag: (min: 1.0, avg: 16.3, max: 33.0) +[2023-10-11 16:52:51,064][84230] Avg episode reward: [(0, '7.320'), (1, '30.390')] +[2023-10-11 16:52:51,458][85176] Updated weights for policy 0, policy_version 46082 (0.0008) +[2023-10-11 16:52:51,838][85176] Updated weights for policy 0, policy_version 46092 (0.0011) +[2023-10-11 16:52:52,216][85176] Updated weights for policy 0, policy_version 46102 (0.0007) +[2023-10-11 16:52:52,354][85175] Updated weights for policy 1, policy_version 46760 (0.0007) +[2023-10-11 16:52:52,584][85176] Updated weights for policy 0, policy_version 46112 (0.0007) +[2023-10-11 16:52:52,724][85175] Updated weights for policy 1, policy_version 46770 (0.0007) +[2023-10-11 16:52:53,095][85175] Updated weights for policy 1, policy_version 46780 (0.0007) +[2023-10-11 16:52:56,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.5). Total num frames: 95125504. Throughput: 0: 1682.7, 1: 1691.4. Samples: 23791552. Policy #0 lag: (min: 1.0, avg: 16.3, max: 33.0) +[2023-10-11 16:52:56,063][84230] Avg episode reward: [(0, '8.160'), (1, '34.690')] +[2023-10-11 16:52:56,064][85000] Saving new best policy, reward=34.690! +[2023-10-11 16:52:56,559][85176] Updated weights for policy 0, policy_version 46122 (0.0008) +[2023-10-11 16:52:56,933][85176] Updated weights for policy 0, policy_version 46132 (0.0007) +[2023-10-11 16:52:57,118][85175] Updated weights for policy 1, policy_version 46790 (0.0008) +[2023-10-11 16:52:57,305][85176] Updated weights for policy 0, policy_version 46142 (0.0007) +[2023-10-11 16:52:57,478][85175] Updated weights for policy 1, policy_version 46800 (0.0007) +[2023-10-11 16:52:57,846][85175] Updated weights for policy 1, policy_version 46810 (0.0007) +[2023-10-11 16:53:01,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 95191040. Throughput: 0: 1679.7, 1: 1700.8. Samples: 23812344. Policy #0 lag: (min: 1.0, avg: 16.3, max: 33.0) +[2023-10-11 16:53:01,063][84230] Avg episode reward: [(0, '8.170'), (1, '32.310')] +[2023-10-11 16:53:01,617][85176] Updated weights for policy 0, policy_version 46152 (0.0008) +[2023-10-11 16:53:01,831][85175] Updated weights for policy 1, policy_version 46820 (0.0008) +[2023-10-11 16:53:01,985][85176] Updated weights for policy 0, policy_version 46162 (0.0008) +[2023-10-11 16:53:02,212][85175] Updated weights for policy 1, policy_version 46830 (0.0008) +[2023-10-11 16:53:02,346][85176] Updated weights for policy 0, policy_version 46172 (0.0009) +[2023-10-11 16:53:02,577][85175] Updated weights for policy 1, policy_version 46840 (0.0007) +[2023-10-11 16:53:06,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 95256576. Throughput: 0: 1676.0, 1: 1680.2. Samples: 23821478. Policy #0 lag: (min: 0.0, avg: 22.6, max: 32.0) +[2023-10-11 16:53:06,064][84230] Avg episode reward: [(0, '7.260'), (1, '31.800')] +[2023-10-11 16:53:06,284][85176] Updated weights for policy 0, policy_version 46182 (0.0007) +[2023-10-11 16:53:06,595][85175] Updated weights for policy 1, policy_version 46850 (0.0008) +[2023-10-11 16:53:06,651][85176] Updated weights for policy 0, policy_version 46192 (0.0008) +[2023-10-11 16:53:06,963][85175] Updated weights for policy 1, policy_version 46860 (0.0007) +[2023-10-11 16:53:07,017][85176] Updated weights for policy 0, policy_version 46202 (0.0008) +[2023-10-11 16:53:07,326][85175] Updated weights for policy 1, policy_version 46870 (0.0010) +[2023-10-11 16:53:07,696][85175] Updated weights for policy 1, policy_version 46880 (0.0008) +[2023-10-11 16:53:11,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 95322112. Throughput: 0: 1667.5, 1: 1703.5. Samples: 23841990. Policy #0 lag: (min: 0.0, avg: 22.6, max: 32.0) +[2023-10-11 16:53:11,064][84230] Avg episode reward: [(0, '7.270'), (1, '30.050')] +[2023-10-11 16:53:11,140][85176] Updated weights for policy 0, policy_version 46212 (0.0007) +[2023-10-11 16:53:11,516][85176] Updated weights for policy 0, policy_version 46222 (0.0008) +[2023-10-11 16:53:11,821][85175] Updated weights for policy 1, policy_version 46890 (0.0008) +[2023-10-11 16:53:11,891][85176] Updated weights for policy 0, policy_version 46232 (0.0008) +[2023-10-11 16:53:12,179][85175] Updated weights for policy 1, policy_version 46900 (0.0007) +[2023-10-11 16:53:12,547][85175] Updated weights for policy 1, policy_version 46910 (0.0007) +[2023-10-11 16:53:15,991][85176] Updated weights for policy 0, policy_version 46242 (0.0010) +[2023-10-11 16:53:16,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 95387648. Throughput: 0: 1667.1, 1: 1699.6. Samples: 23862622. Policy #0 lag: (min: 0.0, avg: 22.6, max: 32.0) +[2023-10-11 16:53:16,064][84230] Avg episode reward: [(0, '7.870'), (1, '31.870')] +[2023-10-11 16:53:16,365][85176] Updated weights for policy 0, policy_version 46252 (0.0008) +[2023-10-11 16:53:16,575][85175] Updated weights for policy 1, policy_version 46920 (0.0008) +[2023-10-11 16:53:16,734][85176] Updated weights for policy 0, policy_version 46262 (0.0007) +[2023-10-11 16:53:16,946][85175] Updated weights for policy 1, policy_version 46930 (0.0008) +[2023-10-11 16:53:17,104][85176] Updated weights for policy 0, policy_version 46272 (0.0008) +[2023-10-11 16:53:17,310][85175] Updated weights for policy 1, policy_version 46940 (0.0008) +[2023-10-11 16:53:21,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 95453184. Throughput: 0: 1669.7, 1: 1688.8. Samples: 23871698. Policy #0 lag: (min: 0.0, avg: 22.6, max: 32.0) +[2023-10-11 16:53:21,063][84230] Avg episode reward: [(0, '7.470'), (1, '32.190')] +[2023-10-11 16:53:21,439][85175] Updated weights for policy 1, policy_version 46950 (0.0009) +[2023-10-11 16:53:21,494][85176] Updated weights for policy 0, policy_version 46282 (0.0009) +[2023-10-11 16:53:21,800][85175] Updated weights for policy 1, policy_version 46960 (0.0007) +[2023-10-11 16:53:21,864][85176] Updated weights for policy 0, policy_version 46292 (0.0008) +[2023-10-11 16:53:22,167][85175] Updated weights for policy 1, policy_version 46970 (0.0007) +[2023-10-11 16:53:22,234][85176] Updated weights for policy 0, policy_version 46302 (0.0009) +[2023-10-11 16:53:25,985][85175] Updated weights for policy 1, policy_version 46980 (0.0009) +[2023-10-11 16:53:26,062][84230] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 95518720. Throughput: 0: 1662.9, 1: 1701.0. Samples: 23892144. Policy #0 lag: (min: 0.0, avg: 22.6, max: 32.0) +[2023-10-11 16:53:26,063][84230] Avg episode reward: [(0, '7.730'), (1, '32.540')] +[2023-10-11 16:53:26,343][85175] Updated weights for policy 1, policy_version 46990 (0.0009) +[2023-10-11 16:53:26,362][85176] Updated weights for policy 0, policy_version 46312 (0.0009) +[2023-10-11 16:53:26,720][85175] Updated weights for policy 1, policy_version 47000 (0.0007) +[2023-10-11 16:53:26,742][85176] Updated weights for policy 0, policy_version 46322 (0.0008) +[2023-10-11 16:53:27,122][85176] Updated weights for policy 0, policy_version 46332 (0.0008) +[2023-10-11 16:53:30,885][85175] Updated weights for policy 1, policy_version 47010 (0.0007) +[2023-10-11 16:53:31,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 95584256. Throughput: 0: 1662.1, 1: 1702.2. Samples: 23912994. Policy #0 lag: (min: 0.0, avg: 22.6, max: 32.0) +[2023-10-11 16:53:31,063][84230] Avg episode reward: [(0, '8.010'), (1, '32.060')] +[2023-10-11 16:53:31,118][85176] Updated weights for policy 0, policy_version 46342 (0.0010) +[2023-10-11 16:53:31,255][85175] Updated weights for policy 1, policy_version 47020 (0.0008) +[2023-10-11 16:53:31,487][85176] Updated weights for policy 0, policy_version 46352 (0.0009) +[2023-10-11 16:53:31,610][85175] Updated weights for policy 1, policy_version 47030 (0.0007) +[2023-10-11 16:53:31,868][85176] Updated weights for policy 0, policy_version 46362 (0.0008) +[2023-10-11 16:53:31,976][85175] Updated weights for policy 1, policy_version 47040 (0.0007) +[2023-10-11 16:53:35,915][85176] Updated weights for policy 0, policy_version 46372 (0.0007) +[2023-10-11 16:53:36,041][85175] Updated weights for policy 1, policy_version 47050 (0.0010) +[2023-10-11 16:53:36,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 95649792. Throughput: 0: 1661.4, 1: 1702.2. Samples: 23922104. Policy #0 lag: (min: 31.0, avg: 36.0, max: 63.0) +[2023-10-11 16:53:36,063][84230] Avg episode reward: [(0, '7.970'), (1, '30.520')] +[2023-10-11 16:53:36,293][85176] Updated weights for policy 0, policy_version 46382 (0.0008) +[2023-10-11 16:53:36,403][85175] Updated weights for policy 1, policy_version 47060 (0.0008) +[2023-10-11 16:53:36,666][85176] Updated weights for policy 0, policy_version 46392 (0.0009) +[2023-10-11 16:53:36,767][85175] Updated weights for policy 1, policy_version 47070 (0.0008) +[2023-10-11 16:53:40,718][85176] Updated weights for policy 0, policy_version 46402 (0.0007) +[2023-10-11 16:53:40,739][85175] Updated weights for policy 1, policy_version 47080 (0.0008) +[2023-10-11 16:53:41,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 95715328. Throughput: 0: 1664.3, 1: 1700.8. Samples: 23942984. Policy #0 lag: (min: 31.0, avg: 36.0, max: 63.0) +[2023-10-11 16:53:41,063][84230] Avg episode reward: [(0, '7.520'), (1, '30.720')] +[2023-10-11 16:53:41,086][85176] Updated weights for policy 0, policy_version 46412 (0.0008) +[2023-10-11 16:53:41,109][85175] Updated weights for policy 1, policy_version 47090 (0.0008) +[2023-10-11 16:53:41,456][85176] Updated weights for policy 0, policy_version 46422 (0.0009) +[2023-10-11 16:53:41,478][85175] Updated weights for policy 1, policy_version 47100 (0.0008) +[2023-10-11 16:53:41,826][85176] Updated weights for policy 0, policy_version 46432 (0.0009) +[2023-10-11 16:53:45,517][85175] Updated weights for policy 1, policy_version 47110 (0.0008) +[2023-10-11 16:53:45,878][85176] Updated weights for policy 0, policy_version 46442 (0.0007) +[2023-10-11 16:53:45,890][85175] Updated weights for policy 1, policy_version 47120 (0.0008) +[2023-10-11 16:53:46,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 95780864. Throughput: 0: 1662.2, 1: 1692.2. Samples: 23963294. Policy #0 lag: (min: 31.0, avg: 36.0, max: 63.0) +[2023-10-11 16:53:46,063][84230] Avg episode reward: [(0, '7.270'), (1, '31.000')] +[2023-10-11 16:53:46,253][85176] Updated weights for policy 0, policy_version 46452 (0.0007) +[2023-10-11 16:53:46,257][85175] Updated weights for policy 1, policy_version 47130 (0.0008) +[2023-10-11 16:53:46,619][85176] Updated weights for policy 0, policy_version 46462 (0.0008) +[2023-10-11 16:53:50,430][85175] Updated weights for policy 1, policy_version 47140 (0.0009) +[2023-10-11 16:53:50,799][85175] Updated weights for policy 1, policy_version 47150 (0.0008) +[2023-10-11 16:53:50,914][85176] Updated weights for policy 0, policy_version 46472 (0.0009) +[2023-10-11 16:53:51,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 95846400. Throughput: 0: 1660.0, 1: 1695.4. Samples: 23972470. Policy #0 lag: (min: 31.0, avg: 36.0, max: 63.0) +[2023-10-11 16:53:51,063][84230] Avg episode reward: [(0, '7.680'), (1, '33.800')] +[2023-10-11 16:53:51,165][85175] Updated weights for policy 1, policy_version 47160 (0.0007) +[2023-10-11 16:53:51,281][85176] Updated weights for policy 0, policy_version 46482 (0.0009) +[2023-10-11 16:53:51,653][85176] Updated weights for policy 0, policy_version 46492 (0.0009) +[2023-10-11 16:53:55,537][85175] Updated weights for policy 1, policy_version 47170 (0.0009) +[2023-10-11 16:53:55,904][85175] Updated weights for policy 1, policy_version 47180 (0.0009) +[2023-10-11 16:53:56,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 95911936. Throughput: 0: 1648.7, 1: 1685.8. Samples: 23992042. Policy #0 lag: (min: 31.0, avg: 36.0, max: 63.0) +[2023-10-11 16:53:56,063][84230] Avg episode reward: [(0, '8.310'), (1, '30.070')] +[2023-10-11 16:53:56,178][85176] Updated weights for policy 0, policy_version 46502 (0.0009) +[2023-10-11 16:53:56,274][85175] Updated weights for policy 1, policy_version 47190 (0.0008) +[2023-10-11 16:53:56,547][85176] Updated weights for policy 0, policy_version 46512 (0.0007) +[2023-10-11 16:53:56,644][85175] Updated weights for policy 1, policy_version 47200 (0.0008) +[2023-10-11 16:53:56,921][85176] Updated weights for policy 0, policy_version 46522 (0.0009) +[2023-10-11 16:54:00,837][85175] Updated weights for policy 1, policy_version 47210 (0.0008) +[2023-10-11 16:54:01,057][85176] Updated weights for policy 0, policy_version 46532 (0.0008) +[2023-10-11 16:54:01,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 95977472. Throughput: 0: 1641.5, 1: 1676.3. Samples: 24011922. Policy #0 lag: (min: 31.0, avg: 36.0, max: 63.0) +[2023-10-11 16:54:01,063][84230] Avg episode reward: [(0, '7.750'), (1, '30.600')] +[2023-10-11 16:54:01,206][85175] Updated weights for policy 1, policy_version 47220 (0.0007) +[2023-10-11 16:54:01,427][85176] Updated weights for policy 0, policy_version 46542 (0.0007) +[2023-10-11 16:54:01,571][85175] Updated weights for policy 1, policy_version 47230 (0.0008) +[2023-10-11 16:54:01,793][85176] Updated weights for policy 0, policy_version 46552 (0.0007) +[2023-10-11 16:54:05,551][85175] Updated weights for policy 1, policy_version 47240 (0.0009) +[2023-10-11 16:54:05,926][85175] Updated weights for policy 1, policy_version 47250 (0.0008) +[2023-10-11 16:54:06,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 96043008. Throughput: 0: 1642.3, 1: 1674.0. Samples: 24020932. Policy #0 lag: (min: 31.0, avg: 36.0, max: 63.0) +[2023-10-11 16:54:06,063][84230] Avg episode reward: [(0, '7.110'), (1, '31.050')] +[2023-10-11 16:54:06,107][85176] Updated weights for policy 0, policy_version 46562 (0.0009) +[2023-10-11 16:54:06,284][85175] Updated weights for policy 1, policy_version 47260 (0.0007) +[2023-10-11 16:54:06,509][85176] Updated weights for policy 0, policy_version 46572 (0.0009) +[2023-10-11 16:54:06,884][85176] Updated weights for policy 0, policy_version 46582 (0.0008) +[2023-10-11 16:54:07,262][85176] Updated weights for policy 0, policy_version 46592 (0.0009) +[2023-10-11 16:54:10,244][85175] Updated weights for policy 1, policy_version 47270 (0.0010) +[2023-10-11 16:54:10,608][85175] Updated weights for policy 1, policy_version 47280 (0.0009) +[2023-10-11 16:54:10,981][85175] Updated weights for policy 1, policy_version 47290 (0.0007) +[2023-10-11 16:54:11,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 96108544. Throughput: 0: 1641.8, 1: 1677.8. Samples: 24041526. Policy #0 lag: (min: 31.0, avg: 31.1, max: 36.0) +[2023-10-11 16:54:11,063][84230] Avg episode reward: [(0, '7.390'), (1, '32.480')] +[2023-10-11 16:54:11,294][85176] Updated weights for policy 0, policy_version 46602 (0.0008) +[2023-10-11 16:54:11,665][85176] Updated weights for policy 0, policy_version 46612 (0.0010) +[2023-10-11 16:54:12,040][85176] Updated weights for policy 0, policy_version 46622 (0.0008) +[2023-10-11 16:54:15,061][85175] Updated weights for policy 1, policy_version 47300 (0.0009) +[2023-10-11 16:54:15,419][85175] Updated weights for policy 1, policy_version 47310 (0.0007) +[2023-10-11 16:54:15,790][85175] Updated weights for policy 1, policy_version 47320 (0.0007) +[2023-10-11 16:54:16,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 96174080. Throughput: 0: 1646.6, 1: 1658.7. Samples: 24061732. Policy #0 lag: (min: 31.0, avg: 31.1, max: 36.0) +[2023-10-11 16:54:16,063][84230] Avg episode reward: [(0, '7.830'), (1, '32.540')] +[2023-10-11 16:54:16,074][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000047328_48463872.pth... +[2023-10-11 16:54:16,080][85176] Updated weights for policy 0, policy_version 46632 (0.0007) +[2023-10-11 16:54:16,112][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000045728_46825472.pth +[2023-10-11 16:54:16,117][85000] Saving a milestone ./train_atari/atari_frostbite_APPO/checkpoint_p1/milestones/checkpoint_000047328_48463872.pth +[2023-10-11 16:54:16,450][85176] Updated weights for policy 0, policy_version 46642 (0.0008) +[2023-10-11 16:54:16,830][85176] Updated weights for policy 0, policy_version 46652 (0.0008) +[2023-10-11 16:54:16,977][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000046656_47775744.pth... +[2023-10-11 16:54:17,016][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000045088_46170112.pth +[2023-10-11 16:54:17,021][84801] Saving a milestone ./train_atari/atari_frostbite_APPO/checkpoint_p0/milestones/checkpoint_000046656_47775744.pth +[2023-10-11 16:54:19,882][85175] Updated weights for policy 1, policy_version 47330 (0.0008) +[2023-10-11 16:54:20,238][85175] Updated weights for policy 1, policy_version 47340 (0.0008) +[2023-10-11 16:54:20,610][85175] Updated weights for policy 1, policy_version 47350 (0.0008) +[2023-10-11 16:54:20,764][85176] Updated weights for policy 0, policy_version 46662 (0.0008) +[2023-10-11 16:54:20,976][85175] Updated weights for policy 1, policy_version 47360 (0.0008) +[2023-10-11 16:54:21,062][84230] Fps is (10 sec: 16384.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 96272384. Throughput: 0: 1648.1, 1: 1670.6. Samples: 24071446. Policy #0 lag: (min: 31.0, avg: 31.1, max: 36.0) +[2023-10-11 16:54:21,063][84230] Avg episode reward: [(0, '7.990'), (1, '34.880')] +[2023-10-11 16:54:21,064][85000] Saving new best policy, reward=34.880! +[2023-10-11 16:54:21,130][85176] Updated weights for policy 0, policy_version 46672 (0.0009) +[2023-10-11 16:54:21,514][85176] Updated weights for policy 0, policy_version 46682 (0.0010) +[2023-10-11 16:54:24,984][85175] Updated weights for policy 1, policy_version 47370 (0.0009) +[2023-10-11 16:54:25,351][85175] Updated weights for policy 1, policy_version 47380 (0.0010) +[2023-10-11 16:54:25,710][85175] Updated weights for policy 1, policy_version 47390 (0.0008) +[2023-10-11 16:54:25,712][85176] Updated weights for policy 0, policy_version 46692 (0.0009) +[2023-10-11 16:54:26,063][84230] Fps is (10 sec: 16384.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 96337920. Throughput: 0: 1645.9, 1: 1671.7. Samples: 24092274. Policy #0 lag: (min: 31.0, avg: 31.1, max: 36.0) +[2023-10-11 16:54:26,063][84230] Avg episode reward: [(0, '8.050'), (1, '35.390')] +[2023-10-11 16:54:26,064][85000] Saving new best policy, reward=35.390! +[2023-10-11 16:54:26,089][85176] Updated weights for policy 0, policy_version 46702 (0.0009) +[2023-10-11 16:54:26,461][85176] Updated weights for policy 0, policy_version 46712 (0.0007) +[2023-10-11 16:54:29,640][85175] Updated weights for policy 1, policy_version 47400 (0.0008) +[2023-10-11 16:54:30,014][85175] Updated weights for policy 1, policy_version 47410 (0.0007) +[2023-10-11 16:54:30,380][85175] Updated weights for policy 1, policy_version 47420 (0.0009) +[2023-10-11 16:54:30,620][85176] Updated weights for policy 0, policy_version 46722 (0.0008) +[2023-10-11 16:54:30,992][85176] Updated weights for policy 0, policy_version 46732 (0.0007) +[2023-10-11 16:54:31,063][84230] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 96403456. Throughput: 0: 1644.0, 1: 1656.3. Samples: 24111808. Policy #0 lag: (min: 31.0, avg: 31.1, max: 36.0) +[2023-10-11 16:54:31,064][84230] Avg episode reward: [(0, '7.410'), (1, '34.940')] +[2023-10-11 16:54:31,369][85176] Updated weights for policy 0, policy_version 46742 (0.0007) +[2023-10-11 16:54:31,743][85176] Updated weights for policy 0, policy_version 46752 (0.0009) +[2023-10-11 16:54:34,308][85175] Updated weights for policy 1, policy_version 47430 (0.0009) +[2023-10-11 16:54:34,672][85175] Updated weights for policy 1, policy_version 47440 (0.0007) +[2023-10-11 16:54:35,039][85175] Updated weights for policy 1, policy_version 47450 (0.0008) +[2023-10-11 16:54:35,893][85176] Updated weights for policy 0, policy_version 46762 (0.0007) +[2023-10-11 16:54:36,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 96468992. Throughput: 0: 1644.6, 1: 1683.5. Samples: 24122236. Policy #0 lag: (min: 31.0, avg: 31.1, max: 36.0) +[2023-10-11 16:54:36,063][84230] Avg episode reward: [(0, '7.260'), (1, '30.400')] +[2023-10-11 16:54:36,262][85176] Updated weights for policy 0, policy_version 46772 (0.0009) +[2023-10-11 16:54:36,640][85176] Updated weights for policy 0, policy_version 46782 (0.0008) +[2023-10-11 16:54:39,249][85175] Updated weights for policy 1, policy_version 47460 (0.0008) +[2023-10-11 16:54:39,623][85175] Updated weights for policy 1, policy_version 47470 (0.0008) +[2023-10-11 16:54:39,989][85175] Updated weights for policy 1, policy_version 47480 (0.0009) +[2023-10-11 16:54:40,818][85176] Updated weights for policy 0, policy_version 46792 (0.0010) +[2023-10-11 16:54:41,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 96534528. Throughput: 0: 1656.8, 1: 1682.8. Samples: 24142324. Policy #0 lag: (min: 31.0, avg: 31.1, max: 36.0) +[2023-10-11 16:54:41,063][84230] Avg episode reward: [(0, '8.200'), (1, '30.500')] +[2023-10-11 16:54:41,187][85176] Updated weights for policy 0, policy_version 46802 (0.0010) +[2023-10-11 16:54:41,563][85176] Updated weights for policy 0, policy_version 46812 (0.0009) +[2023-10-11 16:54:43,945][85175] Updated weights for policy 1, policy_version 47490 (0.0008) +[2023-10-11 16:54:44,308][85175] Updated weights for policy 1, policy_version 47500 (0.0009) +[2023-10-11 16:54:44,674][85175] Updated weights for policy 1, policy_version 47510 (0.0009) +[2023-10-11 16:54:45,046][85175] Updated weights for policy 1, policy_version 47520 (0.0010) +[2023-10-11 16:54:45,698][85176] Updated weights for policy 0, policy_version 46822 (0.0008) +[2023-10-11 16:54:46,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 96600064. Throughput: 0: 1661.7, 1: 1675.7. Samples: 24162104. Policy #0 lag: (min: 31.0, avg: 38.2, max: 63.0) +[2023-10-11 16:54:46,064][84230] Avg episode reward: [(0, '8.200'), (1, '31.750')] +[2023-10-11 16:54:46,069][85176] Updated weights for policy 0, policy_version 46832 (0.0007) +[2023-10-11 16:54:46,443][85176] Updated weights for policy 0, policy_version 46842 (0.0007) +[2023-10-11 16:54:49,289][85175] Updated weights for policy 1, policy_version 47530 (0.0007) +[2023-10-11 16:54:49,655][85175] Updated weights for policy 1, policy_version 47540 (0.0008) +[2023-10-11 16:54:50,034][85175] Updated weights for policy 1, policy_version 47550 (0.0010) +[2023-10-11 16:54:50,613][85176] Updated weights for policy 0, policy_version 46852 (0.0009) +[2023-10-11 16:54:50,985][85176] Updated weights for policy 0, policy_version 46862 (0.0010) +[2023-10-11 16:54:51,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 96665600. Throughput: 0: 1665.3, 1: 1704.5. Samples: 24172576. Policy #0 lag: (min: 31.0, avg: 38.2, max: 63.0) +[2023-10-11 16:54:51,064][84230] Avg episode reward: [(0, '7.750'), (1, '33.550')] +[2023-10-11 16:54:51,353][85176] Updated weights for policy 0, policy_version 46872 (0.0010) +[2023-10-11 16:54:53,977][85175] Updated weights for policy 1, policy_version 47560 (0.0009) +[2023-10-11 16:54:54,346][85175] Updated weights for policy 1, policy_version 47570 (0.0007) +[2023-10-11 16:54:54,712][85175] Updated weights for policy 1, policy_version 47580 (0.0011) +[2023-10-11 16:54:55,463][85176] Updated weights for policy 0, policy_version 46882 (0.0011) +[2023-10-11 16:54:55,833][85176] Updated weights for policy 0, policy_version 46892 (0.0011) +[2023-10-11 16:54:56,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 96731136. Throughput: 0: 1665.5, 1: 1679.0. Samples: 24192026. Policy #0 lag: (min: 31.0, avg: 38.2, max: 63.0) +[2023-10-11 16:54:56,063][84230] Avg episode reward: [(0, '6.850'), (1, '35.690')] +[2023-10-11 16:54:56,064][85000] Saving new best policy, reward=35.690! +[2023-10-11 16:54:56,202][85176] Updated weights for policy 0, policy_version 46902 (0.0012) +[2023-10-11 16:54:56,575][85176] Updated weights for policy 0, policy_version 46912 (0.0009) +[2023-10-11 16:54:58,809][85175] Updated weights for policy 1, policy_version 47590 (0.0011) +[2023-10-11 16:54:59,176][85175] Updated weights for policy 1, policy_version 47600 (0.0010) +[2023-10-11 16:54:59,543][85175] Updated weights for policy 1, policy_version 47610 (0.0010) +[2023-10-11 16:55:00,817][85176] Updated weights for policy 0, policy_version 46922 (0.0011) +[2023-10-11 16:55:01,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 96796672. Throughput: 0: 1653.2, 1: 1686.9. Samples: 24212040. Policy #0 lag: (min: 31.0, avg: 38.2, max: 63.0) +[2023-10-11 16:55:01,063][84230] Avg episode reward: [(0, '7.450'), (1, '34.120')] +[2023-10-11 16:55:01,187][85176] Updated weights for policy 0, policy_version 46932 (0.0009) +[2023-10-11 16:55:01,563][85176] Updated weights for policy 0, policy_version 46942 (0.0010) +[2023-10-11 16:55:03,460][85175] Updated weights for policy 1, policy_version 47620 (0.0009) +[2023-10-11 16:55:03,828][85175] Updated weights for policy 1, policy_version 47630 (0.0009) +[2023-10-11 16:55:04,188][85175] Updated weights for policy 1, policy_version 47640 (0.0009) +[2023-10-11 16:55:05,777][85176] Updated weights for policy 0, policy_version 46952 (0.0007) +[2023-10-11 16:55:06,062][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 96862208. Throughput: 0: 1653.4, 1: 1699.1. Samples: 24222310. Policy #0 lag: (min: 31.0, avg: 38.2, max: 63.0) +[2023-10-11 16:55:06,063][84230] Avg episode reward: [(0, '8.120'), (1, '35.740')] +[2023-10-11 16:55:06,064][85000] Saving new best policy, reward=35.740! +[2023-10-11 16:55:06,154][85176] Updated weights for policy 0, policy_version 46962 (0.0007) +[2023-10-11 16:55:06,523][85176] Updated weights for policy 0, policy_version 46972 (0.0009) +[2023-10-11 16:55:08,350][85175] Updated weights for policy 1, policy_version 47650 (0.0009) +[2023-10-11 16:55:08,715][85175] Updated weights for policy 1, policy_version 47660 (0.0011) +[2023-10-11 16:55:09,080][85175] Updated weights for policy 1, policy_version 47670 (0.0012) +[2023-10-11 16:55:09,448][85175] Updated weights for policy 1, policy_version 47680 (0.0009) +[2023-10-11 16:55:10,514][85176] Updated weights for policy 0, policy_version 46982 (0.0009) +[2023-10-11 16:55:10,887][85176] Updated weights for policy 0, policy_version 46992 (0.0007) +[2023-10-11 16:55:11,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 96927744. Throughput: 0: 1653.1, 1: 1674.0. Samples: 24241996. Policy #0 lag: (min: 31.0, avg: 38.2, max: 63.0) +[2023-10-11 16:55:11,063][84230] Avg episode reward: [(0, '7.970'), (1, '34.010')] +[2023-10-11 16:55:11,262][85176] Updated weights for policy 0, policy_version 47002 (0.0007) +[2023-10-11 16:55:13,310][85175] Updated weights for policy 1, policy_version 47690 (0.0008) +[2023-10-11 16:55:13,677][85175] Updated weights for policy 1, policy_version 47700 (0.0010) +[2023-10-11 16:55:14,044][85175] Updated weights for policy 1, policy_version 47710 (0.0009) +[2023-10-11 16:55:15,276][85176] Updated weights for policy 0, policy_version 47012 (0.0007) +[2023-10-11 16:55:15,640][85176] Updated weights for policy 0, policy_version 47022 (0.0008) +[2023-10-11 16:55:16,023][85176] Updated weights for policy 0, policy_version 47032 (0.0009) +[2023-10-11 16:55:16,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 96993280. Throughput: 0: 1648.9, 1: 1703.7. Samples: 24262674. Policy #0 lag: (min: 31.0, avg: 38.2, max: 63.0) +[2023-10-11 16:55:16,064][84230] Avg episode reward: [(0, '7.900'), (1, '37.220')] +[2023-10-11 16:55:16,072][85000] Saving new best policy, reward=37.220! +[2023-10-11 16:55:17,865][85175] Updated weights for policy 1, policy_version 47720 (0.0007) +[2023-10-11 16:55:18,227][85175] Updated weights for policy 1, policy_version 47730 (0.0007) +[2023-10-11 16:55:18,596][85175] Updated weights for policy 1, policy_version 47740 (0.0008) +[2023-10-11 16:55:20,026][85176] Updated weights for policy 0, policy_version 47042 (0.0008) +[2023-10-11 16:55:20,404][85176] Updated weights for policy 0, policy_version 47052 (0.0008) +[2023-10-11 16:55:20,777][85176] Updated weights for policy 0, policy_version 47062 (0.0007) +[2023-10-11 16:55:21,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 97058816. Throughput: 0: 1661.4, 1: 1679.7. Samples: 24272586. Policy #0 lag: (min: 31.0, avg: 34.4, max: 63.0) +[2023-10-11 16:55:21,063][84230] Avg episode reward: [(0, '7.450'), (1, '33.970')] +[2023-10-11 16:55:21,141][85176] Updated weights for policy 0, policy_version 47072 (0.0007) +[2023-10-11 16:55:22,642][85175] Updated weights for policy 1, policy_version 47750 (0.0008) +[2023-10-11 16:55:23,015][85175] Updated weights for policy 1, policy_version 47760 (0.0008) +[2023-10-11 16:55:23,390][85175] Updated weights for policy 1, policy_version 47770 (0.0007) +[2023-10-11 16:55:25,126][85176] Updated weights for policy 0, policy_version 47082 (0.0007) +[2023-10-11 16:55:25,493][85176] Updated weights for policy 0, policy_version 47092 (0.0008) +[2023-10-11 16:55:25,872][85176] Updated weights for policy 0, policy_version 47102 (0.0009) +[2023-10-11 16:55:26,063][84230] Fps is (10 sec: 16384.1, 60 sec: 13653.3, 300 sec: 13440.5). Total num frames: 97157120. Throughput: 0: 1663.8, 1: 1691.5. Samples: 24293312. Policy #0 lag: (min: 31.0, avg: 34.4, max: 63.0) +[2023-10-11 16:55:26,064][84230] Avg episode reward: [(0, '7.450'), (1, '32.760')] +[2023-10-11 16:55:27,491][85175] Updated weights for policy 1, policy_version 47780 (0.0009) +[2023-10-11 16:55:27,860][85175] Updated weights for policy 1, policy_version 47790 (0.0007) +[2023-10-11 16:55:28,232][85175] Updated weights for policy 1, policy_version 47800 (0.0008) +[2023-10-11 16:55:30,021][85176] Updated weights for policy 0, policy_version 47112 (0.0011) +[2023-10-11 16:55:30,388][85176] Updated weights for policy 0, policy_version 47122 (0.0010) +[2023-10-11 16:55:30,769][85176] Updated weights for policy 0, policy_version 47132 (0.0009) +[2023-10-11 16:55:31,063][84230] Fps is (10 sec: 16383.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 97222656. Throughput: 0: 1647.2, 1: 1714.4. Samples: 24313374. Policy #0 lag: (min: 31.0, avg: 34.4, max: 63.0) +[2023-10-11 16:55:31,064][84230] Avg episode reward: [(0, '7.900'), (1, '30.900')] +[2023-10-11 16:55:32,258][85175] Updated weights for policy 1, policy_version 47810 (0.0007) +[2023-10-11 16:55:32,631][85175] Updated weights for policy 1, policy_version 47820 (0.0010) +[2023-10-11 16:55:33,004][85175] Updated weights for policy 1, policy_version 47830 (0.0007) +[2023-10-11 16:55:33,375][85175] Updated weights for policy 1, policy_version 47840 (0.0008) +[2023-10-11 16:55:34,710][85176] Updated weights for policy 0, policy_version 47142 (0.0011) +[2023-10-11 16:55:35,085][85176] Updated weights for policy 0, policy_version 47152 (0.0008) +[2023-10-11 16:55:35,455][85176] Updated weights for policy 0, policy_version 47162 (0.0008) +[2023-10-11 16:55:36,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 97288192. Throughput: 0: 1669.0, 1: 1685.5. Samples: 24323530. Policy #0 lag: (min: 31.0, avg: 34.4, max: 63.0) +[2023-10-11 16:55:36,064][84230] Avg episode reward: [(0, '7.600'), (1, '32.640')] +[2023-10-11 16:55:37,508][85175] Updated weights for policy 1, policy_version 47850 (0.0008) +[2023-10-11 16:55:37,872][85175] Updated weights for policy 1, policy_version 47860 (0.0010) +[2023-10-11 16:55:38,238][85175] Updated weights for policy 1, policy_version 47870 (0.0011) +[2023-10-11 16:55:39,610][85176] Updated weights for policy 0, policy_version 47172 (0.0011) +[2023-10-11 16:55:39,986][85176] Updated weights for policy 0, policy_version 47182 (0.0011) +[2023-10-11 16:55:40,357][85176] Updated weights for policy 0, policy_version 47192 (0.0010) +[2023-10-11 16:55:41,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 97353728. Throughput: 0: 1671.0, 1: 1702.4. Samples: 24343828. Policy #0 lag: (min: 31.0, avg: 34.4, max: 63.0) +[2023-10-11 16:55:41,064][84230] Avg episode reward: [(0, '8.020'), (1, '32.120')] +[2023-10-11 16:55:42,189][85175] Updated weights for policy 1, policy_version 47880 (0.0010) +[2023-10-11 16:55:42,543][85175] Updated weights for policy 1, policy_version 47890 (0.0010) +[2023-10-11 16:55:42,912][85175] Updated weights for policy 1, policy_version 47900 (0.0009) +[2023-10-11 16:55:44,724][85176] Updated weights for policy 0, policy_version 47202 (0.0008) +[2023-10-11 16:55:45,106][85176] Updated weights for policy 0, policy_version 47212 (0.0008) +[2023-10-11 16:55:45,489][85176] Updated weights for policy 0, policy_version 47222 (0.0008) +[2023-10-11 16:55:45,852][85176] Updated weights for policy 0, policy_version 47232 (0.0009) +[2023-10-11 16:55:46,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 97419264. Throughput: 0: 1659.8, 1: 1713.5. Samples: 24363842. Policy #0 lag: (min: 31.0, avg: 34.4, max: 63.0) +[2023-10-11 16:55:46,064][84230] Avg episode reward: [(0, '7.870'), (1, '37.520')] +[2023-10-11 16:55:46,074][85000] Saving new best policy, reward=37.520! +[2023-10-11 16:55:47,032][85175] Updated weights for policy 1, policy_version 47910 (0.0008) +[2023-10-11 16:55:47,408][85175] Updated weights for policy 1, policy_version 47920 (0.0008) +[2023-10-11 16:55:47,773][85175] Updated weights for policy 1, policy_version 47930 (0.0010) +[2023-10-11 16:55:50,051][85176] Updated weights for policy 0, policy_version 47242 (0.0008) +[2023-10-11 16:55:50,419][85176] Updated weights for policy 0, policy_version 47252 (0.0007) +[2023-10-11 16:55:50,793][85176] Updated weights for policy 0, policy_version 47262 (0.0009) +[2023-10-11 16:55:51,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 97484800. Throughput: 0: 1679.1, 1: 1690.6. Samples: 24373944. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) +[2023-10-11 16:55:51,063][84230] Avg episode reward: [(0, '7.600'), (1, '37.260')] +[2023-10-11 16:55:51,773][85175] Updated weights for policy 1, policy_version 47940 (0.0009) +[2023-10-11 16:55:52,142][85175] Updated weights for policy 1, policy_version 47950 (0.0008) +[2023-10-11 16:55:52,503][85175] Updated weights for policy 1, policy_version 47960 (0.0008) +[2023-10-11 16:55:54,752][85176] Updated weights for policy 0, policy_version 47272 (0.0010) +[2023-10-11 16:55:55,124][85176] Updated weights for policy 0, policy_version 47282 (0.0008) +[2023-10-11 16:55:55,504][85176] Updated weights for policy 0, policy_version 47292 (0.0008) +[2023-10-11 16:55:56,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 97550336. Throughput: 0: 1676.4, 1: 1718.7. Samples: 24394776. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) +[2023-10-11 16:55:56,064][84230] Avg episode reward: [(0, '7.530'), (1, '35.720')] +[2023-10-11 16:55:56,442][85175] Updated weights for policy 1, policy_version 47970 (0.0008) +[2023-10-11 16:55:56,803][85175] Updated weights for policy 1, policy_version 47980 (0.0008) +[2023-10-11 16:55:57,171][85175] Updated weights for policy 1, policy_version 47990 (0.0009) +[2023-10-11 16:55:57,538][85175] Updated weights for policy 1, policy_version 48000 (0.0008) +[2023-10-11 16:55:59,399][85176] Updated weights for policy 0, policy_version 47302 (0.0008) +[2023-10-11 16:55:59,772][85176] Updated weights for policy 0, policy_version 47312 (0.0008) +[2023-10-11 16:56:00,145][85176] Updated weights for policy 0, policy_version 47322 (0.0009) +[2023-10-11 16:56:01,062][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 97615872. Throughput: 0: 1662.9, 1: 1715.3. Samples: 24414690. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) +[2023-10-11 16:56:01,063][84230] Avg episode reward: [(0, '7.350'), (1, '35.340')] +[2023-10-11 16:56:01,540][85175] Updated weights for policy 1, policy_version 48010 (0.0007) +[2023-10-11 16:56:01,911][85175] Updated weights for policy 1, policy_version 48020 (0.0007) +[2023-10-11 16:56:02,276][85175] Updated weights for policy 1, policy_version 48030 (0.0008) +[2023-10-11 16:56:04,272][85176] Updated weights for policy 0, policy_version 47332 (0.0008) +[2023-10-11 16:56:04,658][85176] Updated weights for policy 0, policy_version 47342 (0.0008) +[2023-10-11 16:56:05,029][85176] Updated weights for policy 0, policy_version 47352 (0.0008) +[2023-10-11 16:56:06,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 97681408. Throughput: 0: 1678.7, 1: 1710.0. Samples: 24425078. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) +[2023-10-11 16:56:06,063][84230] Avg episode reward: [(0, '7.640'), (1, '31.620')] +[2023-10-11 16:56:06,292][85175] Updated weights for policy 1, policy_version 48040 (0.0009) +[2023-10-11 16:56:06,663][85175] Updated weights for policy 1, policy_version 48050 (0.0008) +[2023-10-11 16:56:07,022][85175] Updated weights for policy 1, policy_version 48060 (0.0008) +[2023-10-11 16:56:09,258][85176] Updated weights for policy 0, policy_version 47362 (0.0008) +[2023-10-11 16:56:09,638][85176] Updated weights for policy 0, policy_version 47372 (0.0009) +[2023-10-11 16:56:10,007][85176] Updated weights for policy 0, policy_version 47382 (0.0008) +[2023-10-11 16:56:10,368][85176] Updated weights for policy 0, policy_version 47392 (0.0010) +[2023-10-11 16:56:10,985][85175] Updated weights for policy 1, policy_version 48070 (0.0008) +[2023-10-11 16:56:11,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 97746944. Throughput: 0: 1664.4, 1: 1712.9. Samples: 24445292. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) +[2023-10-11 16:56:11,063][84230] Avg episode reward: [(0, '8.020'), (1, '36.560')] +[2023-10-11 16:56:11,346][85175] Updated weights for policy 1, policy_version 48080 (0.0009) +[2023-10-11 16:56:11,712][85175] Updated weights for policy 1, policy_version 48090 (0.0010) +[2023-10-11 16:56:14,374][85176] Updated weights for policy 0, policy_version 47402 (0.0009) +[2023-10-11 16:56:14,749][85176] Updated weights for policy 0, policy_version 47412 (0.0008) +[2023-10-11 16:56:15,115][85176] Updated weights for policy 0, policy_version 47422 (0.0009) +[2023-10-11 16:56:15,765][85175] Updated weights for policy 1, policy_version 48100 (0.0007) +[2023-10-11 16:56:16,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 97812480. Throughput: 0: 1664.7, 1: 1706.6. Samples: 24465082. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) +[2023-10-11 16:56:16,063][84230] Avg episode reward: [(0, '7.690'), (1, '33.140')] +[2023-10-11 16:56:16,070][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000047424_48562176.pth... +[2023-10-11 16:56:16,112][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000045856_46956544.pth +[2023-10-11 16:56:16,130][85175] Updated weights for policy 1, policy_version 48110 (0.0007) +[2023-10-11 16:56:16,495][85175] Updated weights for policy 1, policy_version 48120 (0.0011) +[2023-10-11 16:56:16,783][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000048128_49283072.pth... +[2023-10-11 16:56:16,822][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000046528_47644672.pth +[2023-10-11 16:56:18,962][85176] Updated weights for policy 0, policy_version 47432 (0.0007) +[2023-10-11 16:56:19,338][85176] Updated weights for policy 0, policy_version 47442 (0.0010) +[2023-10-11 16:56:19,713][85176] Updated weights for policy 0, policy_version 47452 (0.0007) +[2023-10-11 16:56:20,448][85175] Updated weights for policy 1, policy_version 48130 (0.0010) +[2023-10-11 16:56:20,811][85175] Updated weights for policy 1, policy_version 48140 (0.0007) +[2023-10-11 16:56:21,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 97878016. Throughput: 0: 1671.7, 1: 1705.3. Samples: 24475492. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) +[2023-10-11 16:56:21,063][84230] Avg episode reward: [(0, '7.220'), (1, '35.920')] +[2023-10-11 16:56:21,183][85175] Updated weights for policy 1, policy_version 48150 (0.0009) +[2023-10-11 16:56:21,549][85175] Updated weights for policy 1, policy_version 48160 (0.0008) +[2023-10-11 16:56:23,780][85176] Updated weights for policy 0, policy_version 47462 (0.0007) +[2023-10-11 16:56:24,157][85176] Updated weights for policy 0, policy_version 47472 (0.0007) +[2023-10-11 16:56:24,527][85176] Updated weights for policy 0, policy_version 47482 (0.0009) +[2023-10-11 16:56:25,689][85175] Updated weights for policy 1, policy_version 48170 (0.0009) +[2023-10-11 16:56:26,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 97943552. Throughput: 0: 1653.3, 1: 1715.3. Samples: 24495414. Policy #0 lag: (min: 27.0, avg: 27.0, max: 27.0) +[2023-10-11 16:56:26,063][84230] Avg episode reward: [(0, '7.450'), (1, '32.080')] +[2023-10-11 16:56:26,065][85175] Updated weights for policy 1, policy_version 48180 (0.0008) +[2023-10-11 16:56:26,429][85175] Updated weights for policy 1, policy_version 48190 (0.0007) +[2023-10-11 16:56:28,734][85176] Updated weights for policy 0, policy_version 47492 (0.0010) +[2023-10-11 16:56:29,118][85176] Updated weights for policy 0, policy_version 47502 (0.0010) +[2023-10-11 16:56:29,481][85176] Updated weights for policy 0, policy_version 47512 (0.0008) +[2023-10-11 16:56:30,496][85175] Updated weights for policy 1, policy_version 48200 (0.0007) +[2023-10-11 16:56:30,865][85175] Updated weights for policy 1, policy_version 48210 (0.0007) +[2023-10-11 16:56:31,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 98009088. Throughput: 0: 1664.2, 1: 1700.8. Samples: 24515270. Policy #0 lag: (min: 27.0, avg: 27.0, max: 27.0) +[2023-10-11 16:56:31,064][84230] Avg episode reward: [(0, '7.900'), (1, '34.670')] +[2023-10-11 16:56:31,241][85175] Updated weights for policy 1, policy_version 48220 (0.0009) +[2023-10-11 16:56:33,819][85176] Updated weights for policy 0, policy_version 47522 (0.0007) +[2023-10-11 16:56:34,218][85176] Updated weights for policy 0, policy_version 47532 (0.0008) +[2023-10-11 16:56:34,585][85176] Updated weights for policy 0, policy_version 47542 (0.0007) +[2023-10-11 16:56:34,960][85176] Updated weights for policy 0, policy_version 47552 (0.0010) +[2023-10-11 16:56:35,252][85175] Updated weights for policy 1, policy_version 48230 (0.0011) +[2023-10-11 16:56:35,634][85175] Updated weights for policy 1, policy_version 48240 (0.0007) +[2023-10-11 16:56:35,993][85175] Updated weights for policy 1, policy_version 48250 (0.0007) +[2023-10-11 16:56:36,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 98074624. Throughput: 0: 1675.2, 1: 1703.0. Samples: 24525962. Policy #0 lag: (min: 27.0, avg: 27.0, max: 27.0) +[2023-10-11 16:56:36,063][84230] Avg episode reward: [(0, '8.010'), (1, '29.910')] +[2023-10-11 16:56:39,028][85176] Updated weights for policy 0, policy_version 47562 (0.0010) +[2023-10-11 16:56:39,400][85176] Updated weights for policy 0, policy_version 47572 (0.0010) +[2023-10-11 16:56:39,769][85176] Updated weights for policy 0, policy_version 47582 (0.0010) +[2023-10-11 16:56:40,090][85175] Updated weights for policy 1, policy_version 48260 (0.0007) +[2023-10-11 16:56:40,450][85175] Updated weights for policy 1, policy_version 48270 (0.0008) +[2023-10-11 16:56:40,822][85175] Updated weights for policy 1, policy_version 48280 (0.0007) +[2023-10-11 16:56:41,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 98140160. Throughput: 0: 1653.4, 1: 1702.3. Samples: 24545784. Policy #0 lag: (min: 27.0, avg: 27.0, max: 27.0) +[2023-10-11 16:56:41,064][84230] Avg episode reward: [(0, '7.710'), (1, '33.660')] +[2023-10-11 16:56:43,700][85176] Updated weights for policy 0, policy_version 47592 (0.0008) +[2023-10-11 16:56:44,078][85176] Updated weights for policy 0, policy_version 47602 (0.0010) +[2023-10-11 16:56:44,461][85176] Updated weights for policy 0, policy_version 47612 (0.0008) +[2023-10-11 16:56:44,769][85175] Updated weights for policy 1, policy_version 48290 (0.0007) +[2023-10-11 16:56:45,143][85175] Updated weights for policy 1, policy_version 48300 (0.0007) +[2023-10-11 16:56:45,499][85175] Updated weights for policy 1, policy_version 48310 (0.0008) +[2023-10-11 16:56:45,863][85175] Updated weights for policy 1, policy_version 48320 (0.0008) +[2023-10-11 16:56:46,063][84230] Fps is (10 sec: 16383.8, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 98238464. Throughput: 0: 1668.4, 1: 1680.8. Samples: 24565404. Policy #0 lag: (min: 27.0, avg: 27.0, max: 27.0) +[2023-10-11 16:56:46,063][84230] Avg episode reward: [(0, '7.600'), (1, '29.470')] +[2023-10-11 16:56:48,430][85176] Updated weights for policy 0, policy_version 47622 (0.0007) +[2023-10-11 16:56:48,794][85176] Updated weights for policy 0, policy_version 47632 (0.0008) +[2023-10-11 16:56:49,171][85176] Updated weights for policy 0, policy_version 47642 (0.0007) +[2023-10-11 16:56:49,844][85175] Updated weights for policy 1, policy_version 48330 (0.0007) +[2023-10-11 16:56:50,202][85175] Updated weights for policy 1, policy_version 48340 (0.0009) +[2023-10-11 16:56:50,569][85175] Updated weights for policy 1, policy_version 48350 (0.0009) +[2023-10-11 16:56:51,063][84230] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 98304000. Throughput: 0: 1663.5, 1: 1698.4. Samples: 24576362. Policy #0 lag: (min: 27.0, avg: 27.0, max: 27.0) +[2023-10-11 16:56:51,064][84230] Avg episode reward: [(0, '7.150'), (1, '35.660')] +[2023-10-11 16:56:53,413][85176] Updated weights for policy 0, policy_version 47652 (0.0009) +[2023-10-11 16:56:53,783][85176] Updated weights for policy 0, policy_version 47662 (0.0010) +[2023-10-11 16:56:54,159][85176] Updated weights for policy 0, policy_version 47672 (0.0011) +[2023-10-11 16:56:54,687][85175] Updated weights for policy 1, policy_version 48360 (0.0010) +[2023-10-11 16:56:55,055][85175] Updated weights for policy 1, policy_version 48370 (0.0007) +[2023-10-11 16:56:55,435][85175] Updated weights for policy 1, policy_version 48380 (0.0007) +[2023-10-11 16:56:56,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 98369536. Throughput: 0: 1658.2, 1: 1693.2. Samples: 24596102. Policy #0 lag: (min: 27.0, avg: 27.0, max: 27.0) +[2023-10-11 16:56:56,063][84230] Avg episode reward: [(0, '7.300'), (1, '33.710')] +[2023-10-11 16:56:58,357][85176] Updated weights for policy 0, policy_version 47682 (0.0008) +[2023-10-11 16:56:58,731][85176] Updated weights for policy 0, policy_version 47692 (0.0009) +[2023-10-11 16:56:59,099][85176] Updated weights for policy 0, policy_version 47702 (0.0009) +[2023-10-11 16:56:59,237][85175] Updated weights for policy 1, policy_version 48390 (0.0007) +[2023-10-11 16:56:59,468][85176] Updated weights for policy 0, policy_version 47712 (0.0009) +[2023-10-11 16:56:59,597][85175] Updated weights for policy 1, policy_version 48400 (0.0007) +[2023-10-11 16:56:59,978][85175] Updated weights for policy 1, policy_version 48410 (0.0007) +[2023-10-11 16:57:01,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 98435072. Throughput: 0: 1678.6, 1: 1675.2. Samples: 24616002. Policy #0 lag: (min: 9.0, avg: 21.4, max: 41.0) +[2023-10-11 16:57:01,063][84230] Avg episode reward: [(0, '7.570'), (1, '39.380')] +[2023-10-11 16:57:01,070][85000] Saving new best policy, reward=39.380! +[2023-10-11 16:57:03,511][85176] Updated weights for policy 0, policy_version 47722 (0.0010) +[2023-10-11 16:57:03,884][85176] Updated weights for policy 0, policy_version 47732 (0.0007) +[2023-10-11 16:57:03,947][85175] Updated weights for policy 1, policy_version 48420 (0.0007) +[2023-10-11 16:57:04,259][85176] Updated weights for policy 0, policy_version 47742 (0.0009) +[2023-10-11 16:57:04,309][85175] Updated weights for policy 1, policy_version 48430 (0.0008) +[2023-10-11 16:57:04,681][85175] Updated weights for policy 1, policy_version 48440 (0.0009) +[2023-10-11 16:57:06,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 98500608. Throughput: 0: 1665.7, 1: 1707.6. Samples: 24627292. Policy #0 lag: (min: 9.0, avg: 21.4, max: 41.0) +[2023-10-11 16:57:06,064][84230] Avg episode reward: [(0, '8.140'), (1, '34.090')] +[2023-10-11 16:57:08,153][85176] Updated weights for policy 0, policy_version 47752 (0.0008) +[2023-10-11 16:57:08,521][85176] Updated weights for policy 0, policy_version 47762 (0.0010) +[2023-10-11 16:57:08,709][85175] Updated weights for policy 1, policy_version 48450 (0.0009) +[2023-10-11 16:57:08,900][85176] Updated weights for policy 0, policy_version 47772 (0.0009) +[2023-10-11 16:57:09,071][85175] Updated weights for policy 1, policy_version 48460 (0.0009) +[2023-10-11 16:57:09,446][85175] Updated weights for policy 1, policy_version 48470 (0.0010) +[2023-10-11 16:57:09,802][85175] Updated weights for policy 1, policy_version 48480 (0.0010) +[2023-10-11 16:57:11,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 98566144. Throughput: 0: 1670.6, 1: 1687.5. Samples: 24646530. Policy #0 lag: (min: 9.0, avg: 21.4, max: 41.0) +[2023-10-11 16:57:11,064][84230] Avg episode reward: [(0, '8.020'), (1, '38.250')] +[2023-10-11 16:57:13,001][85176] Updated weights for policy 0, policy_version 47782 (0.0008) +[2023-10-11 16:57:13,361][85176] Updated weights for policy 0, policy_version 47792 (0.0009) +[2023-10-11 16:57:13,739][85176] Updated weights for policy 0, policy_version 47802 (0.0010) +[2023-10-11 16:57:13,990][85175] Updated weights for policy 1, policy_version 48490 (0.0008) +[2023-10-11 16:57:14,363][85175] Updated weights for policy 1, policy_version 48500 (0.0007) +[2023-10-11 16:57:14,724][85175] Updated weights for policy 1, policy_version 48510 (0.0009) +[2023-10-11 16:57:16,062][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 98631680. Throughput: 0: 1677.4, 1: 1687.6. Samples: 24666696. Policy #0 lag: (min: 9.0, avg: 21.4, max: 41.0) +[2023-10-11 16:57:16,063][84230] Avg episode reward: [(0, '7.560'), (1, '31.950')] +[2023-10-11 16:57:17,913][85176] Updated weights for policy 0, policy_version 47812 (0.0009) +[2023-10-11 16:57:18,285][85176] Updated weights for policy 0, policy_version 47822 (0.0007) +[2023-10-11 16:57:18,661][85176] Updated weights for policy 0, policy_version 47832 (0.0007) +[2023-10-11 16:57:18,821][85175] Updated weights for policy 1, policy_version 48520 (0.0008) +[2023-10-11 16:57:19,187][85175] Updated weights for policy 1, policy_version 48530 (0.0007) +[2023-10-11 16:57:19,550][85175] Updated weights for policy 1, policy_version 48540 (0.0008) +[2023-10-11 16:57:21,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 98697216. Throughput: 0: 1656.8, 1: 1709.9. Samples: 24677468. Policy #0 lag: (min: 9.0, avg: 21.4, max: 41.0) +[2023-10-11 16:57:21,064][84230] Avg episode reward: [(0, '7.560'), (1, '36.370')] +[2023-10-11 16:57:22,775][85176] Updated weights for policy 0, policy_version 47842 (0.0009) +[2023-10-11 16:57:23,148][85176] Updated weights for policy 0, policy_version 47852 (0.0008) +[2023-10-11 16:57:23,520][85176] Updated weights for policy 0, policy_version 47862 (0.0009) +[2023-10-11 16:57:23,653][85175] Updated weights for policy 1, policy_version 48550 (0.0008) +[2023-10-11 16:57:23,882][85176] Updated weights for policy 0, policy_version 47872 (0.0007) +[2023-10-11 16:57:24,012][85175] Updated weights for policy 1, policy_version 48560 (0.0008) +[2023-10-11 16:57:24,384][85175] Updated weights for policy 1, policy_version 48570 (0.0009) +[2023-10-11 16:57:26,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 98762752. Throughput: 0: 1671.9, 1: 1680.0. Samples: 24696618. Policy #0 lag: (min: 9.0, avg: 21.4, max: 41.0) +[2023-10-11 16:57:26,063][84230] Avg episode reward: [(0, '7.450'), (1, '34.400')] +[2023-10-11 16:57:28,001][85176] Updated weights for policy 0, policy_version 47882 (0.0007) +[2023-10-11 16:57:28,155][85175] Updated weights for policy 1, policy_version 48580 (0.0009) +[2023-10-11 16:57:28,366][85176] Updated weights for policy 0, policy_version 47892 (0.0008) +[2023-10-11 16:57:28,518][85175] Updated weights for policy 1, policy_version 48590 (0.0008) +[2023-10-11 16:57:28,743][85176] Updated weights for policy 0, policy_version 47902 (0.0008) +[2023-10-11 16:57:28,888][85175] Updated weights for policy 1, policy_version 48600 (0.0007) +[2023-10-11 16:57:31,063][84230] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 98828288. Throughput: 0: 1679.2, 1: 1698.1. Samples: 24717382. Policy #0 lag: (min: 9.0, avg: 21.4, max: 41.0) +[2023-10-11 16:57:31,063][84230] Avg episode reward: [(0, '7.750'), (1, '37.400')] +[2023-10-11 16:57:32,667][85176] Updated weights for policy 0, policy_version 47912 (0.0008) +[2023-10-11 16:57:32,953][85175] Updated weights for policy 1, policy_version 48610 (0.0008) +[2023-10-11 16:57:33,035][85176] Updated weights for policy 0, policy_version 47922 (0.0008) +[2023-10-11 16:57:33,324][85175] Updated weights for policy 1, policy_version 48620 (0.0008) +[2023-10-11 16:57:33,409][85176] Updated weights for policy 0, policy_version 47932 (0.0007) +[2023-10-11 16:57:33,678][85175] Updated weights for policy 1, policy_version 48630 (0.0008) +[2023-10-11 16:57:34,049][85175] Updated weights for policy 1, policy_version 48640 (0.0009) +[2023-10-11 16:57:36,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 98893824. Throughput: 0: 1657.7, 1: 1691.1. Samples: 24727056. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:57:36,063][84230] Avg episode reward: [(0, '7.870'), (1, '33.310')] +[2023-10-11 16:57:37,595][85176] Updated weights for policy 0, policy_version 47942 (0.0008) +[2023-10-11 16:57:37,982][85176] Updated weights for policy 0, policy_version 47952 (0.0009) +[2023-10-11 16:57:38,093][85175] Updated weights for policy 1, policy_version 48650 (0.0008) +[2023-10-11 16:57:38,354][85176] Updated weights for policy 0, policy_version 47962 (0.0008) +[2023-10-11 16:57:38,466][85175] Updated weights for policy 1, policy_version 48660 (0.0008) +[2023-10-11 16:57:38,833][85175] Updated weights for policy 1, policy_version 48670 (0.0008) +[2023-10-11 16:57:41,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 98959360. Throughput: 0: 1672.6, 1: 1677.9. Samples: 24746872. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:57:41,064][84230] Avg episode reward: [(0, '7.720'), (1, '34.590')] +[2023-10-11 16:57:42,497][85176] Updated weights for policy 0, policy_version 47972 (0.0007) +[2023-10-11 16:57:42,863][85176] Updated weights for policy 0, policy_version 47982 (0.0008) +[2023-10-11 16:57:42,911][85175] Updated weights for policy 1, policy_version 48680 (0.0008) +[2023-10-11 16:57:43,242][85176] Updated weights for policy 0, policy_version 47992 (0.0009) +[2023-10-11 16:57:43,281][85175] Updated weights for policy 1, policy_version 48690 (0.0008) +[2023-10-11 16:57:43,651][85175] Updated weights for policy 1, policy_version 48700 (0.0008) +[2023-10-11 16:57:46,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 99024896. Throughput: 0: 1668.4, 1: 1703.9. Samples: 24767758. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:57:46,063][84230] Avg episode reward: [(0, '8.050'), (1, '35.460')] +[2023-10-11 16:57:47,384][85176] Updated weights for policy 0, policy_version 48002 (0.0009) +[2023-10-11 16:57:47,649][85175] Updated weights for policy 1, policy_version 48710 (0.0007) +[2023-10-11 16:57:47,764][85176] Updated weights for policy 0, policy_version 48012 (0.0008) +[2023-10-11 16:57:48,013][85175] Updated weights for policy 1, policy_version 48720 (0.0008) +[2023-10-11 16:57:48,134][85176] Updated weights for policy 0, policy_version 48022 (0.0009) +[2023-10-11 16:57:48,373][85175] Updated weights for policy 1, policy_version 48730 (0.0007) +[2023-10-11 16:57:48,503][85176] Updated weights for policy 0, policy_version 48032 (0.0008) +[2023-10-11 16:57:51,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 99090432. Throughput: 0: 1648.6, 1: 1676.0. Samples: 24776896. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:57:51,063][84230] Avg episode reward: [(0, '7.600'), (1, '35.270')] +[2023-10-11 16:57:52,548][85175] Updated weights for policy 1, policy_version 48740 (0.0010) +[2023-10-11 16:57:52,641][85176] Updated weights for policy 0, policy_version 48042 (0.0009) +[2023-10-11 16:57:52,905][85175] Updated weights for policy 1, policy_version 48750 (0.0008) +[2023-10-11 16:57:53,013][85176] Updated weights for policy 0, policy_version 48052 (0.0007) +[2023-10-11 16:57:53,274][85175] Updated weights for policy 1, policy_version 48760 (0.0009) +[2023-10-11 16:57:53,385][85176] Updated weights for policy 0, policy_version 48062 (0.0009) +[2023-10-11 16:57:56,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 99155968. Throughput: 0: 1664.7, 1: 1682.3. Samples: 24797144. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:57:56,064][84230] Avg episode reward: [(0, '7.100'), (1, '35.940')] +[2023-10-11 16:57:57,353][85175] Updated weights for policy 1, policy_version 48770 (0.0008) +[2023-10-11 16:57:57,471][85176] Updated weights for policy 0, policy_version 48072 (0.0010) +[2023-10-11 16:57:57,719][85175] Updated weights for policy 1, policy_version 48780 (0.0008) +[2023-10-11 16:57:57,844][85176] Updated weights for policy 0, policy_version 48082 (0.0008) +[2023-10-11 16:57:58,087][85175] Updated weights for policy 1, policy_version 48790 (0.0007) +[2023-10-11 16:57:58,208][85176] Updated weights for policy 0, policy_version 48092 (0.0008) +[2023-10-11 16:57:58,446][85175] Updated weights for policy 1, policy_version 48800 (0.0007) +[2023-10-11 16:58:01,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 99221504. Throughput: 0: 1669.5, 1: 1693.5. Samples: 24818030. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:58:01,063][84230] Avg episode reward: [(0, '7.550'), (1, '33.880')] +[2023-10-11 16:58:02,303][85176] Updated weights for policy 0, policy_version 48102 (0.0007) +[2023-10-11 16:58:02,584][85175] Updated weights for policy 1, policy_version 48810 (0.0009) +[2023-10-11 16:58:02,663][85176] Updated weights for policy 0, policy_version 48112 (0.0009) +[2023-10-11 16:58:02,953][85175] Updated weights for policy 1, policy_version 48820 (0.0007) +[2023-10-11 16:58:03,041][85176] Updated weights for policy 0, policy_version 48122 (0.0009) +[2023-10-11 16:58:03,322][85175] Updated weights for policy 1, policy_version 48830 (0.0007) +[2023-10-11 16:58:06,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 99287040. Throughput: 0: 1659.6, 1: 1665.0. Samples: 24827072. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:58:06,063][84230] Avg episode reward: [(0, '7.900'), (1, '33.780')] +[2023-10-11 16:58:07,170][85175] Updated weights for policy 1, policy_version 48840 (0.0008) +[2023-10-11 16:58:07,413][85176] Updated weights for policy 0, policy_version 48132 (0.0007) +[2023-10-11 16:58:07,542][85175] Updated weights for policy 1, policy_version 48850 (0.0007) +[2023-10-11 16:58:07,798][85176] Updated weights for policy 0, policy_version 48142 (0.0007) +[2023-10-11 16:58:07,909][85175] Updated weights for policy 1, policy_version 48860 (0.0007) +[2023-10-11 16:58:08,168][85176] Updated weights for policy 0, policy_version 48152 (0.0008) +[2023-10-11 16:58:11,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 99352576. Throughput: 0: 1659.6, 1: 1697.0. Samples: 24847664. Policy #0 lag: (min: 10.0, avg: 12.8, max: 42.0) +[2023-10-11 16:58:11,064][84230] Avg episode reward: [(0, '8.170'), (1, '33.030')] +[2023-10-11 16:58:11,932][85175] Updated weights for policy 1, policy_version 48870 (0.0008) +[2023-10-11 16:58:12,124][85176] Updated weights for policy 0, policy_version 48162 (0.0008) +[2023-10-11 16:58:12,298][85175] Updated weights for policy 1, policy_version 48880 (0.0007) +[2023-10-11 16:58:12,491][85176] Updated weights for policy 0, policy_version 48172 (0.0007) +[2023-10-11 16:58:12,664][85175] Updated weights for policy 1, policy_version 48890 (0.0007) +[2023-10-11 16:58:12,862][85176] Updated weights for policy 0, policy_version 48182 (0.0008) +[2023-10-11 16:58:13,237][85176] Updated weights for policy 0, policy_version 48192 (0.0007) +[2023-10-11 16:58:16,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 99418112. Throughput: 0: 1662.1, 1: 1700.0. Samples: 24868678. Policy #0 lag: (min: 10.0, avg: 12.8, max: 42.0) +[2023-10-11 16:58:16,063][84230] Avg episode reward: [(0, '8.620'), (1, '37.080')] +[2023-10-11 16:58:16,074][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000048896_50069504.pth... +[2023-10-11 16:58:16,074][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000048192_49348608.pth... +[2023-10-11 16:58:16,108][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000047328_48463872.pth +[2023-10-11 16:58:16,109][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000046656_47775744.pth +[2023-10-11 16:58:16,653][85175] Updated weights for policy 1, policy_version 48900 (0.0008) +[2023-10-11 16:58:17,029][85175] Updated weights for policy 1, policy_version 48910 (0.0009) +[2023-10-11 16:58:17,246][85176] Updated weights for policy 0, policy_version 48202 (0.0008) +[2023-10-11 16:58:17,393][85175] Updated weights for policy 1, policy_version 48920 (0.0008) +[2023-10-11 16:58:17,611][85176] Updated weights for policy 0, policy_version 48212 (0.0008) +[2023-10-11 16:58:17,987][85176] Updated weights for policy 0, policy_version 48222 (0.0009) +[2023-10-11 16:58:21,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 99483648. Throughput: 0: 1660.3, 1: 1691.5. Samples: 24877888. Policy #0 lag: (min: 10.0, avg: 12.8, max: 42.0) +[2023-10-11 16:58:21,063][84230] Avg episode reward: [(0, '7.880'), (1, '35.640')] +[2023-10-11 16:58:21,414][85175] Updated weights for policy 1, policy_version 48930 (0.0009) +[2023-10-11 16:58:21,785][85175] Updated weights for policy 1, policy_version 48940 (0.0009) +[2023-10-11 16:58:22,154][85175] Updated weights for policy 1, policy_version 48950 (0.0008) +[2023-10-11 16:58:22,185][85176] Updated weights for policy 0, policy_version 48232 (0.0008) +[2023-10-11 16:58:22,520][85175] Updated weights for policy 1, policy_version 48960 (0.0009) +[2023-10-11 16:58:22,554][85176] Updated weights for policy 0, policy_version 48242 (0.0011) +[2023-10-11 16:58:22,929][85176] Updated weights for policy 0, policy_version 48252 (0.0008) +[2023-10-11 16:58:26,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 99549184. Throughput: 0: 1663.4, 1: 1706.9. Samples: 24898532. Policy #0 lag: (min: 10.0, avg: 12.8, max: 42.0) +[2023-10-11 16:58:26,063][84230] Avg episode reward: [(0, '6.700'), (1, '37.890')] +[2023-10-11 16:58:26,511][85175] Updated weights for policy 1, policy_version 48970 (0.0007) +[2023-10-11 16:58:26,869][85175] Updated weights for policy 1, policy_version 48980 (0.0008) +[2023-10-11 16:58:26,923][85176] Updated weights for policy 0, policy_version 48262 (0.0008) +[2023-10-11 16:58:27,232][85175] Updated weights for policy 1, policy_version 48990 (0.0008) +[2023-10-11 16:58:27,298][85176] Updated weights for policy 0, policy_version 48272 (0.0008) +[2023-10-11 16:58:27,675][85176] Updated weights for policy 0, policy_version 48282 (0.0009) +[2023-10-11 16:58:31,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 99614720. Throughput: 0: 1666.7, 1: 1706.4. Samples: 24919548. Policy #0 lag: (min: 10.0, avg: 12.8, max: 42.0) +[2023-10-11 16:58:31,063][84230] Avg episode reward: [(0, '6.780'), (1, '34.840')] +[2023-10-11 16:58:31,087][85175] Updated weights for policy 1, policy_version 49000 (0.0010) +[2023-10-11 16:58:31,459][85175] Updated weights for policy 1, policy_version 49010 (0.0008) +[2023-10-11 16:58:31,624][85176] Updated weights for policy 0, policy_version 48292 (0.0009) +[2023-10-11 16:58:31,822][85175] Updated weights for policy 1, policy_version 49020 (0.0008) +[2023-10-11 16:58:32,009][85176] Updated weights for policy 0, policy_version 48302 (0.0007) +[2023-10-11 16:58:32,380][85176] Updated weights for policy 0, policy_version 48312 (0.0007) +[2023-10-11 16:58:35,794][85175] Updated weights for policy 1, policy_version 49030 (0.0007) +[2023-10-11 16:58:36,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 99680256. Throughput: 0: 1667.9, 1: 1704.8. Samples: 24928666. Policy #0 lag: (min: 10.0, avg: 12.8, max: 42.0) +[2023-10-11 16:58:36,063][84230] Avg episode reward: [(0, '7.620'), (1, '37.800')] +[2023-10-11 16:58:36,164][85175] Updated weights for policy 1, policy_version 49040 (0.0008) +[2023-10-11 16:58:36,348][85176] Updated weights for policy 0, policy_version 48322 (0.0007) +[2023-10-11 16:58:36,542][85175] Updated weights for policy 1, policy_version 49050 (0.0008) +[2023-10-11 16:58:36,719][85176] Updated weights for policy 0, policy_version 48332 (0.0009) +[2023-10-11 16:58:37,100][85176] Updated weights for policy 0, policy_version 48342 (0.0009) +[2023-10-11 16:58:37,465][85176] Updated weights for policy 0, policy_version 48352 (0.0009) +[2023-10-11 16:58:40,521][85175] Updated weights for policy 1, policy_version 49060 (0.0008) +[2023-10-11 16:58:40,892][85175] Updated weights for policy 1, policy_version 49070 (0.0008) +[2023-10-11 16:58:41,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 99745792. Throughput: 0: 1666.0, 1: 1720.0. Samples: 24949512. Policy #0 lag: (min: 10.0, avg: 12.8, max: 42.0) +[2023-10-11 16:58:41,063][84230] Avg episode reward: [(0, '8.500'), (1, '35.870')] +[2023-10-11 16:58:41,262][85175] Updated weights for policy 1, policy_version 49080 (0.0008) +[2023-10-11 16:58:41,735][85176] Updated weights for policy 0, policy_version 48362 (0.0009) +[2023-10-11 16:58:42,108][85176] Updated weights for policy 0, policy_version 48372 (0.0009) +[2023-10-11 16:58:42,486][85176] Updated weights for policy 0, policy_version 48382 (0.0008) +[2023-10-11 16:58:45,265][85175] Updated weights for policy 1, policy_version 49090 (0.0009) +[2023-10-11 16:58:45,639][85175] Updated weights for policy 1, policy_version 49100 (0.0008) +[2023-10-11 16:58:46,018][85175] Updated weights for policy 1, policy_version 49110 (0.0007) +[2023-10-11 16:58:46,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 99811328. Throughput: 0: 1658.6, 1: 1710.3. Samples: 24969628. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:58:46,064][84230] Avg episode reward: [(0, '7.880'), (1, '36.830')] +[2023-10-11 16:58:46,381][85175] Updated weights for policy 1, policy_version 49120 (0.0009) +[2023-10-11 16:58:46,660][85176] Updated weights for policy 0, policy_version 48392 (0.0008) +[2023-10-11 16:58:47,035][85176] Updated weights for policy 0, policy_version 48402 (0.0008) +[2023-10-11 16:58:47,408][85176] Updated weights for policy 0, policy_version 48412 (0.0009) +[2023-10-11 16:58:50,615][85175] Updated weights for policy 1, policy_version 49130 (0.0009) +[2023-10-11 16:58:50,982][85175] Updated weights for policy 1, policy_version 49140 (0.0009) +[2023-10-11 16:58:51,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 99876864. Throughput: 0: 1657.3, 1: 1720.1. Samples: 24979056. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:58:51,063][84230] Avg episode reward: [(0, '7.510'), (1, '37.670')] +[2023-10-11 16:58:51,356][85175] Updated weights for policy 1, policy_version 49150 (0.0009) +[2023-10-11 16:58:51,649][85176] Updated weights for policy 0, policy_version 48422 (0.0010) +[2023-10-11 16:58:52,024][85176] Updated weights for policy 0, policy_version 48432 (0.0008) +[2023-10-11 16:58:52,397][85176] Updated weights for policy 0, policy_version 48442 (0.0007) +[2023-10-11 16:58:55,423][85175] Updated weights for policy 1, policy_version 49160 (0.0007) +[2023-10-11 16:58:55,789][85175] Updated weights for policy 1, policy_version 49170 (0.0009) +[2023-10-11 16:58:56,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 99942400. Throughput: 0: 1667.7, 1: 1707.3. Samples: 24999534. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:58:56,063][84230] Avg episode reward: [(0, '7.350'), (1, '34.650')] +[2023-10-11 16:58:56,156][85175] Updated weights for policy 1, policy_version 49180 (0.0008) +[2023-10-11 16:58:56,574][85176] Updated weights for policy 0, policy_version 48452 (0.0008) +[2023-10-11 16:58:56,959][85176] Updated weights for policy 0, policy_version 48462 (0.0008) +[2023-10-11 16:58:57,328][85176] Updated weights for policy 0, policy_version 48472 (0.0010) +[2023-10-11 16:59:00,170][85175] Updated weights for policy 1, policy_version 49190 (0.0007) +[2023-10-11 16:59:00,542][85175] Updated weights for policy 1, policy_version 49200 (0.0009) +[2023-10-11 16:59:00,912][85175] Updated weights for policy 1, policy_version 49210 (0.0009) +[2023-10-11 16:59:01,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 100007936. Throughput: 0: 1663.0, 1: 1688.6. Samples: 25019498. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:59:01,063][84230] Avg episode reward: [(0, '7.420'), (1, '35.720')] +[2023-10-11 16:59:01,424][85176] Updated weights for policy 0, policy_version 48482 (0.0007) +[2023-10-11 16:59:01,796][85176] Updated weights for policy 0, policy_version 48492 (0.0008) +[2023-10-11 16:59:02,165][85176] Updated weights for policy 0, policy_version 48502 (0.0009) +[2023-10-11 16:59:02,540][85176] Updated weights for policy 0, policy_version 48512 (0.0008) +[2023-10-11 16:59:05,038][85175] Updated weights for policy 1, policy_version 49220 (0.0007) +[2023-10-11 16:59:05,416][85175] Updated weights for policy 1, policy_version 49230 (0.0007) +[2023-10-11 16:59:05,791][85175] Updated weights for policy 1, policy_version 49240 (0.0008) +[2023-10-11 16:59:06,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 100073472. Throughput: 0: 1660.5, 1: 1699.1. Samples: 25029072. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:59:06,063][84230] Avg episode reward: [(0, '8.050'), (1, '33.590')] +[2023-10-11 16:59:06,719][85176] Updated weights for policy 0, policy_version 48522 (0.0008) +[2023-10-11 16:59:07,099][85176] Updated weights for policy 0, policy_version 48532 (0.0010) +[2023-10-11 16:59:07,481][85176] Updated weights for policy 0, policy_version 48542 (0.0007) +[2023-10-11 16:59:09,850][85175] Updated weights for policy 1, policy_version 49250 (0.0009) +[2023-10-11 16:59:10,213][85175] Updated weights for policy 1, policy_version 49260 (0.0007) +[2023-10-11 16:59:10,588][85175] Updated weights for policy 1, policy_version 49270 (0.0010) +[2023-10-11 16:59:10,959][85175] Updated weights for policy 1, policy_version 49280 (0.0007) +[2023-10-11 16:59:11,063][84230] Fps is (10 sec: 16383.9, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 100171776. Throughput: 0: 1660.8, 1: 1702.2. Samples: 25049868. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:59:11,064][84230] Avg episode reward: [(0, '8.320'), (1, '35.560')] +[2023-10-11 16:59:11,459][85176] Updated weights for policy 0, policy_version 48552 (0.0007) +[2023-10-11 16:59:11,831][85176] Updated weights for policy 0, policy_version 48562 (0.0007) +[2023-10-11 16:59:12,198][85176] Updated weights for policy 0, policy_version 48572 (0.0007) +[2023-10-11 16:59:15,051][85175] Updated weights for policy 1, policy_version 49290 (0.0007) +[2023-10-11 16:59:15,426][85175] Updated weights for policy 1, policy_version 49300 (0.0010) +[2023-10-11 16:59:15,791][85175] Updated weights for policy 1, policy_version 49310 (0.0008) +[2023-10-11 16:59:16,063][84230] Fps is (10 sec: 16383.7, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 100237312. Throughput: 0: 1661.4, 1: 1673.6. Samples: 25069624. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:59:16,063][84230] Avg episode reward: [(0, '7.570'), (1, '36.570')] +[2023-10-11 16:59:16,212][85176] Updated weights for policy 0, policy_version 48582 (0.0008) +[2023-10-11 16:59:16,585][85176] Updated weights for policy 0, policy_version 48592 (0.0009) +[2023-10-11 16:59:16,961][85176] Updated weights for policy 0, policy_version 48602 (0.0008) +[2023-10-11 16:59:19,740][85175] Updated weights for policy 1, policy_version 49320 (0.0008) +[2023-10-11 16:59:20,109][85175] Updated weights for policy 1, policy_version 49330 (0.0008) +[2023-10-11 16:59:20,472][85175] Updated weights for policy 1, policy_version 49340 (0.0010) +[2023-10-11 16:59:21,049][85176] Updated weights for policy 0, policy_version 48612 (0.0010) +[2023-10-11 16:59:21,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 100302848. Throughput: 0: 1664.6, 1: 1696.4. Samples: 25079914. Policy #0 lag: (min: 26.0, avg: 26.4, max: 40.0) +[2023-10-11 16:59:21,064][84230] Avg episode reward: [(0, '7.150'), (1, '38.320')] +[2023-10-11 16:59:21,426][85176] Updated weights for policy 0, policy_version 48622 (0.0007) +[2023-10-11 16:59:21,799][85176] Updated weights for policy 0, policy_version 48632 (0.0008) +[2023-10-11 16:59:24,453][85175] Updated weights for policy 1, policy_version 49350 (0.0009) +[2023-10-11 16:59:24,809][85175] Updated weights for policy 1, policy_version 49360 (0.0008) +[2023-10-11 16:59:25,180][85175] Updated weights for policy 1, policy_version 49370 (0.0009) +[2023-10-11 16:59:25,883][85176] Updated weights for policy 0, policy_version 48642 (0.0008) +[2023-10-11 16:59:26,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 100368384. Throughput: 0: 1668.7, 1: 1685.2. Samples: 25100438. Policy #0 lag: (min: 26.0, avg: 26.4, max: 40.0) +[2023-10-11 16:59:26,063][84230] Avg episode reward: [(0, '7.260'), (1, '38.820')] +[2023-10-11 16:59:26,266][85176] Updated weights for policy 0, policy_version 48652 (0.0008) +[2023-10-11 16:59:26,635][85176] Updated weights for policy 0, policy_version 48662 (0.0009) +[2023-10-11 16:59:27,009][85176] Updated weights for policy 0, policy_version 48672 (0.0008) +[2023-10-11 16:59:29,228][85175] Updated weights for policy 1, policy_version 49380 (0.0010) +[2023-10-11 16:59:29,603][85175] Updated weights for policy 1, policy_version 49390 (0.0009) +[2023-10-11 16:59:29,964][85175] Updated weights for policy 1, policy_version 49400 (0.0007) +[2023-10-11 16:59:31,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 100433920. Throughput: 0: 1668.1, 1: 1676.3. Samples: 25120126. Policy #0 lag: (min: 26.0, avg: 26.4, max: 40.0) +[2023-10-11 16:59:31,063][84230] Avg episode reward: [(0, '7.970'), (1, '37.820')] +[2023-10-11 16:59:31,122][85176] Updated weights for policy 0, policy_version 48682 (0.0007) +[2023-10-11 16:59:31,493][85176] Updated weights for policy 0, policy_version 48692 (0.0008) +[2023-10-11 16:59:31,858][85176] Updated weights for policy 0, policy_version 48702 (0.0008) +[2023-10-11 16:59:33,777][85175] Updated weights for policy 1, policy_version 49410 (0.0008) +[2023-10-11 16:59:34,146][85175] Updated weights for policy 1, policy_version 49420 (0.0010) +[2023-10-11 16:59:34,515][85175] Updated weights for policy 1, policy_version 49430 (0.0007) +[2023-10-11 16:59:34,887][85175] Updated weights for policy 1, policy_version 49440 (0.0008) +[2023-10-11 16:59:36,033][85176] Updated weights for policy 0, policy_version 48712 (0.0008) +[2023-10-11 16:59:36,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 100499456. Throughput: 0: 1667.8, 1: 1704.7. Samples: 25130818. Policy #0 lag: (min: 26.0, avg: 26.4, max: 40.0) +[2023-10-11 16:59:36,064][84230] Avg episode reward: [(0, '8.010'), (1, '37.950')] +[2023-10-11 16:59:36,404][85176] Updated weights for policy 0, policy_version 48722 (0.0008) +[2023-10-11 16:59:36,779][85176] Updated weights for policy 0, policy_version 48732 (0.0008) +[2023-10-11 16:59:38,878][85175] Updated weights for policy 1, policy_version 49450 (0.0011) +[2023-10-11 16:59:39,237][85175] Updated weights for policy 1, policy_version 49460 (0.0010) +[2023-10-11 16:59:39,611][85175] Updated weights for policy 1, policy_version 49470 (0.0010) +[2023-10-11 16:59:40,768][85176] Updated weights for policy 0, policy_version 48742 (0.0008) +[2023-10-11 16:59:41,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 100564992. Throughput: 0: 1666.8, 1: 1685.5. Samples: 25150386. Policy #0 lag: (min: 26.0, avg: 26.4, max: 40.0) +[2023-10-11 16:59:41,064][84230] Avg episode reward: [(0, '7.750'), (1, '36.400')] +[2023-10-11 16:59:41,139][85176] Updated weights for policy 0, policy_version 48752 (0.0009) +[2023-10-11 16:59:41,509][85176] Updated weights for policy 0, policy_version 48762 (0.0007) +[2023-10-11 16:59:43,688][85175] Updated weights for policy 1, policy_version 49480 (0.0008) +[2023-10-11 16:59:44,048][85175] Updated weights for policy 1, policy_version 49490 (0.0008) +[2023-10-11 16:59:44,422][85175] Updated weights for policy 1, policy_version 49500 (0.0007) +[2023-10-11 16:59:45,647][85176] Updated weights for policy 0, policy_version 48772 (0.0007) +[2023-10-11 16:59:46,025][85176] Updated weights for policy 0, policy_version 48782 (0.0007) +[2023-10-11 16:59:46,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 100630528. Throughput: 0: 1664.4, 1: 1694.8. Samples: 25170662. Policy #0 lag: (min: 26.0, avg: 26.4, max: 40.0) +[2023-10-11 16:59:46,063][84230] Avg episode reward: [(0, '7.900'), (1, '37.160')] +[2023-10-11 16:59:46,409][85176] Updated weights for policy 0, policy_version 48792 (0.0009) +[2023-10-11 16:59:48,515][85175] Updated weights for policy 1, policy_version 49510 (0.0009) +[2023-10-11 16:59:48,886][85175] Updated weights for policy 1, policy_version 49520 (0.0008) +[2023-10-11 16:59:49,252][85175] Updated weights for policy 1, policy_version 49530 (0.0008) +[2023-10-11 16:59:50,597][85176] Updated weights for policy 0, policy_version 48802 (0.0009) +[2023-10-11 16:59:50,975][85176] Updated weights for policy 0, policy_version 48812 (0.0007) +[2023-10-11 16:59:51,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 100696064. Throughput: 0: 1666.2, 1: 1705.4. Samples: 25180794. Policy #0 lag: (min: 26.0, avg: 26.4, max: 40.0) +[2023-10-11 16:59:51,063][84230] Avg episode reward: [(0, '7.900'), (1, '35.440')] +[2023-10-11 16:59:51,353][85176] Updated weights for policy 0, policy_version 48822 (0.0010) +[2023-10-11 16:59:51,722][85176] Updated weights for policy 0, policy_version 48832 (0.0010) +[2023-10-11 16:59:53,425][85175] Updated weights for policy 1, policy_version 49540 (0.0009) +[2023-10-11 16:59:53,784][85175] Updated weights for policy 1, policy_version 49550 (0.0008) +[2023-10-11 16:59:54,152][85175] Updated weights for policy 1, policy_version 49560 (0.0010) +[2023-10-11 16:59:55,931][85176] Updated weights for policy 0, policy_version 48842 (0.0009) +[2023-10-11 16:59:56,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 100761600. Throughput: 0: 1663.6, 1: 1678.2. Samples: 25200252. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 16:59:56,064][84230] Avg episode reward: [(0, '7.300'), (1, '36.190')] +[2023-10-11 16:59:56,300][85176] Updated weights for policy 0, policy_version 48852 (0.0009) +[2023-10-11 16:59:56,675][85176] Updated weights for policy 0, policy_version 48862 (0.0008) +[2023-10-11 16:59:58,003][85175] Updated weights for policy 1, policy_version 49570 (0.0009) +[2023-10-11 16:59:58,376][85175] Updated weights for policy 1, policy_version 49580 (0.0007) +[2023-10-11 16:59:58,743][85175] Updated weights for policy 1, policy_version 49590 (0.0009) +[2023-10-11 16:59:59,105][85175] Updated weights for policy 1, policy_version 49600 (0.0010) +[2023-10-11 17:00:00,627][85176] Updated weights for policy 0, policy_version 48872 (0.0008) +[2023-10-11 17:00:00,992][85176] Updated weights for policy 0, policy_version 48882 (0.0007) +[2023-10-11 17:00:01,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 100827136. Throughput: 0: 1659.6, 1: 1706.3. Samples: 25221088. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:00:01,063][84230] Avg episode reward: [(0, '8.030'), (1, '33.510')] +[2023-10-11 17:00:01,372][85176] Updated weights for policy 0, policy_version 48892 (0.0008) +[2023-10-11 17:00:02,986][85175] Updated weights for policy 1, policy_version 49610 (0.0009) +[2023-10-11 17:00:03,356][85175] Updated weights for policy 1, policy_version 49620 (0.0009) +[2023-10-11 17:00:03,720][85175] Updated weights for policy 1, policy_version 49630 (0.0009) +[2023-10-11 17:00:05,421][85176] Updated weights for policy 0, policy_version 48902 (0.0008) +[2023-10-11 17:00:05,793][85176] Updated weights for policy 0, policy_version 48912 (0.0007) +[2023-10-11 17:00:06,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 100892672. Throughput: 0: 1665.9, 1: 1692.0. Samples: 25231016. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:00:06,063][84230] Avg episode reward: [(0, '8.480'), (1, '34.850')] +[2023-10-11 17:00:06,160][85176] Updated weights for policy 0, policy_version 48922 (0.0010) +[2023-10-11 17:00:07,713][85175] Updated weights for policy 1, policy_version 49640 (0.0010) +[2023-10-11 17:00:08,087][85175] Updated weights for policy 1, policy_version 49650 (0.0009) +[2023-10-11 17:00:08,454][85175] Updated weights for policy 1, policy_version 49660 (0.0009) +[2023-10-11 17:00:10,326][85176] Updated weights for policy 0, policy_version 48932 (0.0007) +[2023-10-11 17:00:10,701][85176] Updated weights for policy 0, policy_version 48942 (0.0007) +[2023-10-11 17:00:11,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 100958208. Throughput: 0: 1662.4, 1: 1688.1. Samples: 25251212. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:00:11,064][84230] Avg episode reward: [(0, '7.750'), (1, '36.490')] +[2023-10-11 17:00:11,076][85176] Updated weights for policy 0, policy_version 48952 (0.0009) +[2023-10-11 17:00:12,565][85175] Updated weights for policy 1, policy_version 49670 (0.0008) +[2023-10-11 17:00:12,930][85175] Updated weights for policy 1, policy_version 49680 (0.0008) +[2023-10-11 17:00:13,298][85175] Updated weights for policy 1, policy_version 49690 (0.0011) +[2023-10-11 17:00:15,363][85176] Updated weights for policy 0, policy_version 48962 (0.0008) +[2023-10-11 17:00:15,739][85176] Updated weights for policy 0, policy_version 48972 (0.0009) +[2023-10-11 17:00:16,063][84230] Fps is (10 sec: 13106.7, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 101023744. Throughput: 0: 1654.6, 1: 1706.3. Samples: 25271366. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:00:16,064][84230] Avg episode reward: [(0, '8.050'), (1, '35.610')] +[2023-10-11 17:00:16,075][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000049696_50888704.pth... +[2023-10-11 17:00:16,113][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000048128_49283072.pth +[2023-10-11 17:00:16,118][85176] Updated weights for policy 0, policy_version 48982 (0.0008) +[2023-10-11 17:00:16,496][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000048992_50167808.pth... +[2023-10-11 17:00:16,500][85176] Updated weights for policy 0, policy_version 48992 (0.0008) +[2023-10-11 17:00:16,526][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000047424_48562176.pth +[2023-10-11 17:00:17,360][85175] Updated weights for policy 1, policy_version 49700 (0.0009) +[2023-10-11 17:00:17,732][85175] Updated weights for policy 1, policy_version 49710 (0.0010) +[2023-10-11 17:00:18,095][85175] Updated weights for policy 1, policy_version 49720 (0.0010) +[2023-10-11 17:00:20,542][85176] Updated weights for policy 0, policy_version 49002 (0.0009) +[2023-10-11 17:00:20,916][85176] Updated weights for policy 0, policy_version 49012 (0.0009) +[2023-10-11 17:00:21,062][84230] Fps is (10 sec: 13107.6, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 101089280. Throughput: 0: 1662.1, 1: 1669.1. Samples: 25280720. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:00:21,063][84230] Avg episode reward: [(0, '8.350'), (1, '35.920')] +[2023-10-11 17:00:21,291][85176] Updated weights for policy 0, policy_version 49022 (0.0010) +[2023-10-11 17:00:22,150][85175] Updated weights for policy 1, policy_version 49730 (0.0010) +[2023-10-11 17:00:22,523][85175] Updated weights for policy 1, policy_version 49740 (0.0008) +[2023-10-11 17:00:22,896][85175] Updated weights for policy 1, policy_version 49750 (0.0008) +[2023-10-11 17:00:23,258][85175] Updated weights for policy 1, policy_version 49760 (0.0008) +[2023-10-11 17:00:25,427][85176] Updated weights for policy 0, policy_version 49032 (0.0010) +[2023-10-11 17:00:25,804][85176] Updated weights for policy 0, policy_version 49042 (0.0008) +[2023-10-11 17:00:26,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.1, 300 sec: 13329.4). Total num frames: 101154816. Throughput: 0: 1661.7, 1: 1696.4. Samples: 25301504. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:00:26,064][84230] Avg episode reward: [(0, '7.900'), (1, '35.550')] +[2023-10-11 17:00:26,179][85176] Updated weights for policy 0, policy_version 49052 (0.0009) +[2023-10-11 17:00:27,443][85175] Updated weights for policy 1, policy_version 49770 (0.0010) +[2023-10-11 17:00:27,826][85175] Updated weights for policy 1, policy_version 49780 (0.0008) +[2023-10-11 17:00:28,199][85175] Updated weights for policy 1, policy_version 49790 (0.0010) +[2023-10-11 17:00:30,233][85176] Updated weights for policy 0, policy_version 49062 (0.0011) +[2023-10-11 17:00:30,625][85176] Updated weights for policy 0, policy_version 49072 (0.0010) +[2023-10-11 17:00:30,993][85176] Updated weights for policy 0, policy_version 49082 (0.0010) +[2023-10-11 17:00:31,063][84230] Fps is (10 sec: 13106.8, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 101220352. Throughput: 0: 1655.0, 1: 1702.3. Samples: 25321744. Policy #0 lag: (min: 27.0, avg: 27.0, max: 27.0) +[2023-10-11 17:00:31,064][84230] Avg episode reward: [(0, '7.300'), (1, '37.950')] +[2023-10-11 17:00:32,099][85175] Updated weights for policy 1, policy_version 49800 (0.0009) +[2023-10-11 17:00:32,477][85175] Updated weights for policy 1, policy_version 49810 (0.0007) +[2023-10-11 17:00:32,848][85175] Updated weights for policy 1, policy_version 49820 (0.0008) +[2023-10-11 17:00:34,970][85176] Updated weights for policy 0, policy_version 49092 (0.0010) +[2023-10-11 17:00:35,344][85176] Updated weights for policy 0, policy_version 49102 (0.0010) +[2023-10-11 17:00:35,726][85176] Updated weights for policy 0, policy_version 49112 (0.0009) +[2023-10-11 17:00:36,063][84230] Fps is (10 sec: 16384.5, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 101318656. Throughput: 0: 1670.0, 1: 1677.8. Samples: 25331448. Policy #0 lag: (min: 27.0, avg: 27.0, max: 27.0) +[2023-10-11 17:00:36,063][84230] Avg episode reward: [(0, '8.030'), (1, '37.060')] +[2023-10-11 17:00:36,815][85175] Updated weights for policy 1, policy_version 49830 (0.0008) +[2023-10-11 17:00:37,187][85175] Updated weights for policy 1, policy_version 49840 (0.0008) +[2023-10-11 17:00:37,556][85175] Updated weights for policy 1, policy_version 49850 (0.0007) +[2023-10-11 17:00:39,671][85176] Updated weights for policy 0, policy_version 49122 (0.0008) +[2023-10-11 17:00:40,052][85176] Updated weights for policy 0, policy_version 49132 (0.0007) +[2023-10-11 17:00:40,422][85176] Updated weights for policy 0, policy_version 49142 (0.0007) +[2023-10-11 17:00:40,790][85176] Updated weights for policy 0, policy_version 49152 (0.0007) +[2023-10-11 17:00:41,063][84230] Fps is (10 sec: 16384.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 101384192. Throughput: 0: 1672.9, 1: 1706.0. Samples: 25352302. Policy #0 lag: (min: 27.0, avg: 27.0, max: 27.0) +[2023-10-11 17:00:41,064][84230] Avg episode reward: [(0, '8.470'), (1, '37.310')] +[2023-10-11 17:00:41,450][85175] Updated weights for policy 1, policy_version 49860 (0.0007) +[2023-10-11 17:00:41,823][85175] Updated weights for policy 1, policy_version 49870 (0.0010) +[2023-10-11 17:00:42,194][85175] Updated weights for policy 1, policy_version 49880 (0.0009) +[2023-10-11 17:00:44,918][85176] Updated weights for policy 0, policy_version 49162 (0.0011) +[2023-10-11 17:00:45,295][85176] Updated weights for policy 0, policy_version 49172 (0.0008) +[2023-10-11 17:00:45,660][85176] Updated weights for policy 0, policy_version 49182 (0.0007) +[2023-10-11 17:00:46,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 101449728. Throughput: 0: 1648.2, 1: 1706.2. Samples: 25372036. Policy #0 lag: (min: 27.0, avg: 27.0, max: 27.0) +[2023-10-11 17:00:46,064][84230] Avg episode reward: [(0, '8.000'), (1, '34.870')] +[2023-10-11 17:00:46,247][85175] Updated weights for policy 1, policy_version 49890 (0.0008) +[2023-10-11 17:00:46,613][85175] Updated weights for policy 1, policy_version 49900 (0.0010) +[2023-10-11 17:00:46,979][85175] Updated weights for policy 1, policy_version 49910 (0.0008) +[2023-10-11 17:00:47,352][85175] Updated weights for policy 1, policy_version 49920 (0.0008) +[2023-10-11 17:00:49,848][85176] Updated weights for policy 0, policy_version 49192 (0.0009) +[2023-10-11 17:00:50,221][85176] Updated weights for policy 0, policy_version 49202 (0.0009) +[2023-10-11 17:00:50,604][85176] Updated weights for policy 0, policy_version 49212 (0.0010) +[2023-10-11 17:00:51,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 101515264. Throughput: 0: 1662.3, 1: 1692.9. Samples: 25382002. Policy #0 lag: (min: 27.0, avg: 27.0, max: 27.0) +[2023-10-11 17:00:51,064][84230] Avg episode reward: [(0, '7.730'), (1, '37.900')] +[2023-10-11 17:00:51,361][85175] Updated weights for policy 1, policy_version 49930 (0.0008) +[2023-10-11 17:00:51,716][85175] Updated weights for policy 1, policy_version 49940 (0.0008) +[2023-10-11 17:00:52,078][85175] Updated weights for policy 1, policy_version 49950 (0.0010) +[2023-10-11 17:00:54,911][85176] Updated weights for policy 0, policy_version 49222 (0.0010) +[2023-10-11 17:00:55,298][85176] Updated weights for policy 0, policy_version 49232 (0.0009) +[2023-10-11 17:00:55,667][85176] Updated weights for policy 0, policy_version 49242 (0.0008) +[2023-10-11 17:00:55,989][85175] Updated weights for policy 1, policy_version 49960 (0.0007) +[2023-10-11 17:00:56,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 101580800. Throughput: 0: 1658.6, 1: 1702.2. Samples: 25402448. Policy #0 lag: (min: 27.0, avg: 27.0, max: 27.0) +[2023-10-11 17:00:56,063][84230] Avg episode reward: [(0, '8.130'), (1, '34.760')] +[2023-10-11 17:00:56,363][85175] Updated weights for policy 1, policy_version 49970 (0.0008) +[2023-10-11 17:00:56,725][85175] Updated weights for policy 1, policy_version 49980 (0.0011) +[2023-10-11 17:00:59,842][85176] Updated weights for policy 0, policy_version 49252 (0.0010) +[2023-10-11 17:01:00,212][85176] Updated weights for policy 0, policy_version 49262 (0.0009) +[2023-10-11 17:01:00,597][85176] Updated weights for policy 0, policy_version 49272 (0.0007) +[2023-10-11 17:01:00,908][85175] Updated weights for policy 1, policy_version 49990 (0.0009) +[2023-10-11 17:01:01,062][84230] Fps is (10 sec: 13107.6, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 101646336. Throughput: 0: 1647.8, 1: 1702.8. Samples: 25422144. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) +[2023-10-11 17:01:01,063][84230] Avg episode reward: [(0, '8.160'), (1, '35.970')] +[2023-10-11 17:01:01,271][85175] Updated weights for policy 1, policy_version 50000 (0.0008) +[2023-10-11 17:01:01,648][85175] Updated weights for policy 1, policy_version 50010 (0.0008) +[2023-10-11 17:01:04,728][85176] Updated weights for policy 0, policy_version 49282 (0.0007) +[2023-10-11 17:01:05,094][85176] Updated weights for policy 0, policy_version 49292 (0.0007) +[2023-10-11 17:01:05,472][85176] Updated weights for policy 0, policy_version 49302 (0.0009) +[2023-10-11 17:01:05,735][85175] Updated weights for policy 1, policy_version 50020 (0.0009) +[2023-10-11 17:01:05,843][85176] Updated weights for policy 0, policy_version 49312 (0.0009) +[2023-10-11 17:01:06,062][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 101711872. Throughput: 0: 1662.4, 1: 1703.7. Samples: 25432196. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) +[2023-10-11 17:01:06,063][84230] Avg episode reward: [(0, '7.740'), (1, '35.230')] +[2023-10-11 17:01:06,095][85175] Updated weights for policy 1, policy_version 50030 (0.0010) +[2023-10-11 17:01:06,458][85175] Updated weights for policy 1, policy_version 50040 (0.0011) +[2023-10-11 17:01:09,772][85176] Updated weights for policy 0, policy_version 49322 (0.0010) +[2023-10-11 17:01:10,145][85176] Updated weights for policy 0, policy_version 49332 (0.0008) +[2023-10-11 17:01:10,520][85176] Updated weights for policy 0, policy_version 49342 (0.0009) +[2023-10-11 17:01:10,550][85175] Updated weights for policy 1, policy_version 50050 (0.0009) +[2023-10-11 17:01:10,915][85175] Updated weights for policy 1, policy_version 50060 (0.0009) +[2023-10-11 17:01:11,063][84230] Fps is (10 sec: 13106.7, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 101777408. Throughput: 0: 1661.4, 1: 1700.9. Samples: 25452808. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) +[2023-10-11 17:01:11,064][84230] Avg episode reward: [(0, '8.050'), (1, '35.750')] +[2023-10-11 17:01:11,289][85175] Updated weights for policy 1, policy_version 50070 (0.0011) +[2023-10-11 17:01:11,664][85175] Updated weights for policy 1, policy_version 50080 (0.0008) +[2023-10-11 17:01:14,720][85176] Updated weights for policy 0, policy_version 49352 (0.0009) +[2023-10-11 17:01:15,102][85176] Updated weights for policy 0, policy_version 49362 (0.0010) +[2023-10-11 17:01:15,473][85176] Updated weights for policy 0, policy_version 49372 (0.0007) +[2023-10-11 17:01:15,839][85175] Updated weights for policy 1, policy_version 50090 (0.0007) +[2023-10-11 17:01:16,062][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 101842944. Throughput: 0: 1646.9, 1: 1697.2. Samples: 25472226. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) +[2023-10-11 17:01:16,063][84230] Avg episode reward: [(0, '8.050'), (1, '38.730')] +[2023-10-11 17:01:16,214][85175] Updated weights for policy 1, policy_version 50100 (0.0007) +[2023-10-11 17:01:16,579][85175] Updated weights for policy 1, policy_version 50110 (0.0008) +[2023-10-11 17:01:19,493][85176] Updated weights for policy 0, policy_version 49382 (0.0009) +[2023-10-11 17:01:19,872][85176] Updated weights for policy 0, policy_version 49392 (0.0008) +[2023-10-11 17:01:20,246][85176] Updated weights for policy 0, policy_version 49402 (0.0009) +[2023-10-11 17:01:20,598][85175] Updated weights for policy 1, policy_version 50120 (0.0007) +[2023-10-11 17:01:20,956][85175] Updated weights for policy 1, policy_version 50130 (0.0008) +[2023-10-11 17:01:21,063][84230] Fps is (10 sec: 13107.6, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 101908480. Throughput: 0: 1664.4, 1: 1699.2. Samples: 25482812. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) +[2023-10-11 17:01:21,063][84230] Avg episode reward: [(0, '7.430'), (1, '34.620')] +[2023-10-11 17:01:21,332][85175] Updated weights for policy 1, policy_version 50140 (0.0008) +[2023-10-11 17:01:24,486][85176] Updated weights for policy 0, policy_version 49412 (0.0009) +[2023-10-11 17:01:24,863][85176] Updated weights for policy 0, policy_version 49422 (0.0007) +[2023-10-11 17:01:25,175][85175] Updated weights for policy 1, policy_version 50150 (0.0009) +[2023-10-11 17:01:25,236][85176] Updated weights for policy 0, policy_version 49432 (0.0007) +[2023-10-11 17:01:25,534][85175] Updated weights for policy 1, policy_version 50160 (0.0008) +[2023-10-11 17:01:25,907][85175] Updated weights for policy 1, policy_version 50170 (0.0010) +[2023-10-11 17:01:26,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 101974016. Throughput: 0: 1650.0, 1: 1699.3. Samples: 25503020. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) +[2023-10-11 17:01:26,064][84230] Avg episode reward: [(0, '7.430'), (1, '37.720')] +[2023-10-11 17:01:29,395][85176] Updated weights for policy 0, policy_version 49442 (0.0009) +[2023-10-11 17:01:29,752][85176] Updated weights for policy 0, policy_version 49452 (0.0009) +[2023-10-11 17:01:30,128][85176] Updated weights for policy 0, policy_version 49462 (0.0010) +[2023-10-11 17:01:30,143][85175] Updated weights for policy 1, policy_version 50180 (0.0010) +[2023-10-11 17:01:30,498][85176] Updated weights for policy 0, policy_version 49472 (0.0008) +[2023-10-11 17:01:30,521][85175] Updated weights for policy 1, policy_version 50190 (0.0008) +[2023-10-11 17:01:30,877][85175] Updated weights for policy 1, policy_version 50200 (0.0008) +[2023-10-11 17:01:31,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 102039552. Throughput: 0: 1653.1, 1: 1681.3. Samples: 25522082. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) +[2023-10-11 17:01:31,063][84230] Avg episode reward: [(0, '7.900'), (1, '33.980')] +[2023-10-11 17:01:34,539][85176] Updated weights for policy 0, policy_version 49482 (0.0007) +[2023-10-11 17:01:34,653][85175] Updated weights for policy 1, policy_version 50210 (0.0008) +[2023-10-11 17:01:34,916][85176] Updated weights for policy 0, policy_version 49492 (0.0008) +[2023-10-11 17:01:35,021][85175] Updated weights for policy 1, policy_version 50220 (0.0009) +[2023-10-11 17:01:35,290][85176] Updated weights for policy 0, policy_version 49502 (0.0007) +[2023-10-11 17:01:35,387][85175] Updated weights for policy 1, policy_version 50230 (0.0008) +[2023-10-11 17:01:35,767][85175] Updated weights for policy 1, policy_version 50240 (0.0011) +[2023-10-11 17:01:36,062][84230] Fps is (10 sec: 16384.4, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 102137856. Throughput: 0: 1665.4, 1: 1699.8. Samples: 25533438. Policy #0 lag: (min: 31.0, avg: 33.8, max: 63.0) +[2023-10-11 17:01:36,063][84230] Avg episode reward: [(0, '7.900'), (1, '39.420')] +[2023-10-11 17:01:36,064][85000] Saving new best policy, reward=39.420! +[2023-10-11 17:01:39,207][85176] Updated weights for policy 0, policy_version 49512 (0.0008) +[2023-10-11 17:01:39,582][85176] Updated weights for policy 0, policy_version 49522 (0.0009) +[2023-10-11 17:01:39,949][85176] Updated weights for policy 0, policy_version 49532 (0.0008) +[2023-10-11 17:01:39,964][85175] Updated weights for policy 1, policy_version 50250 (0.0008) +[2023-10-11 17:01:40,336][85175] Updated weights for policy 1, policy_version 50260 (0.0010) +[2023-10-11 17:01:40,703][85175] Updated weights for policy 1, policy_version 50270 (0.0010) +[2023-10-11 17:01:41,063][84230] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 102203392. Throughput: 0: 1656.1, 1: 1699.7. Samples: 25553460. Policy #0 lag: (min: 31.0, avg: 33.8, max: 63.0) +[2023-10-11 17:01:41,064][84230] Avg episode reward: [(0, '7.760'), (1, '37.380')] +[2023-10-11 17:01:44,249][85176] Updated weights for policy 0, policy_version 49542 (0.0009) +[2023-10-11 17:01:44,629][85176] Updated weights for policy 0, policy_version 49552 (0.0007) +[2023-10-11 17:01:44,683][85175] Updated weights for policy 1, policy_version 50280 (0.0010) +[2023-10-11 17:01:44,994][85176] Updated weights for policy 0, policy_version 49562 (0.0007) +[2023-10-11 17:01:45,044][85175] Updated weights for policy 1, policy_version 50290 (0.0008) +[2023-10-11 17:01:45,419][85175] Updated weights for policy 1, policy_version 50300 (0.0007) +[2023-10-11 17:01:46,063][84230] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 102268928. Throughput: 0: 1663.8, 1: 1679.8. Samples: 25572606. Policy #0 lag: (min: 31.0, avg: 33.8, max: 63.0) +[2023-10-11 17:01:46,064][84230] Avg episode reward: [(0, '7.460'), (1, '43.660')] +[2023-10-11 17:01:46,075][85000] Saving new best policy, reward=43.660! +[2023-10-11 17:01:49,102][85176] Updated weights for policy 0, policy_version 49572 (0.0007) +[2023-10-11 17:01:49,480][85176] Updated weights for policy 0, policy_version 49582 (0.0008) +[2023-10-11 17:01:49,533][85175] Updated weights for policy 1, policy_version 50310 (0.0009) +[2023-10-11 17:01:49,846][85176] Updated weights for policy 0, policy_version 49592 (0.0008) +[2023-10-11 17:01:49,897][85175] Updated weights for policy 1, policy_version 50320 (0.0009) +[2023-10-11 17:01:50,274][85175] Updated weights for policy 1, policy_version 50330 (0.0008) +[2023-10-11 17:01:51,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 102334464. Throughput: 0: 1670.3, 1: 1703.3. Samples: 25584006. Policy #0 lag: (min: 31.0, avg: 33.8, max: 63.0) +[2023-10-11 17:01:51,063][84230] Avg episode reward: [(0, '7.750'), (1, '37.930')] +[2023-10-11 17:01:53,915][85176] Updated weights for policy 0, policy_version 49602 (0.0008) +[2023-10-11 17:01:54,062][85175] Updated weights for policy 1, policy_version 50340 (0.0008) +[2023-10-11 17:01:54,285][85176] Updated weights for policy 0, policy_version 49612 (0.0010) +[2023-10-11 17:01:54,429][85175] Updated weights for policy 1, policy_version 50350 (0.0009) +[2023-10-11 17:01:54,652][85176] Updated weights for policy 0, policy_version 49622 (0.0008) +[2023-10-11 17:01:54,783][85175] Updated weights for policy 1, policy_version 50360 (0.0009) +[2023-10-11 17:01:55,025][85176] Updated weights for policy 0, policy_version 49632 (0.0009) +[2023-10-11 17:01:56,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 102400000. Throughput: 0: 1650.9, 1: 1688.9. Samples: 25603100. Policy #0 lag: (min: 31.0, avg: 33.8, max: 63.0) +[2023-10-11 17:01:56,063][84230] Avg episode reward: [(0, '8.050'), (1, '41.230')] +[2023-10-11 17:01:58,727][85175] Updated weights for policy 1, policy_version 50370 (0.0009) +[2023-10-11 17:01:59,088][85175] Updated weights for policy 1, policy_version 50380 (0.0008) +[2023-10-11 17:01:59,251][85176] Updated weights for policy 0, policy_version 49642 (0.0007) +[2023-10-11 17:01:59,450][85175] Updated weights for policy 1, policy_version 50390 (0.0007) +[2023-10-11 17:01:59,622][85176] Updated weights for policy 0, policy_version 49652 (0.0008) +[2023-10-11 17:01:59,816][85175] Updated weights for policy 1, policy_version 50400 (0.0007) +[2023-10-11 17:01:59,991][85176] Updated weights for policy 0, policy_version 49662 (0.0007) +[2023-10-11 17:02:01,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 102465536. Throughput: 0: 1660.6, 1: 1675.8. Samples: 25622364. Policy #0 lag: (min: 31.0, avg: 33.8, max: 63.0) +[2023-10-11 17:02:01,063][84230] Avg episode reward: [(0, '7.600'), (1, '35.780')] +[2023-10-11 17:02:04,076][85175] Updated weights for policy 1, policy_version 50410 (0.0007) +[2023-10-11 17:02:04,227][85176] Updated weights for policy 0, policy_version 49672 (0.0008) +[2023-10-11 17:02:04,452][85175] Updated weights for policy 1, policy_version 50420 (0.0007) +[2023-10-11 17:02:04,597][85176] Updated weights for policy 0, policy_version 49682 (0.0008) +[2023-10-11 17:02:04,807][85175] Updated weights for policy 1, policy_version 50430 (0.0008) +[2023-10-11 17:02:04,963][85176] Updated weights for policy 0, policy_version 49692 (0.0007) +[2023-10-11 17:02:06,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 102531072. Throughput: 0: 1659.4, 1: 1703.1. Samples: 25634124. Policy #0 lag: (min: 31.0, avg: 33.8, max: 63.0) +[2023-10-11 17:02:06,064][84230] Avg episode reward: [(0, '7.900'), (1, '39.100')] +[2023-10-11 17:02:08,866][85175] Updated weights for policy 1, policy_version 50440 (0.0010) +[2023-10-11 17:02:09,165][85176] Updated weights for policy 0, policy_version 49702 (0.0007) +[2023-10-11 17:02:09,241][85175] Updated weights for policy 1, policy_version 50450 (0.0008) +[2023-10-11 17:02:09,531][85176] Updated weights for policy 0, policy_version 49712 (0.0010) +[2023-10-11 17:02:09,612][85175] Updated weights for policy 1, policy_version 50460 (0.0008) +[2023-10-11 17:02:09,910][85176] Updated weights for policy 0, policy_version 49722 (0.0007) +[2023-10-11 17:02:11,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 102596608. Throughput: 0: 1653.5, 1: 1676.5. Samples: 25652868. Policy #0 lag: (min: 14.0, avg: 21.8, max: 46.0) +[2023-10-11 17:02:11,063][84230] Avg episode reward: [(0, '7.580'), (1, '37.230')] +[2023-10-11 17:02:13,569][85175] Updated weights for policy 1, policy_version 50470 (0.0009) +[2023-10-11 17:02:13,925][85175] Updated weights for policy 1, policy_version 50480 (0.0008) +[2023-10-11 17:02:13,991][85176] Updated weights for policy 0, policy_version 49732 (0.0009) +[2023-10-11 17:02:14,288][85175] Updated weights for policy 1, policy_version 50490 (0.0009) +[2023-10-11 17:02:14,360][85176] Updated weights for policy 0, policy_version 49742 (0.0008) +[2023-10-11 17:02:14,736][85176] Updated weights for policy 0, policy_version 49752 (0.0009) +[2023-10-11 17:02:16,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 102662144. Throughput: 0: 1665.5, 1: 1690.5. Samples: 25673102. Policy #0 lag: (min: 14.0, avg: 21.8, max: 46.0) +[2023-10-11 17:02:16,063][84230] Avg episode reward: [(0, '7.700'), (1, '40.120')] +[2023-10-11 17:02:16,073][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000050496_51707904.pth... +[2023-10-11 17:02:16,073][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000049760_50954240.pth... +[2023-10-11 17:02:16,105][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000048192_49348608.pth +[2023-10-11 17:02:16,112][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000048896_50069504.pth +[2023-10-11 17:02:18,373][85175] Updated weights for policy 1, policy_version 50500 (0.0009) +[2023-10-11 17:02:18,599][85176] Updated weights for policy 0, policy_version 49762 (0.0010) +[2023-10-11 17:02:18,740][85175] Updated weights for policy 1, policy_version 50510 (0.0007) +[2023-10-11 17:02:18,962][85176] Updated weights for policy 0, policy_version 49772 (0.0008) +[2023-10-11 17:02:19,115][85175] Updated weights for policy 1, policy_version 50520 (0.0007) +[2023-10-11 17:02:19,349][85176] Updated weights for policy 0, policy_version 49782 (0.0007) +[2023-10-11 17:02:19,711][85176] Updated weights for policy 0, policy_version 49792 (0.0009) +[2023-10-11 17:02:21,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 102727680. Throughput: 0: 1656.8, 1: 1695.0. Samples: 25684270. Policy #0 lag: (min: 14.0, avg: 21.8, max: 46.0) +[2023-10-11 17:02:21,064][84230] Avg episode reward: [(0, '7.780'), (1, '39.040')] +[2023-10-11 17:02:23,078][85175] Updated weights for policy 1, policy_version 50530 (0.0008) +[2023-10-11 17:02:23,455][85175] Updated weights for policy 1, policy_version 50540 (0.0008) +[2023-10-11 17:02:23,830][85175] Updated weights for policy 1, policy_version 50550 (0.0009) +[2023-10-11 17:02:23,999][85176] Updated weights for policy 0, policy_version 49802 (0.0008) +[2023-10-11 17:02:24,192][85175] Updated weights for policy 1, policy_version 50560 (0.0009) +[2023-10-11 17:02:24,368][85176] Updated weights for policy 0, policy_version 49812 (0.0007) +[2023-10-11 17:02:24,752][85176] Updated weights for policy 0, policy_version 49822 (0.0007) +[2023-10-11 17:02:26,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 102793216. Throughput: 0: 1647.2, 1: 1675.0. Samples: 25702960. Policy #0 lag: (min: 14.0, avg: 21.8, max: 46.0) +[2023-10-11 17:02:26,064][84230] Avg episode reward: [(0, '7.830'), (1, '39.480')] +[2023-10-11 17:02:28,054][85175] Updated weights for policy 1, policy_version 50570 (0.0010) +[2023-10-11 17:02:28,429][85175] Updated weights for policy 1, policy_version 50580 (0.0008) +[2023-10-11 17:02:28,797][85175] Updated weights for policy 1, policy_version 50590 (0.0009) +[2023-10-11 17:02:29,107][85176] Updated weights for policy 0, policy_version 49832 (0.0009) +[2023-10-11 17:02:29,482][85176] Updated weights for policy 0, policy_version 49842 (0.0008) +[2023-10-11 17:02:29,860][85176] Updated weights for policy 0, policy_version 49852 (0.0007) +[2023-10-11 17:02:31,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 102858752. Throughput: 0: 1650.4, 1: 1702.8. Samples: 25723498. Policy #0 lag: (min: 14.0, avg: 21.8, max: 46.0) +[2023-10-11 17:02:31,063][84230] Avg episode reward: [(0, '7.730'), (1, '38.290')] +[2023-10-11 17:02:32,812][85175] Updated weights for policy 1, policy_version 50600 (0.0008) +[2023-10-11 17:02:33,174][85175] Updated weights for policy 1, policy_version 50610 (0.0010) +[2023-10-11 17:02:33,543][85175] Updated weights for policy 1, policy_version 50620 (0.0011) +[2023-10-11 17:02:33,947][85176] Updated weights for policy 0, policy_version 49862 (0.0009) +[2023-10-11 17:02:34,311][85176] Updated weights for policy 0, policy_version 49872 (0.0008) +[2023-10-11 17:02:34,691][85176] Updated weights for policy 0, policy_version 49882 (0.0007) +[2023-10-11 17:02:36,062][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 102924288. Throughput: 0: 1653.2, 1: 1682.6. Samples: 25734120. Policy #0 lag: (min: 14.0, avg: 21.8, max: 46.0) +[2023-10-11 17:02:36,063][84230] Avg episode reward: [(0, '7.570'), (1, '37.510')] +[2023-10-11 17:02:37,636][85175] Updated weights for policy 1, policy_version 50630 (0.0009) +[2023-10-11 17:02:38,008][85175] Updated weights for policy 1, policy_version 50640 (0.0009) +[2023-10-11 17:02:38,381][85175] Updated weights for policy 1, policy_version 50650 (0.0010) +[2023-10-11 17:02:38,596][85176] Updated weights for policy 0, policy_version 49892 (0.0008) +[2023-10-11 17:02:38,972][85176] Updated weights for policy 0, policy_version 49902 (0.0009) +[2023-10-11 17:02:39,334][85176] Updated weights for policy 0, policy_version 49912 (0.0008) +[2023-10-11 17:02:41,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 102989824. Throughput: 0: 1650.4, 1: 1694.1. Samples: 25753602. Policy #0 lag: (min: 14.0, avg: 21.8, max: 46.0) +[2023-10-11 17:02:41,064][84230] Avg episode reward: [(0, '7.570'), (1, '37.940')] +[2023-10-11 17:02:42,457][85175] Updated weights for policy 1, policy_version 50660 (0.0008) +[2023-10-11 17:02:42,829][85175] Updated weights for policy 1, policy_version 50670 (0.0007) +[2023-10-11 17:02:43,195][85175] Updated weights for policy 1, policy_version 50680 (0.0007) +[2023-10-11 17:02:43,414][85176] Updated weights for policy 0, policy_version 49922 (0.0007) +[2023-10-11 17:02:43,776][85176] Updated weights for policy 0, policy_version 49932 (0.0009) +[2023-10-11 17:02:44,151][85176] Updated weights for policy 0, policy_version 49942 (0.0009) +[2023-10-11 17:02:44,515][85176] Updated weights for policy 0, policy_version 49952 (0.0007) +[2023-10-11 17:02:46,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 103055360. Throughput: 0: 1666.2, 1: 1711.0. Samples: 25774336. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:02:46,063][84230] Avg episode reward: [(0, '7.350'), (1, '39.770')] +[2023-10-11 17:02:47,305][85175] Updated weights for policy 1, policy_version 50690 (0.0010) +[2023-10-11 17:02:47,674][85175] Updated weights for policy 1, policy_version 50700 (0.0009) +[2023-10-11 17:02:48,036][85175] Updated weights for policy 1, policy_version 50710 (0.0009) +[2023-10-11 17:02:48,409][85175] Updated weights for policy 1, policy_version 50720 (0.0008) +[2023-10-11 17:02:48,528][85176] Updated weights for policy 0, policy_version 49962 (0.0009) +[2023-10-11 17:02:48,905][85176] Updated weights for policy 0, policy_version 49972 (0.0008) +[2023-10-11 17:02:49,285][85176] Updated weights for policy 0, policy_version 49982 (0.0007) +[2023-10-11 17:02:51,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 103120896. Throughput: 0: 1655.6, 1: 1679.0. Samples: 25784182. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:02:51,064][84230] Avg episode reward: [(0, '7.840'), (1, '39.230')] +[2023-10-11 17:02:52,292][85175] Updated weights for policy 1, policy_version 50730 (0.0007) +[2023-10-11 17:02:52,662][85175] Updated weights for policy 1, policy_version 50740 (0.0008) +[2023-10-11 17:02:53,030][85175] Updated weights for policy 1, policy_version 50750 (0.0009) +[2023-10-11 17:02:53,533][85176] Updated weights for policy 0, policy_version 49992 (0.0008) +[2023-10-11 17:02:53,897][85176] Updated weights for policy 0, policy_version 50002 (0.0007) +[2023-10-11 17:02:54,268][85176] Updated weights for policy 0, policy_version 50012 (0.0009) +[2023-10-11 17:02:56,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 103186432. Throughput: 0: 1654.2, 1: 1704.6. Samples: 25804014. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:02:56,063][84230] Avg episode reward: [(0, '7.860'), (1, '37.890')] +[2023-10-11 17:02:57,177][85175] Updated weights for policy 1, policy_version 50760 (0.0009) +[2023-10-11 17:02:57,554][85175] Updated weights for policy 1, policy_version 50770 (0.0010) +[2023-10-11 17:02:57,922][85175] Updated weights for policy 1, policy_version 50780 (0.0008) +[2023-10-11 17:02:58,291][85176] Updated weights for policy 0, policy_version 50022 (0.0009) +[2023-10-11 17:02:58,655][85176] Updated weights for policy 0, policy_version 50032 (0.0010) +[2023-10-11 17:02:59,030][85176] Updated weights for policy 0, policy_version 50042 (0.0010) +[2023-10-11 17:03:01,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 103251968. Throughput: 0: 1664.5, 1: 1699.9. Samples: 25824500. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:03:01,064][84230] Avg episode reward: [(0, '7.540'), (1, '37.910')] +[2023-10-11 17:03:01,832][85175] Updated weights for policy 1, policy_version 50790 (0.0010) +[2023-10-11 17:03:02,196][85175] Updated weights for policy 1, policy_version 50800 (0.0010) +[2023-10-11 17:03:02,562][85175] Updated weights for policy 1, policy_version 50810 (0.0008) +[2023-10-11 17:03:03,153][85176] Updated weights for policy 0, policy_version 50052 (0.0009) +[2023-10-11 17:03:03,523][85176] Updated weights for policy 0, policy_version 50062 (0.0009) +[2023-10-11 17:03:03,894][85176] Updated weights for policy 0, policy_version 50072 (0.0008) +[2023-10-11 17:03:06,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 103317504. Throughput: 0: 1654.3, 1: 1681.7. Samples: 25834390. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:03:06,063][84230] Avg episode reward: [(0, '7.480'), (1, '37.080')] +[2023-10-11 17:03:06,532][85175] Updated weights for policy 1, policy_version 50820 (0.0008) +[2023-10-11 17:03:06,896][85175] Updated weights for policy 1, policy_version 50830 (0.0010) +[2023-10-11 17:03:07,254][85175] Updated weights for policy 1, policy_version 50840 (0.0008) +[2023-10-11 17:03:07,900][85176] Updated weights for policy 0, policy_version 50082 (0.0011) +[2023-10-11 17:03:08,268][85176] Updated weights for policy 0, policy_version 50092 (0.0007) +[2023-10-11 17:03:08,642][85176] Updated weights for policy 0, policy_version 50102 (0.0009) +[2023-10-11 17:03:09,017][85176] Updated weights for policy 0, policy_version 50112 (0.0008) +[2023-10-11 17:03:11,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 103383040. Throughput: 0: 1665.6, 1: 1705.8. Samples: 25854670. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:03:11,063][84230] Avg episode reward: [(0, '7.840'), (1, '36.950')] +[2023-10-11 17:03:11,179][85175] Updated weights for policy 1, policy_version 50850 (0.0008) +[2023-10-11 17:03:11,538][85175] Updated weights for policy 1, policy_version 50860 (0.0007) +[2023-10-11 17:03:11,909][85175] Updated weights for policy 1, policy_version 50870 (0.0008) +[2023-10-11 17:03:12,275][85175] Updated weights for policy 1, policy_version 50880 (0.0008) +[2023-10-11 17:03:13,071][85176] Updated weights for policy 0, policy_version 50122 (0.0010) +[2023-10-11 17:03:13,436][85176] Updated weights for policy 0, policy_version 50132 (0.0010) +[2023-10-11 17:03:13,814][85176] Updated weights for policy 0, policy_version 50142 (0.0007) +[2023-10-11 17:03:16,063][84230] Fps is (10 sec: 13106.8, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 103448576. Throughput: 0: 1678.7, 1: 1701.6. Samples: 25875614. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:03:16,064][84230] Avg episode reward: [(0, '8.030'), (1, '36.640')] +[2023-10-11 17:03:16,288][85175] Updated weights for policy 1, policy_version 50890 (0.0010) +[2023-10-11 17:03:16,654][85175] Updated weights for policy 1, policy_version 50900 (0.0009) +[2023-10-11 17:03:17,026][85175] Updated weights for policy 1, policy_version 50910 (0.0009) +[2023-10-11 17:03:17,788][85176] Updated weights for policy 0, policy_version 50152 (0.0007) +[2023-10-11 17:03:18,167][85176] Updated weights for policy 0, policy_version 50162 (0.0007) +[2023-10-11 17:03:18,542][85176] Updated weights for policy 0, policy_version 50172 (0.0007) +[2023-10-11 17:03:21,003][85175] Updated weights for policy 1, policy_version 50920 (0.0009) +[2023-10-11 17:03:21,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 103514112. Throughput: 0: 1653.1, 1: 1698.8. Samples: 25884956. Policy #0 lag: (min: 26.0, avg: 26.1, max: 31.0) +[2023-10-11 17:03:21,063][84230] Avg episode reward: [(0, '8.030'), (1, '38.770')] +[2023-10-11 17:03:21,378][85175] Updated weights for policy 1, policy_version 50930 (0.0007) +[2023-10-11 17:03:21,752][85175] Updated weights for policy 1, policy_version 50940 (0.0008) +[2023-10-11 17:03:22,658][85176] Updated weights for policy 0, policy_version 50182 (0.0010) +[2023-10-11 17:03:23,020][85176] Updated weights for policy 0, policy_version 50192 (0.0011) +[2023-10-11 17:03:23,399][85176] Updated weights for policy 0, policy_version 50202 (0.0011) +[2023-10-11 17:03:25,784][85175] Updated weights for policy 1, policy_version 50950 (0.0008) +[2023-10-11 17:03:26,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 103579648. Throughput: 0: 1670.0, 1: 1706.2. Samples: 25905534. Policy #0 lag: (min: 26.0, avg: 26.1, max: 31.0) +[2023-10-11 17:03:26,064][84230] Avg episode reward: [(0, '7.900'), (1, '35.650')] +[2023-10-11 17:03:26,144][85175] Updated weights for policy 1, policy_version 50960 (0.0011) +[2023-10-11 17:03:26,518][85175] Updated weights for policy 1, policy_version 50970 (0.0007) +[2023-10-11 17:03:27,785][85176] Updated weights for policy 0, policy_version 50212 (0.0008) +[2023-10-11 17:03:28,156][85176] Updated weights for policy 0, policy_version 50222 (0.0008) +[2023-10-11 17:03:28,523][85176] Updated weights for policy 0, policy_version 50232 (0.0011) +[2023-10-11 17:03:30,633][85175] Updated weights for policy 1, policy_version 50980 (0.0007) +[2023-10-11 17:03:31,001][85175] Updated weights for policy 1, policy_version 50990 (0.0009) +[2023-10-11 17:03:31,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 103645184. Throughput: 0: 1666.2, 1: 1707.4. Samples: 25926148. Policy #0 lag: (min: 26.0, avg: 26.1, max: 31.0) +[2023-10-11 17:03:31,063][84230] Avg episode reward: [(0, '7.600'), (1, '37.580')] +[2023-10-11 17:03:31,375][85175] Updated weights for policy 1, policy_version 51000 (0.0007) +[2023-10-11 17:03:32,604][85176] Updated weights for policy 0, policy_version 50242 (0.0007) +[2023-10-11 17:03:32,984][85176] Updated weights for policy 0, policy_version 50252 (0.0010) +[2023-10-11 17:03:33,353][85176] Updated weights for policy 0, policy_version 50262 (0.0009) +[2023-10-11 17:03:33,729][85176] Updated weights for policy 0, policy_version 50272 (0.0007) +[2023-10-11 17:03:35,304][85175] Updated weights for policy 1, policy_version 51010 (0.0007) +[2023-10-11 17:03:35,677][85175] Updated weights for policy 1, policy_version 51020 (0.0008) +[2023-10-11 17:03:36,046][85175] Updated weights for policy 1, policy_version 51030 (0.0010) +[2023-10-11 17:03:36,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 103710720. Throughput: 0: 1651.4, 1: 1712.4. Samples: 25935554. Policy #0 lag: (min: 26.0, avg: 26.1, max: 31.0) +[2023-10-11 17:03:36,063][84230] Avg episode reward: [(0, '8.240'), (1, '34.320')] +[2023-10-11 17:03:36,416][85175] Updated weights for policy 1, policy_version 51040 (0.0008) +[2023-10-11 17:03:37,892][85176] Updated weights for policy 0, policy_version 50282 (0.0008) +[2023-10-11 17:03:38,269][85176] Updated weights for policy 0, policy_version 50292 (0.0009) +[2023-10-11 17:03:38,633][85176] Updated weights for policy 0, policy_version 50302 (0.0008) +[2023-10-11 17:03:40,423][85175] Updated weights for policy 1, policy_version 51050 (0.0009) +[2023-10-11 17:03:40,787][85175] Updated weights for policy 1, policy_version 51060 (0.0008) +[2023-10-11 17:03:41,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 103776256. Throughput: 0: 1662.5, 1: 1715.2. Samples: 25956008. Policy #0 lag: (min: 26.0, avg: 26.1, max: 31.0) +[2023-10-11 17:03:41,064][84230] Avg episode reward: [(0, '8.230'), (1, '39.430')] +[2023-10-11 17:03:41,154][85175] Updated weights for policy 1, policy_version 51070 (0.0009) +[2023-10-11 17:03:42,735][85176] Updated weights for policy 0, policy_version 50312 (0.0010) +[2023-10-11 17:03:43,114][85176] Updated weights for policy 0, policy_version 50322 (0.0007) +[2023-10-11 17:03:43,483][85176] Updated weights for policy 0, policy_version 50332 (0.0008) +[2023-10-11 17:03:45,159][85175] Updated weights for policy 1, policy_version 51080 (0.0007) +[2023-10-11 17:03:45,534][85175] Updated weights for policy 1, policy_version 51090 (0.0009) +[2023-10-11 17:03:45,900][85175] Updated weights for policy 1, policy_version 51100 (0.0007) +[2023-10-11 17:03:46,062][84230] Fps is (10 sec: 16384.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 103874560. Throughput: 0: 1668.1, 1: 1702.9. Samples: 25976192. Policy #0 lag: (min: 26.0, avg: 26.1, max: 31.0) +[2023-10-11 17:03:46,063][84230] Avg episode reward: [(0, '8.430'), (1, '35.440')] +[2023-10-11 17:03:47,784][85176] Updated weights for policy 0, policy_version 50342 (0.0010) +[2023-10-11 17:03:48,173][85176] Updated weights for policy 0, policy_version 50352 (0.0010) +[2023-10-11 17:03:48,545][85176] Updated weights for policy 0, policy_version 50362 (0.0007) +[2023-10-11 17:03:49,847][85175] Updated weights for policy 1, policy_version 51110 (0.0009) +[2023-10-11 17:03:50,218][85175] Updated weights for policy 1, policy_version 51120 (0.0009) +[2023-10-11 17:03:50,592][85175] Updated weights for policy 1, policy_version 51130 (0.0009) +[2023-10-11 17:03:51,063][84230] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 103940096. Throughput: 0: 1653.4, 1: 1716.4. Samples: 25986034. Policy #0 lag: (min: 26.0, avg: 26.1, max: 31.0) +[2023-10-11 17:03:51,064][84230] Avg episode reward: [(0, '8.740'), (1, '38.840')] +[2023-10-11 17:03:52,786][85176] Updated weights for policy 0, policy_version 50372 (0.0008) +[2023-10-11 17:03:53,163][85176] Updated weights for policy 0, policy_version 50382 (0.0007) +[2023-10-11 17:03:53,542][85176] Updated weights for policy 0, policy_version 50392 (0.0009) +[2023-10-11 17:03:54,589][85175] Updated weights for policy 1, policy_version 51140 (0.0008) +[2023-10-11 17:03:54,952][85175] Updated weights for policy 1, policy_version 51150 (0.0007) +[2023-10-11 17:03:55,313][85175] Updated weights for policy 1, policy_version 51160 (0.0007) +[2023-10-11 17:03:56,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 104005632. Throughput: 0: 1658.1, 1: 1710.1. Samples: 26006240. Policy #0 lag: (min: 31.0, avg: 33.2, max: 63.0) +[2023-10-11 17:03:56,064][84230] Avg episode reward: [(0, '8.180'), (1, '34.380')] +[2023-10-11 17:03:57,383][85176] Updated weights for policy 0, policy_version 50402 (0.0007) +[2023-10-11 17:03:57,758][85176] Updated weights for policy 0, policy_version 50412 (0.0009) +[2023-10-11 17:03:58,129][85176] Updated weights for policy 0, policy_version 50422 (0.0008) +[2023-10-11 17:03:58,508][85176] Updated weights for policy 0, policy_version 50432 (0.0007) +[2023-10-11 17:03:59,320][85175] Updated weights for policy 1, policy_version 51170 (0.0008) +[2023-10-11 17:03:59,684][85175] Updated weights for policy 1, policy_version 51180 (0.0008) +[2023-10-11 17:04:00,054][85175] Updated weights for policy 1, policy_version 51190 (0.0009) +[2023-10-11 17:04:00,420][85175] Updated weights for policy 1, policy_version 51200 (0.0010) +[2023-10-11 17:04:01,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 104071168. Throughput: 0: 1657.7, 1: 1678.6. Samples: 26025746. Policy #0 lag: (min: 31.0, avg: 33.2, max: 63.0) +[2023-10-11 17:04:01,063][84230] Avg episode reward: [(0, '7.880'), (1, '39.670')] +[2023-10-11 17:04:02,404][85176] Updated weights for policy 0, policy_version 50442 (0.0008) +[2023-10-11 17:04:02,773][85176] Updated weights for policy 0, policy_version 50452 (0.0011) +[2023-10-11 17:04:03,150][85176] Updated weights for policy 0, policy_version 50462 (0.0010) +[2023-10-11 17:04:04,582][85175] Updated weights for policy 1, policy_version 51210 (0.0007) +[2023-10-11 17:04:04,942][85175] Updated weights for policy 1, policy_version 51220 (0.0008) +[2023-10-11 17:04:05,310][85175] Updated weights for policy 1, policy_version 51230 (0.0007) +[2023-10-11 17:04:06,062][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 104136704. Throughput: 0: 1653.7, 1: 1702.6. Samples: 26035992. Policy #0 lag: (min: 31.0, avg: 33.2, max: 63.0) +[2023-10-11 17:04:06,063][84230] Avg episode reward: [(0, '8.470'), (1, '35.610')] +[2023-10-11 17:04:07,158][85176] Updated weights for policy 0, policy_version 50472 (0.0008) +[2023-10-11 17:04:07,524][85176] Updated weights for policy 0, policy_version 50482 (0.0008) +[2023-10-11 17:04:07,897][85176] Updated weights for policy 0, policy_version 50492 (0.0007) +[2023-10-11 17:04:09,355][85175] Updated weights for policy 1, policy_version 51240 (0.0007) +[2023-10-11 17:04:09,720][85175] Updated weights for policy 1, policy_version 51250 (0.0007) +[2023-10-11 17:04:10,089][85175] Updated weights for policy 1, policy_version 51260 (0.0007) +[2023-10-11 17:04:11,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 104202240. Throughput: 0: 1661.8, 1: 1687.4. Samples: 26056248. Policy #0 lag: (min: 31.0, avg: 33.2, max: 63.0) +[2023-10-11 17:04:11,063][84230] Avg episode reward: [(0, '8.440'), (1, '38.040')] +[2023-10-11 17:04:12,008][85176] Updated weights for policy 0, policy_version 50502 (0.0010) +[2023-10-11 17:04:12,389][85176] Updated weights for policy 0, policy_version 50512 (0.0008) +[2023-10-11 17:04:12,758][85176] Updated weights for policy 0, policy_version 50522 (0.0007) +[2023-10-11 17:04:14,142][85175] Updated weights for policy 1, policy_version 51270 (0.0009) +[2023-10-11 17:04:14,513][85175] Updated weights for policy 1, policy_version 51280 (0.0009) +[2023-10-11 17:04:14,881][85175] Updated weights for policy 1, policy_version 51290 (0.0008) +[2023-10-11 17:04:16,063][84230] Fps is (10 sec: 13106.7, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 104267776. Throughput: 0: 1668.4, 1: 1667.5. Samples: 26076266. Policy #0 lag: (min: 31.0, avg: 33.2, max: 63.0) +[2023-10-11 17:04:16,064][84230] Avg episode reward: [(0, '7.720'), (1, '34.430')] +[2023-10-11 17:04:16,076][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000050528_51740672.pth... +[2023-10-11 17:04:16,076][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000051296_52527104.pth... +[2023-10-11 17:04:16,112][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000048992_50167808.pth +[2023-10-11 17:04:16,113][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000049696_50888704.pth +[2023-10-11 17:04:16,879][85176] Updated weights for policy 0, policy_version 50532 (0.0007) +[2023-10-11 17:04:17,260][85176] Updated weights for policy 0, policy_version 50542 (0.0008) +[2023-10-11 17:04:17,635][85176] Updated weights for policy 0, policy_version 50552 (0.0008) +[2023-10-11 17:04:18,903][85175] Updated weights for policy 1, policy_version 51300 (0.0010) +[2023-10-11 17:04:19,268][85175] Updated weights for policy 1, policy_version 51310 (0.0010) +[2023-10-11 17:04:19,633][85175] Updated weights for policy 1, policy_version 51320 (0.0012) +[2023-10-11 17:04:21,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 104333312. Throughput: 0: 1664.9, 1: 1698.1. Samples: 26086892. Policy #0 lag: (min: 31.0, avg: 33.2, max: 63.0) +[2023-10-11 17:04:21,063][84230] Avg episode reward: [(0, '9.450'), (1, '38.050')] +[2023-10-11 17:04:21,064][84801] Saving new best policy, reward=9.450! +[2023-10-11 17:04:21,710][85176] Updated weights for policy 0, policy_version 50562 (0.0007) +[2023-10-11 17:04:22,076][85176] Updated weights for policy 0, policy_version 50572 (0.0009) +[2023-10-11 17:04:22,447][85176] Updated weights for policy 0, policy_version 50582 (0.0008) +[2023-10-11 17:04:22,826][85176] Updated weights for policy 0, policy_version 50592 (0.0010) +[2023-10-11 17:04:23,610][85175] Updated weights for policy 1, policy_version 51330 (0.0010) +[2023-10-11 17:04:23,986][85175] Updated weights for policy 1, policy_version 51340 (0.0007) +[2023-10-11 17:04:24,359][85175] Updated weights for policy 1, policy_version 51350 (0.0008) +[2023-10-11 17:04:24,727][85175] Updated weights for policy 1, policy_version 51360 (0.0008) +[2023-10-11 17:04:26,063][84230] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 104398848. Throughput: 0: 1672.4, 1: 1671.7. Samples: 26106492. Policy #0 lag: (min: 31.0, avg: 33.2, max: 63.0) +[2023-10-11 17:04:26,063][84230] Avg episode reward: [(0, '9.150'), (1, '35.210')] +[2023-10-11 17:04:27,008][85176] Updated weights for policy 0, policy_version 50602 (0.0010) +[2023-10-11 17:04:27,373][85176] Updated weights for policy 0, policy_version 50612 (0.0011) +[2023-10-11 17:04:27,745][85176] Updated weights for policy 0, policy_version 50622 (0.0009) +[2023-10-11 17:04:28,751][85175] Updated weights for policy 1, policy_version 51370 (0.0011) +[2023-10-11 17:04:29,126][85175] Updated weights for policy 1, policy_version 51380 (0.0010) +[2023-10-11 17:04:29,486][85175] Updated weights for policy 1, policy_version 51390 (0.0011) +[2023-10-11 17:04:31,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 104464384. Throughput: 0: 1665.9, 1: 1685.4. Samples: 26127000. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:04:31,064][84230] Avg episode reward: [(0, '7.790'), (1, '40.120')] +[2023-10-11 17:04:31,900][85176] Updated weights for policy 0, policy_version 50632 (0.0008) +[2023-10-11 17:04:32,271][85176] Updated weights for policy 0, policy_version 50642 (0.0008) +[2023-10-11 17:04:32,640][85176] Updated weights for policy 0, policy_version 50652 (0.0011) +[2023-10-11 17:04:33,459][85175] Updated weights for policy 1, policy_version 51400 (0.0010) +[2023-10-11 17:04:33,839][85175] Updated weights for policy 1, policy_version 51410 (0.0007) +[2023-10-11 17:04:34,198][85175] Updated weights for policy 1, policy_version 51420 (0.0008) +[2023-10-11 17:04:36,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 104529920. Throughput: 0: 1665.2, 1: 1691.1. Samples: 26137068. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:04:36,063][84230] Avg episode reward: [(0, '7.470'), (1, '37.410')] +[2023-10-11 17:04:36,703][85176] Updated weights for policy 0, policy_version 50662 (0.0008) +[2023-10-11 17:04:37,078][85176] Updated weights for policy 0, policy_version 50672 (0.0009) +[2023-10-11 17:04:37,451][85176] Updated weights for policy 0, policy_version 50682 (0.0009) +[2023-10-11 17:04:38,192][85175] Updated weights for policy 1, policy_version 51430 (0.0008) +[2023-10-11 17:04:38,554][85175] Updated weights for policy 1, policy_version 51440 (0.0008) +[2023-10-11 17:04:38,922][85175] Updated weights for policy 1, policy_version 51450 (0.0008) +[2023-10-11 17:04:41,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 104595456. Throughput: 0: 1678.2, 1: 1673.1. Samples: 26157050. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:04:41,064][84230] Avg episode reward: [(0, '7.730'), (1, '42.360')] +[2023-10-11 17:04:41,519][85176] Updated weights for policy 0, policy_version 50692 (0.0008) +[2023-10-11 17:04:41,888][85176] Updated weights for policy 0, policy_version 50702 (0.0009) +[2023-10-11 17:04:42,268][85176] Updated weights for policy 0, policy_version 50712 (0.0008) +[2023-10-11 17:04:42,878][85175] Updated weights for policy 1, policy_version 51460 (0.0007) +[2023-10-11 17:04:43,238][85175] Updated weights for policy 1, policy_version 51470 (0.0007) +[2023-10-11 17:04:43,607][85175] Updated weights for policy 1, policy_version 51480 (0.0008) +[2023-10-11 17:04:46,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 104660992. Throughput: 0: 1678.3, 1: 1708.1. Samples: 26178136. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:04:46,064][84230] Avg episode reward: [(0, '8.800'), (1, '39.150')] +[2023-10-11 17:04:46,358][85176] Updated weights for policy 0, policy_version 50722 (0.0008) +[2023-10-11 17:04:46,731][85176] Updated weights for policy 0, policy_version 50732 (0.0009) +[2023-10-11 17:04:47,096][85176] Updated weights for policy 0, policy_version 50742 (0.0010) +[2023-10-11 17:04:47,471][85176] Updated weights for policy 0, policy_version 50752 (0.0009) +[2023-10-11 17:04:47,492][85175] Updated weights for policy 1, policy_version 51490 (0.0009) +[2023-10-11 17:04:47,848][85175] Updated weights for policy 1, policy_version 51500 (0.0008) +[2023-10-11 17:04:48,225][85175] Updated weights for policy 1, policy_version 51510 (0.0008) +[2023-10-11 17:04:48,594][85175] Updated weights for policy 1, policy_version 51520 (0.0009) +[2023-10-11 17:04:51,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 104726528. Throughput: 0: 1677.6, 1: 1689.1. Samples: 26187496. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:04:51,063][84230] Avg episode reward: [(0, '9.370'), (1, '42.990')] +[2023-10-11 17:04:51,661][85176] Updated weights for policy 0, policy_version 50762 (0.0010) +[2023-10-11 17:04:52,036][85176] Updated weights for policy 0, policy_version 50772 (0.0010) +[2023-10-11 17:04:52,417][85176] Updated weights for policy 0, policy_version 50782 (0.0008) +[2023-10-11 17:04:52,702][85175] Updated weights for policy 1, policy_version 51530 (0.0009) +[2023-10-11 17:04:53,075][85175] Updated weights for policy 1, policy_version 51540 (0.0010) +[2023-10-11 17:04:53,447][85175] Updated weights for policy 1, policy_version 51550 (0.0009) +[2023-10-11 17:04:56,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 104792064. Throughput: 0: 1672.8, 1: 1696.1. Samples: 26207848. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:04:56,063][84230] Avg episode reward: [(0, '8.470'), (1, '40.610')] +[2023-10-11 17:04:56,358][85176] Updated weights for policy 0, policy_version 50792 (0.0009) +[2023-10-11 17:04:56,726][85176] Updated weights for policy 0, policy_version 50802 (0.0009) +[2023-10-11 17:04:57,106][85176] Updated weights for policy 0, policy_version 50812 (0.0009) +[2023-10-11 17:04:57,486][85175] Updated weights for policy 1, policy_version 51560 (0.0008) +[2023-10-11 17:04:57,856][85175] Updated weights for policy 1, policy_version 51570 (0.0010) +[2023-10-11 17:04:58,230][85175] Updated weights for policy 1, policy_version 51580 (0.0008) +[2023-10-11 17:05:01,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 104857600. Throughput: 0: 1672.5, 1: 1714.6. Samples: 26228682. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:05:01,063][84230] Avg episode reward: [(0, '7.730'), (1, '42.360')] +[2023-10-11 17:05:01,377][85176] Updated weights for policy 0, policy_version 50822 (0.0008) +[2023-10-11 17:05:01,760][85176] Updated weights for policy 0, policy_version 50832 (0.0008) +[2023-10-11 17:05:02,138][85176] Updated weights for policy 0, policy_version 50842 (0.0009) +[2023-10-11 17:05:02,256][85175] Updated weights for policy 1, policy_version 51590 (0.0008) +[2023-10-11 17:05:02,620][85175] Updated weights for policy 1, policy_version 51600 (0.0007) +[2023-10-11 17:05:02,989][85175] Updated weights for policy 1, policy_version 51610 (0.0008) +[2023-10-11 17:05:06,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 104923136. Throughput: 0: 1669.9, 1: 1681.5. Samples: 26237706. Policy #0 lag: (min: 31.0, avg: 31.2, max: 41.0) +[2023-10-11 17:05:06,063][84230] Avg episode reward: [(0, '7.720'), (1, '38.540')] +[2023-10-11 17:05:06,206][85176] Updated weights for policy 0, policy_version 50852 (0.0008) +[2023-10-11 17:05:06,577][85176] Updated weights for policy 0, policy_version 50862 (0.0009) +[2023-10-11 17:05:06,958][85176] Updated weights for policy 0, policy_version 50872 (0.0008) +[2023-10-11 17:05:07,033][85175] Updated weights for policy 1, policy_version 51620 (0.0008) +[2023-10-11 17:05:07,390][85175] Updated weights for policy 1, policy_version 51630 (0.0009) +[2023-10-11 17:05:07,761][85175] Updated weights for policy 1, policy_version 51640 (0.0010) +[2023-10-11 17:05:10,814][85176] Updated weights for policy 0, policy_version 50882 (0.0007) +[2023-10-11 17:05:11,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 104988672. Throughput: 0: 1674.5, 1: 1710.3. Samples: 26258808. Policy #0 lag: (min: 31.0, avg: 31.2, max: 41.0) +[2023-10-11 17:05:11,063][84230] Avg episode reward: [(0, '7.720'), (1, '40.300')] +[2023-10-11 17:05:11,188][85176] Updated weights for policy 0, policy_version 50892 (0.0009) +[2023-10-11 17:05:11,567][85176] Updated weights for policy 0, policy_version 50902 (0.0007) +[2023-10-11 17:05:11,685][85175] Updated weights for policy 1, policy_version 51650 (0.0007) +[2023-10-11 17:05:11,935][85176] Updated weights for policy 0, policy_version 50912 (0.0007) +[2023-10-11 17:05:12,046][85175] Updated weights for policy 1, policy_version 51660 (0.0009) +[2023-10-11 17:05:12,410][85175] Updated weights for policy 1, policy_version 51670 (0.0008) +[2023-10-11 17:05:12,773][85175] Updated weights for policy 1, policy_version 51680 (0.0008) +[2023-10-11 17:05:16,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 105054208. Throughput: 0: 1675.4, 1: 1722.0. Samples: 26279884. Policy #0 lag: (min: 31.0, avg: 31.2, max: 41.0) +[2023-10-11 17:05:16,064][84230] Avg episode reward: [(0, '7.280'), (1, '39.740')] +[2023-10-11 17:05:16,135][85176] Updated weights for policy 0, policy_version 50922 (0.0007) +[2023-10-11 17:05:16,508][85176] Updated weights for policy 0, policy_version 50932 (0.0009) +[2023-10-11 17:05:16,799][85175] Updated weights for policy 1, policy_version 51690 (0.0009) +[2023-10-11 17:05:16,881][85176] Updated weights for policy 0, policy_version 50942 (0.0007) +[2023-10-11 17:05:17,177][85175] Updated weights for policy 1, policy_version 51700 (0.0009) +[2023-10-11 17:05:17,552][85175] Updated weights for policy 1, policy_version 51710 (0.0008) +[2023-10-11 17:05:21,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.5). Total num frames: 105119744. Throughput: 0: 1676.8, 1: 1698.4. Samples: 26288954. Policy #0 lag: (min: 31.0, avg: 31.2, max: 41.0) +[2023-10-11 17:05:21,063][84230] Avg episode reward: [(0, '8.860'), (1, '39.690')] +[2023-10-11 17:05:21,098][85176] Updated weights for policy 0, policy_version 50952 (0.0007) +[2023-10-11 17:05:21,468][85176] Updated weights for policy 0, policy_version 50962 (0.0007) +[2023-10-11 17:05:21,791][85175] Updated weights for policy 1, policy_version 51720 (0.0008) +[2023-10-11 17:05:21,846][85176] Updated weights for policy 0, policy_version 50972 (0.0009) +[2023-10-11 17:05:22,159][85175] Updated weights for policy 1, policy_version 51730 (0.0008) +[2023-10-11 17:05:22,530][85175] Updated weights for policy 1, policy_version 51740 (0.0008) +[2023-10-11 17:05:26,028][85176] Updated weights for policy 0, policy_version 50982 (0.0009) +[2023-10-11 17:05:26,063][84230] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 105185280. Throughput: 0: 1667.3, 1: 1713.6. Samples: 26309188. Policy #0 lag: (min: 31.0, avg: 31.2, max: 41.0) +[2023-10-11 17:05:26,063][84230] Avg episode reward: [(0, '10.140'), (1, '37.970')] +[2023-10-11 17:05:26,353][85175] Updated weights for policy 1, policy_version 51750 (0.0009) +[2023-10-11 17:05:26,392][85176] Updated weights for policy 0, policy_version 50992 (0.0010) +[2023-10-11 17:05:26,720][85175] Updated weights for policy 1, policy_version 51760 (0.0007) +[2023-10-11 17:05:26,763][85176] Updated weights for policy 0, policy_version 51002 (0.0007) +[2023-10-11 17:05:26,984][84801] Saving new best policy, reward=10.140! +[2023-10-11 17:05:27,087][85175] Updated weights for policy 1, policy_version 51770 (0.0008) +[2023-10-11 17:05:30,842][85176] Updated weights for policy 0, policy_version 51012 (0.0008) +[2023-10-11 17:05:31,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 105250816. Throughput: 0: 1661.5, 1: 1712.3. Samples: 26329956. Policy #0 lag: (min: 31.0, avg: 31.2, max: 41.0) +[2023-10-11 17:05:31,063][84230] Avg episode reward: [(0, '10.460'), (1, '38.160')] +[2023-10-11 17:05:31,066][85175] Updated weights for policy 1, policy_version 51780 (0.0007) +[2023-10-11 17:05:31,214][85176] Updated weights for policy 0, policy_version 51022 (0.0008) +[2023-10-11 17:05:31,447][85175] Updated weights for policy 1, policy_version 51790 (0.0009) +[2023-10-11 17:05:31,577][85176] Updated weights for policy 0, policy_version 51032 (0.0007) +[2023-10-11 17:05:31,809][85175] Updated weights for policy 1, policy_version 51800 (0.0007) +[2023-10-11 17:05:31,874][84801] Saving new best policy, reward=10.460! +[2023-10-11 17:05:35,641][85176] Updated weights for policy 0, policy_version 51042 (0.0009) +[2023-10-11 17:05:35,796][85175] Updated weights for policy 1, policy_version 51810 (0.0007) +[2023-10-11 17:05:36,005][85176] Updated weights for policy 0, policy_version 51052 (0.0007) +[2023-10-11 17:05:36,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 105316352. Throughput: 0: 1660.9, 1: 1708.0. Samples: 26339098. Policy #0 lag: (min: 31.0, avg: 31.2, max: 41.0) +[2023-10-11 17:05:36,064][84230] Avg episode reward: [(0, '9.200'), (1, '39.410')] +[2023-10-11 17:05:36,154][85175] Updated weights for policy 1, policy_version 51820 (0.0008) +[2023-10-11 17:05:36,386][85176] Updated weights for policy 0, policy_version 51062 (0.0009) +[2023-10-11 17:05:36,522][85175] Updated weights for policy 1, policy_version 51830 (0.0009) +[2023-10-11 17:05:36,756][85176] Updated weights for policy 0, policy_version 51072 (0.0009) +[2023-10-11 17:05:36,892][85175] Updated weights for policy 1, policy_version 51840 (0.0010) +[2023-10-11 17:05:40,926][85176] Updated weights for policy 0, policy_version 51082 (0.0007) +[2023-10-11 17:05:40,962][85175] Updated weights for policy 1, policy_version 51850 (0.0008) +[2023-10-11 17:05:41,062][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 105381888. Throughput: 0: 1665.1, 1: 1713.8. Samples: 26359898. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:05:41,063][84230] Avg episode reward: [(0, '7.260'), (1, '39.990')] +[2023-10-11 17:05:41,297][85176] Updated weights for policy 0, policy_version 51092 (0.0007) +[2023-10-11 17:05:41,325][85175] Updated weights for policy 1, policy_version 51860 (0.0009) +[2023-10-11 17:05:41,659][85176] Updated weights for policy 0, policy_version 51102 (0.0009) +[2023-10-11 17:05:41,680][85175] Updated weights for policy 1, policy_version 51870 (0.0007) +[2023-10-11 17:05:45,652][85175] Updated weights for policy 1, policy_version 51880 (0.0008) +[2023-10-11 17:05:45,804][85176] Updated weights for policy 0, policy_version 51112 (0.0009) +[2023-10-11 17:05:46,009][85175] Updated weights for policy 1, policy_version 51890 (0.0007) +[2023-10-11 17:05:46,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 105447424. Throughput: 0: 1663.6, 1: 1712.1. Samples: 26380588. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:05:46,063][84230] Avg episode reward: [(0, '7.370'), (1, '39.610')] +[2023-10-11 17:05:46,180][85176] Updated weights for policy 0, policy_version 51122 (0.0008) +[2023-10-11 17:05:46,380][85175] Updated weights for policy 1, policy_version 51900 (0.0007) +[2023-10-11 17:05:46,552][85176] Updated weights for policy 0, policy_version 51132 (0.0008) +[2023-10-11 17:05:50,396][85175] Updated weights for policy 1, policy_version 51910 (0.0008) +[2023-10-11 17:05:50,668][85176] Updated weights for policy 0, policy_version 51142 (0.0008) +[2023-10-11 17:05:50,769][85175] Updated weights for policy 1, policy_version 51920 (0.0008) +[2023-10-11 17:05:51,033][85176] Updated weights for policy 0, policy_version 51152 (0.0008) +[2023-10-11 17:05:51,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 105512960. Throughput: 0: 1665.3, 1: 1715.3. Samples: 26389834. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:05:51,064][84230] Avg episode reward: [(0, '7.800'), (1, '38.540')] +[2023-10-11 17:05:51,141][85175] Updated weights for policy 1, policy_version 51930 (0.0008) +[2023-10-11 17:05:51,410][85176] Updated weights for policy 0, policy_version 51162 (0.0008) +[2023-10-11 17:05:55,113][85175] Updated weights for policy 1, policy_version 51940 (0.0009) +[2023-10-11 17:05:55,483][85175] Updated weights for policy 1, policy_version 51950 (0.0010) +[2023-10-11 17:05:55,575][85176] Updated weights for policy 0, policy_version 51172 (0.0009) +[2023-10-11 17:05:55,843][85175] Updated weights for policy 1, policy_version 51960 (0.0007) +[2023-10-11 17:05:55,953][85176] Updated weights for policy 0, policy_version 51182 (0.0008) +[2023-10-11 17:05:56,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 105578496. Throughput: 0: 1662.7, 1: 1706.3. Samples: 26410412. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:05:56,063][84230] Avg episode reward: [(0, '7.750'), (1, '40.890')] +[2023-10-11 17:05:56,324][85176] Updated weights for policy 0, policy_version 51192 (0.0008) +[2023-10-11 17:05:59,892][85175] Updated weights for policy 1, policy_version 51970 (0.0007) +[2023-10-11 17:06:00,271][85175] Updated weights for policy 1, policy_version 51980 (0.0009) +[2023-10-11 17:06:00,443][85176] Updated weights for policy 0, policy_version 51202 (0.0010) +[2023-10-11 17:06:00,630][85175] Updated weights for policy 1, policy_version 51990 (0.0007) +[2023-10-11 17:06:00,809][85176] Updated weights for policy 0, policy_version 51212 (0.0008) +[2023-10-11 17:06:01,008][85175] Updated weights for policy 1, policy_version 52000 (0.0008) +[2023-10-11 17:06:01,063][84230] Fps is (10 sec: 16383.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 105676800. Throughput: 0: 1658.3, 1: 1684.4. Samples: 26430308. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:06:01,064][84230] Avg episode reward: [(0, '7.860'), (1, '38.400')] +[2023-10-11 17:06:01,182][85176] Updated weights for policy 0, policy_version 51222 (0.0008) +[2023-10-11 17:06:01,554][85176] Updated weights for policy 0, policy_version 51232 (0.0010) +[2023-10-11 17:06:05,192][85175] Updated weights for policy 1, policy_version 52010 (0.0008) +[2023-10-11 17:06:05,562][85175] Updated weights for policy 1, policy_version 52020 (0.0007) +[2023-10-11 17:06:05,598][85176] Updated weights for policy 0, policy_version 51242 (0.0008) +[2023-10-11 17:06:05,926][85175] Updated weights for policy 1, policy_version 52030 (0.0007) +[2023-10-11 17:06:05,976][85176] Updated weights for policy 0, policy_version 51252 (0.0009) +[2023-10-11 17:06:06,062][84230] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 13440.5). Total num frames: 105742336. Throughput: 0: 1657.7, 1: 1702.3. Samples: 26440156. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:06:06,063][84230] Avg episode reward: [(0, '7.560'), (1, '41.190')] +[2023-10-11 17:06:06,358][85176] Updated weights for policy 0, policy_version 51262 (0.0007) +[2023-10-11 17:06:10,012][85175] Updated weights for policy 1, policy_version 52040 (0.0010) +[2023-10-11 17:06:10,220][85176] Updated weights for policy 0, policy_version 51272 (0.0008) +[2023-10-11 17:06:10,396][85175] Updated weights for policy 1, policy_version 52050 (0.0008) +[2023-10-11 17:06:10,592][85176] Updated weights for policy 0, policy_version 51282 (0.0008) +[2023-10-11 17:06:10,762][85175] Updated weights for policy 1, policy_version 52060 (0.0007) +[2023-10-11 17:06:10,961][85176] Updated weights for policy 0, policy_version 51292 (0.0007) +[2023-10-11 17:06:11,062][84230] Fps is (10 sec: 13107.6, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 105807872. Throughput: 0: 1664.9, 1: 1709.5. Samples: 26461036. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:06:11,063][84230] Avg episode reward: [(0, '7.300'), (1, '38.390')] +[2023-10-11 17:06:14,661][85175] Updated weights for policy 1, policy_version 52070 (0.0008) +[2023-10-11 17:06:15,027][85175] Updated weights for policy 1, policy_version 52080 (0.0008) +[2023-10-11 17:06:15,124][85176] Updated weights for policy 0, policy_version 51302 (0.0008) +[2023-10-11 17:06:15,396][85175] Updated weights for policy 1, policy_version 52090 (0.0009) +[2023-10-11 17:06:15,496][85176] Updated weights for policy 0, policy_version 51312 (0.0008) +[2023-10-11 17:06:15,869][85176] Updated weights for policy 0, policy_version 51322 (0.0009) +[2023-10-11 17:06:16,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 105873408. Throughput: 0: 1649.1, 1: 1678.8. Samples: 26479710. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:06:16,063][84230] Avg episode reward: [(0, '7.150'), (1, '40.850')] +[2023-10-11 17:06:16,070][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000052096_53346304.pth... +[2023-10-11 17:06:16,088][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000051328_52559872.pth... +[2023-10-11 17:06:16,111][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000050496_51707904.pth +[2023-10-11 17:06:16,117][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000049760_50954240.pth +[2023-10-11 17:06:19,378][85175] Updated weights for policy 1, policy_version 52100 (0.0010) +[2023-10-11 17:06:19,741][85175] Updated weights for policy 1, policy_version 52110 (0.0008) +[2023-10-11 17:06:19,951][85176] Updated weights for policy 0, policy_version 51332 (0.0009) +[2023-10-11 17:06:20,116][85175] Updated weights for policy 1, policy_version 52120 (0.0007) +[2023-10-11 17:06:20,321][85176] Updated weights for policy 0, policy_version 51342 (0.0008) +[2023-10-11 17:06:20,696][85176] Updated weights for policy 0, policy_version 51352 (0.0009) +[2023-10-11 17:06:21,063][84230] Fps is (10 sec: 16383.6, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 105971712. Throughput: 0: 1663.7, 1: 1702.9. Samples: 26490596. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:06:21,064][84230] Avg episode reward: [(0, '9.010'), (1, '39.160')] +[2023-10-11 17:06:24,043][85175] Updated weights for policy 1, policy_version 52130 (0.0010) +[2023-10-11 17:06:24,421][85175] Updated weights for policy 1, policy_version 52140 (0.0009) +[2023-10-11 17:06:24,798][85175] Updated weights for policy 1, policy_version 52150 (0.0007) +[2023-10-11 17:06:24,861][85176] Updated weights for policy 0, policy_version 51362 (0.0007) +[2023-10-11 17:06:25,168][85175] Updated weights for policy 1, policy_version 52160 (0.0009) +[2023-10-11 17:06:25,220][85176] Updated weights for policy 0, policy_version 51372 (0.0008) +[2023-10-11 17:06:25,598][85176] Updated weights for policy 0, policy_version 51382 (0.0011) +[2023-10-11 17:06:25,960][85176] Updated weights for policy 0, policy_version 51392 (0.0011) +[2023-10-11 17:06:26,062][84230] Fps is (10 sec: 16384.3, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 106037248. Throughput: 0: 1664.3, 1: 1696.1. Samples: 26511118. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:06:26,063][84230] Avg episode reward: [(0, '9.420'), (1, '38.740')] +[2023-10-11 17:06:29,032][85175] Updated weights for policy 1, policy_version 52170 (0.0009) +[2023-10-11 17:06:29,391][85175] Updated weights for policy 1, policy_version 52180 (0.0008) +[2023-10-11 17:06:29,754][85175] Updated weights for policy 1, policy_version 52190 (0.0009) +[2023-10-11 17:06:30,256][85176] Updated weights for policy 0, policy_version 51402 (0.0009) +[2023-10-11 17:06:30,625][85176] Updated weights for policy 0, policy_version 51412 (0.0011) +[2023-10-11 17:06:30,991][85176] Updated weights for policy 0, policy_version 51422 (0.0010) +[2023-10-11 17:06:31,063][84230] Fps is (10 sec: 9830.5, 60 sec: 13653.3, 300 sec: 13329.3). Total num frames: 106070016. Throughput: 0: 1651.7, 1: 1684.5. Samples: 26530716. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:06:31,063][84230] Avg episode reward: [(0, '8.610'), (1, '38.230')] +[2023-10-11 17:06:33,875][85175] Updated weights for policy 1, policy_version 52200 (0.0009) +[2023-10-11 17:06:34,235][85175] Updated weights for policy 1, policy_version 52210 (0.0008) +[2023-10-11 17:06:34,601][85175] Updated weights for policy 1, policy_version 52220 (0.0008) +[2023-10-11 17:06:35,008][85176] Updated weights for policy 0, policy_version 51432 (0.0008) +[2023-10-11 17:06:35,386][85176] Updated weights for policy 0, policy_version 51442 (0.0009) +[2023-10-11 17:06:35,764][85176] Updated weights for policy 0, policy_version 51452 (0.0007) +[2023-10-11 17:06:36,063][84230] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 13440.4). Total num frames: 106168320. Throughput: 0: 1669.8, 1: 1708.9. Samples: 26541874. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:06:36,063][84230] Avg episode reward: [(0, '9.550'), (1, '38.360')] +[2023-10-11 17:06:38,586][85175] Updated weights for policy 1, policy_version 52230 (0.0009) +[2023-10-11 17:06:38,956][85175] Updated weights for policy 1, policy_version 52240 (0.0007) +[2023-10-11 17:06:39,321][85175] Updated weights for policy 1, policy_version 52250 (0.0007) +[2023-10-11 17:06:39,887][85176] Updated weights for policy 0, policy_version 51462 (0.0008) +[2023-10-11 17:06:40,262][85176] Updated weights for policy 0, policy_version 51472 (0.0007) +[2023-10-11 17:06:40,632][85176] Updated weights for policy 0, policy_version 51482 (0.0008) +[2023-10-11 17:06:41,063][84230] Fps is (10 sec: 16383.7, 60 sec: 14199.4, 300 sec: 13440.4). Total num frames: 106233856. Throughput: 0: 1671.0, 1: 1685.9. Samples: 26561474. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:06:41,064][84230] Avg episode reward: [(0, '9.320'), (1, '40.080')] +[2023-10-11 17:06:43,339][85175] Updated weights for policy 1, policy_version 52260 (0.0009) +[2023-10-11 17:06:43,703][85175] Updated weights for policy 1, policy_version 52270 (0.0009) +[2023-10-11 17:06:44,068][85175] Updated weights for policy 1, policy_version 52280 (0.0009) +[2023-10-11 17:06:44,641][85176] Updated weights for policy 0, policy_version 51492 (0.0008) +[2023-10-11 17:06:45,016][85176] Updated weights for policy 0, policy_version 51502 (0.0011) +[2023-10-11 17:06:45,382][85176] Updated weights for policy 0, policy_version 51512 (0.0011) +[2023-10-11 17:06:46,063][84230] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 13440.4). Total num frames: 106299392. Throughput: 0: 1652.6, 1: 1697.0. Samples: 26581038. Policy #0 lag: (min: 11.0, avg: 15.1, max: 43.0) +[2023-10-11 17:06:46,064][84230] Avg episode reward: [(0, '7.560'), (1, '41.200')] +[2023-10-11 17:06:48,224][85175] Updated weights for policy 1, policy_version 52290 (0.0010) +[2023-10-11 17:06:48,591][85175] Updated weights for policy 1, policy_version 52300 (0.0008) +[2023-10-11 17:06:48,953][85175] Updated weights for policy 1, policy_version 52310 (0.0008) +[2023-10-11 17:06:49,319][85175] Updated weights for policy 1, policy_version 52320 (0.0011) +[2023-10-11 17:06:49,473][85176] Updated weights for policy 0, policy_version 51522 (0.0008) +[2023-10-11 17:06:49,839][85176] Updated weights for policy 0, policy_version 51532 (0.0010) +[2023-10-11 17:06:50,213][85176] Updated weights for policy 0, policy_version 51542 (0.0009) +[2023-10-11 17:06:50,583][85176] Updated weights for policy 0, policy_version 51552 (0.0009) +[2023-10-11 17:06:51,063][84230] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 13440.4). Total num frames: 106364928. Throughput: 0: 1677.9, 1: 1696.5. Samples: 26592008. Policy #0 lag: (min: 11.0, avg: 15.1, max: 43.0) +[2023-10-11 17:06:51,064][84230] Avg episode reward: [(0, '8.600'), (1, '36.390')] +[2023-10-11 17:06:53,358][85175] Updated weights for policy 1, policy_version 52330 (0.0008) +[2023-10-11 17:06:53,726][85175] Updated weights for policy 1, policy_version 52340 (0.0008) +[2023-10-11 17:06:54,094][85175] Updated weights for policy 1, policy_version 52350 (0.0008) +[2023-10-11 17:06:54,655][85176] Updated weights for policy 0, policy_version 51562 (0.0008) +[2023-10-11 17:06:55,028][85176] Updated weights for policy 0, policy_version 51572 (0.0008) +[2023-10-11 17:06:55,408][85176] Updated weights for policy 0, policy_version 51582 (0.0009) +[2023-10-11 17:06:56,063][84230] Fps is (10 sec: 13107.5, 60 sec: 14199.4, 300 sec: 13440.4). Total num frames: 106430464. Throughput: 0: 1668.4, 1: 1679.3. Samples: 26611686. Policy #0 lag: (min: 11.0, avg: 15.1, max: 43.0) +[2023-10-11 17:06:56,064][84230] Avg episode reward: [(0, '10.050'), (1, '36.350')] +[2023-10-11 17:06:58,056][85175] Updated weights for policy 1, policy_version 52360 (0.0008) +[2023-10-11 17:06:58,438][85175] Updated weights for policy 1, policy_version 52370 (0.0010) +[2023-10-11 17:06:58,801][85175] Updated weights for policy 1, policy_version 52380 (0.0007) +[2023-10-11 17:06:59,442][85176] Updated weights for policy 0, policy_version 51592 (0.0010) +[2023-10-11 17:06:59,813][85176] Updated weights for policy 0, policy_version 51602 (0.0010) +[2023-10-11 17:07:00,188][85176] Updated weights for policy 0, policy_version 51612 (0.0008) +[2023-10-11 17:07:01,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 106496000. Throughput: 0: 1666.0, 1: 1708.0. Samples: 26631540. Policy #0 lag: (min: 11.0, avg: 15.1, max: 43.0) +[2023-10-11 17:07:01,064][84230] Avg episode reward: [(0, '10.700'), (1, '37.540')] +[2023-10-11 17:07:01,073][84801] Saving new best policy, reward=10.700! +[2023-10-11 17:07:02,766][85175] Updated weights for policy 1, policy_version 52390 (0.0008) +[2023-10-11 17:07:03,123][85175] Updated weights for policy 1, policy_version 52400 (0.0010) +[2023-10-11 17:07:03,496][85175] Updated weights for policy 1, policy_version 52410 (0.0008) +[2023-10-11 17:07:04,218][85176] Updated weights for policy 0, policy_version 51622 (0.0009) +[2023-10-11 17:07:04,597][85176] Updated weights for policy 0, policy_version 51632 (0.0009) +[2023-10-11 17:07:04,974][85176] Updated weights for policy 0, policy_version 51642 (0.0010) +[2023-10-11 17:07:06,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 106561536. Throughput: 0: 1681.6, 1: 1688.4. Samples: 26642246. Policy #0 lag: (min: 11.0, avg: 15.1, max: 43.0) +[2023-10-11 17:07:06,064][84230] Avg episode reward: [(0, '10.430'), (1, '40.440')] +[2023-10-11 17:07:07,465][85175] Updated weights for policy 1, policy_version 52420 (0.0007) +[2023-10-11 17:07:07,837][85175] Updated weights for policy 1, policy_version 52430 (0.0007) +[2023-10-11 17:07:08,197][85175] Updated weights for policy 1, policy_version 52440 (0.0010) +[2023-10-11 17:07:08,985][85176] Updated weights for policy 0, policy_version 51652 (0.0009) +[2023-10-11 17:07:09,343][85176] Updated weights for policy 0, policy_version 51662 (0.0011) +[2023-10-11 17:07:09,713][85176] Updated weights for policy 0, policy_version 51672 (0.0008) +[2023-10-11 17:07:11,063][84230] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 106627072. Throughput: 0: 1666.0, 1: 1694.4. Samples: 26662334. Policy #0 lag: (min: 11.0, avg: 15.1, max: 43.0) +[2023-10-11 17:07:11,063][84230] Avg episode reward: [(0, '9.180'), (1, '39.370')] +[2023-10-11 17:07:12,348][85175] Updated weights for policy 1, policy_version 52450 (0.0008) +[2023-10-11 17:07:12,716][85175] Updated weights for policy 1, policy_version 52460 (0.0009) +[2023-10-11 17:07:13,093][85175] Updated weights for policy 1, policy_version 52470 (0.0009) +[2023-10-11 17:07:13,462][85175] Updated weights for policy 1, policy_version 52480 (0.0008) +[2023-10-11 17:07:13,912][85176] Updated weights for policy 0, policy_version 51682 (0.0008) +[2023-10-11 17:07:14,284][85176] Updated weights for policy 0, policy_version 51692 (0.0008) +[2023-10-11 17:07:14,654][85176] Updated weights for policy 0, policy_version 51702 (0.0009) +[2023-10-11 17:07:15,022][85176] Updated weights for policy 0, policy_version 51712 (0.0009) +[2023-10-11 17:07:16,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 106692608. Throughput: 0: 1665.0, 1: 1708.1. Samples: 26682506. Policy #0 lag: (min: 11.0, avg: 15.1, max: 43.0) +[2023-10-11 17:07:16,064][84230] Avg episode reward: [(0, '9.770'), (1, '36.720')] +[2023-10-11 17:07:17,378][85175] Updated weights for policy 1, policy_version 52490 (0.0008) +[2023-10-11 17:07:17,744][85175] Updated weights for policy 1, policy_version 52500 (0.0007) +[2023-10-11 17:07:18,109][85175] Updated weights for policy 1, policy_version 52510 (0.0007) +[2023-10-11 17:07:19,110][85176] Updated weights for policy 0, policy_version 51722 (0.0007) +[2023-10-11 17:07:19,476][85176] Updated weights for policy 0, policy_version 51732 (0.0011) +[2023-10-11 17:07:19,843][85176] Updated weights for policy 0, policy_version 51742 (0.0011) +[2023-10-11 17:07:21,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 106758144. Throughput: 0: 1677.1, 1: 1683.7. Samples: 26693108. Policy #0 lag: (min: 1.0, avg: 8.8, max: 33.0) +[2023-10-11 17:07:21,063][84230] Avg episode reward: [(0, '10.810'), (1, '38.630')] +[2023-10-11 17:07:21,063][84801] Saving new best policy, reward=10.810! +[2023-10-11 17:07:22,074][85175] Updated weights for policy 1, policy_version 52520 (0.0007) +[2023-10-11 17:07:22,436][85175] Updated weights for policy 1, policy_version 52530 (0.0009) +[2023-10-11 17:07:22,802][85175] Updated weights for policy 1, policy_version 52540 (0.0008) +[2023-10-11 17:07:23,880][85176] Updated weights for policy 0, policy_version 51752 (0.0009) +[2023-10-11 17:07:24,257][85176] Updated weights for policy 0, policy_version 51762 (0.0009) +[2023-10-11 17:07:24,633][85176] Updated weights for policy 0, policy_version 51772 (0.0010) +[2023-10-11 17:07:26,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 106823680. Throughput: 0: 1653.9, 1: 1707.8. Samples: 26712752. Policy #0 lag: (min: 1.0, avg: 8.8, max: 33.0) +[2023-10-11 17:07:26,064][84230] Avg episode reward: [(0, '10.450'), (1, '40.010')] +[2023-10-11 17:07:26,932][85175] Updated weights for policy 1, policy_version 52550 (0.0010) +[2023-10-11 17:07:27,303][85175] Updated weights for policy 1, policy_version 52560 (0.0008) +[2023-10-11 17:07:27,681][85175] Updated weights for policy 1, policy_version 52570 (0.0009) +[2023-10-11 17:07:28,574][85176] Updated weights for policy 0, policy_version 51782 (0.0008) +[2023-10-11 17:07:28,950][85176] Updated weights for policy 0, policy_version 51792 (0.0009) +[2023-10-11 17:07:29,319][85176] Updated weights for policy 0, policy_version 51802 (0.0010) +[2023-10-11 17:07:31,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 106889216. Throughput: 0: 1671.1, 1: 1707.5. Samples: 26733076. Policy #0 lag: (min: 1.0, avg: 8.8, max: 33.0) +[2023-10-11 17:07:31,063][84230] Avg episode reward: [(0, '10.240'), (1, '41.730')] +[2023-10-11 17:07:31,800][85175] Updated weights for policy 1, policy_version 52580 (0.0008) +[2023-10-11 17:07:32,177][85175] Updated weights for policy 1, policy_version 52590 (0.0009) +[2023-10-11 17:07:32,536][85175] Updated weights for policy 1, policy_version 52600 (0.0009) +[2023-10-11 17:07:33,509][85176] Updated weights for policy 0, policy_version 51812 (0.0008) +[2023-10-11 17:07:33,892][85176] Updated weights for policy 0, policy_version 51822 (0.0011) +[2023-10-11 17:07:34,259][85176] Updated weights for policy 0, policy_version 51832 (0.0008) +[2023-10-11 17:07:36,062][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 106954752. Throughput: 0: 1667.8, 1: 1690.4. Samples: 26743128. Policy #0 lag: (min: 1.0, avg: 8.8, max: 33.0) +[2023-10-11 17:07:36,063][84230] Avg episode reward: [(0, '10.800'), (1, '38.180')] +[2023-10-11 17:07:36,445][85175] Updated weights for policy 1, policy_version 52610 (0.0008) +[2023-10-11 17:07:36,822][85175] Updated weights for policy 1, policy_version 52620 (0.0008) +[2023-10-11 17:07:37,193][85175] Updated weights for policy 1, policy_version 52630 (0.0009) +[2023-10-11 17:07:37,562][85175] Updated weights for policy 1, policy_version 52640 (0.0008) +[2023-10-11 17:07:38,406][85176] Updated weights for policy 0, policy_version 51842 (0.0008) +[2023-10-11 17:07:38,785][85176] Updated weights for policy 0, policy_version 51852 (0.0008) +[2023-10-11 17:07:39,156][85176] Updated weights for policy 0, policy_version 51862 (0.0010) +[2023-10-11 17:07:39,537][85176] Updated weights for policy 0, policy_version 51872 (0.0010) +[2023-10-11 17:07:41,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 107020288. Throughput: 0: 1656.4, 1: 1713.8. Samples: 26763344. Policy #0 lag: (min: 1.0, avg: 8.8, max: 33.0) +[2023-10-11 17:07:41,064][84230] Avg episode reward: [(0, '10.130'), (1, '40.570')] +[2023-10-11 17:07:41,553][85175] Updated weights for policy 1, policy_version 52650 (0.0008) +[2023-10-11 17:07:41,929][85175] Updated weights for policy 1, policy_version 52660 (0.0009) +[2023-10-11 17:07:42,290][85175] Updated weights for policy 1, policy_version 52670 (0.0008) +[2023-10-11 17:07:43,570][85176] Updated weights for policy 0, policy_version 51882 (0.0007) +[2023-10-11 17:07:43,941][85176] Updated weights for policy 0, policy_version 51892 (0.0008) +[2023-10-11 17:07:44,323][85176] Updated weights for policy 0, policy_version 51902 (0.0010) +[2023-10-11 17:07:46,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 107085824. Throughput: 0: 1682.8, 1: 1711.7. Samples: 26784294. Policy #0 lag: (min: 1.0, avg: 8.8, max: 33.0) +[2023-10-11 17:07:46,063][84230] Avg episode reward: [(0, '11.210'), (1, '39.790')] +[2023-10-11 17:07:46,071][84801] Saving new best policy, reward=11.210! +[2023-10-11 17:07:46,199][85175] Updated weights for policy 1, policy_version 52680 (0.0008) +[2023-10-11 17:07:46,572][85175] Updated weights for policy 1, policy_version 52690 (0.0007) +[2023-10-11 17:07:46,932][85175] Updated weights for policy 1, policy_version 52700 (0.0010) +[2023-10-11 17:07:48,574][85176] Updated weights for policy 0, policy_version 51912 (0.0008) +[2023-10-11 17:07:48,957][85176] Updated weights for policy 0, policy_version 51922 (0.0008) +[2023-10-11 17:07:49,329][85176] Updated weights for policy 0, policy_version 51932 (0.0007) +[2023-10-11 17:07:50,884][85175] Updated weights for policy 1, policy_version 52710 (0.0008) +[2023-10-11 17:07:51,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 107151360. Throughput: 0: 1672.4, 1: 1701.7. Samples: 26794078. Policy #0 lag: (min: 1.0, avg: 8.8, max: 33.0) +[2023-10-11 17:07:51,063][84230] Avg episode reward: [(0, '12.970'), (1, '41.750')] +[2023-10-11 17:07:51,064][84801] Saving new best policy, reward=12.970! +[2023-10-11 17:07:51,251][85175] Updated weights for policy 1, policy_version 52720 (0.0008) +[2023-10-11 17:07:51,623][85175] Updated weights for policy 1, policy_version 52730 (0.0008) +[2023-10-11 17:07:53,381][85176] Updated weights for policy 0, policy_version 51942 (0.0007) +[2023-10-11 17:07:53,758][85176] Updated weights for policy 0, policy_version 51952 (0.0008) +[2023-10-11 17:07:54,133][85176] Updated weights for policy 0, policy_version 51962 (0.0011) +[2023-10-11 17:07:55,782][85175] Updated weights for policy 1, policy_version 52740 (0.0008) +[2023-10-11 17:07:56,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 107216896. Throughput: 0: 1661.2, 1: 1700.8. Samples: 26813626. Policy #0 lag: (min: 8.0, avg: 32.1, max: 40.0) +[2023-10-11 17:07:56,063][84230] Avg episode reward: [(0, '15.060'), (1, '36.070')] +[2023-10-11 17:07:56,064][84801] Saving new best policy, reward=15.060! +[2023-10-11 17:07:56,152][85175] Updated weights for policy 1, policy_version 52750 (0.0008) +[2023-10-11 17:07:56,519][85175] Updated weights for policy 1, policy_version 52760 (0.0010) +[2023-10-11 17:07:58,264][85176] Updated weights for policy 0, policy_version 51972 (0.0010) +[2023-10-11 17:07:58,634][85176] Updated weights for policy 0, policy_version 51982 (0.0009) +[2023-10-11 17:07:59,006][85176] Updated weights for policy 0, policy_version 51992 (0.0008) +[2023-10-11 17:08:00,461][85175] Updated weights for policy 1, policy_version 52770 (0.0010) +[2023-10-11 17:08:00,832][85175] Updated weights for policy 1, policy_version 52780 (0.0010) +[2023-10-11 17:08:01,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 107282432. Throughput: 0: 1675.1, 1: 1702.9. Samples: 26834518. Policy #0 lag: (min: 8.0, avg: 32.1, max: 40.0) +[2023-10-11 17:08:01,063][84230] Avg episode reward: [(0, '15.590'), (1, '40.210')] +[2023-10-11 17:08:01,073][84801] Saving new best policy, reward=15.590! +[2023-10-11 17:08:01,195][85175] Updated weights for policy 1, policy_version 52790 (0.0010) +[2023-10-11 17:08:01,566][85175] Updated weights for policy 1, policy_version 52800 (0.0009) +[2023-10-11 17:08:03,087][85176] Updated weights for policy 0, policy_version 52002 (0.0008) +[2023-10-11 17:08:03,458][85176] Updated weights for policy 0, policy_version 52012 (0.0007) +[2023-10-11 17:08:03,829][85176] Updated weights for policy 0, policy_version 52022 (0.0007) +[2023-10-11 17:08:04,192][85176] Updated weights for policy 0, policy_version 52032 (0.0008) +[2023-10-11 17:08:05,601][85175] Updated weights for policy 1, policy_version 52810 (0.0007) +[2023-10-11 17:08:05,968][85175] Updated weights for policy 1, policy_version 52820 (0.0010) +[2023-10-11 17:08:06,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 107347968. Throughput: 0: 1660.0, 1: 1703.8. Samples: 26844480. Policy #0 lag: (min: 8.0, avg: 32.1, max: 40.0) +[2023-10-11 17:08:06,063][84230] Avg episode reward: [(0, '16.470'), (1, '37.940')] +[2023-10-11 17:08:06,064][84801] Saving new best policy, reward=16.470! +[2023-10-11 17:08:06,337][85175] Updated weights for policy 1, policy_version 52830 (0.0009) +[2023-10-11 17:08:08,352][85176] Updated weights for policy 0, policy_version 52042 (0.0007) +[2023-10-11 17:08:08,722][85176] Updated weights for policy 0, policy_version 52052 (0.0009) +[2023-10-11 17:08:09,097][85176] Updated weights for policy 0, policy_version 52062 (0.0007) +[2023-10-11 17:08:10,502][85175] Updated weights for policy 1, policy_version 52840 (0.0011) +[2023-10-11 17:08:10,870][85175] Updated weights for policy 1, policy_version 52850 (0.0010) +[2023-10-11 17:08:11,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 107413504. Throughput: 0: 1664.6, 1: 1699.4. Samples: 26864132. Policy #0 lag: (min: 8.0, avg: 32.1, max: 40.0) +[2023-10-11 17:08:11,063][84230] Avg episode reward: [(0, '17.700'), (1, '41.300')] +[2023-10-11 17:08:11,064][84801] Saving new best policy, reward=17.700! +[2023-10-11 17:08:11,254][85175] Updated weights for policy 1, policy_version 52860 (0.0011) +[2023-10-11 17:08:12,998][85176] Updated weights for policy 0, policy_version 52072 (0.0009) +[2023-10-11 17:08:13,376][85176] Updated weights for policy 0, policy_version 52082 (0.0008) +[2023-10-11 17:08:13,746][85176] Updated weights for policy 0, policy_version 52092 (0.0009) +[2023-10-11 17:08:15,302][85175] Updated weights for policy 1, policy_version 52870 (0.0010) +[2023-10-11 17:08:15,672][85175] Updated weights for policy 1, policy_version 52880 (0.0010) +[2023-10-11 17:08:16,044][85175] Updated weights for policy 1, policy_version 52890 (0.0009) +[2023-10-11 17:08:16,063][84230] Fps is (10 sec: 13106.7, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 107479040. Throughput: 0: 1675.6, 1: 1690.7. Samples: 26884560. Policy #0 lag: (min: 8.0, avg: 32.1, max: 40.0) +[2023-10-11 17:08:16,064][84230] Avg episode reward: [(0, '17.510'), (1, '40.020')] +[2023-10-11 17:08:16,074][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000052096_53346304.pth... +[2023-10-11 17:08:16,111][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000050528_51740672.pth +[2023-10-11 17:08:16,257][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000052896_54165504.pth... +[2023-10-11 17:08:16,285][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000051296_52527104.pth +[2023-10-11 17:08:17,867][85176] Updated weights for policy 0, policy_version 52102 (0.0008) +[2023-10-11 17:08:18,237][85176] Updated weights for policy 0, policy_version 52112 (0.0007) +[2023-10-11 17:08:18,619][85176] Updated weights for policy 0, policy_version 52122 (0.0007) +[2023-10-11 17:08:20,027][85175] Updated weights for policy 1, policy_version 52900 (0.0008) +[2023-10-11 17:08:20,399][85175] Updated weights for policy 1, policy_version 52910 (0.0007) +[2023-10-11 17:08:20,759][85175] Updated weights for policy 1, policy_version 52920 (0.0009) +[2023-10-11 17:08:21,062][84230] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 107577344. Throughput: 0: 1659.5, 1: 1703.1. Samples: 26894446. Policy #0 lag: (min: 8.0, avg: 32.1, max: 40.0) +[2023-10-11 17:08:21,063][84230] Avg episode reward: [(0, '17.820'), (1, '40.630')] +[2023-10-11 17:08:21,064][84801] Saving new best policy, reward=17.820! +[2023-10-11 17:08:22,662][85176] Updated weights for policy 0, policy_version 52132 (0.0008) +[2023-10-11 17:08:23,032][85176] Updated weights for policy 0, policy_version 52142 (0.0009) +[2023-10-11 17:08:23,398][85176] Updated weights for policy 0, policy_version 52152 (0.0007) +[2023-10-11 17:08:24,984][85175] Updated weights for policy 1, policy_version 52930 (0.0009) +[2023-10-11 17:08:25,348][85175] Updated weights for policy 1, policy_version 52940 (0.0008) +[2023-10-11 17:08:25,712][85175] Updated weights for policy 1, policy_version 52950 (0.0008) +[2023-10-11 17:08:26,062][84230] Fps is (10 sec: 13107.8, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 107610112. Throughput: 0: 1669.4, 1: 1690.6. Samples: 26914544. Policy #0 lag: (min: 8.0, avg: 32.1, max: 40.0) +[2023-10-11 17:08:26,063][84230] Avg episode reward: [(0, '17.610'), (1, '40.670')] +[2023-10-11 17:08:26,074][85175] Updated weights for policy 1, policy_version 52960 (0.0008) +[2023-10-11 17:08:27,428][85176] Updated weights for policy 0, policy_version 52162 (0.0008) +[2023-10-11 17:08:27,797][85176] Updated weights for policy 0, policy_version 52172 (0.0009) +[2023-10-11 17:08:28,170][85176] Updated weights for policy 0, policy_version 52182 (0.0007) +[2023-10-11 17:08:28,540][85176] Updated weights for policy 0, policy_version 52192 (0.0009) +[2023-10-11 17:08:30,170][85175] Updated weights for policy 1, policy_version 52970 (0.0008) +[2023-10-11 17:08:30,538][85175] Updated weights for policy 1, policy_version 52980 (0.0007) +[2023-10-11 17:08:30,915][85175] Updated weights for policy 1, policy_version 52990 (0.0009) +[2023-10-11 17:08:31,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 107708416. Throughput: 0: 1670.8, 1: 1672.6. Samples: 26934750. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:08:31,064][84230] Avg episode reward: [(0, '18.310'), (1, '39.450')] +[2023-10-11 17:08:31,075][84801] Saving new best policy, reward=18.310! +[2023-10-11 17:08:32,631][85176] Updated weights for policy 0, policy_version 52202 (0.0007) +[2023-10-11 17:08:33,018][85176] Updated weights for policy 0, policy_version 52212 (0.0008) +[2023-10-11 17:08:33,399][85176] Updated weights for policy 0, policy_version 52222 (0.0007) +[2023-10-11 17:08:34,910][85175] Updated weights for policy 1, policy_version 53000 (0.0009) +[2023-10-11 17:08:35,295][85175] Updated weights for policy 1, policy_version 53010 (0.0007) +[2023-10-11 17:08:35,659][85175] Updated weights for policy 1, policy_version 53020 (0.0007) +[2023-10-11 17:08:36,063][84230] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 107773952. Throughput: 0: 1651.9, 1: 1693.4. Samples: 26944616. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:08:36,063][84230] Avg episode reward: [(0, '22.720'), (1, '41.630')] +[2023-10-11 17:08:36,064][84801] Saving new best policy, reward=22.720! +[2023-10-11 17:08:37,458][85176] Updated weights for policy 0, policy_version 52232 (0.0008) +[2023-10-11 17:08:37,832][85176] Updated weights for policy 0, policy_version 52242 (0.0009) +[2023-10-11 17:08:38,204][85176] Updated weights for policy 0, policy_version 52252 (0.0011) +[2023-10-11 17:08:39,438][85175] Updated weights for policy 1, policy_version 53030 (0.0007) +[2023-10-11 17:08:39,799][85175] Updated weights for policy 1, policy_version 53040 (0.0008) +[2023-10-11 17:08:40,163][85175] Updated weights for policy 1, policy_version 53050 (0.0007) +[2023-10-11 17:08:41,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 107839488. Throughput: 0: 1676.4, 1: 1691.6. Samples: 26965182. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:08:41,064][84230] Avg episode reward: [(0, '21.040'), (1, '38.020')] +[2023-10-11 17:08:42,128][85176] Updated weights for policy 0, policy_version 52262 (0.0009) +[2023-10-11 17:08:42,496][85176] Updated weights for policy 0, policy_version 52272 (0.0008) +[2023-10-11 17:08:42,870][85176] Updated weights for policy 0, policy_version 52282 (0.0008) +[2023-10-11 17:08:44,270][85175] Updated weights for policy 1, policy_version 53060 (0.0007) +[2023-10-11 17:08:44,647][85175] Updated weights for policy 1, policy_version 53070 (0.0008) +[2023-10-11 17:08:45,009][85175] Updated weights for policy 1, policy_version 53080 (0.0009) +[2023-10-11 17:08:46,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 107905024. Throughput: 0: 1683.7, 1: 1665.3. Samples: 26985224. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:08:46,063][84230] Avg episode reward: [(0, '22.300'), (1, '40.690')] +[2023-10-11 17:08:46,814][85176] Updated weights for policy 0, policy_version 52292 (0.0008) +[2023-10-11 17:08:47,186][85176] Updated weights for policy 0, policy_version 52302 (0.0007) +[2023-10-11 17:08:47,557][85176] Updated weights for policy 0, policy_version 52312 (0.0008) +[2023-10-11 17:08:49,039][85175] Updated weights for policy 1, policy_version 53090 (0.0009) +[2023-10-11 17:08:49,414][85175] Updated weights for policy 1, policy_version 53100 (0.0007) +[2023-10-11 17:08:49,779][85175] Updated weights for policy 1, policy_version 53110 (0.0007) +[2023-10-11 17:08:50,145][85175] Updated weights for policy 1, policy_version 53120 (0.0007) +[2023-10-11 17:08:51,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 107970560. Throughput: 0: 1668.7, 1: 1687.7. Samples: 26995518. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:08:51,063][84230] Avg episode reward: [(0, '25.400'), (1, '35.220')] +[2023-10-11 17:08:51,064][84801] Saving new best policy, reward=25.400! +[2023-10-11 17:08:51,731][85176] Updated weights for policy 0, policy_version 52322 (0.0011) +[2023-10-11 17:08:52,113][85176] Updated weights for policy 0, policy_version 52332 (0.0007) +[2023-10-11 17:08:52,489][85176] Updated weights for policy 0, policy_version 52342 (0.0009) +[2023-10-11 17:08:52,852][85176] Updated weights for policy 0, policy_version 52352 (0.0009) +[2023-10-11 17:08:54,214][85175] Updated weights for policy 1, policy_version 53130 (0.0010) +[2023-10-11 17:08:54,578][85175] Updated weights for policy 1, policy_version 53140 (0.0011) +[2023-10-11 17:08:54,944][85175] Updated weights for policy 1, policy_version 53150 (0.0009) +[2023-10-11 17:08:56,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 108036096. Throughput: 0: 1682.2, 1: 1679.9. Samples: 27015426. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:08:56,063][84230] Avg episode reward: [(0, '26.210'), (1, '38.070')] +[2023-10-11 17:08:56,064][84801] Saving new best policy, reward=26.210! +[2023-10-11 17:08:57,073][85176] Updated weights for policy 0, policy_version 52362 (0.0010) +[2023-10-11 17:08:57,444][85176] Updated weights for policy 0, policy_version 52372 (0.0007) +[2023-10-11 17:08:57,822][85176] Updated weights for policy 0, policy_version 52382 (0.0007) +[2023-10-11 17:08:58,894][85175] Updated weights for policy 1, policy_version 53160 (0.0011) +[2023-10-11 17:08:59,257][85175] Updated weights for policy 1, policy_version 53170 (0.0008) +[2023-10-11 17:08:59,625][85175] Updated weights for policy 1, policy_version 53180 (0.0009) +[2023-10-11 17:09:01,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 108101632. Throughput: 0: 1677.8, 1: 1679.0. Samples: 27035616. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:09:01,064][84230] Avg episode reward: [(0, '27.640'), (1, '37.590')] +[2023-10-11 17:09:01,073][84801] Saving new best policy, reward=27.640! +[2023-10-11 17:09:01,940][85176] Updated weights for policy 0, policy_version 52392 (0.0008) +[2023-10-11 17:09:02,318][85176] Updated weights for policy 0, policy_version 52402 (0.0009) +[2023-10-11 17:09:02,685][85176] Updated weights for policy 0, policy_version 52412 (0.0010) +[2023-10-11 17:09:03,882][85175] Updated weights for policy 1, policy_version 53190 (0.0010) +[2023-10-11 17:09:04,254][85175] Updated weights for policy 1, policy_version 53200 (0.0010) +[2023-10-11 17:09:04,627][85175] Updated weights for policy 1, policy_version 53210 (0.0010) +[2023-10-11 17:09:06,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 108167168. Throughput: 0: 1668.8, 1: 1698.8. Samples: 27045986. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:09:06,063][84230] Avg episode reward: [(0, '25.890'), (1, '42.780')] +[2023-10-11 17:09:06,823][85176] Updated weights for policy 0, policy_version 52422 (0.0007) +[2023-10-11 17:09:07,200][85176] Updated weights for policy 0, policy_version 52432 (0.0010) +[2023-10-11 17:09:07,572][85176] Updated weights for policy 0, policy_version 52442 (0.0010) +[2023-10-11 17:09:08,742][85175] Updated weights for policy 1, policy_version 53220 (0.0010) +[2023-10-11 17:09:09,108][85175] Updated weights for policy 1, policy_version 53230 (0.0010) +[2023-10-11 17:09:09,476][85175] Updated weights for policy 1, policy_version 53240 (0.0008) +[2023-10-11 17:09:11,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 108232704. Throughput: 0: 1677.0, 1: 1680.1. Samples: 27065616. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:09:11,064][84230] Avg episode reward: [(0, '30.180'), (1, '40.660')] +[2023-10-11 17:09:11,065][84801] Saving new best policy, reward=30.180! +[2023-10-11 17:09:11,681][85176] Updated weights for policy 0, policy_version 52452 (0.0007) +[2023-10-11 17:09:12,061][85176] Updated weights for policy 0, policy_version 52462 (0.0008) +[2023-10-11 17:09:12,421][85176] Updated weights for policy 0, policy_version 52472 (0.0009) +[2023-10-11 17:09:13,416][85175] Updated weights for policy 1, policy_version 53250 (0.0008) +[2023-10-11 17:09:13,786][85175] Updated weights for policy 1, policy_version 53260 (0.0010) +[2023-10-11 17:09:14,153][85175] Updated weights for policy 1, policy_version 53270 (0.0009) +[2023-10-11 17:09:14,526][85175] Updated weights for policy 1, policy_version 53280 (0.0008) +[2023-10-11 17:09:16,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 108298240. Throughput: 0: 1675.7, 1: 1689.4. Samples: 27086182. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:09:16,063][84230] Avg episode reward: [(0, '29.750'), (1, '42.130')] +[2023-10-11 17:09:16,516][85176] Updated weights for policy 0, policy_version 52482 (0.0007) +[2023-10-11 17:09:16,891][85176] Updated weights for policy 0, policy_version 52492 (0.0008) +[2023-10-11 17:09:17,273][85176] Updated weights for policy 0, policy_version 52502 (0.0008) +[2023-10-11 17:09:17,637][85176] Updated weights for policy 0, policy_version 52512 (0.0010) +[2023-10-11 17:09:18,621][85175] Updated weights for policy 1, policy_version 53290 (0.0007) +[2023-10-11 17:09:18,987][85175] Updated weights for policy 1, policy_version 53300 (0.0009) +[2023-10-11 17:09:19,355][85175] Updated weights for policy 1, policy_version 53310 (0.0009) +[2023-10-11 17:09:21,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 108363776. Throughput: 0: 1676.1, 1: 1694.3. Samples: 27096286. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:09:21,063][84230] Avg episode reward: [(0, '30.380'), (1, '38.990')] +[2023-10-11 17:09:21,064][84801] Saving new best policy, reward=30.380! +[2023-10-11 17:09:21,765][85176] Updated weights for policy 0, policy_version 52522 (0.0008) +[2023-10-11 17:09:22,130][85176] Updated weights for policy 0, policy_version 52532 (0.0011) +[2023-10-11 17:09:22,505][85176] Updated weights for policy 0, policy_version 52542 (0.0011) +[2023-10-11 17:09:23,101][85175] Updated weights for policy 1, policy_version 53320 (0.0010) +[2023-10-11 17:09:23,470][85175] Updated weights for policy 1, policy_version 53330 (0.0010) +[2023-10-11 17:09:23,836][85175] Updated weights for policy 1, policy_version 53340 (0.0011) +[2023-10-11 17:09:26,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 108429312. Throughput: 0: 1679.3, 1: 1678.9. Samples: 27116298. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:09:26,063][84230] Avg episode reward: [(0, '27.760'), (1, '41.500')] +[2023-10-11 17:09:26,425][85176] Updated weights for policy 0, policy_version 52552 (0.0010) +[2023-10-11 17:09:26,791][85176] Updated weights for policy 0, policy_version 52562 (0.0009) +[2023-10-11 17:09:27,164][85176] Updated weights for policy 0, policy_version 52572 (0.0009) +[2023-10-11 17:09:27,874][85175] Updated weights for policy 1, policy_version 53350 (0.0011) +[2023-10-11 17:09:28,261][85175] Updated weights for policy 1, policy_version 53360 (0.0010) +[2023-10-11 17:09:28,637][85175] Updated weights for policy 1, policy_version 53370 (0.0008) +[2023-10-11 17:09:31,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 108494848. Throughput: 0: 1673.1, 1: 1705.9. Samples: 27137278. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:09:31,063][84230] Avg episode reward: [(0, '30.070'), (1, '39.130')] +[2023-10-11 17:09:31,271][85176] Updated weights for policy 0, policy_version 52582 (0.0009) +[2023-10-11 17:09:31,643][85176] Updated weights for policy 0, policy_version 52592 (0.0010) +[2023-10-11 17:09:32,015][85176] Updated weights for policy 0, policy_version 52602 (0.0010) +[2023-10-11 17:09:32,507][85175] Updated weights for policy 1, policy_version 53380 (0.0009) +[2023-10-11 17:09:32,871][85175] Updated weights for policy 1, policy_version 53390 (0.0010) +[2023-10-11 17:09:33,244][85175] Updated weights for policy 1, policy_version 53400 (0.0007) +[2023-10-11 17:09:36,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 108560384. Throughput: 0: 1672.2, 1: 1681.2. Samples: 27146420. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:09:36,063][84230] Avg episode reward: [(0, '29.430'), (1, '40.760')] +[2023-10-11 17:09:36,148][85176] Updated weights for policy 0, policy_version 52612 (0.0009) +[2023-10-11 17:09:36,516][85176] Updated weights for policy 0, policy_version 52622 (0.0009) +[2023-10-11 17:09:36,887][85176] Updated weights for policy 0, policy_version 52632 (0.0009) +[2023-10-11 17:09:37,396][85175] Updated weights for policy 1, policy_version 53410 (0.0007) +[2023-10-11 17:09:37,771][85175] Updated weights for policy 1, policy_version 53420 (0.0008) +[2023-10-11 17:09:38,127][85175] Updated weights for policy 1, policy_version 53430 (0.0008) +[2023-10-11 17:09:38,496][85175] Updated weights for policy 1, policy_version 53440 (0.0007) +[2023-10-11 17:09:41,039][85176] Updated weights for policy 0, policy_version 52642 (0.0008) +[2023-10-11 17:09:41,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 108625920. Throughput: 0: 1673.1, 1: 1689.3. Samples: 27166734. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:09:41,063][84230] Avg episode reward: [(0, '30.080'), (1, '38.970')] +[2023-10-11 17:09:41,408][85176] Updated weights for policy 0, policy_version 52652 (0.0008) +[2023-10-11 17:09:41,776][85176] Updated weights for policy 0, policy_version 52662 (0.0009) +[2023-10-11 17:09:42,143][85176] Updated weights for policy 0, policy_version 52672 (0.0008) +[2023-10-11 17:09:42,492][85175] Updated weights for policy 1, policy_version 53450 (0.0008) +[2023-10-11 17:09:42,846][85175] Updated weights for policy 1, policy_version 53460 (0.0007) +[2023-10-11 17:09:43,225][85175] Updated weights for policy 1, policy_version 53470 (0.0008) +[2023-10-11 17:09:46,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 108691456. Throughput: 0: 1679.3, 1: 1703.9. Samples: 27187860. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:09:46,063][84230] Avg episode reward: [(0, '26.660'), (1, '41.190')] +[2023-10-11 17:09:46,137][85176] Updated weights for policy 0, policy_version 52682 (0.0008) +[2023-10-11 17:09:46,511][85176] Updated weights for policy 0, policy_version 52692 (0.0008) +[2023-10-11 17:09:46,882][85176] Updated weights for policy 0, policy_version 52702 (0.0009) +[2023-10-11 17:09:47,305][85175] Updated weights for policy 1, policy_version 53480 (0.0007) +[2023-10-11 17:09:47,665][85175] Updated weights for policy 1, policy_version 53490 (0.0008) +[2023-10-11 17:09:48,040][85175] Updated weights for policy 1, policy_version 53500 (0.0007) +[2023-10-11 17:09:51,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 108756992. Throughput: 0: 1680.7, 1: 1676.8. Samples: 27197070. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:09:51,063][84230] Avg episode reward: [(0, '29.490'), (1, '39.800')] +[2023-10-11 17:09:51,168][85176] Updated weights for policy 0, policy_version 52712 (0.0008) +[2023-10-11 17:09:51,537][85176] Updated weights for policy 0, policy_version 52722 (0.0009) +[2023-10-11 17:09:51,901][85176] Updated weights for policy 0, policy_version 52732 (0.0008) +[2023-10-11 17:09:52,072][85175] Updated weights for policy 1, policy_version 53510 (0.0007) +[2023-10-11 17:09:52,431][85175] Updated weights for policy 1, policy_version 53520 (0.0008) +[2023-10-11 17:09:52,811][85175] Updated weights for policy 1, policy_version 53530 (0.0009) +[2023-10-11 17:09:55,997][85176] Updated weights for policy 0, policy_version 52742 (0.0008) +[2023-10-11 17:09:56,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 108822528. Throughput: 0: 1681.2, 1: 1698.2. Samples: 27217688. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:09:56,064][84230] Avg episode reward: [(0, '29.370'), (1, '41.700')] +[2023-10-11 17:09:56,370][85176] Updated weights for policy 0, policy_version 52752 (0.0012) +[2023-10-11 17:09:56,738][85176] Updated weights for policy 0, policy_version 52762 (0.0008) +[2023-10-11 17:09:56,897][85175] Updated weights for policy 1, policy_version 53540 (0.0009) +[2023-10-11 17:09:57,265][85175] Updated weights for policy 1, policy_version 53550 (0.0007) +[2023-10-11 17:09:57,628][85175] Updated weights for policy 1, policy_version 53560 (0.0007) +[2023-10-11 17:10:00,720][85176] Updated weights for policy 0, policy_version 52772 (0.0009) +[2023-10-11 17:10:01,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 108888064. Throughput: 0: 1673.4, 1: 1699.3. Samples: 27237952. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:10:01,064][84230] Avg episode reward: [(0, '30.940'), (1, '38.620')] +[2023-10-11 17:10:01,091][85176] Updated weights for policy 0, policy_version 52782 (0.0008) +[2023-10-11 17:10:01,460][85176] Updated weights for policy 0, policy_version 52792 (0.0009) +[2023-10-11 17:10:01,764][84801] Saving new best policy, reward=30.940! +[2023-10-11 17:10:01,807][85175] Updated weights for policy 1, policy_version 53570 (0.0007) +[2023-10-11 17:10:02,177][85175] Updated weights for policy 1, policy_version 53580 (0.0008) +[2023-10-11 17:10:02,552][85175] Updated weights for policy 1, policy_version 53590 (0.0011) +[2023-10-11 17:10:02,922][85175] Updated weights for policy 1, policy_version 53600 (0.0010) +[2023-10-11 17:10:05,456][85176] Updated weights for policy 0, policy_version 52802 (0.0009) +[2023-10-11 17:10:05,829][85176] Updated weights for policy 0, policy_version 52812 (0.0007) +[2023-10-11 17:10:06,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 108953600. Throughput: 0: 1675.6, 1: 1675.8. Samples: 27247100. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:10:06,063][84230] Avg episode reward: [(0, '29.000'), (1, '38.330')] +[2023-10-11 17:10:06,203][85176] Updated weights for policy 0, policy_version 52822 (0.0008) +[2023-10-11 17:10:06,580][85176] Updated weights for policy 0, policy_version 52832 (0.0009) +[2023-10-11 17:10:06,976][85175] Updated weights for policy 1, policy_version 53610 (0.0007) +[2023-10-11 17:10:07,351][85175] Updated weights for policy 1, policy_version 53620 (0.0010) +[2023-10-11 17:10:07,724][85175] Updated weights for policy 1, policy_version 53630 (0.0010) +[2023-10-11 17:10:10,866][85176] Updated weights for policy 0, policy_version 52842 (0.0007) +[2023-10-11 17:10:11,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 109019136. Throughput: 0: 1670.0, 1: 1698.5. Samples: 27267880. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:10:11,063][84230] Avg episode reward: [(0, '26.090'), (1, '40.400')] +[2023-10-11 17:10:11,234][85176] Updated weights for policy 0, policy_version 52852 (0.0007) +[2023-10-11 17:10:11,600][85176] Updated weights for policy 0, policy_version 52862 (0.0007) +[2023-10-11 17:10:11,859][85175] Updated weights for policy 1, policy_version 53640 (0.0008) +[2023-10-11 17:10:12,217][85175] Updated weights for policy 1, policy_version 53650 (0.0009) +[2023-10-11 17:10:12,587][85175] Updated weights for policy 1, policy_version 53660 (0.0009) +[2023-10-11 17:10:15,637][85176] Updated weights for policy 0, policy_version 52872 (0.0008) +[2023-10-11 17:10:16,011][85176] Updated weights for policy 0, policy_version 52882 (0.0009) +[2023-10-11 17:10:16,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 109084672. Throughput: 0: 1663.3, 1: 1694.4. Samples: 27288378. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) +[2023-10-11 17:10:16,064][84230] Avg episode reward: [(0, '26.500'), (1, '42.480')] +[2023-10-11 17:10:16,075][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000053664_54951936.pth... +[2023-10-11 17:10:16,115][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000052096_53346304.pth +[2023-10-11 17:10:16,387][85176] Updated weights for policy 0, policy_version 52892 (0.0009) +[2023-10-11 17:10:16,529][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000052896_54165504.pth... +[2023-10-11 17:10:16,559][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000051328_52559872.pth +[2023-10-11 17:10:16,692][85175] Updated weights for policy 1, policy_version 53670 (0.0008) +[2023-10-11 17:10:17,093][85175] Updated weights for policy 1, policy_version 53680 (0.0007) +[2023-10-11 17:10:17,454][85175] Updated weights for policy 1, policy_version 53690 (0.0009) +[2023-10-11 17:10:20,241][85176] Updated weights for policy 0, policy_version 52902 (0.0010) +[2023-10-11 17:10:20,617][85176] Updated weights for policy 0, policy_version 52912 (0.0010) +[2023-10-11 17:10:20,981][85176] Updated weights for policy 0, policy_version 52922 (0.0007) +[2023-10-11 17:10:21,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 109150208. Throughput: 0: 1673.2, 1: 1691.2. Samples: 27297816. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) +[2023-10-11 17:10:21,063][84230] Avg episode reward: [(0, '23.990'), (1, '41.790')] +[2023-10-11 17:10:21,451][85175] Updated weights for policy 1, policy_version 53700 (0.0008) +[2023-10-11 17:10:21,820][85175] Updated weights for policy 1, policy_version 53710 (0.0007) +[2023-10-11 17:10:22,190][85175] Updated weights for policy 1, policy_version 53720 (0.0009) +[2023-10-11 17:10:25,142][85176] Updated weights for policy 0, policy_version 52932 (0.0010) +[2023-10-11 17:10:25,509][85176] Updated weights for policy 0, policy_version 52942 (0.0008) +[2023-10-11 17:10:25,881][85176] Updated weights for policy 0, policy_version 52952 (0.0009) +[2023-10-11 17:10:26,062][84230] Fps is (10 sec: 13107.7, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 109215744. Throughput: 0: 1681.6, 1: 1691.6. Samples: 27318524. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) +[2023-10-11 17:10:26,063][84230] Avg episode reward: [(0, '31.230'), (1, '42.170')] +[2023-10-11 17:10:26,180][84801] Saving new best policy, reward=31.230! +[2023-10-11 17:10:26,281][85175] Updated weights for policy 1, policy_version 53730 (0.0009) +[2023-10-11 17:10:26,648][85175] Updated weights for policy 1, policy_version 53740 (0.0008) +[2023-10-11 17:10:27,030][85175] Updated weights for policy 1, policy_version 53750 (0.0008) +[2023-10-11 17:10:27,393][85175] Updated weights for policy 1, policy_version 53760 (0.0009) +[2023-10-11 17:10:29,994][85176] Updated weights for policy 0, policy_version 52962 (0.0010) +[2023-10-11 17:10:30,365][85176] Updated weights for policy 0, policy_version 52972 (0.0009) +[2023-10-11 17:10:30,742][85176] Updated weights for policy 0, policy_version 52982 (0.0009) +[2023-10-11 17:10:31,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 109281280. Throughput: 0: 1660.1, 1: 1688.0. Samples: 27338522. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) +[2023-10-11 17:10:31,063][84230] Avg episode reward: [(0, '29.890'), (1, '38.660')] +[2023-10-11 17:10:31,116][85176] Updated weights for policy 0, policy_version 52992 (0.0008) +[2023-10-11 17:10:31,461][85175] Updated weights for policy 1, policy_version 53770 (0.0008) +[2023-10-11 17:10:31,822][85175] Updated weights for policy 1, policy_version 53780 (0.0007) +[2023-10-11 17:10:32,197][85175] Updated weights for policy 1, policy_version 53790 (0.0009) +[2023-10-11 17:10:35,193][85176] Updated weights for policy 0, policy_version 53002 (0.0007) +[2023-10-11 17:10:35,557][85176] Updated weights for policy 0, policy_version 53012 (0.0007) +[2023-10-11 17:10:35,906][85175] Updated weights for policy 1, policy_version 53800 (0.0007) +[2023-10-11 17:10:35,939][85176] Updated weights for policy 0, policy_version 53022 (0.0007) +[2023-10-11 17:10:36,063][84230] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 109379584. Throughput: 0: 1673.6, 1: 1688.8. Samples: 27348380. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) +[2023-10-11 17:10:36,063][84230] Avg episode reward: [(0, '28.960'), (1, '43.940')] +[2023-10-11 17:10:36,271][85175] Updated weights for policy 1, policy_version 53810 (0.0010) +[2023-10-11 17:10:36,633][85175] Updated weights for policy 1, policy_version 53820 (0.0010) +[2023-10-11 17:10:36,777][85000] Saving new best policy, reward=43.940! +[2023-10-11 17:10:40,057][85176] Updated weights for policy 0, policy_version 53032 (0.0008) +[2023-10-11 17:10:40,422][85176] Updated weights for policy 0, policy_version 53042 (0.0011) +[2023-10-11 17:10:40,676][85175] Updated weights for policy 1, policy_version 53830 (0.0010) +[2023-10-11 17:10:40,797][85176] Updated weights for policy 0, policy_version 53052 (0.0010) +[2023-10-11 17:10:41,051][85175] Updated weights for policy 1, policy_version 53840 (0.0009) +[2023-10-11 17:10:41,063][84230] Fps is (10 sec: 16383.6, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 109445120. Throughput: 0: 1671.0, 1: 1692.6. Samples: 27369048. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) +[2023-10-11 17:10:41,064][84230] Avg episode reward: [(0, '28.870'), (1, '37.990')] +[2023-10-11 17:10:41,414][85175] Updated weights for policy 1, policy_version 53850 (0.0008) +[2023-10-11 17:10:44,932][85176] Updated weights for policy 0, policy_version 53062 (0.0008) +[2023-10-11 17:10:45,312][85176] Updated weights for policy 0, policy_version 53072 (0.0009) +[2023-10-11 17:10:45,490][85175] Updated weights for policy 1, policy_version 53860 (0.0009) +[2023-10-11 17:10:45,676][85176] Updated weights for policy 0, policy_version 53082 (0.0008) +[2023-10-11 17:10:45,861][85175] Updated weights for policy 1, policy_version 53870 (0.0007) +[2023-10-11 17:10:46,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 109510656. Throughput: 0: 1654.5, 1: 1697.1. Samples: 27388776. Policy #0 lag: (min: 31.0, avg: 33.0, max: 62.0) +[2023-10-11 17:10:46,063][84230] Avg episode reward: [(0, '30.980'), (1, '42.810')] +[2023-10-11 17:10:46,220][85175] Updated weights for policy 1, policy_version 53880 (0.0009) +[2023-10-11 17:10:49,818][85176] Updated weights for policy 0, policy_version 53092 (0.0007) +[2023-10-11 17:10:50,180][85176] Updated weights for policy 0, policy_version 53102 (0.0010) +[2023-10-11 17:10:50,351][85175] Updated weights for policy 1, policy_version 53890 (0.0007) +[2023-10-11 17:10:50,559][85176] Updated weights for policy 0, policy_version 53112 (0.0010) +[2023-10-11 17:10:50,712][85175] Updated weights for policy 1, policy_version 53900 (0.0007) +[2023-10-11 17:10:51,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 109576192. Throughput: 0: 1668.7, 1: 1698.6. Samples: 27398628. Policy #0 lag: (min: 31.0, avg: 33.0, max: 62.0) +[2023-10-11 17:10:51,064][84230] Avg episode reward: [(0, '31.930'), (1, '39.370')] +[2023-10-11 17:10:51,065][84801] Saving new best policy, reward=31.930! +[2023-10-11 17:10:51,077][85175] Updated weights for policy 1, policy_version 53910 (0.0008) +[2023-10-11 17:10:51,451][85175] Updated weights for policy 1, policy_version 53920 (0.0008) +[2023-10-11 17:10:54,824][85176] Updated weights for policy 0, policy_version 53122 (0.0008) +[2023-10-11 17:10:55,197][85176] Updated weights for policy 0, policy_version 53132 (0.0007) +[2023-10-11 17:10:55,572][85176] Updated weights for policy 0, policy_version 53142 (0.0008) +[2023-10-11 17:10:55,636][85175] Updated weights for policy 1, policy_version 53930 (0.0008) +[2023-10-11 17:10:55,958][85176] Updated weights for policy 0, policy_version 53152 (0.0009) +[2023-10-11 17:10:56,005][85175] Updated weights for policy 1, policy_version 53940 (0.0009) +[2023-10-11 17:10:56,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 109641728. Throughput: 0: 1666.9, 1: 1697.7. Samples: 27419288. Policy #0 lag: (min: 31.0, avg: 33.0, max: 62.0) +[2023-10-11 17:10:56,063][84230] Avg episode reward: [(0, '31.590'), (1, '43.510')] +[2023-10-11 17:10:56,375][85175] Updated weights for policy 1, policy_version 53950 (0.0008) +[2023-10-11 17:10:59,810][85176] Updated weights for policy 0, policy_version 53162 (0.0011) +[2023-10-11 17:11:00,179][85176] Updated weights for policy 0, policy_version 53172 (0.0010) +[2023-10-11 17:11:00,421][85175] Updated weights for policy 1, policy_version 53960 (0.0008) +[2023-10-11 17:11:00,546][85176] Updated weights for policy 0, policy_version 53182 (0.0008) +[2023-10-11 17:11:00,778][85175] Updated weights for policy 1, policy_version 53970 (0.0009) +[2023-10-11 17:11:01,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 109707264. Throughput: 0: 1650.5, 1: 1686.4. Samples: 27438540. Policy #0 lag: (min: 31.0, avg: 33.0, max: 62.0) +[2023-10-11 17:11:01,063][84230] Avg episode reward: [(0, '33.270'), (1, '39.970')] +[2023-10-11 17:11:01,070][84801] Saving new best policy, reward=33.270! +[2023-10-11 17:11:01,153][85175] Updated weights for policy 1, policy_version 53980 (0.0008) +[2023-10-11 17:11:04,811][85176] Updated weights for policy 0, policy_version 53192 (0.0011) +[2023-10-11 17:11:05,183][85176] Updated weights for policy 0, policy_version 53202 (0.0007) +[2023-10-11 17:11:05,428][85175] Updated weights for policy 1, policy_version 53990 (0.0008) +[2023-10-11 17:11:05,553][85176] Updated weights for policy 0, policy_version 53212 (0.0007) +[2023-10-11 17:11:05,817][85175] Updated weights for policy 1, policy_version 54000 (0.0008) +[2023-10-11 17:11:06,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 109772800. Throughput: 0: 1668.8, 1: 1695.9. Samples: 27449224. Policy #0 lag: (min: 31.0, avg: 33.0, max: 62.0) +[2023-10-11 17:11:06,064][84230] Avg episode reward: [(0, '30.430'), (1, '42.470')] +[2023-10-11 17:11:06,184][85175] Updated weights for policy 1, policy_version 54010 (0.0008) +[2023-10-11 17:11:09,583][85176] Updated weights for policy 0, policy_version 53222 (0.0007) +[2023-10-11 17:11:09,955][85176] Updated weights for policy 0, policy_version 53232 (0.0008) +[2023-10-11 17:11:09,992][85175] Updated weights for policy 1, policy_version 54020 (0.0009) +[2023-10-11 17:11:10,330][85176] Updated weights for policy 0, policy_version 53242 (0.0009) +[2023-10-11 17:11:10,354][85175] Updated weights for policy 1, policy_version 54030 (0.0007) +[2023-10-11 17:11:10,723][85175] Updated weights for policy 1, policy_version 54040 (0.0010) +[2023-10-11 17:11:11,063][84230] Fps is (10 sec: 16383.7, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 109871104. Throughput: 0: 1656.1, 1: 1694.8. Samples: 27469316. Policy #0 lag: (min: 31.0, avg: 33.0, max: 62.0) +[2023-10-11 17:11:11,064][84230] Avg episode reward: [(0, '31.100'), (1, '39.200')] +[2023-10-11 17:11:14,443][85176] Updated weights for policy 0, policy_version 53252 (0.0007) +[2023-10-11 17:11:14,820][85176] Updated weights for policy 0, policy_version 53262 (0.0007) +[2023-10-11 17:11:14,871][85175] Updated weights for policy 1, policy_version 54050 (0.0007) +[2023-10-11 17:11:15,184][85176] Updated weights for policy 0, policy_version 53272 (0.0007) +[2023-10-11 17:11:15,227][85175] Updated weights for policy 1, policy_version 54060 (0.0008) +[2023-10-11 17:11:15,595][85175] Updated weights for policy 1, policy_version 54070 (0.0007) +[2023-10-11 17:11:15,954][85175] Updated weights for policy 1, policy_version 54080 (0.0011) +[2023-10-11 17:11:16,063][84230] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 13440.4). Total num frames: 109936640. Throughput: 0: 1650.2, 1: 1679.8. Samples: 27488370. Policy #0 lag: (min: 31.0, avg: 33.0, max: 62.0) +[2023-10-11 17:11:16,064][84230] Avg episode reward: [(0, '29.210'), (1, '41.340')] +[2023-10-11 17:11:19,305][85176] Updated weights for policy 0, policy_version 53282 (0.0009) +[2023-10-11 17:11:19,682][85176] Updated weights for policy 0, policy_version 53292 (0.0008) +[2023-10-11 17:11:19,868][85175] Updated weights for policy 1, policy_version 54090 (0.0007) +[2023-10-11 17:11:20,054][85176] Updated weights for policy 0, policy_version 53302 (0.0009) +[2023-10-11 17:11:20,241][85175] Updated weights for policy 1, policy_version 54100 (0.0009) +[2023-10-11 17:11:20,416][85176] Updated weights for policy 0, policy_version 53312 (0.0009) +[2023-10-11 17:11:20,616][85175] Updated weights for policy 1, policy_version 54110 (0.0009) +[2023-10-11 17:11:21,063][84230] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13440.4). Total num frames: 110002176. Throughput: 0: 1667.5, 1: 1693.5. Samples: 27499624. Policy #0 lag: (min: 10.0, avg: 13.8, max: 42.0) +[2023-10-11 17:11:21,063][84230] Avg episode reward: [(0, '31.550'), (1, '40.210')] +[2023-10-11 17:11:24,636][85175] Updated weights for policy 1, policy_version 54120 (0.0008) +[2023-10-11 17:11:24,655][85176] Updated weights for policy 0, policy_version 53322 (0.0008) +[2023-10-11 17:11:25,007][85175] Updated weights for policy 1, policy_version 54130 (0.0007) +[2023-10-11 17:11:25,025][85176] Updated weights for policy 0, policy_version 53332 (0.0008) +[2023-10-11 17:11:25,369][85175] Updated weights for policy 1, policy_version 54140 (0.0008) +[2023-10-11 17:11:25,395][85176] Updated weights for policy 0, policy_version 53342 (0.0008) +[2023-10-11 17:11:26,063][84230] Fps is (10 sec: 13107.5, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 110067712. Throughput: 0: 1662.7, 1: 1689.3. Samples: 27519884. Policy #0 lag: (min: 10.0, avg: 13.8, max: 42.0) +[2023-10-11 17:11:26,063][84230] Avg episode reward: [(0, '30.890'), (1, '42.000')] +[2023-10-11 17:11:29,258][85176] Updated weights for policy 0, policy_version 53352 (0.0007) +[2023-10-11 17:11:29,485][85175] Updated weights for policy 1, policy_version 54150 (0.0007) +[2023-10-11 17:11:29,622][85176] Updated weights for policy 0, policy_version 53362 (0.0007) +[2023-10-11 17:11:29,850][85175] Updated weights for policy 1, policy_version 54160 (0.0007) +[2023-10-11 17:11:29,985][85176] Updated weights for policy 0, policy_version 53372 (0.0007) +[2023-10-11 17:11:30,222][85175] Updated weights for policy 1, policy_version 54170 (0.0009) +[2023-10-11 17:11:31,063][84230] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13440.4). Total num frames: 110133248. Throughput: 0: 1662.4, 1: 1664.0. Samples: 27538464. Policy #0 lag: (min: 10.0, avg: 13.8, max: 42.0) +[2023-10-11 17:11:31,063][84230] Avg episode reward: [(0, '30.330'), (1, '38.890')] +[2023-10-11 17:11:34,241][85176] Updated weights for policy 0, policy_version 53382 (0.0007) +[2023-10-11 17:11:34,247][85175] Updated weights for policy 1, policy_version 54180 (0.0008) +[2023-10-11 17:11:34,611][85175] Updated weights for policy 1, policy_version 54190 (0.0007) +[2023-10-11 17:11:34,611][85176] Updated weights for policy 0, policy_version 53392 (0.0010) +[2023-10-11 17:11:34,981][85176] Updated weights for policy 0, policy_version 53402 (0.0008) +[2023-10-11 17:11:34,987][85175] Updated weights for policy 1, policy_version 54200 (0.0007) +[2023-10-11 17:11:36,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 110198784. Throughput: 0: 1674.4, 1: 1695.1. Samples: 27550254. Policy #0 lag: (min: 10.0, avg: 13.8, max: 42.0) +[2023-10-11 17:11:36,064][84230] Avg episode reward: [(0, '30.230'), (1, '39.770')] +[2023-10-11 17:11:38,913][85175] Updated weights for policy 1, policy_version 54210 (0.0009) +[2023-10-11 17:11:38,942][85176] Updated weights for policy 0, policy_version 53412 (0.0010) +[2023-10-11 17:11:39,285][85175] Updated weights for policy 1, policy_version 54220 (0.0008) +[2023-10-11 17:11:39,316][85176] Updated weights for policy 0, policy_version 53422 (0.0008) +[2023-10-11 17:11:39,647][85175] Updated weights for policy 1, policy_version 54230 (0.0009) +[2023-10-11 17:11:39,695][85176] Updated weights for policy 0, policy_version 53432 (0.0007) +[2023-10-11 17:11:40,011][85175] Updated weights for policy 1, policy_version 54240 (0.0008) +[2023-10-11 17:11:41,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 110264320. Throughput: 0: 1659.9, 1: 1677.2. Samples: 27569460. Policy #0 lag: (min: 10.0, avg: 13.8, max: 42.0) +[2023-10-11 17:11:41,064][84230] Avg episode reward: [(0, '29.520'), (1, '38.430')] +[2023-10-11 17:11:43,689][85176] Updated weights for policy 0, policy_version 53442 (0.0008) +[2023-10-11 17:11:44,051][85176] Updated weights for policy 0, policy_version 53452 (0.0008) +[2023-10-11 17:11:44,138][85175] Updated weights for policy 1, policy_version 54250 (0.0007) +[2023-10-11 17:11:44,422][85176] Updated weights for policy 0, policy_version 53462 (0.0007) +[2023-10-11 17:11:44,508][85175] Updated weights for policy 1, policy_version 54260 (0.0007) +[2023-10-11 17:11:44,789][85176] Updated weights for policy 0, policy_version 53472 (0.0007) +[2023-10-11 17:11:44,866][85175] Updated weights for policy 1, policy_version 54270 (0.0008) +[2023-10-11 17:11:46,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 110329856. Throughput: 0: 1675.6, 1: 1675.6. Samples: 27589348. Policy #0 lag: (min: 10.0, avg: 13.8, max: 42.0) +[2023-10-11 17:11:46,064][84230] Avg episode reward: [(0, '31.070'), (1, '42.090')] +[2023-10-11 17:11:48,764][85175] Updated weights for policy 1, policy_version 54280 (0.0009) +[2023-10-11 17:11:49,027][85176] Updated weights for policy 0, policy_version 53482 (0.0009) +[2023-10-11 17:11:49,133][85175] Updated weights for policy 1, policy_version 54290 (0.0008) +[2023-10-11 17:11:49,391][85176] Updated weights for policy 0, policy_version 53492 (0.0008) +[2023-10-11 17:11:49,502][85175] Updated weights for policy 1, policy_version 54300 (0.0007) +[2023-10-11 17:11:49,769][85176] Updated weights for policy 0, policy_version 53502 (0.0008) +[2023-10-11 17:11:51,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 110395392. Throughput: 0: 1676.5, 1: 1696.5. Samples: 27601012. Policy #0 lag: (min: 10.0, avg: 13.8, max: 42.0) +[2023-10-11 17:11:51,064][84230] Avg episode reward: [(0, '29.460'), (1, '39.780')] +[2023-10-11 17:11:53,604][85175] Updated weights for policy 1, policy_version 54310 (0.0010) +[2023-10-11 17:11:53,976][85175] Updated weights for policy 1, policy_version 54320 (0.0009) +[2023-10-11 17:11:54,089][85176] Updated weights for policy 0, policy_version 53512 (0.0008) +[2023-10-11 17:11:54,335][85175] Updated weights for policy 1, policy_version 54330 (0.0007) +[2023-10-11 17:11:54,457][85176] Updated weights for policy 0, policy_version 53522 (0.0009) +[2023-10-11 17:11:54,825][85176] Updated weights for policy 0, policy_version 53532 (0.0008) +[2023-10-11 17:11:56,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 110460928. Throughput: 0: 1665.2, 1: 1678.5. Samples: 27619782. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-11 17:11:56,064][84230] Avg episode reward: [(0, '29.860'), (1, '44.120')] +[2023-10-11 17:11:56,065][85000] Saving new best policy, reward=44.120! +[2023-10-11 17:11:58,412][85175] Updated weights for policy 1, policy_version 54340 (0.0008) +[2023-10-11 17:11:58,818][85175] Updated weights for policy 1, policy_version 54350 (0.0007) +[2023-10-11 17:11:58,906][85176] Updated weights for policy 0, policy_version 53542 (0.0007) +[2023-10-11 17:11:59,187][85175] Updated weights for policy 1, policy_version 54360 (0.0008) +[2023-10-11 17:11:59,278][85176] Updated weights for policy 0, policy_version 53552 (0.0007) +[2023-10-11 17:11:59,654][85176] Updated weights for policy 0, policy_version 53562 (0.0008) +[2023-10-11 17:12:01,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 110526464. Throughput: 0: 1673.9, 1: 1690.4. Samples: 27639760. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-11 17:12:01,064][84230] Avg episode reward: [(0, '29.330'), (1, '40.180')] +[2023-10-11 17:12:03,020][85175] Updated weights for policy 1, policy_version 54370 (0.0010) +[2023-10-11 17:12:03,395][85175] Updated weights for policy 1, policy_version 54380 (0.0008) +[2023-10-11 17:12:03,547][85176] Updated weights for policy 0, policy_version 53572 (0.0007) +[2023-10-11 17:12:03,760][85175] Updated weights for policy 1, policy_version 54390 (0.0008) +[2023-10-11 17:12:03,915][85176] Updated weights for policy 0, policy_version 53582 (0.0009) +[2023-10-11 17:12:04,133][85175] Updated weights for policy 1, policy_version 54400 (0.0009) +[2023-10-11 17:12:04,289][85176] Updated weights for policy 0, policy_version 53592 (0.0011) +[2023-10-11 17:12:06,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 110592000. Throughput: 0: 1668.4, 1: 1690.5. Samples: 27650772. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-11 17:12:06,064][84230] Avg episode reward: [(0, '31.170'), (1, '43.850')] +[2023-10-11 17:12:08,184][85175] Updated weights for policy 1, policy_version 54410 (0.0008) +[2023-10-11 17:12:08,438][85176] Updated weights for policy 0, policy_version 53602 (0.0008) +[2023-10-11 17:12:08,542][85175] Updated weights for policy 1, policy_version 54420 (0.0008) +[2023-10-11 17:12:08,801][85176] Updated weights for policy 0, policy_version 53612 (0.0007) +[2023-10-11 17:12:08,920][85175] Updated weights for policy 1, policy_version 54430 (0.0009) +[2023-10-11 17:12:09,174][85176] Updated weights for policy 0, policy_version 53622 (0.0008) +[2023-10-11 17:12:09,551][85176] Updated weights for policy 0, policy_version 53632 (0.0008) +[2023-10-11 17:12:11,063][84230] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 110657536. Throughput: 0: 1649.6, 1: 1678.2. Samples: 27669632. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-11 17:12:11,063][84230] Avg episode reward: [(0, '29.530'), (1, '42.060')] +[2023-10-11 17:12:12,872][85175] Updated weights for policy 1, policy_version 54440 (0.0011) +[2023-10-11 17:12:13,236][85175] Updated weights for policy 1, policy_version 54450 (0.0007) +[2023-10-11 17:12:13,508][85176] Updated weights for policy 0, policy_version 53642 (0.0009) +[2023-10-11 17:12:13,607][85175] Updated weights for policy 1, policy_version 54460 (0.0008) +[2023-10-11 17:12:13,871][85176] Updated weights for policy 0, policy_version 53652 (0.0009) +[2023-10-11 17:12:14,249][85176] Updated weights for policy 0, policy_version 53662 (0.0010) +[2023-10-11 17:12:16,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 110723072. Throughput: 0: 1671.1, 1: 1709.2. Samples: 27690574. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-11 17:12:16,063][84230] Avg episode reward: [(0, '30.570'), (1, '41.130')] +[2023-10-11 17:12:16,073][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000053664_54951936.pth... +[2023-10-11 17:12:16,074][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000054464_55771136.pth... +[2023-10-11 17:12:16,107][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000052896_54165504.pth +[2023-10-11 17:12:16,109][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000052096_53346304.pth +[2023-10-11 17:12:17,724][85175] Updated weights for policy 1, policy_version 54470 (0.0009) +[2023-10-11 17:12:18,080][85175] Updated weights for policy 1, policy_version 54480 (0.0008) +[2023-10-11 17:12:18,306][85176] Updated weights for policy 0, policy_version 53672 (0.0007) +[2023-10-11 17:12:18,446][85175] Updated weights for policy 1, policy_version 54490 (0.0009) +[2023-10-11 17:12:18,675][85176] Updated weights for policy 0, policy_version 53682 (0.0007) +[2023-10-11 17:12:19,053][85176] Updated weights for policy 0, policy_version 53692 (0.0007) +[2023-10-11 17:12:21,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 110788608. Throughput: 0: 1661.2, 1: 1680.8. Samples: 27700640. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-11 17:12:21,064][84230] Avg episode reward: [(0, '28.770'), (1, '38.170')] +[2023-10-11 17:12:22,494][85175] Updated weights for policy 1, policy_version 54500 (0.0008) +[2023-10-11 17:12:22,856][85175] Updated weights for policy 1, policy_version 54510 (0.0007) +[2023-10-11 17:12:23,133][85176] Updated weights for policy 0, policy_version 53702 (0.0008) +[2023-10-11 17:12:23,217][85175] Updated weights for policy 1, policy_version 54520 (0.0007) +[2023-10-11 17:12:23,510][85176] Updated weights for policy 0, policy_version 53712 (0.0009) +[2023-10-11 17:12:23,887][85176] Updated weights for policy 0, policy_version 53722 (0.0008) +[2023-10-11 17:12:26,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 110854144. Throughput: 0: 1661.8, 1: 1690.0. Samples: 27720292. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-11 17:12:26,064][84230] Avg episode reward: [(0, '31.930'), (1, '40.860')] +[2023-10-11 17:12:27,316][85175] Updated weights for policy 1, policy_version 54530 (0.0009) +[2023-10-11 17:12:27,680][85175] Updated weights for policy 1, policy_version 54540 (0.0010) +[2023-10-11 17:12:27,789][85176] Updated weights for policy 0, policy_version 53732 (0.0008) +[2023-10-11 17:12:28,049][85175] Updated weights for policy 1, policy_version 54550 (0.0007) +[2023-10-11 17:12:28,150][85176] Updated weights for policy 0, policy_version 53742 (0.0009) +[2023-10-11 17:12:28,416][85175] Updated weights for policy 1, policy_version 54560 (0.0008) +[2023-10-11 17:12:28,529][85176] Updated weights for policy 0, policy_version 53752 (0.0008) +[2023-10-11 17:12:31,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 110919680. Throughput: 0: 1672.3, 1: 1701.6. Samples: 27741172. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-11 17:12:31,063][84230] Avg episode reward: [(0, '31.270'), (1, '40.540')] +[2023-10-11 17:12:32,348][85175] Updated weights for policy 1, policy_version 54570 (0.0008) +[2023-10-11 17:12:32,715][85176] Updated weights for policy 0, policy_version 53762 (0.0008) +[2023-10-11 17:12:32,725][85175] Updated weights for policy 1, policy_version 54580 (0.0008) +[2023-10-11 17:12:33,079][85176] Updated weights for policy 0, policy_version 53772 (0.0007) +[2023-10-11 17:12:33,084][85175] Updated weights for policy 1, policy_version 54590 (0.0007) +[2023-10-11 17:12:33,453][85176] Updated weights for policy 0, policy_version 53782 (0.0008) +[2023-10-11 17:12:33,833][85176] Updated weights for policy 0, policy_version 53792 (0.0008) +[2023-10-11 17:12:36,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 110985216. Throughput: 0: 1648.8, 1: 1674.0. Samples: 27750538. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:12:36,064][84230] Avg episode reward: [(0, '33.910'), (1, '40.640')] +[2023-10-11 17:12:36,065][84801] Saving new best policy, reward=33.910! +[2023-10-11 17:12:37,228][85175] Updated weights for policy 1, policy_version 54600 (0.0010) +[2023-10-11 17:12:37,594][85175] Updated weights for policy 1, policy_version 54610 (0.0008) +[2023-10-11 17:12:37,862][85176] Updated weights for policy 0, policy_version 53802 (0.0009) +[2023-10-11 17:12:37,964][85175] Updated weights for policy 1, policy_version 54620 (0.0007) +[2023-10-11 17:12:38,240][85176] Updated weights for policy 0, policy_version 53812 (0.0008) +[2023-10-11 17:12:38,611][85176] Updated weights for policy 0, policy_version 53822 (0.0007) +[2023-10-11 17:12:41,062][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 111050752. Throughput: 0: 1666.9, 1: 1699.5. Samples: 27771270. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:12:41,063][84230] Avg episode reward: [(0, '30.620'), (1, '40.310')] +[2023-10-11 17:12:42,043][85175] Updated weights for policy 1, policy_version 54630 (0.0007) +[2023-10-11 17:12:42,410][85175] Updated weights for policy 1, policy_version 54640 (0.0008) +[2023-10-11 17:12:42,614][85176] Updated weights for policy 0, policy_version 53832 (0.0009) +[2023-10-11 17:12:42,782][85175] Updated weights for policy 1, policy_version 54650 (0.0007) +[2023-10-11 17:12:42,980][85176] Updated weights for policy 0, policy_version 53842 (0.0008) +[2023-10-11 17:12:43,362][85176] Updated weights for policy 0, policy_version 53852 (0.0010) +[2023-10-11 17:12:46,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 111116288. Throughput: 0: 1679.1, 1: 1702.6. Samples: 27791936. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:12:46,063][84230] Avg episode reward: [(0, '30.060'), (1, '41.750')] +[2023-10-11 17:12:46,820][85175] Updated weights for policy 1, policy_version 54660 (0.0008) +[2023-10-11 17:12:47,221][85175] Updated weights for policy 1, policy_version 54670 (0.0009) +[2023-10-11 17:12:47,408][85176] Updated weights for policy 0, policy_version 53862 (0.0009) +[2023-10-11 17:12:47,591][85175] Updated weights for policy 1, policy_version 54680 (0.0009) +[2023-10-11 17:12:47,777][85176] Updated weights for policy 0, policy_version 53872 (0.0008) +[2023-10-11 17:12:48,149][85176] Updated weights for policy 0, policy_version 53882 (0.0009) +[2023-10-11 17:12:51,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 111181824. Throughput: 0: 1654.4, 1: 1681.5. Samples: 27800886. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:12:51,064][84230] Avg episode reward: [(0, '30.060'), (1, '43.630')] +[2023-10-11 17:12:51,607][85175] Updated weights for policy 1, policy_version 54690 (0.0009) +[2023-10-11 17:12:51,970][85175] Updated weights for policy 1, policy_version 54700 (0.0008) +[2023-10-11 17:12:52,249][85176] Updated weights for policy 0, policy_version 53892 (0.0008) +[2023-10-11 17:12:52,337][85175] Updated weights for policy 1, policy_version 54710 (0.0008) +[2023-10-11 17:12:52,614][85176] Updated weights for policy 0, policy_version 53902 (0.0008) +[2023-10-11 17:12:52,696][85175] Updated weights for policy 1, policy_version 54720 (0.0008) +[2023-10-11 17:12:52,999][85176] Updated weights for policy 0, policy_version 53912 (0.0009) +[2023-10-11 17:12:56,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 111247360. Throughput: 0: 1681.2, 1: 1700.8. Samples: 27821822. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:12:56,064][84230] Avg episode reward: [(0, '30.650'), (1, '44.720')] +[2023-10-11 17:12:56,066][85000] Saving new best policy, reward=44.720! +[2023-10-11 17:12:56,617][85175] Updated weights for policy 1, policy_version 54730 (0.0011) +[2023-10-11 17:12:56,988][85175] Updated weights for policy 1, policy_version 54740 (0.0008) +[2023-10-11 17:12:57,347][85176] Updated weights for policy 0, policy_version 53922 (0.0010) +[2023-10-11 17:12:57,355][85175] Updated weights for policy 1, policy_version 54750 (0.0008) +[2023-10-11 17:12:57,716][85176] Updated weights for policy 0, policy_version 53932 (0.0007) +[2023-10-11 17:12:58,091][85176] Updated weights for policy 0, policy_version 53942 (0.0010) +[2023-10-11 17:12:58,456][85176] Updated weights for policy 0, policy_version 53952 (0.0008) +[2023-10-11 17:13:01,062][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 111312896. Throughput: 0: 1678.0, 1: 1700.7. Samples: 27842616. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:13:01,063][84230] Avg episode reward: [(0, '32.000'), (1, '41.820')] +[2023-10-11 17:13:01,224][85175] Updated weights for policy 1, policy_version 54760 (0.0010) +[2023-10-11 17:13:01,595][85175] Updated weights for policy 1, policy_version 54770 (0.0007) +[2023-10-11 17:13:01,966][85175] Updated weights for policy 1, policy_version 54780 (0.0009) +[2023-10-11 17:13:02,478][85176] Updated weights for policy 0, policy_version 53962 (0.0008) +[2023-10-11 17:13:02,845][85176] Updated weights for policy 0, policy_version 53972 (0.0009) +[2023-10-11 17:13:03,218][85176] Updated weights for policy 0, policy_version 53982 (0.0008) +[2023-10-11 17:13:05,929][85175] Updated weights for policy 1, policy_version 54790 (0.0009) +[2023-10-11 17:13:06,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 111378432. Throughput: 0: 1659.9, 1: 1700.0. Samples: 27851836. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:13:06,063][84230] Avg episode reward: [(0, '30.610'), (1, '41.750')] +[2023-10-11 17:13:06,294][85175] Updated weights for policy 1, policy_version 54800 (0.0011) +[2023-10-11 17:13:06,664][85175] Updated weights for policy 1, policy_version 54810 (0.0010) +[2023-10-11 17:13:07,285][85176] Updated weights for policy 0, policy_version 53992 (0.0008) +[2023-10-11 17:13:07,658][85176] Updated weights for policy 0, policy_version 54002 (0.0008) +[2023-10-11 17:13:08,037][85176] Updated weights for policy 0, policy_version 54012 (0.0007) +[2023-10-11 17:13:10,656][85175] Updated weights for policy 1, policy_version 54820 (0.0008) +[2023-10-11 17:13:11,017][85175] Updated weights for policy 1, policy_version 54830 (0.0008) +[2023-10-11 17:13:11,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 111443968. Throughput: 0: 1684.0, 1: 1708.5. Samples: 27872956. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:13:11,063][84230] Avg episode reward: [(0, '30.520'), (1, '41.030')] +[2023-10-11 17:13:11,392][85175] Updated weights for policy 1, policy_version 54840 (0.0008) +[2023-10-11 17:13:12,085][85176] Updated weights for policy 0, policy_version 54022 (0.0009) +[2023-10-11 17:13:12,461][85176] Updated weights for policy 0, policy_version 54032 (0.0007) +[2023-10-11 17:13:12,826][85176] Updated weights for policy 0, policy_version 54042 (0.0009) +[2023-10-11 17:13:15,505][85175] Updated weights for policy 1, policy_version 54850 (0.0009) +[2023-10-11 17:13:15,868][85175] Updated weights for policy 1, policy_version 54860 (0.0008) +[2023-10-11 17:13:16,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 111509504. Throughput: 0: 1683.0, 1: 1709.4. Samples: 27893828. Policy #0 lag: (min: 31.0, avg: 37.1, max: 63.0) +[2023-10-11 17:13:16,063][84230] Avg episode reward: [(0, '30.130'), (1, '45.040')] +[2023-10-11 17:13:16,243][85175] Updated weights for policy 1, policy_version 54870 (0.0009) +[2023-10-11 17:13:16,603][85000] Saving new best policy, reward=45.040! +[2023-10-11 17:13:16,608][85175] Updated weights for policy 1, policy_version 54880 (0.0009) +[2023-10-11 17:13:16,792][85176] Updated weights for policy 0, policy_version 54052 (0.0009) +[2023-10-11 17:13:17,175][85176] Updated weights for policy 0, policy_version 54062 (0.0008) +[2023-10-11 17:13:17,553][85176] Updated weights for policy 0, policy_version 54072 (0.0009) +[2023-10-11 17:13:20,697][85175] Updated weights for policy 1, policy_version 54890 (0.0009) +[2023-10-11 17:13:21,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 111575040. Throughput: 0: 1678.5, 1: 1708.1. Samples: 27902934. Policy #0 lag: (min: 31.0, avg: 37.1, max: 63.0) +[2023-10-11 17:13:21,064][84230] Avg episode reward: [(0, '29.690'), (1, '42.400')] +[2023-10-11 17:13:21,065][85175] Updated weights for policy 1, policy_version 54900 (0.0009) +[2023-10-11 17:13:21,426][85175] Updated weights for policy 1, policy_version 54910 (0.0009) +[2023-10-11 17:13:21,649][85176] Updated weights for policy 0, policy_version 54082 (0.0009) +[2023-10-11 17:13:22,027][85176] Updated weights for policy 0, policy_version 54092 (0.0009) +[2023-10-11 17:13:22,408][85176] Updated weights for policy 0, policy_version 54102 (0.0007) +[2023-10-11 17:13:22,787][85176] Updated weights for policy 0, policy_version 54112 (0.0008) +[2023-10-11 17:13:25,388][85175] Updated weights for policy 1, policy_version 54920 (0.0007) +[2023-10-11 17:13:25,757][85175] Updated weights for policy 1, policy_version 54930 (0.0010) +[2023-10-11 17:13:26,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 111640576. Throughput: 0: 1680.8, 1: 1705.7. Samples: 27923660. Policy #0 lag: (min: 31.0, avg: 37.1, max: 63.0) +[2023-10-11 17:13:26,063][84230] Avg episode reward: [(0, '30.380'), (1, '42.470')] +[2023-10-11 17:13:26,117][85175] Updated weights for policy 1, policy_version 54940 (0.0007) +[2023-10-11 17:13:26,848][85176] Updated weights for policy 0, policy_version 54122 (0.0007) +[2023-10-11 17:13:27,226][85176] Updated weights for policy 0, policy_version 54132 (0.0007) +[2023-10-11 17:13:27,602][85176] Updated weights for policy 0, policy_version 54142 (0.0010) +[2023-10-11 17:13:30,157][85175] Updated weights for policy 1, policy_version 54950 (0.0009) +[2023-10-11 17:13:30,529][85175] Updated weights for policy 1, policy_version 54960 (0.0008) +[2023-10-11 17:13:30,890][85175] Updated weights for policy 1, policy_version 54970 (0.0007) +[2023-10-11 17:13:31,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 111706112. Throughput: 0: 1680.0, 1: 1695.5. Samples: 27943834. Policy #0 lag: (min: 31.0, avg: 37.1, max: 63.0) +[2023-10-11 17:13:31,064][84230] Avg episode reward: [(0, '32.940'), (1, '37.040')] +[2023-10-11 17:13:31,799][85176] Updated weights for policy 0, policy_version 54152 (0.0008) +[2023-10-11 17:13:32,171][85176] Updated weights for policy 0, policy_version 54162 (0.0008) +[2023-10-11 17:13:32,554][85176] Updated weights for policy 0, policy_version 54172 (0.0008) +[2023-10-11 17:13:34,923][85175] Updated weights for policy 1, policy_version 54980 (0.0009) +[2023-10-11 17:13:35,324][85175] Updated weights for policy 1, policy_version 54990 (0.0008) +[2023-10-11 17:13:35,687][85175] Updated weights for policy 1, policy_version 55000 (0.0008) +[2023-10-11 17:13:36,063][84230] Fps is (10 sec: 16383.9, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 111804416. Throughput: 0: 1677.0, 1: 1714.8. Samples: 27953518. Policy #0 lag: (min: 31.0, avg: 37.1, max: 63.0) +[2023-10-11 17:13:36,063][84230] Avg episode reward: [(0, '31.440'), (1, '40.730')] +[2023-10-11 17:13:36,801][85176] Updated weights for policy 0, policy_version 54182 (0.0007) +[2023-10-11 17:13:37,180][85176] Updated weights for policy 0, policy_version 54192 (0.0008) +[2023-10-11 17:13:37,555][85176] Updated weights for policy 0, policy_version 54202 (0.0008) +[2023-10-11 17:13:39,703][85175] Updated weights for policy 1, policy_version 55010 (0.0009) +[2023-10-11 17:13:40,071][85175] Updated weights for policy 1, policy_version 55020 (0.0008) +[2023-10-11 17:13:40,434][85175] Updated weights for policy 1, policy_version 55030 (0.0009) +[2023-10-11 17:13:40,804][85175] Updated weights for policy 1, policy_version 55040 (0.0008) +[2023-10-11 17:13:41,062][84230] Fps is (10 sec: 16384.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 111869952. Throughput: 0: 1679.8, 1: 1708.7. Samples: 27974304. Policy #0 lag: (min: 31.0, avg: 37.1, max: 63.0) +[2023-10-11 17:13:41,063][84230] Avg episode reward: [(0, '32.050'), (1, '39.740')] +[2023-10-11 17:13:41,516][85176] Updated weights for policy 0, policy_version 54212 (0.0010) +[2023-10-11 17:13:41,897][85176] Updated weights for policy 0, policy_version 54222 (0.0008) +[2023-10-11 17:13:42,260][85176] Updated weights for policy 0, policy_version 54232 (0.0010) +[2023-10-11 17:13:44,825][85175] Updated weights for policy 1, policy_version 55050 (0.0008) +[2023-10-11 17:13:45,196][85175] Updated weights for policy 1, policy_version 55060 (0.0009) +[2023-10-11 17:13:45,562][85175] Updated weights for policy 1, policy_version 55070 (0.0007) +[2023-10-11 17:13:46,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 111935488. Throughput: 0: 1683.8, 1: 1681.2. Samples: 27994040. Policy #0 lag: (min: 31.0, avg: 37.1, max: 63.0) +[2023-10-11 17:13:46,063][84230] Avg episode reward: [(0, '29.980'), (1, '41.690')] +[2023-10-11 17:13:46,251][85176] Updated weights for policy 0, policy_version 54242 (0.0009) +[2023-10-11 17:13:46,619][85176] Updated weights for policy 0, policy_version 54252 (0.0008) +[2023-10-11 17:13:47,003][85176] Updated weights for policy 0, policy_version 54262 (0.0008) +[2023-10-11 17:13:47,362][85176] Updated weights for policy 0, policy_version 54272 (0.0009) +[2023-10-11 17:13:49,569][85175] Updated weights for policy 1, policy_version 55080 (0.0009) +[2023-10-11 17:13:49,952][85175] Updated weights for policy 1, policy_version 55090 (0.0009) +[2023-10-11 17:13:50,316][85175] Updated weights for policy 1, policy_version 55100 (0.0007) +[2023-10-11 17:13:51,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 112001024. Throughput: 0: 1680.4, 1: 1702.9. Samples: 28004082. Policy #0 lag: (min: 31.0, avg: 37.1, max: 63.0) +[2023-10-11 17:13:51,063][84230] Avg episode reward: [(0, '29.950'), (1, '39.040')] +[2023-10-11 17:13:51,637][85176] Updated weights for policy 0, policy_version 54282 (0.0008) +[2023-10-11 17:13:52,013][85176] Updated weights for policy 0, policy_version 54292 (0.0009) +[2023-10-11 17:13:52,389][85176] Updated weights for policy 0, policy_version 54302 (0.0007) +[2023-10-11 17:13:54,203][85175] Updated weights for policy 1, policy_version 55110 (0.0008) +[2023-10-11 17:13:54,568][85175] Updated weights for policy 1, policy_version 55120 (0.0007) +[2023-10-11 17:13:54,937][85175] Updated weights for policy 1, policy_version 55130 (0.0007) +[2023-10-11 17:13:56,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 112066560. Throughput: 0: 1674.4, 1: 1695.0. Samples: 28024576. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) +[2023-10-11 17:13:56,064][84230] Avg episode reward: [(0, '30.130'), (1, '42.090')] +[2023-10-11 17:13:56,640][85176] Updated weights for policy 0, policy_version 54312 (0.0008) +[2023-10-11 17:13:57,022][85176] Updated weights for policy 0, policy_version 54322 (0.0008) +[2023-10-11 17:13:57,393][85176] Updated weights for policy 0, policy_version 54332 (0.0007) +[2023-10-11 17:13:59,065][85175] Updated weights for policy 1, policy_version 55140 (0.0008) +[2023-10-11 17:13:59,432][85175] Updated weights for policy 1, policy_version 55150 (0.0008) +[2023-10-11 17:13:59,797][85175] Updated weights for policy 1, policy_version 55160 (0.0009) +[2023-10-11 17:14:01,063][84230] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 112132096. Throughput: 0: 1673.6, 1: 1677.7. Samples: 28044636. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) +[2023-10-11 17:14:01,064][84230] Avg episode reward: [(0, '29.140'), (1, '37.600')] +[2023-10-11 17:14:01,192][85176] Updated weights for policy 0, policy_version 54342 (0.0008) +[2023-10-11 17:14:01,566][85176] Updated weights for policy 0, policy_version 54352 (0.0008) +[2023-10-11 17:14:01,931][85176] Updated weights for policy 0, policy_version 54362 (0.0009) +[2023-10-11 17:14:03,784][85175] Updated weights for policy 1, policy_version 55170 (0.0009) +[2023-10-11 17:14:04,142][85175] Updated weights for policy 1, policy_version 55180 (0.0011) +[2023-10-11 17:14:04,519][85175] Updated weights for policy 1, policy_version 55190 (0.0008) +[2023-10-11 17:14:04,893][85175] Updated weights for policy 1, policy_version 55200 (0.0009) +[2023-10-11 17:14:05,817][85176] Updated weights for policy 0, policy_version 54372 (0.0008) +[2023-10-11 17:14:06,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 112197632. Throughput: 0: 1674.4, 1: 1709.8. Samples: 28055222. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) +[2023-10-11 17:14:06,063][84230] Avg episode reward: [(0, '32.340'), (1, '43.200')] +[2023-10-11 17:14:06,194][85176] Updated weights for policy 0, policy_version 54382 (0.0010) +[2023-10-11 17:14:06,569][85176] Updated weights for policy 0, policy_version 54392 (0.0009) +[2023-10-11 17:14:08,967][85175] Updated weights for policy 1, policy_version 55210 (0.0010) +[2023-10-11 17:14:09,328][85175] Updated weights for policy 1, policy_version 55220 (0.0008) +[2023-10-11 17:14:09,697][85175] Updated weights for policy 1, policy_version 55230 (0.0009) +[2023-10-11 17:14:10,714][85176] Updated weights for policy 0, policy_version 54402 (0.0009) +[2023-10-11 17:14:11,063][84230] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 112263168. Throughput: 0: 1678.0, 1: 1687.5. Samples: 28075110. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) +[2023-10-11 17:14:11,063][84230] Avg episode reward: [(0, '32.840'), (1, '37.510')] +[2023-10-11 17:14:11,083][85176] Updated weights for policy 0, policy_version 54412 (0.0008) +[2023-10-11 17:14:11,458][85176] Updated weights for policy 0, policy_version 54422 (0.0007) +[2023-10-11 17:14:11,822][85176] Updated weights for policy 0, policy_version 54432 (0.0008) +[2023-10-11 17:14:13,678][85175] Updated weights for policy 1, policy_version 55240 (0.0007) +[2023-10-11 17:14:14,040][85175] Updated weights for policy 1, policy_version 55250 (0.0009) +[2023-10-11 17:14:14,409][85175] Updated weights for policy 1, policy_version 55260 (0.0010) +[2023-10-11 17:14:15,863][85176] Updated weights for policy 0, policy_version 54442 (0.0007) +[2023-10-11 17:14:16,063][84230] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 112328704. Throughput: 0: 1678.7, 1: 1692.9. Samples: 28095558. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) +[2023-10-11 17:14:16,064][84230] Avg episode reward: [(0, '33.180'), (1, '42.310')] +[2023-10-11 17:14:16,076][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000055264_56590336.pth... +[2023-10-11 17:14:16,112][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000053664_54951936.pth +[2023-10-11 17:14:16,116][85000] Saving a milestone ./train_atari/atari_frostbite_APPO/checkpoint_p1/milestones/checkpoint_000055264_56590336.pth +[2023-10-11 17:14:16,238][85176] Updated weights for policy 0, policy_version 54452 (0.0008) +[2023-10-11 17:14:16,603][85176] Updated weights for policy 0, policy_version 54462 (0.0009) +[2023-10-11 17:14:16,679][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000054464_55771136.pth... +[2023-10-11 17:14:16,719][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000052896_54165504.pth +[2023-10-11 17:14:16,723][84801] Saving a milestone ./train_atari/atari_frostbite_APPO/checkpoint_p0/milestones/checkpoint_000054464_55771136.pth +[2023-10-11 17:14:18,317][85175] Updated weights for policy 1, policy_version 55270 (0.0008) +[2023-10-11 17:14:18,682][85175] Updated weights for policy 1, policy_version 55280 (0.0009) +[2023-10-11 17:14:19,053][85175] Updated weights for policy 1, policy_version 55290 (0.0009) +[2023-10-11 17:14:20,788][85176] Updated weights for policy 0, policy_version 54472 (0.0008) +[2023-10-11 17:14:21,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 112394240. Throughput: 0: 1682.4, 1: 1697.2. Samples: 28105602. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) +[2023-10-11 17:14:21,063][84230] Avg episode reward: [(0, '31.900'), (1, '36.080')] +[2023-10-11 17:14:21,163][85176] Updated weights for policy 0, policy_version 54482 (0.0008) +[2023-10-11 17:14:21,531][85176] Updated weights for policy 0, policy_version 54492 (0.0008) +[2023-10-11 17:14:23,143][85175] Updated weights for policy 1, policy_version 55300 (0.0011) +[2023-10-11 17:14:23,503][85175] Updated weights for policy 1, policy_version 55310 (0.0010) +[2023-10-11 17:14:23,867][85175] Updated weights for policy 1, policy_version 55320 (0.0011) +[2023-10-11 17:14:25,712][85176] Updated weights for policy 0, policy_version 54502 (0.0008) +[2023-10-11 17:14:26,062][84230] Fps is (10 sec: 13107.7, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 112459776. Throughput: 0: 1676.5, 1: 1684.8. Samples: 28125562. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) +[2023-10-11 17:14:26,063][84230] Avg episode reward: [(0, '31.580'), (1, '41.020')] +[2023-10-11 17:14:26,091][85176] Updated weights for policy 0, policy_version 54512 (0.0009) +[2023-10-11 17:14:26,460][85176] Updated weights for policy 0, policy_version 54522 (0.0007) +[2023-10-11 17:14:27,921][85175] Updated weights for policy 1, policy_version 55330 (0.0007) +[2023-10-11 17:14:28,320][85175] Updated weights for policy 1, policy_version 55340 (0.0007) +[2023-10-11 17:14:28,700][85175] Updated weights for policy 1, policy_version 55350 (0.0010) +[2023-10-11 17:14:29,062][85175] Updated weights for policy 1, policy_version 55360 (0.0011) +[2023-10-11 17:14:30,431][85176] Updated weights for policy 0, policy_version 54532 (0.0007) +[2023-10-11 17:14:30,806][85176] Updated weights for policy 0, policy_version 54542 (0.0010) +[2023-10-11 17:14:31,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 112525312. Throughput: 0: 1670.0, 1: 1707.1. Samples: 28146008. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) +[2023-10-11 17:14:31,063][84230] Avg episode reward: [(0, '32.230'), (1, '38.050')] +[2023-10-11 17:14:31,177][85176] Updated weights for policy 0, policy_version 54552 (0.0007) +[2023-10-11 17:14:32,931][85175] Updated weights for policy 1, policy_version 55370 (0.0007) +[2023-10-11 17:14:33,295][85175] Updated weights for policy 1, policy_version 55380 (0.0007) +[2023-10-11 17:14:33,663][85175] Updated weights for policy 1, policy_version 55390 (0.0007) +[2023-10-11 17:14:35,429][85176] Updated weights for policy 0, policy_version 54562 (0.0007) +[2023-10-11 17:14:35,803][85176] Updated weights for policy 0, policy_version 54572 (0.0007) +[2023-10-11 17:14:36,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 112590848. Throughput: 0: 1678.9, 1: 1694.2. Samples: 28155870. Policy #0 lag: (min: 31.0, avg: 31.4, max: 45.0) +[2023-10-11 17:14:36,063][84230] Avg episode reward: [(0, '33.020'), (1, '43.560')] +[2023-10-11 17:14:36,176][85176] Updated weights for policy 0, policy_version 54582 (0.0009) +[2023-10-11 17:14:36,550][85176] Updated weights for policy 0, policy_version 54592 (0.0007) +[2023-10-11 17:14:37,579][85175] Updated weights for policy 1, policy_version 55400 (0.0010) +[2023-10-11 17:14:37,955][85175] Updated weights for policy 1, policy_version 55410 (0.0008) +[2023-10-11 17:14:38,314][85175] Updated weights for policy 1, policy_version 55420 (0.0008) +[2023-10-11 17:14:40,466][85176] Updated weights for policy 0, policy_version 54602 (0.0009) +[2023-10-11 17:14:40,835][85176] Updated weights for policy 0, policy_version 54612 (0.0008) +[2023-10-11 17:14:41,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 112656384. Throughput: 0: 1682.0, 1: 1696.7. Samples: 28176618. Policy #0 lag: (min: 31.0, avg: 31.4, max: 45.0) +[2023-10-11 17:14:41,064][84230] Avg episode reward: [(0, '34.010'), (1, '38.930')] +[2023-10-11 17:14:41,209][85176] Updated weights for policy 0, policy_version 54622 (0.0007) +[2023-10-11 17:14:41,282][84801] Saving new best policy, reward=34.010! +[2023-10-11 17:14:42,228][85175] Updated weights for policy 1, policy_version 55430 (0.0009) +[2023-10-11 17:14:42,589][85175] Updated weights for policy 1, policy_version 55440 (0.0008) +[2023-10-11 17:14:42,966][85175] Updated weights for policy 1, policy_version 55450 (0.0009) +[2023-10-11 17:14:45,208][85176] Updated weights for policy 0, policy_version 54632 (0.0008) +[2023-10-11 17:14:45,582][85176] Updated weights for policy 0, policy_version 54642 (0.0007) +[2023-10-11 17:14:45,953][85176] Updated weights for policy 0, policy_version 54652 (0.0008) +[2023-10-11 17:14:46,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 112721920. Throughput: 0: 1662.9, 1: 1719.8. Samples: 28196860. Policy #0 lag: (min: 31.0, avg: 31.4, max: 45.0) +[2023-10-11 17:14:46,063][84230] Avg episode reward: [(0, '33.060'), (1, '43.400')] +[2023-10-11 17:14:46,962][85175] Updated weights for policy 1, policy_version 55460 (0.0009) +[2023-10-11 17:14:47,333][85175] Updated weights for policy 1, policy_version 55470 (0.0008) +[2023-10-11 17:14:47,702][85175] Updated weights for policy 1, policy_version 55480 (0.0011) +[2023-10-11 17:14:50,057][85176] Updated weights for policy 0, policy_version 54662 (0.0009) +[2023-10-11 17:14:50,439][85176] Updated weights for policy 0, policy_version 54672 (0.0009) +[2023-10-11 17:14:50,813][85176] Updated weights for policy 0, policy_version 54682 (0.0010) +[2023-10-11 17:14:51,062][84230] Fps is (10 sec: 16384.3, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 112820224. Throughput: 0: 1677.2, 1: 1691.2. Samples: 28206800. Policy #0 lag: (min: 31.0, avg: 31.4, max: 45.0) +[2023-10-11 17:14:51,063][84230] Avg episode reward: [(0, '31.490'), (1, '39.810')] +[2023-10-11 17:14:51,809][85175] Updated weights for policy 1, policy_version 55490 (0.0008) +[2023-10-11 17:14:52,170][85175] Updated weights for policy 1, policy_version 55500 (0.0008) +[2023-10-11 17:14:52,536][85175] Updated weights for policy 1, policy_version 55510 (0.0007) +[2023-10-11 17:14:52,902][85175] Updated weights for policy 1, policy_version 55520 (0.0007) +[2023-10-11 17:14:54,867][85176] Updated weights for policy 0, policy_version 54692 (0.0010) +[2023-10-11 17:14:55,242][85176] Updated weights for policy 0, policy_version 54702 (0.0009) +[2023-10-11 17:14:55,620][85176] Updated weights for policy 0, policy_version 54712 (0.0007) +[2023-10-11 17:14:56,062][84230] Fps is (10 sec: 16384.2, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 112885760. Throughput: 0: 1677.0, 1: 1716.4. Samples: 28227816. Policy #0 lag: (min: 31.0, avg: 31.4, max: 45.0) +[2023-10-11 17:14:56,063][84230] Avg episode reward: [(0, '31.510'), (1, '45.270')] +[2023-10-11 17:14:56,064][85000] Saving new best policy, reward=45.270! +[2023-10-11 17:14:57,048][85175] Updated weights for policy 1, policy_version 55530 (0.0010) +[2023-10-11 17:14:57,417][85175] Updated weights for policy 1, policy_version 55540 (0.0008) +[2023-10-11 17:14:57,779][85175] Updated weights for policy 1, policy_version 55550 (0.0007) +[2023-10-11 17:14:59,561][85176] Updated weights for policy 0, policy_version 54722 (0.0007) +[2023-10-11 17:14:59,938][85176] Updated weights for policy 0, policy_version 54732 (0.0009) +[2023-10-11 17:15:00,302][85176] Updated weights for policy 0, policy_version 54742 (0.0007) +[2023-10-11 17:15:00,679][85176] Updated weights for policy 0, policy_version 54752 (0.0010) +[2023-10-11 17:15:01,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 112951296. Throughput: 0: 1657.4, 1: 1720.8. Samples: 28247574. Policy #0 lag: (min: 31.0, avg: 31.4, max: 45.0) +[2023-10-11 17:15:01,063][84230] Avg episode reward: [(0, '31.640'), (1, '42.820')] +[2023-10-11 17:15:01,637][85175] Updated weights for policy 1, policy_version 55560 (0.0007) +[2023-10-11 17:15:02,008][85175] Updated weights for policy 1, policy_version 55570 (0.0008) +[2023-10-11 17:15:02,380][85175] Updated weights for policy 1, policy_version 55580 (0.0008) +[2023-10-11 17:15:04,730][85176] Updated weights for policy 0, policy_version 54762 (0.0008) +[2023-10-11 17:15:05,099][85176] Updated weights for policy 0, policy_version 54772 (0.0009) +[2023-10-11 17:15:05,480][85176] Updated weights for policy 0, policy_version 54782 (0.0009) +[2023-10-11 17:15:06,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 113016832. Throughput: 0: 1687.8, 1: 1700.6. Samples: 28258078. Policy #0 lag: (min: 31.0, avg: 31.4, max: 45.0) +[2023-10-11 17:15:06,063][84230] Avg episode reward: [(0, '31.790'), (1, '45.200')] +[2023-10-11 17:15:06,435][85175] Updated weights for policy 1, policy_version 55590 (0.0008) +[2023-10-11 17:15:06,804][85175] Updated weights for policy 1, policy_version 55600 (0.0008) +[2023-10-11 17:15:07,178][85175] Updated weights for policy 1, policy_version 55610 (0.0009) +[2023-10-11 17:15:09,567][85176] Updated weights for policy 0, policy_version 54792 (0.0009) +[2023-10-11 17:15:09,936][85176] Updated weights for policy 0, policy_version 54802 (0.0009) +[2023-10-11 17:15:10,310][85176] Updated weights for policy 0, policy_version 54812 (0.0007) +[2023-10-11 17:15:11,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 113082368. Throughput: 0: 1683.4, 1: 1714.3. Samples: 28278456. Policy #0 lag: (min: 29.0, avg: 30.0, max: 51.0) +[2023-10-11 17:15:11,063][84230] Avg episode reward: [(0, '34.730'), (1, '42.750')] +[2023-10-11 17:15:11,064][84801] Saving new best policy, reward=34.730! +[2023-10-11 17:15:11,239][85175] Updated weights for policy 1, policy_version 55620 (0.0010) +[2023-10-11 17:15:11,601][85175] Updated weights for policy 1, policy_version 55630 (0.0009) +[2023-10-11 17:15:11,965][85175] Updated weights for policy 1, policy_version 55640 (0.0009) +[2023-10-11 17:15:14,237][85176] Updated weights for policy 0, policy_version 54822 (0.0008) +[2023-10-11 17:15:14,603][85176] Updated weights for policy 0, policy_version 54832 (0.0007) +[2023-10-11 17:15:14,972][85176] Updated weights for policy 0, policy_version 54842 (0.0007) +[2023-10-11 17:15:16,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 113147904. Throughput: 0: 1669.0, 1: 1713.0. Samples: 28298198. Policy #0 lag: (min: 29.0, avg: 30.0, max: 51.0) +[2023-10-11 17:15:16,063][84230] Avg episode reward: [(0, '32.070'), (1, '41.680')] +[2023-10-11 17:15:16,112][85175] Updated weights for policy 1, policy_version 55650 (0.0010) +[2023-10-11 17:15:16,532][85175] Updated weights for policy 1, policy_version 55660 (0.0008) +[2023-10-11 17:15:16,896][85175] Updated weights for policy 1, policy_version 55670 (0.0009) +[2023-10-11 17:15:17,262][85175] Updated weights for policy 1, policy_version 55680 (0.0009) +[2023-10-11 17:15:19,003][85176] Updated weights for policy 0, policy_version 54852 (0.0008) +[2023-10-11 17:15:19,368][85176] Updated weights for policy 0, policy_version 54862 (0.0010) +[2023-10-11 17:15:19,747][85176] Updated weights for policy 0, policy_version 54872 (0.0007) +[2023-10-11 17:15:21,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 113213440. Throughput: 0: 1694.4, 1: 1701.6. Samples: 28308692. Policy #0 lag: (min: 29.0, avg: 30.0, max: 51.0) +[2023-10-11 17:15:21,063][84230] Avg episode reward: [(0, '33.770'), (1, '44.420')] +[2023-10-11 17:15:21,295][85175] Updated weights for policy 1, policy_version 55690 (0.0009) +[2023-10-11 17:15:21,659][85175] Updated weights for policy 1, policy_version 55700 (0.0009) +[2023-10-11 17:15:22,033][85175] Updated weights for policy 1, policy_version 55710 (0.0008) +[2023-10-11 17:15:23,833][85176] Updated weights for policy 0, policy_version 54882 (0.0009) +[2023-10-11 17:15:24,197][85176] Updated weights for policy 0, policy_version 54892 (0.0007) +[2023-10-11 17:15:24,566][85176] Updated weights for policy 0, policy_version 54902 (0.0007) +[2023-10-11 17:15:24,932][85176] Updated weights for policy 0, policy_version 54912 (0.0010) +[2023-10-11 17:15:26,055][85175] Updated weights for policy 1, policy_version 55720 (0.0009) +[2023-10-11 17:15:26,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 113278976. Throughput: 0: 1671.9, 1: 1703.2. Samples: 28328496. Policy #0 lag: (min: 29.0, avg: 30.0, max: 51.0) +[2023-10-11 17:15:26,063][84230] Avg episode reward: [(0, '29.030'), (1, '41.170')] +[2023-10-11 17:15:26,423][85175] Updated weights for policy 1, policy_version 55730 (0.0010) +[2023-10-11 17:15:26,788][85175] Updated weights for policy 1, policy_version 55740 (0.0007) +[2023-10-11 17:15:29,127][85176] Updated weights for policy 0, policy_version 54922 (0.0010) +[2023-10-11 17:15:29,496][85176] Updated weights for policy 0, policy_version 54932 (0.0008) +[2023-10-11 17:15:29,868][85176] Updated weights for policy 0, policy_version 54942 (0.0008) +[2023-10-11 17:15:30,474][85175] Updated weights for policy 1, policy_version 55750 (0.0009) +[2023-10-11 17:15:30,844][85175] Updated weights for policy 1, policy_version 55760 (0.0008) +[2023-10-11 17:15:31,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 113344512. Throughput: 0: 1682.3, 1: 1703.2. Samples: 28349210. Policy #0 lag: (min: 29.0, avg: 30.0, max: 51.0) +[2023-10-11 17:15:31,064][84230] Avg episode reward: [(0, '32.250'), (1, '42.390')] +[2023-10-11 17:15:31,225][85175] Updated weights for policy 1, policy_version 55770 (0.0010) +[2023-10-11 17:15:33,970][85176] Updated weights for policy 0, policy_version 54952 (0.0008) +[2023-10-11 17:15:34,329][85176] Updated weights for policy 0, policy_version 54962 (0.0008) +[2023-10-11 17:15:34,707][85176] Updated weights for policy 0, policy_version 54972 (0.0008) +[2023-10-11 17:15:35,173][85175] Updated weights for policy 1, policy_version 55780 (0.0009) +[2023-10-11 17:15:35,547][85175] Updated weights for policy 1, policy_version 55790 (0.0010) +[2023-10-11 17:15:35,923][85175] Updated weights for policy 1, policy_version 55800 (0.0010) +[2023-10-11 17:15:36,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 113410048. Throughput: 0: 1696.2, 1: 1707.2. Samples: 28359952. Policy #0 lag: (min: 29.0, avg: 30.0, max: 51.0) +[2023-10-11 17:15:36,063][84230] Avg episode reward: [(0, '30.340'), (1, '39.900')] +[2023-10-11 17:15:38,619][85176] Updated weights for policy 0, policy_version 54982 (0.0008) +[2023-10-11 17:15:38,983][85176] Updated weights for policy 0, policy_version 54992 (0.0007) +[2023-10-11 17:15:39,353][85176] Updated weights for policy 0, policy_version 55002 (0.0009) +[2023-10-11 17:15:39,926][85175] Updated weights for policy 1, policy_version 55810 (0.0011) +[2023-10-11 17:15:40,300][85175] Updated weights for policy 1, policy_version 55820 (0.0009) +[2023-10-11 17:15:40,674][85175] Updated weights for policy 1, policy_version 55830 (0.0011) +[2023-10-11 17:15:41,030][85175] Updated weights for policy 1, policy_version 55840 (0.0009) +[2023-10-11 17:15:41,062][84230] Fps is (10 sec: 16384.4, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 113508352. Throughput: 0: 1667.6, 1: 1706.9. Samples: 28379666. Policy #0 lag: (min: 29.0, avg: 30.0, max: 51.0) +[2023-10-11 17:15:41,063][84230] Avg episode reward: [(0, '32.050'), (1, '43.220')] +[2023-10-11 17:15:43,498][85176] Updated weights for policy 0, policy_version 55012 (0.0008) +[2023-10-11 17:15:43,867][85176] Updated weights for policy 0, policy_version 55022 (0.0007) +[2023-10-11 17:15:44,238][85176] Updated weights for policy 0, policy_version 55032 (0.0008) +[2023-10-11 17:15:45,024][85175] Updated weights for policy 1, policy_version 55850 (0.0009) +[2023-10-11 17:15:45,390][85175] Updated weights for policy 1, policy_version 55860 (0.0007) +[2023-10-11 17:15:45,751][85175] Updated weights for policy 1, policy_version 55870 (0.0008) +[2023-10-11 17:15:46,062][84230] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 113573888. Throughput: 0: 1688.4, 1: 1688.9. Samples: 28399554. Policy #0 lag: (min: 29.0, avg: 30.0, max: 51.0) +[2023-10-11 17:15:46,063][84230] Avg episode reward: [(0, '31.680'), (1, '40.290')] +[2023-10-11 17:15:48,109][85176] Updated weights for policy 0, policy_version 55042 (0.0008) +[2023-10-11 17:15:48,491][85176] Updated weights for policy 0, policy_version 55052 (0.0009) +[2023-10-11 17:15:48,860][85176] Updated weights for policy 0, policy_version 55062 (0.0009) +[2023-10-11 17:15:49,238][85176] Updated weights for policy 0, policy_version 55072 (0.0008) +[2023-10-11 17:15:49,728][85175] Updated weights for policy 1, policy_version 55880 (0.0007) +[2023-10-11 17:15:50,093][85175] Updated weights for policy 1, policy_version 55890 (0.0007) +[2023-10-11 17:15:50,463][85175] Updated weights for policy 1, policy_version 55900 (0.0009) +[2023-10-11 17:15:51,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 113639424. Throughput: 0: 1674.9, 1: 1708.1. Samples: 28410312. Policy #0 lag: (min: 23.0, avg: 30.4, max: 55.0) +[2023-10-11 17:15:51,063][84230] Avg episode reward: [(0, '31.250'), (1, '44.900')] +[2023-10-11 17:15:53,282][85176] Updated weights for policy 0, policy_version 55082 (0.0007) +[2023-10-11 17:15:53,651][85176] Updated weights for policy 0, policy_version 55092 (0.0007) +[2023-10-11 17:15:54,031][85176] Updated weights for policy 0, policy_version 55102 (0.0009) +[2023-10-11 17:15:54,493][85175] Updated weights for policy 1, policy_version 55910 (0.0009) +[2023-10-11 17:15:54,859][85175] Updated weights for policy 1, policy_version 55920 (0.0008) +[2023-10-11 17:15:55,226][85175] Updated weights for policy 1, policy_version 55930 (0.0008) +[2023-10-11 17:15:56,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 113704960. Throughput: 0: 1663.9, 1: 1710.2. Samples: 28430292. Policy #0 lag: (min: 23.0, avg: 30.4, max: 55.0) +[2023-10-11 17:15:56,063][84230] Avg episode reward: [(0, '33.650'), (1, '41.190')] +[2023-10-11 17:15:58,319][85176] Updated weights for policy 0, policy_version 55112 (0.0010) +[2023-10-11 17:15:58,696][85176] Updated weights for policy 0, policy_version 55122 (0.0009) +[2023-10-11 17:15:58,950][85175] Updated weights for policy 1, policy_version 55940 (0.0010) +[2023-10-11 17:15:59,062][85176] Updated weights for policy 0, policy_version 55132 (0.0010) +[2023-10-11 17:15:59,322][85175] Updated weights for policy 1, policy_version 55950 (0.0010) +[2023-10-11 17:15:59,677][85175] Updated weights for policy 1, policy_version 55960 (0.0009) +[2023-10-11 17:16:01,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 113770496. Throughput: 0: 1684.4, 1: 1695.5. Samples: 28450292. Policy #0 lag: (min: 23.0, avg: 30.4, max: 55.0) +[2023-10-11 17:16:01,064][84230] Avg episode reward: [(0, '33.090'), (1, '44.500')] +[2023-10-11 17:16:03,154][85176] Updated weights for policy 0, policy_version 55142 (0.0010) +[2023-10-11 17:16:03,524][85176] Updated weights for policy 0, policy_version 55152 (0.0010) +[2023-10-11 17:16:03,869][85175] Updated weights for policy 1, policy_version 55970 (0.0008) +[2023-10-11 17:16:03,903][85176] Updated weights for policy 0, policy_version 55162 (0.0008) +[2023-10-11 17:16:04,263][85175] Updated weights for policy 1, policy_version 55980 (0.0009) +[2023-10-11 17:16:04,628][85175] Updated weights for policy 1, policy_version 55990 (0.0008) +[2023-10-11 17:16:04,997][85175] Updated weights for policy 1, policy_version 56000 (0.0008) +[2023-10-11 17:16:06,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 113836032. Throughput: 0: 1666.3, 1: 1722.8. Samples: 28461202. Policy #0 lag: (min: 23.0, avg: 30.4, max: 55.0) +[2023-10-11 17:16:06,064][84230] Avg episode reward: [(0, '33.140'), (1, '40.060')] +[2023-10-11 17:16:07,862][85176] Updated weights for policy 0, policy_version 55172 (0.0008) +[2023-10-11 17:16:08,228][85176] Updated weights for policy 0, policy_version 55182 (0.0007) +[2023-10-11 17:16:08,603][85176] Updated weights for policy 0, policy_version 55192 (0.0007) +[2023-10-11 17:16:08,938][85175] Updated weights for policy 1, policy_version 56010 (0.0008) +[2023-10-11 17:16:09,304][85175] Updated weights for policy 1, policy_version 56020 (0.0008) +[2023-10-11 17:16:09,676][85175] Updated weights for policy 1, policy_version 56030 (0.0007) +[2023-10-11 17:16:11,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 113901568. Throughput: 0: 1670.7, 1: 1701.6. Samples: 28480250. Policy #0 lag: (min: 23.0, avg: 30.4, max: 55.0) +[2023-10-11 17:16:11,064][84230] Avg episode reward: [(0, '29.700'), (1, '41.700')] +[2023-10-11 17:16:12,733][85176] Updated weights for policy 0, policy_version 55202 (0.0008) +[2023-10-11 17:16:13,094][85176] Updated weights for policy 0, policy_version 55212 (0.0007) +[2023-10-11 17:16:13,472][85176] Updated weights for policy 0, policy_version 55222 (0.0007) +[2023-10-11 17:16:13,746][85175] Updated weights for policy 1, policy_version 56040 (0.0007) +[2023-10-11 17:16:13,851][85176] Updated weights for policy 0, policy_version 55232 (0.0008) +[2023-10-11 17:16:14,114][85175] Updated weights for policy 1, policy_version 56050 (0.0008) +[2023-10-11 17:16:14,481][85175] Updated weights for policy 1, policy_version 56060 (0.0009) +[2023-10-11 17:16:16,062][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 113967104. Throughput: 0: 1675.3, 1: 1691.9. Samples: 28500734. Policy #0 lag: (min: 23.0, avg: 30.4, max: 55.0) +[2023-10-11 17:16:16,063][84230] Avg episode reward: [(0, '30.650'), (1, '39.620')] +[2023-10-11 17:16:16,073][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000055232_56557568.pth... +[2023-10-11 17:16:16,073][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000056064_57409536.pth... +[2023-10-11 17:16:16,109][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000054464_55771136.pth +[2023-10-11 17:16:16,112][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000053664_54951936.pth +[2023-10-11 17:16:18,152][85176] Updated weights for policy 0, policy_version 55242 (0.0009) +[2023-10-11 17:16:18,492][85175] Updated weights for policy 1, policy_version 56070 (0.0008) +[2023-10-11 17:16:18,524][85176] Updated weights for policy 0, policy_version 55252 (0.0007) +[2023-10-11 17:16:18,851][85175] Updated weights for policy 1, policy_version 56080 (0.0008) +[2023-10-11 17:16:18,896][85176] Updated weights for policy 0, policy_version 55262 (0.0008) +[2023-10-11 17:16:19,229][85175] Updated weights for policy 1, policy_version 56090 (0.0008) +[2023-10-11 17:16:21,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 114032640. Throughput: 0: 1654.1, 1: 1709.2. Samples: 28511302. Policy #0 lag: (min: 23.0, avg: 30.4, max: 55.0) +[2023-10-11 17:16:21,063][84230] Avg episode reward: [(0, '30.200'), (1, '44.610')] +[2023-10-11 17:16:22,937][85176] Updated weights for policy 0, policy_version 55272 (0.0008) +[2023-10-11 17:16:23,309][85176] Updated weights for policy 0, policy_version 55282 (0.0007) +[2023-10-11 17:16:23,312][85175] Updated weights for policy 1, policy_version 56100 (0.0008) +[2023-10-11 17:16:23,672][85176] Updated weights for policy 0, policy_version 55292 (0.0009) +[2023-10-11 17:16:23,678][85175] Updated weights for policy 1, policy_version 56110 (0.0007) +[2023-10-11 17:16:24,045][85175] Updated weights for policy 1, policy_version 56120 (0.0008) +[2023-10-11 17:16:26,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 114098176. Throughput: 0: 1670.1, 1: 1684.6. Samples: 28530628. Policy #0 lag: (min: 23.0, avg: 30.4, max: 55.0) +[2023-10-11 17:16:26,063][84230] Avg episode reward: [(0, '32.230'), (1, '40.950')] +[2023-10-11 17:16:27,736][85176] Updated weights for policy 0, policy_version 55302 (0.0009) +[2023-10-11 17:16:27,918][85175] Updated weights for policy 1, policy_version 56130 (0.0007) +[2023-10-11 17:16:28,107][85176] Updated weights for policy 0, policy_version 55312 (0.0008) +[2023-10-11 17:16:28,281][85175] Updated weights for policy 1, policy_version 56140 (0.0007) +[2023-10-11 17:16:28,488][85176] Updated weights for policy 0, policy_version 55322 (0.0007) +[2023-10-11 17:16:28,645][85175] Updated weights for policy 1, policy_version 56150 (0.0008) +[2023-10-11 17:16:29,013][85175] Updated weights for policy 1, policy_version 56160 (0.0008) +[2023-10-11 17:16:31,063][84230] Fps is (10 sec: 13106.7, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 114163712. Throughput: 0: 1668.9, 1: 1708.7. Samples: 28551548. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:16:31,064][84230] Avg episode reward: [(0, '32.050'), (1, '44.100')] +[2023-10-11 17:16:32,496][85176] Updated weights for policy 0, policy_version 55332 (0.0009) +[2023-10-11 17:16:32,874][85176] Updated weights for policy 0, policy_version 55342 (0.0008) +[2023-10-11 17:16:33,019][85175] Updated weights for policy 1, policy_version 56170 (0.0007) +[2023-10-11 17:16:33,250][85176] Updated weights for policy 0, policy_version 55352 (0.0009) +[2023-10-11 17:16:33,382][85175] Updated weights for policy 1, policy_version 56180 (0.0007) +[2023-10-11 17:16:33,743][85175] Updated weights for policy 1, policy_version 56190 (0.0009) +[2023-10-11 17:16:36,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 114229248. Throughput: 0: 1653.8, 1: 1698.2. Samples: 28561152. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:16:36,064][84230] Avg episode reward: [(0, '32.730'), (1, '39.640')] +[2023-10-11 17:16:37,375][85176] Updated weights for policy 0, policy_version 55362 (0.0007) +[2023-10-11 17:16:37,654][85175] Updated weights for policy 1, policy_version 56200 (0.0010) +[2023-10-11 17:16:37,755][85176] Updated weights for policy 0, policy_version 55372 (0.0009) +[2023-10-11 17:16:38,022][85175] Updated weights for policy 1, policy_version 56210 (0.0009) +[2023-10-11 17:16:38,126][85176] Updated weights for policy 0, policy_version 55382 (0.0008) +[2023-10-11 17:16:38,393][85175] Updated weights for policy 1, policy_version 56220 (0.0007) +[2023-10-11 17:16:38,496][85176] Updated weights for policy 0, policy_version 55392 (0.0009) +[2023-10-11 17:16:41,062][84230] Fps is (10 sec: 13107.7, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 114294784. Throughput: 0: 1667.8, 1: 1690.0. Samples: 28581396. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:16:41,063][84230] Avg episode reward: [(0, '30.530'), (1, '45.620')] +[2023-10-11 17:16:41,064][85000] Saving new best policy, reward=45.620! +[2023-10-11 17:16:42,421][85175] Updated weights for policy 1, policy_version 56230 (0.0009) +[2023-10-11 17:16:42,506][85176] Updated weights for policy 0, policy_version 55402 (0.0007) +[2023-10-11 17:16:42,778][85175] Updated weights for policy 1, policy_version 56240 (0.0009) +[2023-10-11 17:16:42,871][85176] Updated weights for policy 0, policy_version 55412 (0.0010) +[2023-10-11 17:16:43,144][85175] Updated weights for policy 1, policy_version 56250 (0.0009) +[2023-10-11 17:16:43,246][85176] Updated weights for policy 0, policy_version 55422 (0.0008) +[2023-10-11 17:16:46,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 114360320. Throughput: 0: 1671.6, 1: 1703.0. Samples: 28602148. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:16:46,064][84230] Avg episode reward: [(0, '30.540'), (1, '41.200')] +[2023-10-11 17:16:47,277][85175] Updated weights for policy 1, policy_version 56260 (0.0008) +[2023-10-11 17:16:47,440][85176] Updated weights for policy 0, policy_version 55432 (0.0009) +[2023-10-11 17:16:47,644][85175] Updated weights for policy 1, policy_version 56270 (0.0010) +[2023-10-11 17:16:47,806][85176] Updated weights for policy 0, policy_version 55442 (0.0010) +[2023-10-11 17:16:48,010][85175] Updated weights for policy 1, policy_version 56280 (0.0007) +[2023-10-11 17:16:48,177][85176] Updated weights for policy 0, policy_version 55452 (0.0009) +[2023-10-11 17:16:51,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 114425856. Throughput: 0: 1652.9, 1: 1672.9. Samples: 28610866. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:16:51,064][84230] Avg episode reward: [(0, '30.560'), (1, '46.240')] +[2023-10-11 17:16:51,065][85000] Saving new best policy, reward=46.240! +[2023-10-11 17:16:51,908][85175] Updated weights for policy 1, policy_version 56290 (0.0008) +[2023-10-11 17:16:52,284][85175] Updated weights for policy 1, policy_version 56300 (0.0009) +[2023-10-11 17:16:52,355][85176] Updated weights for policy 0, policy_version 55462 (0.0008) +[2023-10-11 17:16:52,645][85175] Updated weights for policy 1, policy_version 56310 (0.0009) +[2023-10-11 17:16:52,731][85176] Updated weights for policy 0, policy_version 55472 (0.0007) +[2023-10-11 17:16:53,007][85175] Updated weights for policy 1, policy_version 56320 (0.0007) +[2023-10-11 17:16:53,105][85176] Updated weights for policy 0, policy_version 55482 (0.0009) +[2023-10-11 17:16:56,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 114491392. Throughput: 0: 1663.2, 1: 1700.4. Samples: 28631610. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:16:56,063][84230] Avg episode reward: [(0, '32.610'), (1, '40.870')] +[2023-10-11 17:16:57,243][85176] Updated weights for policy 0, policy_version 55492 (0.0010) +[2023-10-11 17:16:57,370][85175] Updated weights for policy 1, policy_version 56330 (0.0007) +[2023-10-11 17:16:57,617][85176] Updated weights for policy 0, policy_version 55502 (0.0009) +[2023-10-11 17:16:57,740][85175] Updated weights for policy 1, policy_version 56340 (0.0008) +[2023-10-11 17:16:57,985][85176] Updated weights for policy 0, policy_version 55512 (0.0007) +[2023-10-11 17:16:58,115][85175] Updated weights for policy 1, policy_version 56350 (0.0007) +[2023-10-11 17:17:01,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 114556928. Throughput: 0: 1663.5, 1: 1701.5. Samples: 28652160. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:17:01,064][84230] Avg episode reward: [(0, '33.870'), (1, '47.420')] +[2023-10-11 17:17:01,078][85000] Saving new best policy, reward=47.420! +[2023-10-11 17:17:02,083][85175] Updated weights for policy 1, policy_version 56360 (0.0007) +[2023-10-11 17:17:02,141][85176] Updated weights for policy 0, policy_version 55522 (0.0008) +[2023-10-11 17:17:02,446][85175] Updated weights for policy 1, policy_version 56370 (0.0008) +[2023-10-11 17:17:02,504][85176] Updated weights for policy 0, policy_version 55532 (0.0009) +[2023-10-11 17:17:02,816][85175] Updated weights for policy 1, policy_version 56380 (0.0009) +[2023-10-11 17:17:02,876][85176] Updated weights for policy 0, policy_version 55542 (0.0008) +[2023-10-11 17:17:03,248][85176] Updated weights for policy 0, policy_version 55552 (0.0008) +[2023-10-11 17:17:06,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 114622464. Throughput: 0: 1654.7, 1: 1675.5. Samples: 28661160. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:17:06,063][84230] Avg episode reward: [(0, '30.960'), (1, '41.570')] +[2023-10-11 17:17:06,723][85175] Updated weights for policy 1, policy_version 56390 (0.0009) +[2023-10-11 17:17:07,094][85175] Updated weights for policy 1, policy_version 56400 (0.0008) +[2023-10-11 17:17:07,455][85175] Updated weights for policy 1, policy_version 56410 (0.0008) +[2023-10-11 17:17:07,485][85176] Updated weights for policy 0, policy_version 55562 (0.0007) +[2023-10-11 17:17:07,852][85176] Updated weights for policy 0, policy_version 55572 (0.0009) +[2023-10-11 17:17:08,225][85176] Updated weights for policy 0, policy_version 55582 (0.0007) +[2023-10-11 17:17:11,063][84230] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 114688000. Throughput: 0: 1662.3, 1: 1697.8. Samples: 28681834. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-11 17:17:11,063][84230] Avg episode reward: [(0, '33.060'), (1, '43.880')] +[2023-10-11 17:17:11,567][85175] Updated weights for policy 1, policy_version 56420 (0.0009) +[2023-10-11 17:17:11,941][85175] Updated weights for policy 1, policy_version 56430 (0.0007) +[2023-10-11 17:17:12,238][85176] Updated weights for policy 0, policy_version 55592 (0.0008) +[2023-10-11 17:17:12,305][85175] Updated weights for policy 1, policy_version 56440 (0.0008) +[2023-10-11 17:17:12,612][85176] Updated weights for policy 0, policy_version 55602 (0.0007) +[2023-10-11 17:17:12,996][85176] Updated weights for policy 0, policy_version 55612 (0.0008) +[2023-10-11 17:17:16,063][84230] Fps is (10 sec: 13106.8, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 114753536. Throughput: 0: 1668.9, 1: 1700.7. Samples: 28703182. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-11 17:17:16,064][84230] Avg episode reward: [(0, '29.860'), (1, '39.930')] +[2023-10-11 17:17:16,292][85175] Updated weights for policy 1, policy_version 56450 (0.0008) +[2023-10-11 17:17:16,658][85175] Updated weights for policy 1, policy_version 56460 (0.0009) +[2023-10-11 17:17:17,025][85175] Updated weights for policy 1, policy_version 56470 (0.0008) +[2023-10-11 17:17:17,081][85176] Updated weights for policy 0, policy_version 55622 (0.0007) +[2023-10-11 17:17:17,391][85175] Updated weights for policy 1, policy_version 56480 (0.0008) +[2023-10-11 17:17:17,447][85176] Updated weights for policy 0, policy_version 55632 (0.0008) +[2023-10-11 17:17:17,819][85176] Updated weights for policy 0, policy_version 55642 (0.0009) +[2023-10-11 17:17:21,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 114819072. Throughput: 0: 1665.8, 1: 1694.9. Samples: 28712382. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-11 17:17:21,064][84230] Avg episode reward: [(0, '32.710'), (1, '43.820')] +[2023-10-11 17:17:21,403][85175] Updated weights for policy 1, policy_version 56490 (0.0008) +[2023-10-11 17:17:21,777][85175] Updated weights for policy 1, policy_version 56500 (0.0007) +[2023-10-11 17:17:21,924][85176] Updated weights for policy 0, policy_version 55652 (0.0007) +[2023-10-11 17:17:22,142][85175] Updated weights for policy 1, policy_version 56510 (0.0007) +[2023-10-11 17:17:22,292][85176] Updated weights for policy 0, policy_version 55662 (0.0009) +[2023-10-11 17:17:22,666][85176] Updated weights for policy 0, policy_version 55672 (0.0010) +[2023-10-11 17:17:26,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 114884608. Throughput: 0: 1667.2, 1: 1699.1. Samples: 28732880. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-11 17:17:26,064][84230] Avg episode reward: [(0, '31.770'), (1, '42.970')] +[2023-10-11 17:17:26,197][85175] Updated weights for policy 1, policy_version 56520 (0.0009) +[2023-10-11 17:17:26,565][85175] Updated weights for policy 1, policy_version 56530 (0.0009) +[2023-10-11 17:17:26,727][85176] Updated weights for policy 0, policy_version 55682 (0.0008) +[2023-10-11 17:17:26,928][85175] Updated weights for policy 1, policy_version 56540 (0.0009) +[2023-10-11 17:17:27,084][85176] Updated weights for policy 0, policy_version 55692 (0.0007) +[2023-10-11 17:17:27,457][85176] Updated weights for policy 0, policy_version 55702 (0.0009) +[2023-10-11 17:17:27,825][85176] Updated weights for policy 0, policy_version 55712 (0.0007) +[2023-10-11 17:17:30,889][85175] Updated weights for policy 1, policy_version 56550 (0.0007) +[2023-10-11 17:17:31,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 114950144. Throughput: 0: 1663.3, 1: 1705.5. Samples: 28753740. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-11 17:17:31,063][84230] Avg episode reward: [(0, '34.070'), (1, '42.410')] +[2023-10-11 17:17:31,263][85175] Updated weights for policy 1, policy_version 56560 (0.0008) +[2023-10-11 17:17:31,626][85175] Updated weights for policy 1, policy_version 56570 (0.0007) +[2023-10-11 17:17:31,786][85176] Updated weights for policy 0, policy_version 55722 (0.0010) +[2023-10-11 17:17:32,165][85176] Updated weights for policy 0, policy_version 55732 (0.0007) +[2023-10-11 17:17:32,537][85176] Updated weights for policy 0, policy_version 55742 (0.0008) +[2023-10-11 17:17:35,749][85175] Updated weights for policy 1, policy_version 56580 (0.0008) +[2023-10-11 17:17:36,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 115015680. Throughput: 0: 1667.6, 1: 1709.7. Samples: 28762848. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-11 17:17:36,064][84230] Avg episode reward: [(0, '35.240'), (1, '42.280')] +[2023-10-11 17:17:36,065][84801] Saving new best policy, reward=35.240! +[2023-10-11 17:17:36,120][85175] Updated weights for policy 1, policy_version 56590 (0.0009) +[2023-10-11 17:17:36,483][85175] Updated weights for policy 1, policy_version 56600 (0.0008) +[2023-10-11 17:17:36,730][85176] Updated weights for policy 0, policy_version 55752 (0.0008) +[2023-10-11 17:17:37,104][85176] Updated weights for policy 0, policy_version 55762 (0.0010) +[2023-10-11 17:17:37,475][85176] Updated weights for policy 0, policy_version 55772 (0.0008) +[2023-10-11 17:17:40,439][85175] Updated weights for policy 1, policy_version 56610 (0.0010) +[2023-10-11 17:17:40,804][85175] Updated weights for policy 1, policy_version 56620 (0.0007) +[2023-10-11 17:17:41,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 115081216. Throughput: 0: 1671.5, 1: 1705.8. Samples: 28783590. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-11 17:17:41,063][84230] Avg episode reward: [(0, '33.400'), (1, '44.180')] +[2023-10-11 17:17:41,173][85175] Updated weights for policy 1, policy_version 56630 (0.0008) +[2023-10-11 17:17:41,535][85176] Updated weights for policy 0, policy_version 55782 (0.0010) +[2023-10-11 17:17:41,538][85175] Updated weights for policy 1, policy_version 56640 (0.0008) +[2023-10-11 17:17:41,917][85176] Updated weights for policy 0, policy_version 55792 (0.0007) +[2023-10-11 17:17:42,286][85176] Updated weights for policy 0, policy_version 55802 (0.0008) +[2023-10-11 17:17:45,618][85175] Updated weights for policy 1, policy_version 56650 (0.0008) +[2023-10-11 17:17:45,995][85175] Updated weights for policy 1, policy_version 56660 (0.0011) +[2023-10-11 17:17:46,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 115146752. Throughput: 0: 1676.4, 1: 1700.3. Samples: 28804112. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-11 17:17:46,064][84230] Avg episode reward: [(0, '34.510'), (1, '46.450')] +[2023-10-11 17:17:46,357][85176] Updated weights for policy 0, policy_version 55812 (0.0010) +[2023-10-11 17:17:46,362][85175] Updated weights for policy 1, policy_version 56670 (0.0008) +[2023-10-11 17:17:46,730][85176] Updated weights for policy 0, policy_version 55822 (0.0009) +[2023-10-11 17:17:47,106][85176] Updated weights for policy 0, policy_version 55832 (0.0007) +[2023-10-11 17:17:50,356][85175] Updated weights for policy 1, policy_version 56680 (0.0010) +[2023-10-11 17:17:50,728][85175] Updated weights for policy 1, policy_version 56690 (0.0009) +[2023-10-11 17:17:51,029][85176] Updated weights for policy 0, policy_version 55842 (0.0009) +[2023-10-11 17:17:51,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 115212288. Throughput: 0: 1677.3, 1: 1706.6. Samples: 28813436. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:17:51,063][84230] Avg episode reward: [(0, '32.300'), (1, '41.230')] +[2023-10-11 17:17:51,096][85175] Updated weights for policy 1, policy_version 56700 (0.0007) +[2023-10-11 17:17:51,406][85176] Updated weights for policy 0, policy_version 55852 (0.0008) +[2023-10-11 17:17:51,768][85176] Updated weights for policy 0, policy_version 55862 (0.0007) +[2023-10-11 17:17:52,136][85176] Updated weights for policy 0, policy_version 55872 (0.0007) +[2023-10-11 17:17:55,067][85175] Updated weights for policy 1, policy_version 56710 (0.0007) +[2023-10-11 17:17:55,432][85175] Updated weights for policy 1, policy_version 56720 (0.0008) +[2023-10-11 17:17:55,813][85175] Updated weights for policy 1, policy_version 56730 (0.0011) +[2023-10-11 17:17:56,062][84230] Fps is (10 sec: 16384.5, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 115310592. Throughput: 0: 1682.5, 1: 1709.0. Samples: 28834452. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:17:56,063][84230] Avg episode reward: [(0, '34.870'), (1, '44.340')] +[2023-10-11 17:17:56,210][85176] Updated weights for policy 0, policy_version 55882 (0.0007) +[2023-10-11 17:17:56,581][85176] Updated weights for policy 0, policy_version 55892 (0.0007) +[2023-10-11 17:17:56,957][85176] Updated weights for policy 0, policy_version 55902 (0.0008) +[2023-10-11 17:17:59,767][85175] Updated weights for policy 1, policy_version 56740 (0.0009) +[2023-10-11 17:18:00,133][85175] Updated weights for policy 1, policy_version 56750 (0.0007) +[2023-10-11 17:18:00,508][85175] Updated weights for policy 1, policy_version 56760 (0.0008) +[2023-10-11 17:18:00,952][85176] Updated weights for policy 0, policy_version 55912 (0.0009) +[2023-10-11 17:18:01,062][84230] Fps is (10 sec: 16384.1, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 115376128. Throughput: 0: 1682.9, 1: 1682.1. Samples: 28854606. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:18:01,063][84230] Avg episode reward: [(0, '34.580'), (1, '43.130')] +[2023-10-11 17:18:01,327][85176] Updated weights for policy 0, policy_version 55922 (0.0010) +[2023-10-11 17:18:01,699][85176] Updated weights for policy 0, policy_version 55932 (0.0009) +[2023-10-11 17:18:04,461][85175] Updated weights for policy 1, policy_version 56770 (0.0007) +[2023-10-11 17:18:04,830][85175] Updated weights for policy 1, policy_version 56780 (0.0007) +[2023-10-11 17:18:05,202][85175] Updated weights for policy 1, policy_version 56790 (0.0011) +[2023-10-11 17:18:05,573][85175] Updated weights for policy 1, policy_version 56800 (0.0008) +[2023-10-11 17:18:06,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 115441664. Throughput: 0: 1680.3, 1: 1706.3. Samples: 28864778. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:18:06,064][84230] Avg episode reward: [(0, '35.150'), (1, '41.580')] +[2023-10-11 17:18:06,103][85176] Updated weights for policy 0, policy_version 55942 (0.0010) +[2023-10-11 17:18:06,482][85176] Updated weights for policy 0, policy_version 55952 (0.0010) +[2023-10-11 17:18:06,855][85176] Updated weights for policy 0, policy_version 55962 (0.0010) +[2023-10-11 17:18:09,525][85175] Updated weights for policy 1, policy_version 56810 (0.0008) +[2023-10-11 17:18:09,884][85175] Updated weights for policy 1, policy_version 56820 (0.0008) +[2023-10-11 17:18:10,259][85175] Updated weights for policy 1, policy_version 56830 (0.0008) +[2023-10-11 17:18:10,891][85176] Updated weights for policy 0, policy_version 55972 (0.0008) +[2023-10-11 17:18:11,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 115507200. Throughput: 0: 1679.5, 1: 1701.9. Samples: 28885040. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:18:11,063][84230] Avg episode reward: [(0, '30.770'), (1, '41.550')] +[2023-10-11 17:18:11,259][85176] Updated weights for policy 0, policy_version 55982 (0.0009) +[2023-10-11 17:18:11,626][85176] Updated weights for policy 0, policy_version 55992 (0.0011) +[2023-10-11 17:18:14,413][85175] Updated weights for policy 1, policy_version 56840 (0.0008) +[2023-10-11 17:18:14,780][85175] Updated weights for policy 1, policy_version 56850 (0.0009) +[2023-10-11 17:18:15,149][85175] Updated weights for policy 1, policy_version 56860 (0.0009) +[2023-10-11 17:18:15,752][85176] Updated weights for policy 0, policy_version 56002 (0.0010) +[2023-10-11 17:18:16,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 115572736. Throughput: 0: 1678.3, 1: 1678.8. Samples: 28904812. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:18:16,063][84230] Avg episode reward: [(0, '33.650'), (1, '43.420')] +[2023-10-11 17:18:16,072][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000056864_58228736.pth... +[2023-10-11 17:18:16,105][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000055264_56590336.pth +[2023-10-11 17:18:16,119][85176] Updated weights for policy 0, policy_version 56012 (0.0007) +[2023-10-11 17:18:16,497][85176] Updated weights for policy 0, policy_version 56022 (0.0007) +[2023-10-11 17:18:16,866][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000056032_57376768.pth... +[2023-10-11 17:18:16,871][85176] Updated weights for policy 0, policy_version 56032 (0.0008) +[2023-10-11 17:18:16,895][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000054464_55771136.pth +[2023-10-11 17:18:19,251][85175] Updated weights for policy 1, policy_version 56870 (0.0009) +[2023-10-11 17:18:19,620][85175] Updated weights for policy 1, policy_version 56880 (0.0009) +[2023-10-11 17:18:19,995][85175] Updated weights for policy 1, policy_version 56890 (0.0009) +[2023-10-11 17:18:20,782][85176] Updated weights for policy 0, policy_version 56042 (0.0008) +[2023-10-11 17:18:21,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 115638272. Throughput: 0: 1682.4, 1: 1706.3. Samples: 28915338. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:18:21,063][84230] Avg episode reward: [(0, '33.070'), (1, '41.910')] +[2023-10-11 17:18:21,162][85176] Updated weights for policy 0, policy_version 56052 (0.0009) +[2023-10-11 17:18:21,538][85176] Updated weights for policy 0, policy_version 56062 (0.0008) +[2023-10-11 17:18:23,956][85175] Updated weights for policy 1, policy_version 56900 (0.0009) +[2023-10-11 17:18:24,321][85175] Updated weights for policy 1, policy_version 56910 (0.0009) +[2023-10-11 17:18:24,691][85175] Updated weights for policy 1, policy_version 56920 (0.0009) +[2023-10-11 17:18:25,652][85176] Updated weights for policy 0, policy_version 56072 (0.0009) +[2023-10-11 17:18:26,023][85176] Updated weights for policy 0, policy_version 56082 (0.0010) +[2023-10-11 17:18:26,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 115703808. Throughput: 0: 1684.4, 1: 1690.2. Samples: 28935448. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:18:26,064][84230] Avg episode reward: [(0, '33.850'), (1, '44.870')] +[2023-10-11 17:18:26,399][85176] Updated weights for policy 0, policy_version 56092 (0.0007) +[2023-10-11 17:18:28,637][85175] Updated weights for policy 1, policy_version 56930 (0.0009) +[2023-10-11 17:18:29,006][85175] Updated weights for policy 1, policy_version 56940 (0.0008) +[2023-10-11 17:18:29,369][85175] Updated weights for policy 1, policy_version 56950 (0.0008) +[2023-10-11 17:18:29,728][85175] Updated weights for policy 1, policy_version 56960 (0.0007) +[2023-10-11 17:18:30,525][85176] Updated weights for policy 0, policy_version 56102 (0.0007) +[2023-10-11 17:18:30,904][85176] Updated weights for policy 0, policy_version 56112 (0.0007) +[2023-10-11 17:18:31,063][84230] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 115769344. Throughput: 0: 1669.3, 1: 1692.0. Samples: 28955372. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:18:31,064][84230] Avg episode reward: [(0, '31.670'), (1, '44.220')] +[2023-10-11 17:18:31,275][85176] Updated weights for policy 0, policy_version 56122 (0.0007) +[2023-10-11 17:18:33,816][85175] Updated weights for policy 1, policy_version 56970 (0.0009) +[2023-10-11 17:18:34,182][85175] Updated weights for policy 1, policy_version 56980 (0.0010) +[2023-10-11 17:18:34,545][85175] Updated weights for policy 1, policy_version 56990 (0.0010) +[2023-10-11 17:18:35,363][85176] Updated weights for policy 0, policy_version 56132 (0.0007) +[2023-10-11 17:18:35,734][85176] Updated weights for policy 0, policy_version 56142 (0.0009) +[2023-10-11 17:18:36,062][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 115834880. Throughput: 0: 1676.5, 1: 1712.4. Samples: 28965938. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:18:36,063][84230] Avg episode reward: [(0, '27.510'), (1, '43.480')] +[2023-10-11 17:18:36,109][85176] Updated weights for policy 0, policy_version 56152 (0.0009) +[2023-10-11 17:18:38,466][85175] Updated weights for policy 1, policy_version 57000 (0.0007) +[2023-10-11 17:18:38,842][85175] Updated weights for policy 1, policy_version 57010 (0.0007) +[2023-10-11 17:18:39,207][85175] Updated weights for policy 1, policy_version 57020 (0.0008) +[2023-10-11 17:18:40,172][85176] Updated weights for policy 0, policy_version 56162 (0.0008) +[2023-10-11 17:18:40,545][85176] Updated weights for policy 0, policy_version 56172 (0.0009) +[2023-10-11 17:18:40,924][85176] Updated weights for policy 0, policy_version 56182 (0.0009) +[2023-10-11 17:18:41,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 115900416. Throughput: 0: 1673.6, 1: 1683.4. Samples: 28985516. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:18:41,063][84230] Avg episode reward: [(0, '30.690'), (1, '43.310')] +[2023-10-11 17:18:41,298][85176] Updated weights for policy 0, policy_version 56192 (0.0007) +[2023-10-11 17:18:43,229][85175] Updated weights for policy 1, policy_version 57030 (0.0010) +[2023-10-11 17:18:43,602][85175] Updated weights for policy 1, policy_version 57040 (0.0009) +[2023-10-11 17:18:43,971][85175] Updated weights for policy 1, policy_version 57050 (0.0008) +[2023-10-11 17:18:45,146][85176] Updated weights for policy 0, policy_version 56202 (0.0008) +[2023-10-11 17:18:45,505][85176] Updated weights for policy 0, policy_version 56212 (0.0010) +[2023-10-11 17:18:45,891][85176] Updated weights for policy 0, policy_version 56222 (0.0008) +[2023-10-11 17:18:46,063][84230] Fps is (10 sec: 16383.7, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 115998720. Throughput: 0: 1651.6, 1: 1703.5. Samples: 29005590. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:18:46,064][84230] Avg episode reward: [(0, '32.610'), (1, '42.300')] +[2023-10-11 17:18:47,974][85175] Updated weights for policy 1, policy_version 57060 (0.0008) +[2023-10-11 17:18:48,339][85175] Updated weights for policy 1, policy_version 57070 (0.0010) +[2023-10-11 17:18:48,708][85175] Updated weights for policy 1, policy_version 57080 (0.0010) +[2023-10-11 17:18:50,143][85176] Updated weights for policy 0, policy_version 56232 (0.0009) +[2023-10-11 17:18:50,518][85176] Updated weights for policy 0, policy_version 56242 (0.0010) +[2023-10-11 17:18:50,902][85176] Updated weights for policy 0, policy_version 56252 (0.0008) +[2023-10-11 17:18:51,062][84230] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 116064256. Throughput: 0: 1676.9, 1: 1691.0. Samples: 29016330. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:18:51,063][84230] Avg episode reward: [(0, '35.640'), (1, '42.700')] +[2023-10-11 17:18:51,063][84801] Saving new best policy, reward=35.640! +[2023-10-11 17:18:52,707][85175] Updated weights for policy 1, policy_version 57090 (0.0009) +[2023-10-11 17:18:53,080][85175] Updated weights for policy 1, policy_version 57100 (0.0007) +[2023-10-11 17:18:53,447][85175] Updated weights for policy 1, policy_version 57110 (0.0007) +[2023-10-11 17:18:53,805][85175] Updated weights for policy 1, policy_version 57120 (0.0008) +[2023-10-11 17:18:54,991][85176] Updated weights for policy 0, policy_version 56262 (0.0008) +[2023-10-11 17:18:55,371][85176] Updated weights for policy 0, policy_version 56272 (0.0009) +[2023-10-11 17:18:55,748][85176] Updated weights for policy 0, policy_version 56282 (0.0007) +[2023-10-11 17:18:56,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 116129792. Throughput: 0: 1679.8, 1: 1685.9. Samples: 29036496. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:18:56,064][84230] Avg episode reward: [(0, '31.720'), (1, '42.000')] +[2023-10-11 17:18:57,797][85175] Updated weights for policy 1, policy_version 57130 (0.0007) +[2023-10-11 17:18:58,171][85175] Updated weights for policy 1, policy_version 57140 (0.0008) +[2023-10-11 17:18:58,544][85175] Updated weights for policy 1, policy_version 57150 (0.0008) +[2023-10-11 17:18:59,850][85176] Updated weights for policy 0, policy_version 56292 (0.0008) +[2023-10-11 17:19:00,219][85176] Updated weights for policy 0, policy_version 56302 (0.0007) +[2023-10-11 17:19:00,592][85176] Updated weights for policy 0, policy_version 56312 (0.0007) +[2023-10-11 17:19:01,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 116195328. Throughput: 0: 1658.8, 1: 1712.3. Samples: 29056512. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:19:01,063][84230] Avg episode reward: [(0, '32.400'), (1, '42.450')] +[2023-10-11 17:19:02,605][85175] Updated weights for policy 1, policy_version 57160 (0.0009) +[2023-10-11 17:19:02,975][85175] Updated weights for policy 1, policy_version 57170 (0.0007) +[2023-10-11 17:19:03,339][85175] Updated weights for policy 1, policy_version 57180 (0.0008) +[2023-10-11 17:19:04,666][85176] Updated weights for policy 0, policy_version 56322 (0.0007) +[2023-10-11 17:19:05,036][85176] Updated weights for policy 0, policy_version 56332 (0.0011) +[2023-10-11 17:19:05,411][85176] Updated weights for policy 0, policy_version 56342 (0.0010) +[2023-10-11 17:19:05,788][85176] Updated weights for policy 0, policy_version 56352 (0.0010) +[2023-10-11 17:19:06,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 116260864. Throughput: 0: 1675.8, 1: 1688.4. Samples: 29066726. Policy #0 lag: (min: 31.0, avg: 32.0, max: 53.0) +[2023-10-11 17:19:06,064][84230] Avg episode reward: [(0, '32.620'), (1, '40.910')] +[2023-10-11 17:19:07,396][85175] Updated weights for policy 1, policy_version 57190 (0.0008) +[2023-10-11 17:19:07,759][85175] Updated weights for policy 1, policy_version 57200 (0.0009) +[2023-10-11 17:19:08,125][85175] Updated weights for policy 1, policy_version 57210 (0.0010) +[2023-10-11 17:19:09,855][85176] Updated weights for policy 0, policy_version 56362 (0.0008) +[2023-10-11 17:19:10,226][85176] Updated weights for policy 0, policy_version 56372 (0.0010) +[2023-10-11 17:19:10,607][85176] Updated weights for policy 0, policy_version 56382 (0.0009) +[2023-10-11 17:19:11,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 116326400. Throughput: 0: 1677.3, 1: 1703.1. Samples: 29087564. Policy #0 lag: (min: 31.0, avg: 32.0, max: 53.0) +[2023-10-11 17:19:11,064][84230] Avg episode reward: [(0, '34.790'), (1, '40.630')] +[2023-10-11 17:19:11,977][85175] Updated weights for policy 1, policy_version 57220 (0.0010) +[2023-10-11 17:19:12,357][85175] Updated weights for policy 1, policy_version 57230 (0.0011) +[2023-10-11 17:19:12,722][85175] Updated weights for policy 1, policy_version 57240 (0.0011) +[2023-10-11 17:19:14,714][85176] Updated weights for policy 0, policy_version 56392 (0.0010) +[2023-10-11 17:19:15,087][85176] Updated weights for policy 0, policy_version 56402 (0.0007) +[2023-10-11 17:19:15,461][85176] Updated weights for policy 0, policy_version 56412 (0.0009) +[2023-10-11 17:19:16,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 116391936. Throughput: 0: 1663.3, 1: 1711.9. Samples: 29107256. Policy #0 lag: (min: 31.0, avg: 32.0, max: 53.0) +[2023-10-11 17:19:16,064][84230] Avg episode reward: [(0, '31.900'), (1, '40.700')] +[2023-10-11 17:19:16,753][85175] Updated weights for policy 1, policy_version 57250 (0.0011) +[2023-10-11 17:19:17,127][85175] Updated weights for policy 1, policy_version 57260 (0.0008) +[2023-10-11 17:19:17,483][85175] Updated weights for policy 1, policy_version 57270 (0.0010) +[2023-10-11 17:19:17,852][85175] Updated weights for policy 1, policy_version 57280 (0.0009) +[2023-10-11 17:19:19,384][85176] Updated weights for policy 0, policy_version 56422 (0.0010) +[2023-10-11 17:19:19,752][85176] Updated weights for policy 0, policy_version 56432 (0.0009) +[2023-10-11 17:19:20,134][85176] Updated weights for policy 0, policy_version 56442 (0.0008) +[2023-10-11 17:19:21,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 116457472. Throughput: 0: 1688.9, 1: 1688.3. Samples: 29117914. Policy #0 lag: (min: 31.0, avg: 32.0, max: 53.0) +[2023-10-11 17:19:21,064][84230] Avg episode reward: [(0, '31.620'), (1, '39.920')] +[2023-10-11 17:19:21,981][85175] Updated weights for policy 1, policy_version 57290 (0.0010) +[2023-10-11 17:19:22,360][85175] Updated weights for policy 1, policy_version 57300 (0.0009) +[2023-10-11 17:19:22,729][85175] Updated weights for policy 1, policy_version 57310 (0.0008) +[2023-10-11 17:19:24,171][85176] Updated weights for policy 0, policy_version 56452 (0.0010) +[2023-10-11 17:19:24,537][85176] Updated weights for policy 0, policy_version 56462 (0.0010) +[2023-10-11 17:19:24,913][85176] Updated weights for policy 0, policy_version 56472 (0.0008) +[2023-10-11 17:19:26,062][84230] Fps is (10 sec: 13107.6, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 116523008. Throughput: 0: 1679.7, 1: 1716.9. Samples: 29138364. Policy #0 lag: (min: 31.0, avg: 32.0, max: 53.0) +[2023-10-11 17:19:26,063][84230] Avg episode reward: [(0, '32.070'), (1, '44.310')] +[2023-10-11 17:19:26,549][85175] Updated weights for policy 1, policy_version 57320 (0.0007) +[2023-10-11 17:19:26,914][85175] Updated weights for policy 1, policy_version 57330 (0.0009) +[2023-10-11 17:19:27,270][85175] Updated weights for policy 1, policy_version 57340 (0.0008) +[2023-10-11 17:19:28,979][85176] Updated weights for policy 0, policy_version 56482 (0.0008) +[2023-10-11 17:19:29,351][85176] Updated weights for policy 0, policy_version 56492 (0.0009) +[2023-10-11 17:19:29,724][85176] Updated weights for policy 0, policy_version 56502 (0.0007) +[2023-10-11 17:19:30,098][85176] Updated weights for policy 0, policy_version 56512 (0.0009) +[2023-10-11 17:19:31,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 116588544. Throughput: 0: 1681.8, 1: 1726.8. Samples: 29158980. Policy #0 lag: (min: 31.0, avg: 32.0, max: 53.0) +[2023-10-11 17:19:31,063][84230] Avg episode reward: [(0, '33.580'), (1, '41.980')] +[2023-10-11 17:19:31,089][85175] Updated weights for policy 1, policy_version 57350 (0.0010) +[2023-10-11 17:19:31,457][85175] Updated weights for policy 1, policy_version 57360 (0.0007) +[2023-10-11 17:19:31,820][85175] Updated weights for policy 1, policy_version 57370 (0.0009) +[2023-10-11 17:19:34,149][85176] Updated weights for policy 0, policy_version 56522 (0.0009) +[2023-10-11 17:19:34,527][85176] Updated weights for policy 0, policy_version 56532 (0.0008) +[2023-10-11 17:19:34,891][85176] Updated weights for policy 0, policy_version 56542 (0.0007) +[2023-10-11 17:19:35,902][85175] Updated weights for policy 1, policy_version 57380 (0.0009) +[2023-10-11 17:19:36,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 116654080. Throughput: 0: 1688.8, 1: 1712.2. Samples: 29169378. Policy #0 lag: (min: 31.0, avg: 32.0, max: 53.0) +[2023-10-11 17:19:36,063][84230] Avg episode reward: [(0, '35.180'), (1, '44.390')] +[2023-10-11 17:19:36,269][85175] Updated weights for policy 1, policy_version 57390 (0.0008) +[2023-10-11 17:19:36,631][85175] Updated weights for policy 1, policy_version 57400 (0.0009) +[2023-10-11 17:19:38,936][85176] Updated weights for policy 0, policy_version 56552 (0.0009) +[2023-10-11 17:19:39,308][85176] Updated weights for policy 0, policy_version 56562 (0.0009) +[2023-10-11 17:19:39,676][85176] Updated weights for policy 0, policy_version 56572 (0.0007) +[2023-10-11 17:19:40,503][85175] Updated weights for policy 1, policy_version 57410 (0.0007) +[2023-10-11 17:19:40,886][85175] Updated weights for policy 1, policy_version 57420 (0.0008) +[2023-10-11 17:19:41,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 116719616. Throughput: 0: 1668.7, 1: 1731.6. Samples: 29189508. Policy #0 lag: (min: 31.0, avg: 32.0, max: 53.0) +[2023-10-11 17:19:41,064][84230] Avg episode reward: [(0, '35.150'), (1, '40.430')] +[2023-10-11 17:19:41,248][85175] Updated weights for policy 1, policy_version 57430 (0.0009) +[2023-10-11 17:19:41,618][85175] Updated weights for policy 1, policy_version 57440 (0.0008) +[2023-10-11 17:19:43,618][85176] Updated weights for policy 0, policy_version 56582 (0.0007) +[2023-10-11 17:19:43,989][85176] Updated weights for policy 0, policy_version 56592 (0.0010) +[2023-10-11 17:19:44,373][85176] Updated weights for policy 0, policy_version 56602 (0.0009) +[2023-10-11 17:19:45,639][85175] Updated weights for policy 1, policy_version 57450 (0.0009) +[2023-10-11 17:19:46,001][85175] Updated weights for policy 1, policy_version 57460 (0.0008) +[2023-10-11 17:19:46,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 116785152. Throughput: 0: 1685.3, 1: 1721.1. Samples: 29209800. Policy #0 lag: (min: 7.0, avg: 15.0, max: 39.0) +[2023-10-11 17:19:46,064][84230] Avg episode reward: [(0, '34.880'), (1, '44.050')] +[2023-10-11 17:19:46,371][85175] Updated weights for policy 1, policy_version 57470 (0.0008) +[2023-10-11 17:19:48,614][85176] Updated weights for policy 0, policy_version 56612 (0.0009) +[2023-10-11 17:19:48,992][85176] Updated weights for policy 0, policy_version 56622 (0.0007) +[2023-10-11 17:19:49,363][85176] Updated weights for policy 0, policy_version 56632 (0.0009) +[2023-10-11 17:19:50,303][85175] Updated weights for policy 1, policy_version 57480 (0.0008) +[2023-10-11 17:19:50,661][85175] Updated weights for policy 1, policy_version 57490 (0.0007) +[2023-10-11 17:19:51,024][85175] Updated weights for policy 1, policy_version 57500 (0.0007) +[2023-10-11 17:19:51,062][84230] Fps is (10 sec: 13107.7, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 116850688. Throughput: 0: 1689.5, 1: 1721.7. Samples: 29220232. Policy #0 lag: (min: 7.0, avg: 15.0, max: 39.0) +[2023-10-11 17:19:51,063][84230] Avg episode reward: [(0, '32.060'), (1, '42.340')] +[2023-10-11 17:19:53,379][85176] Updated weights for policy 0, policy_version 56642 (0.0008) +[2023-10-11 17:19:53,754][85176] Updated weights for policy 0, policy_version 56652 (0.0009) +[2023-10-11 17:19:54,123][85176] Updated weights for policy 0, policy_version 56662 (0.0009) +[2023-10-11 17:19:54,499][85176] Updated weights for policy 0, policy_version 56672 (0.0009) +[2023-10-11 17:19:55,138][85175] Updated weights for policy 1, policy_version 57510 (0.0008) +[2023-10-11 17:19:55,504][85175] Updated weights for policy 1, policy_version 57520 (0.0007) +[2023-10-11 17:19:55,870][85175] Updated weights for policy 1, policy_version 57530 (0.0010) +[2023-10-11 17:19:56,062][84230] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 116916224. Throughput: 0: 1664.1, 1: 1726.1. Samples: 29240120. Policy #0 lag: (min: 7.0, avg: 15.0, max: 39.0) +[2023-10-11 17:19:56,063][84230] Avg episode reward: [(0, '32.380'), (1, '45.210')] +[2023-10-11 17:19:58,353][85176] Updated weights for policy 0, policy_version 56682 (0.0008) +[2023-10-11 17:19:58,735][85176] Updated weights for policy 0, policy_version 56692 (0.0008) +[2023-10-11 17:19:59,099][85176] Updated weights for policy 0, policy_version 56702 (0.0008) +[2023-10-11 17:19:59,771][85175] Updated weights for policy 1, policy_version 57540 (0.0009) +[2023-10-11 17:20:00,131][85175] Updated weights for policy 1, policy_version 57550 (0.0009) +[2023-10-11 17:20:00,490][85175] Updated weights for policy 1, policy_version 57560 (0.0010) +[2023-10-11 17:20:01,062][84230] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 117014528. Throughput: 0: 1694.3, 1: 1702.5. Samples: 29260110. Policy #0 lag: (min: 7.0, avg: 15.0, max: 39.0) +[2023-10-11 17:20:01,063][84230] Avg episode reward: [(0, '32.910'), (1, '42.650')] +[2023-10-11 17:20:03,238][85176] Updated weights for policy 0, policy_version 56712 (0.0008) +[2023-10-11 17:20:03,619][85176] Updated weights for policy 0, policy_version 56722 (0.0008) +[2023-10-11 17:20:03,979][85176] Updated weights for policy 0, policy_version 56732 (0.0008) +[2023-10-11 17:20:04,548][85175] Updated weights for policy 1, policy_version 57570 (0.0009) +[2023-10-11 17:20:04,914][85175] Updated weights for policy 1, policy_version 57580 (0.0007) +[2023-10-11 17:20:05,287][85175] Updated weights for policy 1, policy_version 57590 (0.0007) +[2023-10-11 17:20:05,654][85175] Updated weights for policy 1, policy_version 57600 (0.0009) +[2023-10-11 17:20:06,063][84230] Fps is (10 sec: 16383.7, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 117080064. Throughput: 0: 1671.2, 1: 1728.0. Samples: 29270878. Policy #0 lag: (min: 7.0, avg: 15.0, max: 39.0) +[2023-10-11 17:20:06,064][84230] Avg episode reward: [(0, '35.380'), (1, '44.830')] +[2023-10-11 17:20:07,885][85176] Updated weights for policy 0, policy_version 56742 (0.0008) +[2023-10-11 17:20:08,252][85176] Updated weights for policy 0, policy_version 56752 (0.0007) +[2023-10-11 17:20:08,624][85176] Updated weights for policy 0, policy_version 56762 (0.0007) +[2023-10-11 17:20:09,890][85175] Updated weights for policy 1, policy_version 57610 (0.0007) +[2023-10-11 17:20:10,256][85175] Updated weights for policy 1, policy_version 57620 (0.0010) +[2023-10-11 17:20:10,638][85175] Updated weights for policy 1, policy_version 57630 (0.0009) +[2023-10-11 17:20:11,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 117145600. Throughput: 0: 1672.5, 1: 1720.1. Samples: 29291032. Policy #0 lag: (min: 7.0, avg: 15.0, max: 39.0) +[2023-10-11 17:20:11,063][84230] Avg episode reward: [(0, '35.740'), (1, '40.130')] +[2023-10-11 17:20:11,064][84801] Saving new best policy, reward=35.740! +[2023-10-11 17:20:12,585][85176] Updated weights for policy 0, policy_version 56772 (0.0008) +[2023-10-11 17:20:12,965][85176] Updated weights for policy 0, policy_version 56782 (0.0008) +[2023-10-11 17:20:13,330][85176] Updated weights for policy 0, policy_version 56792 (0.0008) +[2023-10-11 17:20:14,596][85175] Updated weights for policy 1, policy_version 57640 (0.0011) +[2023-10-11 17:20:14,962][85175] Updated weights for policy 1, policy_version 57650 (0.0009) +[2023-10-11 17:20:15,332][85175] Updated weights for policy 1, policy_version 57660 (0.0007) +[2023-10-11 17:20:16,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 117211136. Throughput: 0: 1693.8, 1: 1680.2. Samples: 29310808. Policy #0 lag: (min: 7.0, avg: 15.0, max: 39.0) +[2023-10-11 17:20:16,063][84230] Avg episode reward: [(0, '35.640'), (1, '42.940')] +[2023-10-11 17:20:16,073][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000057664_59047936.pth... +[2023-10-11 17:20:16,073][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000056800_58163200.pth... +[2023-10-11 17:20:16,103][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000056064_57409536.pth +[2023-10-11 17:20:16,110][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000055232_56557568.pth +[2023-10-11 17:20:17,455][85176] Updated weights for policy 0, policy_version 56802 (0.0008) +[2023-10-11 17:20:17,828][85176] Updated weights for policy 0, policy_version 56812 (0.0009) +[2023-10-11 17:20:18,206][85176] Updated weights for policy 0, policy_version 56822 (0.0010) +[2023-10-11 17:20:18,577][85176] Updated weights for policy 0, policy_version 56832 (0.0009) +[2023-10-11 17:20:19,476][85175] Updated weights for policy 1, policy_version 57670 (0.0010) +[2023-10-11 17:20:19,842][85175] Updated weights for policy 1, policy_version 57680 (0.0010) +[2023-10-11 17:20:20,210][85175] Updated weights for policy 1, policy_version 57690 (0.0009) +[2023-10-11 17:20:21,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 117276672. Throughput: 0: 1665.1, 1: 1708.8. Samples: 29321202. Policy #0 lag: (min: 7.0, avg: 15.0, max: 39.0) +[2023-10-11 17:20:21,064][84230] Avg episode reward: [(0, '37.540'), (1, '39.740')] +[2023-10-11 17:20:21,065][84801] Saving new best policy, reward=37.540! +[2023-10-11 17:20:22,749][85176] Updated weights for policy 0, policy_version 56842 (0.0007) +[2023-10-11 17:20:23,125][85176] Updated weights for policy 0, policy_version 56852 (0.0009) +[2023-10-11 17:20:23,511][85176] Updated weights for policy 0, policy_version 56862 (0.0010) +[2023-10-11 17:20:24,289][85175] Updated weights for policy 1, policy_version 57700 (0.0009) +[2023-10-11 17:20:24,654][85175] Updated weights for policy 1, policy_version 57710 (0.0010) +[2023-10-11 17:20:25,027][85175] Updated weights for policy 1, policy_version 57720 (0.0009) +[2023-10-11 17:20:26,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 117342208. Throughput: 0: 1682.6, 1: 1687.1. Samples: 29341144. Policy #0 lag: (min: 15.0, avg: 16.1, max: 38.0) +[2023-10-11 17:20:26,063][84230] Avg episode reward: [(0, '35.850'), (1, '44.850')] +[2023-10-11 17:20:27,457][85176] Updated weights for policy 0, policy_version 56872 (0.0010) +[2023-10-11 17:20:27,828][85176] Updated weights for policy 0, policy_version 56882 (0.0009) +[2023-10-11 17:20:28,201][85176] Updated weights for policy 0, policy_version 56892 (0.0008) +[2023-10-11 17:20:28,972][85175] Updated weights for policy 1, policy_version 57730 (0.0008) +[2023-10-11 17:20:29,338][85175] Updated weights for policy 1, policy_version 57740 (0.0010) +[2023-10-11 17:20:29,703][85175] Updated weights for policy 1, policy_version 57750 (0.0008) +[2023-10-11 17:20:30,081][85175] Updated weights for policy 1, policy_version 57760 (0.0008) +[2023-10-11 17:20:31,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 117407744. Throughput: 0: 1687.7, 1: 1673.3. Samples: 29361048. Policy #0 lag: (min: 15.0, avg: 16.1, max: 38.0) +[2023-10-11 17:20:31,063][84230] Avg episode reward: [(0, '35.010'), (1, '40.190')] +[2023-10-11 17:20:32,237][85176] Updated weights for policy 0, policy_version 56902 (0.0010) +[2023-10-11 17:20:32,616][85176] Updated weights for policy 0, policy_version 56912 (0.0009) +[2023-10-11 17:20:32,983][85176] Updated weights for policy 0, policy_version 56922 (0.0010) +[2023-10-11 17:20:34,004][85175] Updated weights for policy 1, policy_version 57770 (0.0007) +[2023-10-11 17:20:34,373][85175] Updated weights for policy 1, policy_version 57780 (0.0008) +[2023-10-11 17:20:34,745][85175] Updated weights for policy 1, policy_version 57790 (0.0007) +[2023-10-11 17:20:36,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 117473280. Throughput: 0: 1664.8, 1: 1700.2. Samples: 29371658. Policy #0 lag: (min: 15.0, avg: 16.1, max: 38.0) +[2023-10-11 17:20:36,064][84230] Avg episode reward: [(0, '31.960'), (1, '42.720')] +[2023-10-11 17:20:37,130][85176] Updated weights for policy 0, policy_version 56932 (0.0009) +[2023-10-11 17:20:37,510][85176] Updated weights for policy 0, policy_version 56942 (0.0009) +[2023-10-11 17:20:37,880][85176] Updated weights for policy 0, policy_version 56952 (0.0008) +[2023-10-11 17:20:38,796][85175] Updated weights for policy 1, policy_version 57800 (0.0007) +[2023-10-11 17:20:39,165][85175] Updated weights for policy 1, policy_version 57810 (0.0007) +[2023-10-11 17:20:39,525][85175] Updated weights for policy 1, policy_version 57820 (0.0008) +[2023-10-11 17:20:41,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 117538816. Throughput: 0: 1690.2, 1: 1676.7. Samples: 29391632. Policy #0 lag: (min: 15.0, avg: 16.1, max: 38.0) +[2023-10-11 17:20:41,064][84230] Avg episode reward: [(0, '34.380'), (1, '36.690')] +[2023-10-11 17:20:42,019][85176] Updated weights for policy 0, policy_version 56962 (0.0010) +[2023-10-11 17:20:42,387][85176] Updated weights for policy 0, policy_version 56972 (0.0009) +[2023-10-11 17:20:42,770][85176] Updated weights for policy 0, policy_version 56982 (0.0007) +[2023-10-11 17:20:43,137][85176] Updated weights for policy 0, policy_version 56992 (0.0007) +[2023-10-11 17:20:43,574][85175] Updated weights for policy 1, policy_version 57830 (0.0008) +[2023-10-11 17:20:43,942][85175] Updated weights for policy 1, policy_version 57840 (0.0008) +[2023-10-11 17:20:44,306][85175] Updated weights for policy 1, policy_version 57850 (0.0007) +[2023-10-11 17:20:46,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 117604352. Throughput: 0: 1684.1, 1: 1695.4. Samples: 29412186. Policy #0 lag: (min: 15.0, avg: 16.1, max: 38.0) +[2023-10-11 17:20:46,064][84230] Avg episode reward: [(0, '34.580'), (1, '41.110')] +[2023-10-11 17:20:47,346][85176] Updated weights for policy 0, policy_version 57002 (0.0008) +[2023-10-11 17:20:47,725][85176] Updated weights for policy 0, policy_version 57012 (0.0007) +[2023-10-11 17:20:48,091][85176] Updated weights for policy 0, policy_version 57022 (0.0009) +[2023-10-11 17:20:48,269][85175] Updated weights for policy 1, policy_version 57860 (0.0007) +[2023-10-11 17:20:48,627][85175] Updated weights for policy 1, policy_version 57870 (0.0008) +[2023-10-11 17:20:49,001][85175] Updated weights for policy 1, policy_version 57880 (0.0010) +[2023-10-11 17:20:51,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 117669888. Throughput: 0: 1673.6, 1: 1689.6. Samples: 29422222. Policy #0 lag: (min: 15.0, avg: 16.1, max: 38.0) +[2023-10-11 17:20:51,064][84230] Avg episode reward: [(0, '36.850'), (1, '36.310')] +[2023-10-11 17:20:52,176][85176] Updated weights for policy 0, policy_version 57032 (0.0008) +[2023-10-11 17:20:52,547][85176] Updated weights for policy 0, policy_version 57042 (0.0007) +[2023-10-11 17:20:52,793][85175] Updated weights for policy 1, policy_version 57890 (0.0007) +[2023-10-11 17:20:52,912][85176] Updated weights for policy 0, policy_version 57052 (0.0007) +[2023-10-11 17:20:53,164][85175] Updated weights for policy 1, policy_version 57900 (0.0007) +[2023-10-11 17:20:53,544][85175] Updated weights for policy 1, policy_version 57910 (0.0010) +[2023-10-11 17:20:53,909][85175] Updated weights for policy 1, policy_version 57920 (0.0008) +[2023-10-11 17:20:56,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 117735424. Throughput: 0: 1679.9, 1: 1679.2. Samples: 29442196. Policy #0 lag: (min: 15.0, avg: 16.1, max: 38.0) +[2023-10-11 17:20:56,064][84230] Avg episode reward: [(0, '34.040'), (1, '42.290')] +[2023-10-11 17:20:57,127][85176] Updated weights for policy 0, policy_version 57062 (0.0008) +[2023-10-11 17:20:57,512][85176] Updated weights for policy 0, policy_version 57072 (0.0008) +[2023-10-11 17:20:57,874][85176] Updated weights for policy 0, policy_version 57082 (0.0007) +[2023-10-11 17:20:58,170][85175] Updated weights for policy 1, policy_version 57930 (0.0009) +[2023-10-11 17:20:58,536][85175] Updated weights for policy 1, policy_version 57940 (0.0009) +[2023-10-11 17:20:58,900][85175] Updated weights for policy 1, policy_version 57950 (0.0009) +[2023-10-11 17:21:01,062][84230] Fps is (10 sec: 13107.7, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 117800960. Throughput: 0: 1670.1, 1: 1701.8. Samples: 29462544. Policy #0 lag: (min: 15.0, avg: 16.1, max: 38.0) +[2023-10-11 17:21:01,063][84230] Avg episode reward: [(0, '35.390'), (1, '40.690')] +[2023-10-11 17:21:01,917][85176] Updated weights for policy 0, policy_version 57092 (0.0007) +[2023-10-11 17:21:02,281][85176] Updated weights for policy 0, policy_version 57102 (0.0007) +[2023-10-11 17:21:02,659][85176] Updated weights for policy 0, policy_version 57112 (0.0010) +[2023-10-11 17:21:02,797][85175] Updated weights for policy 1, policy_version 57960 (0.0008) +[2023-10-11 17:21:03,163][85175] Updated weights for policy 1, policy_version 57970 (0.0007) +[2023-10-11 17:21:03,529][85175] Updated weights for policy 1, policy_version 57980 (0.0009) +[2023-10-11 17:21:06,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 117866496. Throughput: 0: 1669.3, 1: 1680.4. Samples: 29471938. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:21:06,063][84230] Avg episode reward: [(0, '35.050'), (1, '45.400')] +[2023-10-11 17:21:06,482][85176] Updated weights for policy 0, policy_version 57122 (0.0007) +[2023-10-11 17:21:06,857][85176] Updated weights for policy 0, policy_version 57132 (0.0010) +[2023-10-11 17:21:07,223][85176] Updated weights for policy 0, policy_version 57142 (0.0007) +[2023-10-11 17:21:07,573][85175] Updated weights for policy 1, policy_version 57990 (0.0008) +[2023-10-11 17:21:07,595][85176] Updated weights for policy 0, policy_version 57152 (0.0008) +[2023-10-11 17:21:07,941][85175] Updated weights for policy 1, policy_version 58000 (0.0007) +[2023-10-11 17:21:08,309][85175] Updated weights for policy 1, policy_version 58010 (0.0007) +[2023-10-11 17:21:11,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 117932032. Throughput: 0: 1675.9, 1: 1687.9. Samples: 29492516. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:21:11,063][84230] Avg episode reward: [(0, '36.030'), (1, '39.710')] +[2023-10-11 17:21:11,689][85176] Updated weights for policy 0, policy_version 57162 (0.0008) +[2023-10-11 17:21:12,061][85176] Updated weights for policy 0, policy_version 57172 (0.0009) +[2023-10-11 17:21:12,350][85175] Updated weights for policy 1, policy_version 58020 (0.0007) +[2023-10-11 17:21:12,440][85176] Updated weights for policy 0, policy_version 57182 (0.0008) +[2023-10-11 17:21:12,714][85175] Updated weights for policy 1, policy_version 58030 (0.0010) +[2023-10-11 17:21:13,079][85175] Updated weights for policy 1, policy_version 58040 (0.0011) +[2023-10-11 17:21:16,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 117997568. Throughput: 0: 1677.1, 1: 1706.1. Samples: 29513294. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:21:16,064][84230] Avg episode reward: [(0, '35.730'), (1, '44.810')] +[2023-10-11 17:21:16,413][85176] Updated weights for policy 0, policy_version 57192 (0.0010) +[2023-10-11 17:21:16,788][85176] Updated weights for policy 0, policy_version 57202 (0.0011) +[2023-10-11 17:21:16,950][85175] Updated weights for policy 1, policy_version 58050 (0.0010) +[2023-10-11 17:21:17,155][85176] Updated weights for policy 0, policy_version 57212 (0.0009) +[2023-10-11 17:21:17,319][85175] Updated weights for policy 1, policy_version 58060 (0.0007) +[2023-10-11 17:21:17,694][85175] Updated weights for policy 1, policy_version 58070 (0.0008) +[2023-10-11 17:21:18,063][85175] Updated weights for policy 1, policy_version 58080 (0.0008) +[2023-10-11 17:21:21,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 118063104. Throughput: 0: 1675.0, 1: 1676.2. Samples: 29522460. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:21:21,063][84230] Avg episode reward: [(0, '36.930'), (1, '38.240')] +[2023-10-11 17:21:21,371][85176] Updated weights for policy 0, policy_version 57222 (0.0010) +[2023-10-11 17:21:21,745][85176] Updated weights for policy 0, policy_version 57232 (0.0009) +[2023-10-11 17:21:22,040][85175] Updated weights for policy 1, policy_version 58090 (0.0008) +[2023-10-11 17:21:22,109][85176] Updated weights for policy 0, policy_version 57242 (0.0007) +[2023-10-11 17:21:22,410][85175] Updated weights for policy 1, policy_version 58100 (0.0009) +[2023-10-11 17:21:22,777][85175] Updated weights for policy 1, policy_version 58110 (0.0010) +[2023-10-11 17:21:26,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 118128640. Throughput: 0: 1670.5, 1: 1695.1. Samples: 29543084. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:21:26,064][84230] Avg episode reward: [(0, '34.830'), (1, '45.080')] +[2023-10-11 17:21:26,341][85176] Updated weights for policy 0, policy_version 57252 (0.0009) +[2023-10-11 17:21:26,714][85176] Updated weights for policy 0, policy_version 57262 (0.0009) +[2023-10-11 17:21:27,003][85175] Updated weights for policy 1, policy_version 58120 (0.0008) +[2023-10-11 17:21:27,079][85176] Updated weights for policy 0, policy_version 57272 (0.0007) +[2023-10-11 17:21:27,375][85175] Updated weights for policy 1, policy_version 58130 (0.0008) +[2023-10-11 17:21:27,734][85175] Updated weights for policy 1, policy_version 58140 (0.0011) +[2023-10-11 17:21:31,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 118194176. Throughput: 0: 1671.5, 1: 1694.2. Samples: 29563642. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:21:31,064][84230] Avg episode reward: [(0, '36.890'), (1, '40.390')] +[2023-10-11 17:21:31,232][85176] Updated weights for policy 0, policy_version 57282 (0.0009) +[2023-10-11 17:21:31,599][85176] Updated weights for policy 0, policy_version 57292 (0.0010) +[2023-10-11 17:21:31,797][85175] Updated weights for policy 1, policy_version 58150 (0.0009) +[2023-10-11 17:21:31,973][85176] Updated weights for policy 0, policy_version 57302 (0.0009) +[2023-10-11 17:21:32,155][85175] Updated weights for policy 1, policy_version 58160 (0.0008) +[2023-10-11 17:21:32,333][85176] Updated weights for policy 0, policy_version 57312 (0.0009) +[2023-10-11 17:21:32,521][85175] Updated weights for policy 1, policy_version 58170 (0.0010) +[2023-10-11 17:21:36,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 118259712. Throughput: 0: 1669.1, 1: 1673.3. Samples: 29572628. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:21:36,063][84230] Avg episode reward: [(0, '34.990'), (1, '47.040')] +[2023-10-11 17:21:36,562][85175] Updated weights for policy 1, policy_version 58180 (0.0007) +[2023-10-11 17:21:36,576][85176] Updated weights for policy 0, policy_version 57322 (0.0008) +[2023-10-11 17:21:36,931][85175] Updated weights for policy 1, policy_version 58190 (0.0008) +[2023-10-11 17:21:36,942][85176] Updated weights for policy 0, policy_version 57332 (0.0009) +[2023-10-11 17:21:37,298][85175] Updated weights for policy 1, policy_version 58200 (0.0008) +[2023-10-11 17:21:37,307][85176] Updated weights for policy 0, policy_version 57342 (0.0008) +[2023-10-11 17:21:41,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 118325248. Throughput: 0: 1669.5, 1: 1688.4. Samples: 29593306. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:21:41,064][84230] Avg episode reward: [(0, '36.170'), (1, '38.970')] +[2023-10-11 17:21:41,349][85176] Updated weights for policy 0, policy_version 57352 (0.0007) +[2023-10-11 17:21:41,357][85175] Updated weights for policy 1, policy_version 58210 (0.0008) +[2023-10-11 17:21:41,722][85175] Updated weights for policy 1, policy_version 58220 (0.0009) +[2023-10-11 17:21:41,727][85176] Updated weights for policy 0, policy_version 57362 (0.0008) +[2023-10-11 17:21:42,092][85175] Updated weights for policy 1, policy_version 58230 (0.0008) +[2023-10-11 17:21:42,100][85176] Updated weights for policy 0, policy_version 57372 (0.0008) +[2023-10-11 17:21:42,461][85175] Updated weights for policy 1, policy_version 58240 (0.0008) +[2023-10-11 17:21:46,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 118390784. Throughput: 0: 1669.1, 1: 1699.4. Samples: 29614128. Policy #0 lag: (min: 31.0, avg: 34.5, max: 63.0) +[2023-10-11 17:21:46,063][84230] Avg episode reward: [(0, '32.800'), (1, '46.010')] +[2023-10-11 17:21:46,110][85176] Updated weights for policy 0, policy_version 57382 (0.0008) +[2023-10-11 17:21:46,498][85176] Updated weights for policy 0, policy_version 57392 (0.0008) +[2023-10-11 17:21:46,519][85175] Updated weights for policy 1, policy_version 58250 (0.0008) +[2023-10-11 17:21:46,875][85176] Updated weights for policy 0, policy_version 57402 (0.0008) +[2023-10-11 17:21:46,883][85175] Updated weights for policy 1, policy_version 58260 (0.0008) +[2023-10-11 17:21:47,248][85175] Updated weights for policy 1, policy_version 58270 (0.0008) +[2023-10-11 17:21:50,982][85176] Updated weights for policy 0, policy_version 57412 (0.0009) +[2023-10-11 17:21:51,062][84230] Fps is (10 sec: 13107.7, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 118456320. Throughput: 0: 1666.4, 1: 1691.2. Samples: 29623026. Policy #0 lag: (min: 31.0, avg: 34.5, max: 63.0) +[2023-10-11 17:21:51,063][84230] Avg episode reward: [(0, '35.090'), (1, '38.750')] +[2023-10-11 17:21:51,315][85175] Updated weights for policy 1, policy_version 58280 (0.0009) +[2023-10-11 17:21:51,358][85176] Updated weights for policy 0, policy_version 57422 (0.0011) +[2023-10-11 17:21:51,677][85175] Updated weights for policy 1, policy_version 58290 (0.0008) +[2023-10-11 17:21:51,734][85176] Updated weights for policy 0, policy_version 57432 (0.0009) +[2023-10-11 17:21:52,040][85175] Updated weights for policy 1, policy_version 58300 (0.0007) +[2023-10-11 17:21:55,783][85176] Updated weights for policy 0, policy_version 57442 (0.0007) +[2023-10-11 17:21:56,022][85175] Updated weights for policy 1, policy_version 58310 (0.0007) +[2023-10-11 17:21:56,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 118521856. Throughput: 0: 1663.2, 1: 1694.7. Samples: 29643624. Policy #0 lag: (min: 31.0, avg: 34.5, max: 63.0) +[2023-10-11 17:21:56,064][84230] Avg episode reward: [(0, '33.210'), (1, '48.970')] +[2023-10-11 17:21:56,149][85176] Updated weights for policy 0, policy_version 57452 (0.0008) +[2023-10-11 17:21:56,393][85175] Updated weights for policy 1, policy_version 58320 (0.0007) +[2023-10-11 17:21:56,529][85176] Updated weights for policy 0, policy_version 57462 (0.0007) +[2023-10-11 17:21:56,759][85175] Updated weights for policy 1, policy_version 58330 (0.0007) +[2023-10-11 17:21:56,904][85176] Updated weights for policy 0, policy_version 57472 (0.0009) +[2023-10-11 17:21:56,973][85000] Saving new best policy, reward=48.970! +[2023-10-11 17:22:00,859][85175] Updated weights for policy 1, policy_version 58340 (0.0008) +[2023-10-11 17:22:00,929][85176] Updated weights for policy 0, policy_version 57482 (0.0010) +[2023-10-11 17:22:01,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 118587392. Throughput: 0: 1665.8, 1: 1697.1. Samples: 29664624. Policy #0 lag: (min: 31.0, avg: 34.5, max: 63.0) +[2023-10-11 17:22:01,063][84230] Avg episode reward: [(0, '34.420'), (1, '39.730')] +[2023-10-11 17:22:01,216][85175] Updated weights for policy 1, policy_version 58350 (0.0007) +[2023-10-11 17:22:01,313][85176] Updated weights for policy 0, policy_version 57492 (0.0008) +[2023-10-11 17:22:01,588][85175] Updated weights for policy 1, policy_version 58360 (0.0008) +[2023-10-11 17:22:01,683][85176] Updated weights for policy 0, policy_version 57502 (0.0007) +[2023-10-11 17:22:05,554][85175] Updated weights for policy 1, policy_version 58370 (0.0008) +[2023-10-11 17:22:05,905][85176] Updated weights for policy 0, policy_version 57512 (0.0008) +[2023-10-11 17:22:05,925][85175] Updated weights for policy 1, policy_version 58380 (0.0008) +[2023-10-11 17:22:06,062][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 118652928. Throughput: 0: 1665.5, 1: 1694.4. Samples: 29673656. Policy #0 lag: (min: 31.0, avg: 34.5, max: 63.0) +[2023-10-11 17:22:06,063][84230] Avg episode reward: [(0, '32.480'), (1, '46.680')] +[2023-10-11 17:22:06,285][85176] Updated weights for policy 0, policy_version 57522 (0.0008) +[2023-10-11 17:22:06,291][85175] Updated weights for policy 1, policy_version 58390 (0.0007) +[2023-10-11 17:22:06,649][85176] Updated weights for policy 0, policy_version 57532 (0.0007) +[2023-10-11 17:22:06,660][85175] Updated weights for policy 1, policy_version 58400 (0.0007) +[2023-10-11 17:22:10,842][85176] Updated weights for policy 0, policy_version 57542 (0.0008) +[2023-10-11 17:22:10,862][85175] Updated weights for policy 1, policy_version 58410 (0.0007) +[2023-10-11 17:22:11,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 118718464. Throughput: 0: 1667.8, 1: 1691.6. Samples: 29694254. Policy #0 lag: (min: 31.0, avg: 34.5, max: 63.0) +[2023-10-11 17:22:11,063][84230] Avg episode reward: [(0, '32.030'), (1, '36.950')] +[2023-10-11 17:22:11,214][85176] Updated weights for policy 0, policy_version 57552 (0.0008) +[2023-10-11 17:22:11,225][85175] Updated weights for policy 1, policy_version 58420 (0.0009) +[2023-10-11 17:22:11,572][85176] Updated weights for policy 0, policy_version 57562 (0.0009) +[2023-10-11 17:22:11,588][85175] Updated weights for policy 1, policy_version 58430 (0.0008) +[2023-10-11 17:22:15,411][85175] Updated weights for policy 1, policy_version 58440 (0.0008) +[2023-10-11 17:22:15,412][85176] Updated weights for policy 0, policy_version 57572 (0.0008) +[2023-10-11 17:22:15,778][85176] Updated weights for policy 0, policy_version 57582 (0.0007) +[2023-10-11 17:22:15,780][85175] Updated weights for policy 1, policy_version 58450 (0.0007) +[2023-10-11 17:22:16,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 118784000. Throughput: 0: 1663.7, 1: 1691.5. Samples: 29714624. Policy #0 lag: (min: 31.0, avg: 34.5, max: 63.0) +[2023-10-11 17:22:16,063][84230] Avg episode reward: [(0, '34.710'), (1, '45.380')] +[2023-10-11 17:22:16,147][85176] Updated weights for policy 0, policy_version 57592 (0.0009) +[2023-10-11 17:22:16,150][85175] Updated weights for policy 1, policy_version 58460 (0.0008) +[2023-10-11 17:22:16,288][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000058464_59867136.pth... +[2023-10-11 17:22:16,326][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000056864_58228736.pth +[2023-10-11 17:22:16,445][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000057600_58982400.pth... +[2023-10-11 17:22:16,483][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000056032_57376768.pth +[2023-10-11 17:22:20,192][85175] Updated weights for policy 1, policy_version 58470 (0.0009) +[2023-10-11 17:22:20,351][85176] Updated weights for policy 0, policy_version 57602 (0.0007) +[2023-10-11 17:22:20,559][85175] Updated weights for policy 1, policy_version 58480 (0.0009) +[2023-10-11 17:22:20,721][85176] Updated weights for policy 0, policy_version 57612 (0.0008) +[2023-10-11 17:22:20,921][85175] Updated weights for policy 1, policy_version 58490 (0.0007) +[2023-10-11 17:22:21,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 118849536. Throughput: 0: 1675.3, 1: 1701.0. Samples: 29724562. Policy #0 lag: (min: 31.0, avg: 34.5, max: 63.0) +[2023-10-11 17:22:21,063][84230] Avg episode reward: [(0, '35.520'), (1, '35.640')] +[2023-10-11 17:22:21,093][85176] Updated weights for policy 0, policy_version 57622 (0.0010) +[2023-10-11 17:22:21,470][85176] Updated weights for policy 0, policy_version 57632 (0.0009) +[2023-10-11 17:22:25,203][85175] Updated weights for policy 1, policy_version 58500 (0.0007) +[2023-10-11 17:22:25,421][85176] Updated weights for policy 0, policy_version 57642 (0.0009) +[2023-10-11 17:22:25,558][85175] Updated weights for policy 1, policy_version 58510 (0.0008) +[2023-10-11 17:22:25,788][85176] Updated weights for policy 0, policy_version 57652 (0.0008) +[2023-10-11 17:22:25,927][85175] Updated weights for policy 1, policy_version 58520 (0.0008) +[2023-10-11 17:22:26,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 118915072. Throughput: 0: 1677.7, 1: 1700.2. Samples: 29745312. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-10-11 17:22:26,063][84230] Avg episode reward: [(0, '37.730'), (1, '45.400')] +[2023-10-11 17:22:26,160][85176] Updated weights for policy 0, policy_version 57662 (0.0009) +[2023-10-11 17:22:26,227][84801] Saving new best policy, reward=37.730! +[2023-10-11 17:22:30,035][85175] Updated weights for policy 1, policy_version 58530 (0.0009) +[2023-10-11 17:22:30,198][85176] Updated weights for policy 0, policy_version 57672 (0.0008) +[2023-10-11 17:22:30,406][85175] Updated weights for policy 1, policy_version 58540 (0.0008) +[2023-10-11 17:22:30,569][85176] Updated weights for policy 0, policy_version 57682 (0.0009) +[2023-10-11 17:22:30,767][85175] Updated weights for policy 1, policy_version 58550 (0.0008) +[2023-10-11 17:22:30,949][85176] Updated weights for policy 0, policy_version 57692 (0.0009) +[2023-10-11 17:22:31,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 118980608. Throughput: 0: 1665.0, 1: 1679.9. Samples: 29764648. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-10-11 17:22:31,063][84230] Avg episode reward: [(0, '32.330'), (1, '39.860')] +[2023-10-11 17:22:31,131][85175] Updated weights for policy 1, policy_version 58560 (0.0008) +[2023-10-11 17:22:35,025][85176] Updated weights for policy 0, policy_version 57702 (0.0009) +[2023-10-11 17:22:35,202][85175] Updated weights for policy 1, policy_version 58570 (0.0009) +[2023-10-11 17:22:35,417][85176] Updated weights for policy 0, policy_version 57712 (0.0008) +[2023-10-11 17:22:35,576][85175] Updated weights for policy 1, policy_version 58580 (0.0007) +[2023-10-11 17:22:35,782][85176] Updated weights for policy 0, policy_version 57722 (0.0007) +[2023-10-11 17:22:35,942][85175] Updated weights for policy 1, policy_version 58590 (0.0007) +[2023-10-11 17:22:36,062][84230] Fps is (10 sec: 19660.9, 60 sec: 14199.5, 300 sec: 13662.6). Total num frames: 119111680. Throughput: 0: 1687.0, 1: 1694.8. Samples: 29775210. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-10-11 17:22:36,063][84230] Avg episode reward: [(0, '35.960'), (1, '48.170')] +[2023-10-11 17:22:39,804][85176] Updated weights for policy 0, policy_version 57732 (0.0007) +[2023-10-11 17:22:39,908][85175] Updated weights for policy 1, policy_version 58600 (0.0008) +[2023-10-11 17:22:40,182][85176] Updated weights for policy 0, policy_version 57742 (0.0007) +[2023-10-11 17:22:40,277][85175] Updated weights for policy 1, policy_version 58610 (0.0008) +[2023-10-11 17:22:40,546][85176] Updated weights for policy 0, policy_version 57752 (0.0009) +[2023-10-11 17:22:40,636][85175] Updated weights for policy 1, policy_version 58620 (0.0007) +[2023-10-11 17:22:41,063][84230] Fps is (10 sec: 19660.6, 60 sec: 14199.5, 300 sec: 13662.6). Total num frames: 119177216. Throughput: 0: 1680.8, 1: 1696.3. Samples: 29795592. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-10-11 17:22:41,063][84230] Avg episode reward: [(0, '31.590'), (1, '44.630')] +[2023-10-11 17:22:44,522][85176] Updated weights for policy 0, policy_version 57762 (0.0009) +[2023-10-11 17:22:44,596][85175] Updated weights for policy 1, policy_version 58630 (0.0009) +[2023-10-11 17:22:44,892][85176] Updated weights for policy 0, policy_version 57772 (0.0008) +[2023-10-11 17:22:44,968][85175] Updated weights for policy 1, policy_version 58640 (0.0007) +[2023-10-11 17:22:45,260][85176] Updated weights for policy 0, policy_version 57782 (0.0007) +[2023-10-11 17:22:45,337][85175] Updated weights for policy 1, policy_version 58650 (0.0007) +[2023-10-11 17:22:45,632][85176] Updated weights for policy 0, policy_version 57792 (0.0007) +[2023-10-11 17:22:46,062][84230] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13662.6). Total num frames: 119242752. Throughput: 0: 1654.0, 1: 1668.1. Samples: 29814116. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-10-11 17:22:46,063][84230] Avg episode reward: [(0, '33.660'), (1, '47.550')] +[2023-10-11 17:22:49,367][85175] Updated weights for policy 1, policy_version 58660 (0.0008) +[2023-10-11 17:22:49,736][85175] Updated weights for policy 1, policy_version 58670 (0.0009) +[2023-10-11 17:22:49,824][85176] Updated weights for policy 0, policy_version 57802 (0.0009) +[2023-10-11 17:22:50,100][85175] Updated weights for policy 1, policy_version 58680 (0.0007) +[2023-10-11 17:22:50,192][85176] Updated weights for policy 0, policy_version 57812 (0.0008) +[2023-10-11 17:22:50,562][85176] Updated weights for policy 0, policy_version 57822 (0.0007) +[2023-10-11 17:22:51,063][84230] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 119308288. Throughput: 0: 1680.9, 1: 1695.2. Samples: 29825580. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-10-11 17:22:51,063][84230] Avg episode reward: [(0, '33.870'), (1, '40.890')] +[2023-10-11 17:22:54,182][85175] Updated weights for policy 1, policy_version 58690 (0.0009) +[2023-10-11 17:22:54,533][85176] Updated weights for policy 0, policy_version 57832 (0.0008) +[2023-10-11 17:22:54,537][85175] Updated weights for policy 1, policy_version 58700 (0.0009) +[2023-10-11 17:22:54,908][85175] Updated weights for policy 1, policy_version 58710 (0.0008) +[2023-10-11 17:22:54,909][85176] Updated weights for policy 0, policy_version 57842 (0.0007) +[2023-10-11 17:22:55,271][85175] Updated weights for policy 1, policy_version 58720 (0.0007) +[2023-10-11 17:22:55,276][85176] Updated weights for policy 0, policy_version 57852 (0.0008) +[2023-10-11 17:22:56,063][84230] Fps is (10 sec: 13106.9, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 119373824. Throughput: 0: 1675.5, 1: 1690.9. Samples: 29845740. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) +[2023-10-11 17:22:56,064][84230] Avg episode reward: [(0, '35.930'), (1, '46.800')] +[2023-10-11 17:22:59,445][85175] Updated weights for policy 1, policy_version 58730 (0.0007) +[2023-10-11 17:22:59,552][85176] Updated weights for policy 0, policy_version 57862 (0.0009) +[2023-10-11 17:22:59,803][85175] Updated weights for policy 1, policy_version 58740 (0.0007) +[2023-10-11 17:22:59,931][85176] Updated weights for policy 0, policy_version 57872 (0.0008) +[2023-10-11 17:23:00,163][85175] Updated weights for policy 1, policy_version 58750 (0.0008) +[2023-10-11 17:23:00,301][85176] Updated weights for policy 0, policy_version 57882 (0.0008) +[2023-10-11 17:23:01,063][84230] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 119439360. Throughput: 0: 1658.1, 1: 1674.2. Samples: 29864576. Policy #0 lag: (min: 14.0, avg: 17.3, max: 46.0) +[2023-10-11 17:23:01,064][84230] Avg episode reward: [(0, '39.180'), (1, '40.530')] +[2023-10-11 17:23:01,076][84801] Saving new best policy, reward=39.180! +[2023-10-11 17:23:04,216][85175] Updated weights for policy 1, policy_version 58760 (0.0009) +[2023-10-11 17:23:04,471][85176] Updated weights for policy 0, policy_version 57892 (0.0008) +[2023-10-11 17:23:04,586][85175] Updated weights for policy 1, policy_version 58770 (0.0008) +[2023-10-11 17:23:04,842][85176] Updated weights for policy 0, policy_version 57902 (0.0007) +[2023-10-11 17:23:04,955][85175] Updated weights for policy 1, policy_version 58780 (0.0007) +[2023-10-11 17:23:05,212][85176] Updated weights for policy 0, policy_version 57912 (0.0007) +[2023-10-11 17:23:06,063][84230] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 119504896. Throughput: 0: 1680.5, 1: 1692.9. Samples: 29876366. Policy #0 lag: (min: 14.0, avg: 17.3, max: 46.0) +[2023-10-11 17:23:06,063][84230] Avg episode reward: [(0, '36.040'), (1, '44.210')] +[2023-10-11 17:23:09,080][85175] Updated weights for policy 1, policy_version 58790 (0.0010) +[2023-10-11 17:23:09,126][85176] Updated weights for policy 0, policy_version 57922 (0.0008) +[2023-10-11 17:23:09,451][85175] Updated weights for policy 1, policy_version 58800 (0.0008) +[2023-10-11 17:23:09,504][85176] Updated weights for policy 0, policy_version 57932 (0.0008) +[2023-10-11 17:23:09,816][85175] Updated weights for policy 1, policy_version 58810 (0.0007) +[2023-10-11 17:23:09,868][85176] Updated weights for policy 0, policy_version 57942 (0.0007) +[2023-10-11 17:23:10,246][85176] Updated weights for policy 0, policy_version 57952 (0.0008) +[2023-10-11 17:23:11,063][84230] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 119570432. Throughput: 0: 1669.3, 1: 1674.8. Samples: 29895794. Policy #0 lag: (min: 14.0, avg: 17.3, max: 46.0) +[2023-10-11 17:23:11,064][84230] Avg episode reward: [(0, '39.670'), (1, '37.680')] +[2023-10-11 17:23:11,065][84801] Saving new best policy, reward=39.670! +[2023-10-11 17:23:13,975][85175] Updated weights for policy 1, policy_version 58820 (0.0008) +[2023-10-11 17:23:14,191][85176] Updated weights for policy 0, policy_version 57962 (0.0010) +[2023-10-11 17:23:14,338][85175] Updated weights for policy 1, policy_version 58830 (0.0009) +[2023-10-11 17:23:14,567][85176] Updated weights for policy 0, policy_version 57972 (0.0007) +[2023-10-11 17:23:14,702][85175] Updated weights for policy 1, policy_version 58840 (0.0008) +[2023-10-11 17:23:14,950][85176] Updated weights for policy 0, policy_version 57982 (0.0007) +[2023-10-11 17:23:16,063][84230] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 119635968. Throughput: 0: 1671.9, 1: 1671.7. Samples: 29915112. Policy #0 lag: (min: 14.0, avg: 17.3, max: 46.0) +[2023-10-11 17:23:16,063][84230] Avg episode reward: [(0, '33.360'), (1, '43.400')] +[2023-10-11 17:23:18,725][85175] Updated weights for policy 1, policy_version 58850 (0.0008) +[2023-10-11 17:23:18,928][85176] Updated weights for policy 0, policy_version 57992 (0.0008) +[2023-10-11 17:23:19,090][85175] Updated weights for policy 1, policy_version 58860 (0.0008) +[2023-10-11 17:23:19,300][85176] Updated weights for policy 0, policy_version 58002 (0.0008) +[2023-10-11 17:23:19,464][85175] Updated weights for policy 1, policy_version 58870 (0.0009) +[2023-10-11 17:23:19,666][85176] Updated weights for policy 0, policy_version 58012 (0.0009) +[2023-10-11 17:23:19,820][85175] Updated weights for policy 1, policy_version 58880 (0.0008) +[2023-10-11 17:23:21,063][84230] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 119701504. Throughput: 0: 1683.1, 1: 1684.0. Samples: 29926734. Policy #0 lag: (min: 14.0, avg: 17.3, max: 46.0) +[2023-10-11 17:23:21,064][84230] Avg episode reward: [(0, '38.210'), (1, '38.450')] +[2023-10-11 17:23:23,791][85176] Updated weights for policy 0, policy_version 58022 (0.0009) +[2023-10-11 17:23:24,000][85175] Updated weights for policy 1, policy_version 58890 (0.0008) +[2023-10-11 17:23:24,156][85176] Updated weights for policy 0, policy_version 58032 (0.0009) +[2023-10-11 17:23:24,367][85175] Updated weights for policy 1, policy_version 58900 (0.0007) +[2023-10-11 17:23:24,521][85176] Updated weights for policy 0, policy_version 58042 (0.0007) +[2023-10-11 17:23:24,730][85175] Updated weights for policy 1, policy_version 58910 (0.0007) +[2023-10-11 17:23:26,063][84230] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 119767040. Throughput: 0: 1665.6, 1: 1662.8. Samples: 29945374. Policy #0 lag: (min: 14.0, avg: 17.3, max: 46.0) +[2023-10-11 17:23:26,064][84230] Avg episode reward: [(0, '33.650'), (1, '43.160')] +[2023-10-11 17:23:28,678][85175] Updated weights for policy 1, policy_version 58920 (0.0008) +[2023-10-11 17:23:28,705][85176] Updated weights for policy 0, policy_version 58052 (0.0007) +[2023-10-11 17:23:29,046][85175] Updated weights for policy 1, policy_version 58930 (0.0008) +[2023-10-11 17:23:29,088][85176] Updated weights for policy 0, policy_version 58062 (0.0009) +[2023-10-11 17:23:29,411][85175] Updated weights for policy 1, policy_version 58940 (0.0008) +[2023-10-11 17:23:29,449][85176] Updated weights for policy 0, policy_version 58072 (0.0007) +[2023-10-11 17:23:31,063][84230] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 119832576. Throughput: 0: 1680.5, 1: 1682.7. Samples: 29965460. Policy #0 lag: (min: 14.0, avg: 17.3, max: 46.0) +[2023-10-11 17:23:31,064][84230] Avg episode reward: [(0, '37.310'), (1, '37.710')] +[2023-10-11 17:23:33,355][85175] Updated weights for policy 1, policy_version 58950 (0.0010) +[2023-10-11 17:23:33,725][85176] Updated weights for policy 0, policy_version 58082 (0.0007) +[2023-10-11 17:23:33,733][85175] Updated weights for policy 1, policy_version 58960 (0.0009) +[2023-10-11 17:23:34,086][85175] Updated weights for policy 1, policy_version 58970 (0.0008) +[2023-10-11 17:23:34,095][85176] Updated weights for policy 0, policy_version 58092 (0.0009) +[2023-10-11 17:23:34,469][85176] Updated weights for policy 0, policy_version 58102 (0.0007) +[2023-10-11 17:23:34,841][85176] Updated weights for policy 0, policy_version 58112 (0.0007) +[2023-10-11 17:23:36,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.1, 300 sec: 13551.5). Total num frames: 119898112. Throughput: 0: 1683.7, 1: 1677.5. Samples: 29976834. Policy #0 lag: (min: 14.0, avg: 17.3, max: 46.0) +[2023-10-11 17:23:36,064][84230] Avg episode reward: [(0, '34.290'), (1, '41.980')] +[2023-10-11 17:23:38,239][85175] Updated weights for policy 1, policy_version 58980 (0.0007) +[2023-10-11 17:23:38,604][85175] Updated weights for policy 1, policy_version 58990 (0.0008) +[2023-10-11 17:23:38,868][85176] Updated weights for policy 0, policy_version 58122 (0.0008) +[2023-10-11 17:23:38,967][85175] Updated weights for policy 1, policy_version 59000 (0.0008) +[2023-10-11 17:23:39,239][85176] Updated weights for policy 0, policy_version 58132 (0.0007) +[2023-10-11 17:23:39,613][85176] Updated weights for policy 0, policy_version 58142 (0.0007) +[2023-10-11 17:23:41,062][84230] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 119963648. Throughput: 0: 1663.3, 1: 1667.1. Samples: 29995604. Policy #0 lag: (min: 16.0, avg: 34.6, max: 48.0) +[2023-10-11 17:23:41,063][84230] Avg episode reward: [(0, '36.520'), (1, '37.200')] +[2023-10-11 17:23:42,983][85175] Updated weights for policy 1, policy_version 59010 (0.0008) +[2023-10-11 17:23:43,354][85175] Updated weights for policy 1, policy_version 59020 (0.0010) +[2023-10-11 17:23:43,730][85175] Updated weights for policy 1, policy_version 59030 (0.0009) +[2023-10-11 17:23:43,752][85176] Updated weights for policy 0, policy_version 58152 (0.0008) +[2023-10-11 17:23:44,094][85175] Updated weights for policy 1, policy_version 59040 (0.0010) +[2023-10-11 17:23:44,114][85176] Updated weights for policy 0, policy_version 58162 (0.0008) +[2023-10-11 17:23:44,487][85176] Updated weights for policy 0, policy_version 58172 (0.0008) +[2023-10-11 17:23:46,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 120029184. Throughput: 0: 1681.8, 1: 1688.9. Samples: 30016258. Policy #0 lag: (min: 16.0, avg: 34.6, max: 48.0) +[2023-10-11 17:23:46,064][84230] Avg episode reward: [(0, '39.640'), (1, '43.130')] +[2023-10-11 17:23:48,028][85175] Updated weights for policy 1, policy_version 59050 (0.0010) +[2023-10-11 17:23:48,400][85175] Updated weights for policy 1, policy_version 59060 (0.0007) +[2023-10-11 17:23:48,480][85176] Updated weights for policy 0, policy_version 58182 (0.0008) +[2023-10-11 17:23:48,771][85175] Updated weights for policy 1, policy_version 59070 (0.0008) +[2023-10-11 17:23:48,859][85176] Updated weights for policy 0, policy_version 58192 (0.0008) +[2023-10-11 17:23:49,230][85176] Updated weights for policy 0, policy_version 58202 (0.0008) +[2023-10-11 17:23:51,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 120094720. Throughput: 0: 1670.6, 1: 1670.4. Samples: 30026714. Policy #0 lag: (min: 16.0, avg: 34.6, max: 48.0) +[2023-10-11 17:23:51,064][84230] Avg episode reward: [(0, '36.390'), (1, '38.990')] +[2023-10-11 17:23:52,935][85175] Updated weights for policy 1, policy_version 59080 (0.0008) +[2023-10-11 17:23:53,130][85176] Updated weights for policy 0, policy_version 58212 (0.0010) +[2023-10-11 17:23:53,300][85175] Updated weights for policy 1, policy_version 59090 (0.0007) +[2023-10-11 17:23:53,507][85176] Updated weights for policy 0, policy_version 58222 (0.0007) +[2023-10-11 17:23:53,672][85175] Updated weights for policy 1, policy_version 59100 (0.0008) +[2023-10-11 17:23:53,877][85176] Updated weights for policy 0, policy_version 58232 (0.0009) +[2023-10-11 17:23:56,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 120160256. Throughput: 0: 1664.5, 1: 1678.8. Samples: 30046240. Policy #0 lag: (min: 16.0, avg: 34.6, max: 48.0) +[2023-10-11 17:23:56,064][84230] Avg episode reward: [(0, '39.460'), (1, '42.770')] +[2023-10-11 17:23:57,888][85175] Updated weights for policy 1, policy_version 59110 (0.0009) +[2023-10-11 17:23:57,924][85176] Updated weights for policy 0, policy_version 58242 (0.0009) +[2023-10-11 17:23:58,258][85175] Updated weights for policy 1, policy_version 59120 (0.0007) +[2023-10-11 17:23:58,300][85176] Updated weights for policy 0, policy_version 58252 (0.0007) +[2023-10-11 17:23:58,618][85175] Updated weights for policy 1, policy_version 59130 (0.0007) +[2023-10-11 17:23:58,673][85176] Updated weights for policy 0, policy_version 58262 (0.0008) +[2023-10-11 17:23:59,041][85176] Updated weights for policy 0, policy_version 58272 (0.0008) +[2023-10-11 17:24:01,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 120225792. Throughput: 0: 1677.4, 1: 1694.2. Samples: 30066834. Policy #0 lag: (min: 16.0, avg: 34.6, max: 48.0) +[2023-10-11 17:24:01,064][84230] Avg episode reward: [(0, '32.390'), (1, '42.160')] +[2023-10-11 17:24:02,501][85175] Updated weights for policy 1, policy_version 59140 (0.0008) +[2023-10-11 17:24:02,873][85175] Updated weights for policy 1, policy_version 59150 (0.0007) +[2023-10-11 17:24:03,134][85176] Updated weights for policy 0, policy_version 58282 (0.0007) +[2023-10-11 17:24:03,240][85175] Updated weights for policy 1, policy_version 59160 (0.0007) +[2023-10-11 17:24:03,494][85176] Updated weights for policy 0, policy_version 58292 (0.0009) +[2023-10-11 17:24:03,870][85176] Updated weights for policy 0, policy_version 58302 (0.0009) +[2023-10-11 17:24:06,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 120291328. Throughput: 0: 1657.1, 1: 1669.6. Samples: 30076434. Policy #0 lag: (min: 16.0, avg: 34.6, max: 48.0) +[2023-10-11 17:24:06,063][84230] Avg episode reward: [(0, '39.520'), (1, '43.840')] +[2023-10-11 17:24:07,424][85175] Updated weights for policy 1, policy_version 59170 (0.0009) +[2023-10-11 17:24:07,800][85175] Updated weights for policy 1, policy_version 59180 (0.0009) +[2023-10-11 17:24:07,971][85176] Updated weights for policy 0, policy_version 58312 (0.0008) +[2023-10-11 17:24:08,174][85175] Updated weights for policy 1, policy_version 59190 (0.0007) +[2023-10-11 17:24:08,336][85176] Updated weights for policy 0, policy_version 58322 (0.0007) +[2023-10-11 17:24:08,532][85175] Updated weights for policy 1, policy_version 59200 (0.0007) +[2023-10-11 17:24:08,701][85176] Updated weights for policy 0, policy_version 58332 (0.0008) +[2023-10-11 17:24:11,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 120356864. Throughput: 0: 1670.4, 1: 1693.1. Samples: 30096730. Policy #0 lag: (min: 16.0, avg: 34.6, max: 48.0) +[2023-10-11 17:24:11,064][84230] Avg episode reward: [(0, '31.660'), (1, '40.600')] +[2023-10-11 17:24:12,526][85175] Updated weights for policy 1, policy_version 59210 (0.0008) +[2023-10-11 17:24:12,647][85176] Updated weights for policy 0, policy_version 58342 (0.0009) +[2023-10-11 17:24:12,899][85175] Updated weights for policy 1, policy_version 59220 (0.0009) +[2023-10-11 17:24:13,020][85176] Updated weights for policy 0, policy_version 58352 (0.0010) +[2023-10-11 17:24:13,264][85175] Updated weights for policy 1, policy_version 59230 (0.0008) +[2023-10-11 17:24:13,391][85176] Updated weights for policy 0, policy_version 58362 (0.0007) +[2023-10-11 17:24:16,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 120422400. Throughput: 0: 1680.3, 1: 1699.3. Samples: 30117542. Policy #0 lag: (min: 16.0, avg: 34.6, max: 48.0) +[2023-10-11 17:24:16,063][84230] Avg episode reward: [(0, '38.130'), (1, '39.970')] +[2023-10-11 17:24:16,074][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000059232_60653568.pth... +[2023-10-11 17:24:16,074][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000058368_59768832.pth... +[2023-10-11 17:24:16,126][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000057664_59047936.pth +[2023-10-11 17:24:16,126][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000056800_58163200.pth +[2023-10-11 17:24:17,193][85175] Updated weights for policy 1, policy_version 59240 (0.0009) +[2023-10-11 17:24:17,557][85175] Updated weights for policy 1, policy_version 59250 (0.0009) +[2023-10-11 17:24:17,576][85176] Updated weights for policy 0, policy_version 58372 (0.0007) +[2023-10-11 17:24:17,925][85175] Updated weights for policy 1, policy_version 59260 (0.0009) +[2023-10-11 17:24:17,966][85176] Updated weights for policy 0, policy_version 58382 (0.0008) +[2023-10-11 17:24:18,332][85176] Updated weights for policy 0, policy_version 58392 (0.0007) +[2023-10-11 17:24:21,062][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 120487936. Throughput: 0: 1651.5, 1: 1678.9. Samples: 30126700. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:24:21,063][84230] Avg episode reward: [(0, '31.450'), (1, '39.430')] +[2023-10-11 17:24:22,061][85175] Updated weights for policy 1, policy_version 59270 (0.0008) +[2023-10-11 17:24:22,308][85176] Updated weights for policy 0, policy_version 58402 (0.0008) +[2023-10-11 17:24:22,433][85175] Updated weights for policy 1, policy_version 59280 (0.0010) +[2023-10-11 17:24:22,685][85176] Updated weights for policy 0, policy_version 58412 (0.0008) +[2023-10-11 17:24:22,805][85175] Updated weights for policy 1, policy_version 59290 (0.0009) +[2023-10-11 17:24:23,063][85176] Updated weights for policy 0, policy_version 58422 (0.0010) +[2023-10-11 17:24:23,441][85176] Updated weights for policy 0, policy_version 58432 (0.0010) +[2023-10-11 17:24:26,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 120553472. Throughput: 0: 1675.0, 1: 1694.5. Samples: 30147230. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:24:26,063][84230] Avg episode reward: [(0, '39.200'), (1, '42.930')] +[2023-10-11 17:24:26,769][85175] Updated weights for policy 1, policy_version 59300 (0.0009) +[2023-10-11 17:24:27,142][85175] Updated weights for policy 1, policy_version 59310 (0.0009) +[2023-10-11 17:24:27,512][85175] Updated weights for policy 1, policy_version 59320 (0.0008) +[2023-10-11 17:24:27,656][85176] Updated weights for policy 0, policy_version 58442 (0.0008) +[2023-10-11 17:24:28,019][85176] Updated weights for policy 0, policy_version 58452 (0.0008) +[2023-10-11 17:24:28,394][85176] Updated weights for policy 0, policy_version 58462 (0.0008) +[2023-10-11 17:24:31,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 120619008. Throughput: 0: 1673.8, 1: 1701.7. Samples: 30168156. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:24:31,064][84230] Avg episode reward: [(0, '33.010'), (1, '43.570')] +[2023-10-11 17:24:31,461][85175] Updated weights for policy 1, policy_version 59330 (0.0009) +[2023-10-11 17:24:31,838][85175] Updated weights for policy 1, policy_version 59340 (0.0008) +[2023-10-11 17:24:32,209][85175] Updated weights for policy 1, policy_version 59350 (0.0009) +[2023-10-11 17:24:32,573][85175] Updated weights for policy 1, policy_version 59360 (0.0007) +[2023-10-11 17:24:32,764][85176] Updated weights for policy 0, policy_version 58472 (0.0009) +[2023-10-11 17:24:33,134][85176] Updated weights for policy 0, policy_version 58482 (0.0007) +[2023-10-11 17:24:33,506][85176] Updated weights for policy 0, policy_version 58492 (0.0009) +[2023-10-11 17:24:36,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 120684544. Throughput: 0: 1656.9, 1: 1692.8. Samples: 30177452. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:24:36,064][84230] Avg episode reward: [(0, '39.090'), (1, '43.120')] +[2023-10-11 17:24:36,459][85175] Updated weights for policy 1, policy_version 59370 (0.0009) +[2023-10-11 17:24:36,834][85175] Updated weights for policy 1, policy_version 59380 (0.0011) +[2023-10-11 17:24:37,199][85175] Updated weights for policy 1, policy_version 59390 (0.0009) +[2023-10-11 17:24:37,475][85176] Updated weights for policy 0, policy_version 58502 (0.0009) +[2023-10-11 17:24:37,843][85176] Updated weights for policy 0, policy_version 58512 (0.0011) +[2023-10-11 17:24:38,208][85176] Updated weights for policy 0, policy_version 58522 (0.0008) +[2023-10-11 17:24:41,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 120750080. Throughput: 0: 1670.5, 1: 1706.3. Samples: 30198194. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:24:41,063][84230] Avg episode reward: [(0, '33.480'), (1, '41.210')] +[2023-10-11 17:24:41,207][85175] Updated weights for policy 1, policy_version 59400 (0.0009) +[2023-10-11 17:24:41,572][85175] Updated weights for policy 1, policy_version 59410 (0.0009) +[2023-10-11 17:24:41,938][85175] Updated weights for policy 1, policy_version 59420 (0.0007) +[2023-10-11 17:24:42,243][85176] Updated weights for policy 0, policy_version 58532 (0.0008) +[2023-10-11 17:24:42,627][85176] Updated weights for policy 0, policy_version 58542 (0.0007) +[2023-10-11 17:24:42,996][85176] Updated weights for policy 0, policy_version 58552 (0.0009) +[2023-10-11 17:24:45,948][85175] Updated weights for policy 1, policy_version 59430 (0.0008) +[2023-10-11 17:24:46,062][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 120815616. Throughput: 0: 1674.8, 1: 1711.7. Samples: 30219224. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:24:46,063][84230] Avg episode reward: [(0, '38.330'), (1, '42.200')] +[2023-10-11 17:24:46,311][85175] Updated weights for policy 1, policy_version 59440 (0.0008) +[2023-10-11 17:24:46,685][85175] Updated weights for policy 1, policy_version 59450 (0.0007) +[2023-10-11 17:24:47,094][85176] Updated weights for policy 0, policy_version 58562 (0.0009) +[2023-10-11 17:24:47,456][85176] Updated weights for policy 0, policy_version 58572 (0.0008) +[2023-10-11 17:24:47,823][85176] Updated weights for policy 0, policy_version 58582 (0.0009) +[2023-10-11 17:24:48,197][85176] Updated weights for policy 0, policy_version 58592 (0.0009) +[2023-10-11 17:24:50,625][85175] Updated weights for policy 1, policy_version 59460 (0.0007) +[2023-10-11 17:24:50,996][85175] Updated weights for policy 1, policy_version 59470 (0.0009) +[2023-10-11 17:24:51,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 120881152. Throughput: 0: 1664.2, 1: 1711.9. Samples: 30228358. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:24:51,063][84230] Avg episode reward: [(0, '32.850'), (1, '43.540')] +[2023-10-11 17:24:51,359][85175] Updated weights for policy 1, policy_version 59480 (0.0008) +[2023-10-11 17:24:52,328][85176] Updated weights for policy 0, policy_version 58602 (0.0009) +[2023-10-11 17:24:52,698][85176] Updated weights for policy 0, policy_version 58612 (0.0009) +[2023-10-11 17:24:53,073][85176] Updated weights for policy 0, policy_version 58622 (0.0010) +[2023-10-11 17:24:55,454][85175] Updated weights for policy 1, policy_version 59490 (0.0008) +[2023-10-11 17:24:55,815][85175] Updated weights for policy 1, policy_version 59500 (0.0010) +[2023-10-11 17:24:56,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 120946688. Throughput: 0: 1675.2, 1: 1708.1. Samples: 30248978. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:24:56,063][84230] Avg episode reward: [(0, '39.540'), (1, '41.120')] +[2023-10-11 17:24:56,180][85175] Updated weights for policy 1, policy_version 59510 (0.0009) +[2023-10-11 17:24:56,536][85175] Updated weights for policy 1, policy_version 59520 (0.0011) +[2023-10-11 17:24:56,963][85176] Updated weights for policy 0, policy_version 58632 (0.0007) +[2023-10-11 17:24:57,339][85176] Updated weights for policy 0, policy_version 58642 (0.0007) +[2023-10-11 17:24:57,707][85176] Updated weights for policy 0, policy_version 58652 (0.0008) +[2023-10-11 17:25:00,750][85175] Updated weights for policy 1, policy_version 59530 (0.0010) +[2023-10-11 17:25:01,063][84230] Fps is (10 sec: 13106.8, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 121012224. Throughput: 0: 1676.0, 1: 1703.6. Samples: 30269624. Policy #0 lag: (min: 26.0, avg: 34.0, max: 58.0) +[2023-10-11 17:25:01,064][84230] Avg episode reward: [(0, '36.030'), (1, '44.630')] +[2023-10-11 17:25:01,130][85175] Updated weights for policy 1, policy_version 59540 (0.0009) +[2023-10-11 17:25:01,502][85175] Updated weights for policy 1, policy_version 59550 (0.0009) +[2023-10-11 17:25:01,736][85176] Updated weights for policy 0, policy_version 58662 (0.0008) +[2023-10-11 17:25:02,108][85176] Updated weights for policy 0, policy_version 58672 (0.0010) +[2023-10-11 17:25:02,489][85176] Updated weights for policy 0, policy_version 58682 (0.0007) +[2023-10-11 17:25:05,516][85175] Updated weights for policy 1, policy_version 59560 (0.0007) +[2023-10-11 17:25:05,889][85175] Updated weights for policy 1, policy_version 59570 (0.0009) +[2023-10-11 17:25:06,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 121077760. Throughput: 0: 1678.4, 1: 1705.6. Samples: 30278980. Policy #0 lag: (min: 26.0, avg: 34.0, max: 58.0) +[2023-10-11 17:25:06,064][84230] Avg episode reward: [(0, '41.860'), (1, '43.200')] +[2023-10-11 17:25:06,065][84801] Saving new best policy, reward=41.860! +[2023-10-11 17:25:06,258][85175] Updated weights for policy 1, policy_version 59580 (0.0008) +[2023-10-11 17:25:06,651][85176] Updated weights for policy 0, policy_version 58692 (0.0009) +[2023-10-11 17:25:07,035][85176] Updated weights for policy 0, policy_version 58702 (0.0010) +[2023-10-11 17:25:07,413][85176] Updated weights for policy 0, policy_version 58712 (0.0008) +[2023-10-11 17:25:10,397][85175] Updated weights for policy 1, policy_version 59590 (0.0010) +[2023-10-11 17:25:10,770][85175] Updated weights for policy 1, policy_version 59600 (0.0010) +[2023-10-11 17:25:11,062][84230] Fps is (10 sec: 13107.6, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 121143296. Throughput: 0: 1679.1, 1: 1707.4. Samples: 30299624. Policy #0 lag: (min: 26.0, avg: 34.0, max: 58.0) +[2023-10-11 17:25:11,063][84230] Avg episode reward: [(0, '38.590'), (1, '42.860')] +[2023-10-11 17:25:11,131][85175] Updated weights for policy 1, policy_version 59610 (0.0008) +[2023-10-11 17:25:11,367][85176] Updated weights for policy 0, policy_version 58722 (0.0010) +[2023-10-11 17:25:11,740][85176] Updated weights for policy 0, policy_version 58732 (0.0008) +[2023-10-11 17:25:12,121][85176] Updated weights for policy 0, policy_version 58742 (0.0009) +[2023-10-11 17:25:12,489][85176] Updated weights for policy 0, policy_version 58752 (0.0009) +[2023-10-11 17:25:15,194][85175] Updated weights for policy 1, policy_version 59620 (0.0008) +[2023-10-11 17:25:15,556][85175] Updated weights for policy 1, policy_version 59630 (0.0010) +[2023-10-11 17:25:15,930][85175] Updated weights for policy 1, policy_version 59640 (0.0007) +[2023-10-11 17:25:16,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 121208832. Throughput: 0: 1686.4, 1: 1692.0. Samples: 30320182. Policy #0 lag: (min: 26.0, avg: 34.0, max: 58.0) +[2023-10-11 17:25:16,063][84230] Avg episode reward: [(0, '42.540'), (1, '41.580')] +[2023-10-11 17:25:16,073][84801] Saving new best policy, reward=42.540! +[2023-10-11 17:25:16,727][85176] Updated weights for policy 0, policy_version 58762 (0.0007) +[2023-10-11 17:25:17,094][85176] Updated weights for policy 0, policy_version 58772 (0.0009) +[2023-10-11 17:25:17,478][85176] Updated weights for policy 0, policy_version 58782 (0.0009) +[2023-10-11 17:25:19,701][85175] Updated weights for policy 1, policy_version 59650 (0.0008) +[2023-10-11 17:25:20,073][85175] Updated weights for policy 1, policy_version 59660 (0.0009) +[2023-10-11 17:25:20,447][85175] Updated weights for policy 1, policy_version 59670 (0.0011) +[2023-10-11 17:25:20,817][85175] Updated weights for policy 1, policy_version 59680 (0.0010) +[2023-10-11 17:25:21,062][84230] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 121307136. Throughput: 0: 1682.7, 1: 1705.3. Samples: 30329914. Policy #0 lag: (min: 26.0, avg: 34.0, max: 58.0) +[2023-10-11 17:25:21,063][84230] Avg episode reward: [(0, '37.230'), (1, '42.350')] +[2023-10-11 17:25:21,557][85176] Updated weights for policy 0, policy_version 58792 (0.0008) +[2023-10-11 17:25:21,931][85176] Updated weights for policy 0, policy_version 58802 (0.0008) +[2023-10-11 17:25:22,308][85176] Updated weights for policy 0, policy_version 58812 (0.0009) +[2023-10-11 17:25:24,795][85175] Updated weights for policy 1, policy_version 59690 (0.0008) +[2023-10-11 17:25:25,159][85175] Updated weights for policy 1, policy_version 59700 (0.0007) +[2023-10-11 17:25:25,527][85175] Updated weights for policy 1, policy_version 59710 (0.0008) +[2023-10-11 17:25:26,063][84230] Fps is (10 sec: 16384.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 121372672. Throughput: 0: 1680.3, 1: 1701.6. Samples: 30350378. Policy #0 lag: (min: 26.0, avg: 34.0, max: 58.0) +[2023-10-11 17:25:26,063][84230] Avg episode reward: [(0, '41.210'), (1, '44.130')] +[2023-10-11 17:25:26,282][85176] Updated weights for policy 0, policy_version 58822 (0.0012) +[2023-10-11 17:25:26,662][85176] Updated weights for policy 0, policy_version 58832 (0.0009) +[2023-10-11 17:25:27,040][85176] Updated weights for policy 0, policy_version 58842 (0.0009) +[2023-10-11 17:25:29,488][85175] Updated weights for policy 1, policy_version 59720 (0.0009) +[2023-10-11 17:25:29,849][85175] Updated weights for policy 1, policy_version 59730 (0.0010) +[2023-10-11 17:25:30,226][85175] Updated weights for policy 1, policy_version 59740 (0.0010) +[2023-10-11 17:25:30,941][85176] Updated weights for policy 0, policy_version 58852 (0.0009) +[2023-10-11 17:25:31,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 121438208. Throughput: 0: 1680.7, 1: 1678.4. Samples: 30370384. Policy #0 lag: (min: 26.0, avg: 34.0, max: 58.0) +[2023-10-11 17:25:31,063][84230] Avg episode reward: [(0, '36.930'), (1, '45.150')] +[2023-10-11 17:25:31,301][85176] Updated weights for policy 0, policy_version 58862 (0.0010) +[2023-10-11 17:25:31,669][85176] Updated weights for policy 0, policy_version 58872 (0.0011) +[2023-10-11 17:25:34,114][85175] Updated weights for policy 1, policy_version 59750 (0.0011) +[2023-10-11 17:25:34,480][85175] Updated weights for policy 1, policy_version 59760 (0.0009) +[2023-10-11 17:25:34,844][85175] Updated weights for policy 1, policy_version 59770 (0.0010) +[2023-10-11 17:25:35,840][85176] Updated weights for policy 0, policy_version 58882 (0.0008) +[2023-10-11 17:25:36,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 121503744. Throughput: 0: 1681.6, 1: 1706.3. Samples: 30380814. Policy #0 lag: (min: 26.0, avg: 34.0, max: 58.0) +[2023-10-11 17:25:36,063][84230] Avg episode reward: [(0, '39.700'), (1, '44.500')] +[2023-10-11 17:25:36,215][85176] Updated weights for policy 0, policy_version 58892 (0.0007) +[2023-10-11 17:25:36,590][85176] Updated weights for policy 0, policy_version 58902 (0.0009) +[2023-10-11 17:25:36,962][85176] Updated weights for policy 0, policy_version 58912 (0.0008) +[2023-10-11 17:25:38,979][85175] Updated weights for policy 1, policy_version 59780 (0.0009) +[2023-10-11 17:25:39,354][85175] Updated weights for policy 1, policy_version 59790 (0.0010) +[2023-10-11 17:25:39,714][85175] Updated weights for policy 1, policy_version 59800 (0.0010) +[2023-10-11 17:25:41,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 121569280. Throughput: 0: 1679.3, 1: 1691.5. Samples: 30400666. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:25:41,063][84230] Avg episode reward: [(0, '35.550'), (1, '42.550')] +[2023-10-11 17:25:41,235][85176] Updated weights for policy 0, policy_version 58922 (0.0008) +[2023-10-11 17:25:41,603][85176] Updated weights for policy 0, policy_version 58932 (0.0007) +[2023-10-11 17:25:41,982][85176] Updated weights for policy 0, policy_version 58942 (0.0007) +[2023-10-11 17:25:43,670][85175] Updated weights for policy 1, policy_version 59810 (0.0008) +[2023-10-11 17:25:44,038][85175] Updated weights for policy 1, policy_version 59820 (0.0008) +[2023-10-11 17:25:44,407][85175] Updated weights for policy 1, policy_version 59830 (0.0007) +[2023-10-11 17:25:44,770][85175] Updated weights for policy 1, policy_version 59840 (0.0010) +[2023-10-11 17:25:46,053][85176] Updated weights for policy 0, policy_version 58952 (0.0009) +[2023-10-11 17:25:46,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 121634816. Throughput: 0: 1679.8, 1: 1682.5. Samples: 30420928. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:25:46,063][84230] Avg episode reward: [(0, '39.440'), (1, '42.660')] +[2023-10-11 17:25:46,437][85176] Updated weights for policy 0, policy_version 58962 (0.0008) +[2023-10-11 17:25:46,807][85176] Updated weights for policy 0, policy_version 58972 (0.0008) +[2023-10-11 17:25:48,895][85175] Updated weights for policy 1, policy_version 59850 (0.0007) +[2023-10-11 17:25:49,259][85175] Updated weights for policy 1, policy_version 59860 (0.0007) +[2023-10-11 17:25:49,638][85175] Updated weights for policy 1, policy_version 59870 (0.0010) +[2023-10-11 17:25:50,803][85176] Updated weights for policy 0, policy_version 58982 (0.0009) +[2023-10-11 17:25:51,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 121700352. Throughput: 0: 1676.2, 1: 1708.2. Samples: 30431278. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:25:51,063][84230] Avg episode reward: [(0, '35.980'), (1, '43.610')] +[2023-10-11 17:25:51,180][85176] Updated weights for policy 0, policy_version 58992 (0.0007) +[2023-10-11 17:25:51,548][85176] Updated weights for policy 0, policy_version 59002 (0.0007) +[2023-10-11 17:25:53,754][85175] Updated weights for policy 1, policy_version 59880 (0.0008) +[2023-10-11 17:25:54,123][85175] Updated weights for policy 1, policy_version 59890 (0.0009) +[2023-10-11 17:25:54,490][85175] Updated weights for policy 1, policy_version 59900 (0.0010) +[2023-10-11 17:25:55,703][85176] Updated weights for policy 0, policy_version 59012 (0.0007) +[2023-10-11 17:25:56,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 121765888. Throughput: 0: 1686.2, 1: 1681.6. Samples: 30451176. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:25:56,063][84230] Avg episode reward: [(0, '39.400'), (1, '44.900')] +[2023-10-11 17:25:56,089][85176] Updated weights for policy 0, policy_version 59022 (0.0008) +[2023-10-11 17:25:56,455][85176] Updated weights for policy 0, policy_version 59032 (0.0009) +[2023-10-11 17:25:58,576][85175] Updated weights for policy 1, policy_version 59910 (0.0007) +[2023-10-11 17:25:58,948][85175] Updated weights for policy 1, policy_version 59920 (0.0010) +[2023-10-11 17:25:59,311][85175] Updated weights for policy 1, policy_version 59930 (0.0010) +[2023-10-11 17:26:00,473][85176] Updated weights for policy 0, policy_version 59042 (0.0010) +[2023-10-11 17:26:00,834][85176] Updated weights for policy 0, policy_version 59052 (0.0010) +[2023-10-11 17:26:01,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 121831424. Throughput: 0: 1677.5, 1: 1682.3. Samples: 30471370. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:26:01,064][84230] Avg episode reward: [(0, '37.970'), (1, '44.910')] +[2023-10-11 17:26:01,194][85176] Updated weights for policy 0, policy_version 59062 (0.0010) +[2023-10-11 17:26:01,565][85176] Updated weights for policy 0, policy_version 59072 (0.0011) +[2023-10-11 17:26:03,382][85175] Updated weights for policy 1, policy_version 59940 (0.0008) +[2023-10-11 17:26:03,751][85175] Updated weights for policy 1, policy_version 59950 (0.0009) +[2023-10-11 17:26:04,117][85175] Updated weights for policy 1, policy_version 59960 (0.0009) +[2023-10-11 17:26:05,570][85176] Updated weights for policy 0, policy_version 59082 (0.0008) +[2023-10-11 17:26:05,947][85176] Updated weights for policy 0, policy_version 59092 (0.0007) +[2023-10-11 17:26:06,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 121896960. Throughput: 0: 1682.4, 1: 1690.0. Samples: 30481674. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:26:06,063][84230] Avg episode reward: [(0, '37.490'), (1, '43.490')] +[2023-10-11 17:26:06,313][85176] Updated weights for policy 0, policy_version 59102 (0.0007) +[2023-10-11 17:26:08,062][85175] Updated weights for policy 1, policy_version 59970 (0.0009) +[2023-10-11 17:26:08,439][85175] Updated weights for policy 1, policy_version 59980 (0.0010) +[2023-10-11 17:26:08,806][85175] Updated weights for policy 1, policy_version 59990 (0.0009) +[2023-10-11 17:26:09,167][85175] Updated weights for policy 1, policy_version 60000 (0.0008) +[2023-10-11 17:26:10,416][85176] Updated weights for policy 0, policy_version 59112 (0.0009) +[2023-10-11 17:26:10,787][85176] Updated weights for policy 0, policy_version 59122 (0.0009) +[2023-10-11 17:26:11,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 121962496. Throughput: 0: 1689.0, 1: 1673.6. Samples: 30501692. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:26:11,063][84230] Avg episode reward: [(0, '36.900'), (1, '45.090')] +[2023-10-11 17:26:11,167][85176] Updated weights for policy 0, policy_version 59132 (0.0008) +[2023-10-11 17:26:13,195][85175] Updated weights for policy 1, policy_version 60010 (0.0008) +[2023-10-11 17:26:13,563][85175] Updated weights for policy 1, policy_version 60020 (0.0007) +[2023-10-11 17:26:13,927][85175] Updated weights for policy 1, policy_version 60030 (0.0009) +[2023-10-11 17:26:15,103][85176] Updated weights for policy 0, policy_version 59142 (0.0009) +[2023-10-11 17:26:15,475][85176] Updated weights for policy 0, policy_version 59152 (0.0009) +[2023-10-11 17:26:15,851][85176] Updated weights for policy 0, policy_version 59162 (0.0007) +[2023-10-11 17:26:16,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 122028032. Throughput: 0: 1670.2, 1: 1696.9. Samples: 30521902. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:26:16,063][84230] Avg episode reward: [(0, '40.140'), (1, '43.220')] +[2023-10-11 17:26:16,072][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000060032_61472768.pth... +[2023-10-11 17:26:16,083][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000059168_60588032.pth... +[2023-10-11 17:26:16,107][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000058464_59867136.pth +[2023-10-11 17:26:16,124][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000057600_58982400.pth +[2023-10-11 17:26:17,817][85175] Updated weights for policy 1, policy_version 60040 (0.0008) +[2023-10-11 17:26:18,175][85175] Updated weights for policy 1, policy_version 60050 (0.0007) +[2023-10-11 17:26:18,537][85175] Updated weights for policy 1, policy_version 60060 (0.0008) +[2023-10-11 17:26:19,681][85176] Updated weights for policy 0, policy_version 59172 (0.0008) +[2023-10-11 17:26:20,056][85176] Updated weights for policy 0, policy_version 59182 (0.0007) +[2023-10-11 17:26:20,420][85176] Updated weights for policy 0, policy_version 59192 (0.0007) +[2023-10-11 17:26:21,063][84230] Fps is (10 sec: 16383.7, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 122126336. Throughput: 0: 1689.7, 1: 1674.4. Samples: 30532200. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:26:21,064][84230] Avg episode reward: [(0, '38.670'), (1, '44.970')] +[2023-10-11 17:26:22,603][85175] Updated weights for policy 1, policy_version 60070 (0.0008) +[2023-10-11 17:26:22,966][85175] Updated weights for policy 1, policy_version 60080 (0.0008) +[2023-10-11 17:26:23,330][85175] Updated weights for policy 1, policy_version 60090 (0.0009) +[2023-10-11 17:26:24,546][85176] Updated weights for policy 0, policy_version 59202 (0.0007) +[2023-10-11 17:26:24,930][85176] Updated weights for policy 0, policy_version 59212 (0.0010) +[2023-10-11 17:26:25,296][85176] Updated weights for policy 0, policy_version 59222 (0.0009) +[2023-10-11 17:26:25,666][85176] Updated weights for policy 0, policy_version 59232 (0.0008) +[2023-10-11 17:26:26,062][84230] Fps is (10 sec: 16384.3, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 122191872. Throughput: 0: 1687.1, 1: 1687.6. Samples: 30552528. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:26:26,063][84230] Avg episode reward: [(0, '39.320'), (1, '43.620')] +[2023-10-11 17:26:27,428][85175] Updated weights for policy 1, policy_version 60100 (0.0008) +[2023-10-11 17:26:27,803][85175] Updated weights for policy 1, policy_version 60110 (0.0010) +[2023-10-11 17:26:28,167][85175] Updated weights for policy 1, policy_version 60120 (0.0009) +[2023-10-11 17:26:29,769][85176] Updated weights for policy 0, policy_version 59242 (0.0008) +[2023-10-11 17:26:30,151][85176] Updated weights for policy 0, policy_version 59252 (0.0008) +[2023-10-11 17:26:30,529][85176] Updated weights for policy 0, policy_version 59262 (0.0010) +[2023-10-11 17:26:31,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 122257408. Throughput: 0: 1662.0, 1: 1697.7. Samples: 30572114. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:26:31,064][84230] Avg episode reward: [(0, '39.110'), (1, '44.130')] +[2023-10-11 17:26:32,254][85175] Updated weights for policy 1, policy_version 60130 (0.0010) +[2023-10-11 17:26:32,618][85175] Updated weights for policy 1, policy_version 60140 (0.0008) +[2023-10-11 17:26:32,989][85175] Updated weights for policy 1, policy_version 60150 (0.0007) +[2023-10-11 17:26:33,352][85175] Updated weights for policy 1, policy_version 60160 (0.0007) +[2023-10-11 17:26:34,641][85176] Updated weights for policy 0, policy_version 59272 (0.0009) +[2023-10-11 17:26:35,010][85176] Updated weights for policy 0, policy_version 59282 (0.0008) +[2023-10-11 17:26:35,386][85176] Updated weights for policy 0, policy_version 59292 (0.0009) +[2023-10-11 17:26:36,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 122322944. Throughput: 0: 1689.2, 1: 1669.3. Samples: 30582408. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:26:36,063][84230] Avg episode reward: [(0, '38.940'), (1, '40.890')] +[2023-10-11 17:26:37,334][85175] Updated weights for policy 1, policy_version 60170 (0.0010) +[2023-10-11 17:26:37,705][85175] Updated weights for policy 1, policy_version 60180 (0.0008) +[2023-10-11 17:26:38,077][85175] Updated weights for policy 1, policy_version 60190 (0.0009) +[2023-10-11 17:26:39,446][85176] Updated weights for policy 0, policy_version 59302 (0.0008) +[2023-10-11 17:26:39,814][85176] Updated weights for policy 0, policy_version 59312 (0.0008) +[2023-10-11 17:26:40,186][85176] Updated weights for policy 0, policy_version 59322 (0.0008) +[2023-10-11 17:26:41,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 122388480. Throughput: 0: 1675.0, 1: 1701.1. Samples: 30603102. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:26:41,063][84230] Avg episode reward: [(0, '38.790'), (1, '43.230')] +[2023-10-11 17:26:42,211][85175] Updated weights for policy 1, policy_version 60200 (0.0009) +[2023-10-11 17:26:42,584][85175] Updated weights for policy 1, policy_version 60210 (0.0008) +[2023-10-11 17:26:42,947][85175] Updated weights for policy 1, policy_version 60220 (0.0011) +[2023-10-11 17:26:44,549][85176] Updated weights for policy 0, policy_version 59332 (0.0010) +[2023-10-11 17:26:44,935][85176] Updated weights for policy 0, policy_version 59342 (0.0007) +[2023-10-11 17:26:45,314][85176] Updated weights for policy 0, policy_version 59352 (0.0007) +[2023-10-11 17:26:46,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 122454016. Throughput: 0: 1660.6, 1: 1703.7. Samples: 30622762. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:26:46,064][84230] Avg episode reward: [(0, '37.910'), (1, '41.470')] +[2023-10-11 17:26:47,070][85175] Updated weights for policy 1, policy_version 60230 (0.0010) +[2023-10-11 17:26:47,433][85175] Updated weights for policy 1, policy_version 60240 (0.0008) +[2023-10-11 17:26:47,792][85175] Updated weights for policy 1, policy_version 60250 (0.0009) +[2023-10-11 17:26:49,452][85176] Updated weights for policy 0, policy_version 59362 (0.0008) +[2023-10-11 17:26:49,827][85176] Updated weights for policy 0, policy_version 59372 (0.0008) +[2023-10-11 17:26:50,187][85176] Updated weights for policy 0, policy_version 59382 (0.0010) +[2023-10-11 17:26:50,562][85176] Updated weights for policy 0, policy_version 59392 (0.0009) +[2023-10-11 17:26:51,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 122519552. Throughput: 0: 1681.6, 1: 1681.7. Samples: 30633024. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:26:51,064][84230] Avg episode reward: [(0, '40.930'), (1, '44.380')] +[2023-10-11 17:26:51,753][85175] Updated weights for policy 1, policy_version 60260 (0.0010) +[2023-10-11 17:26:52,116][85175] Updated weights for policy 1, policy_version 60270 (0.0007) +[2023-10-11 17:26:52,492][85175] Updated weights for policy 1, policy_version 60280 (0.0010) +[2023-10-11 17:26:54,675][85176] Updated weights for policy 0, policy_version 59402 (0.0007) +[2023-10-11 17:26:55,044][85176] Updated weights for policy 0, policy_version 59412 (0.0007) +[2023-10-11 17:26:55,423][85176] Updated weights for policy 0, policy_version 59422 (0.0007) +[2023-10-11 17:26:56,062][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 122585088. Throughput: 0: 1672.8, 1: 1708.4. Samples: 30653848. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:26:56,063][84230] Avg episode reward: [(0, '37.730'), (1, '42.250')] +[2023-10-11 17:26:56,445][85175] Updated weights for policy 1, policy_version 60290 (0.0009) +[2023-10-11 17:26:56,818][85175] Updated weights for policy 1, policy_version 60300 (0.0008) +[2023-10-11 17:26:57,184][85175] Updated weights for policy 1, policy_version 60310 (0.0008) +[2023-10-11 17:26:57,559][85175] Updated weights for policy 1, policy_version 60320 (0.0008) +[2023-10-11 17:26:59,374][85176] Updated weights for policy 0, policy_version 59432 (0.0009) +[2023-10-11 17:26:59,739][85176] Updated weights for policy 0, policy_version 59442 (0.0008) +[2023-10-11 17:27:00,114][85176] Updated weights for policy 0, policy_version 59452 (0.0009) +[2023-10-11 17:27:01,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 122650624. Throughput: 0: 1666.9, 1: 1711.3. Samples: 30673922. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:27:01,064][84230] Avg episode reward: [(0, '37.840'), (1, '43.610')] +[2023-10-11 17:27:01,515][85175] Updated weights for policy 1, policy_version 60330 (0.0011) +[2023-10-11 17:27:01,881][85175] Updated weights for policy 1, policy_version 60340 (0.0009) +[2023-10-11 17:27:02,244][85175] Updated weights for policy 1, policy_version 60350 (0.0008) +[2023-10-11 17:27:04,187][85176] Updated weights for policy 0, policy_version 59462 (0.0008) +[2023-10-11 17:27:04,560][85176] Updated weights for policy 0, policy_version 59472 (0.0007) +[2023-10-11 17:27:04,922][85176] Updated weights for policy 0, policy_version 59482 (0.0008) +[2023-10-11 17:27:06,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 122716160. Throughput: 0: 1675.8, 1: 1702.5. Samples: 30684224. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:27:06,063][84230] Avg episode reward: [(0, '36.060'), (1, '42.480')] +[2023-10-11 17:27:06,166][85175] Updated weights for policy 1, policy_version 60360 (0.0008) +[2023-10-11 17:27:06,534][85175] Updated weights for policy 1, policy_version 60370 (0.0009) +[2023-10-11 17:27:06,912][85175] Updated weights for policy 1, policy_version 60380 (0.0008) +[2023-10-11 17:27:08,931][85176] Updated weights for policy 0, policy_version 59492 (0.0009) +[2023-10-11 17:27:09,300][85176] Updated weights for policy 0, policy_version 59502 (0.0009) +[2023-10-11 17:27:09,677][85176] Updated weights for policy 0, policy_version 59512 (0.0009) +[2023-10-11 17:27:10,965][85175] Updated weights for policy 1, policy_version 60390 (0.0009) +[2023-10-11 17:27:11,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 122781696. Throughput: 0: 1659.7, 1: 1710.9. Samples: 30704206. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:27:11,063][84230] Avg episode reward: [(0, '39.620'), (1, '44.490')] +[2023-10-11 17:27:11,330][85175] Updated weights for policy 1, policy_version 60400 (0.0010) +[2023-10-11 17:27:11,705][85175] Updated weights for policy 1, policy_version 60410 (0.0007) +[2023-10-11 17:27:13,631][85176] Updated weights for policy 0, policy_version 59522 (0.0009) +[2023-10-11 17:27:14,002][85176] Updated weights for policy 0, policy_version 59532 (0.0008) +[2023-10-11 17:27:14,368][85176] Updated weights for policy 0, policy_version 59542 (0.0010) +[2023-10-11 17:27:14,739][85176] Updated weights for policy 0, policy_version 59552 (0.0007) +[2023-10-11 17:27:15,662][85175] Updated weights for policy 1, policy_version 60420 (0.0007) +[2023-10-11 17:27:16,025][85175] Updated weights for policy 1, policy_version 60430 (0.0008) +[2023-10-11 17:27:16,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 122847232. Throughput: 0: 1675.7, 1: 1719.8. Samples: 30724914. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:27:16,063][84230] Avg episode reward: [(0, '40.310'), (1, '43.160')] +[2023-10-11 17:27:16,395][85175] Updated weights for policy 1, policy_version 60440 (0.0009) +[2023-10-11 17:27:18,692][85176] Updated weights for policy 0, policy_version 59562 (0.0007) +[2023-10-11 17:27:19,066][85176] Updated weights for policy 0, policy_version 59572 (0.0010) +[2023-10-11 17:27:19,436][85176] Updated weights for policy 0, policy_version 59582 (0.0012) +[2023-10-11 17:27:20,452][85175] Updated weights for policy 1, policy_version 60450 (0.0008) +[2023-10-11 17:27:20,818][85175] Updated weights for policy 1, policy_version 60460 (0.0008) +[2023-10-11 17:27:21,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13551.5). Total num frames: 122912768. Throughput: 0: 1671.9, 1: 1721.3. Samples: 30735100. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:27:21,063][84230] Avg episode reward: [(0, '40.780'), (1, '43.560')] +[2023-10-11 17:27:21,179][85175] Updated weights for policy 1, policy_version 60470 (0.0009) +[2023-10-11 17:27:21,542][85175] Updated weights for policy 1, policy_version 60480 (0.0007) +[2023-10-11 17:27:23,409][85176] Updated weights for policy 0, policy_version 59592 (0.0011) +[2023-10-11 17:27:23,773][85176] Updated weights for policy 0, policy_version 59602 (0.0011) +[2023-10-11 17:27:24,152][85176] Updated weights for policy 0, policy_version 59612 (0.0010) +[2023-10-11 17:27:25,522][85175] Updated weights for policy 1, policy_version 60490 (0.0011) +[2023-10-11 17:27:25,889][85175] Updated weights for policy 1, policy_version 60500 (0.0009) +[2023-10-11 17:27:26,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13551.5). Total num frames: 122978304. Throughput: 0: 1660.0, 1: 1714.8. Samples: 30754966. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:27:26,063][84230] Avg episode reward: [(0, '39.020'), (1, '43.740')] +[2023-10-11 17:27:26,263][85175] Updated weights for policy 1, policy_version 60510 (0.0010) +[2023-10-11 17:27:28,387][85176] Updated weights for policy 0, policy_version 59622 (0.0010) +[2023-10-11 17:27:28,758][85176] Updated weights for policy 0, policy_version 59632 (0.0007) +[2023-10-11 17:27:29,128][85176] Updated weights for policy 0, policy_version 59642 (0.0007) +[2023-10-11 17:27:30,460][85175] Updated weights for policy 1, policy_version 60520 (0.0009) +[2023-10-11 17:27:30,835][85175] Updated weights for policy 1, policy_version 60530 (0.0009) +[2023-10-11 17:27:31,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 123043840. Throughput: 0: 1685.1, 1: 1708.2. Samples: 30775460. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:27:31,063][84230] Avg episode reward: [(0, '35.620'), (1, '43.860')] +[2023-10-11 17:27:31,201][85175] Updated weights for policy 1, policy_version 60540 (0.0010) +[2023-10-11 17:27:33,233][85176] Updated weights for policy 0, policy_version 59652 (0.0008) +[2023-10-11 17:27:33,614][85176] Updated weights for policy 0, policy_version 59662 (0.0007) +[2023-10-11 17:27:33,987][85176] Updated weights for policy 0, policy_version 59672 (0.0010) +[2023-10-11 17:27:35,247][85175] Updated weights for policy 1, policy_version 60550 (0.0009) +[2023-10-11 17:27:35,624][85175] Updated weights for policy 1, policy_version 60560 (0.0009) +[2023-10-11 17:27:35,982][85175] Updated weights for policy 1, policy_version 60570 (0.0007) +[2023-10-11 17:27:36,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 123109376. Throughput: 0: 1676.5, 1: 1716.9. Samples: 30785728. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) +[2023-10-11 17:27:36,063][84230] Avg episode reward: [(0, '35.920'), (1, '45.360')] +[2023-10-11 17:27:38,043][85176] Updated weights for policy 0, policy_version 59682 (0.0011) +[2023-10-11 17:27:38,418][85176] Updated weights for policy 0, policy_version 59692 (0.0010) +[2023-10-11 17:27:38,797][85176] Updated weights for policy 0, policy_version 59702 (0.0009) +[2023-10-11 17:27:39,164][85176] Updated weights for policy 0, policy_version 59712 (0.0009) +[2023-10-11 17:27:39,878][85175] Updated weights for policy 1, policy_version 60580 (0.0007) +[2023-10-11 17:27:40,238][85175] Updated weights for policy 1, policy_version 60590 (0.0008) +[2023-10-11 17:27:40,616][85175] Updated weights for policy 1, policy_version 60600 (0.0008) +[2023-10-11 17:27:41,063][84230] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 123207680. Throughput: 0: 1662.3, 1: 1712.2. Samples: 30805700. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) +[2023-10-11 17:27:41,064][84230] Avg episode reward: [(0, '35.970'), (1, '44.750')] +[2023-10-11 17:27:43,258][85176] Updated weights for policy 0, policy_version 59722 (0.0010) +[2023-10-11 17:27:43,640][85176] Updated weights for policy 0, policy_version 59732 (0.0010) +[2023-10-11 17:27:44,008][85176] Updated weights for policy 0, policy_version 59742 (0.0007) +[2023-10-11 17:27:44,463][85175] Updated weights for policy 1, policy_version 60610 (0.0010) +[2023-10-11 17:27:44,825][85175] Updated weights for policy 1, policy_version 60620 (0.0010) +[2023-10-11 17:27:45,194][85175] Updated weights for policy 1, policy_version 60630 (0.0010) +[2023-10-11 17:27:45,559][85175] Updated weights for policy 1, policy_version 60640 (0.0008) +[2023-10-11 17:27:46,063][84230] Fps is (10 sec: 16383.7, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 123273216. Throughput: 0: 1687.7, 1: 1684.6. Samples: 30825674. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) +[2023-10-11 17:27:46,064][84230] Avg episode reward: [(0, '39.160'), (1, '40.740')] +[2023-10-11 17:27:48,056][85176] Updated weights for policy 0, policy_version 59752 (0.0009) +[2023-10-11 17:27:48,437][85176] Updated weights for policy 0, policy_version 59762 (0.0009) +[2023-10-11 17:27:48,818][85176] Updated weights for policy 0, policy_version 59772 (0.0010) +[2023-10-11 17:27:49,558][85175] Updated weights for policy 1, policy_version 60650 (0.0007) +[2023-10-11 17:27:49,919][85175] Updated weights for policy 1, policy_version 60660 (0.0010) +[2023-10-11 17:27:50,286][85175] Updated weights for policy 1, policy_version 60670 (0.0009) +[2023-10-11 17:27:51,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 123338752. Throughput: 0: 1669.6, 1: 1715.0. Samples: 30836532. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) +[2023-10-11 17:27:51,064][84230] Avg episode reward: [(0, '40.970'), (1, '42.440')] +[2023-10-11 17:27:52,890][85176] Updated weights for policy 0, policy_version 59782 (0.0011) +[2023-10-11 17:27:53,263][85176] Updated weights for policy 0, policy_version 59792 (0.0011) +[2023-10-11 17:27:53,648][85176] Updated weights for policy 0, policy_version 59802 (0.0010) +[2023-10-11 17:27:54,345][85175] Updated weights for policy 1, policy_version 60680 (0.0009) +[2023-10-11 17:27:54,717][85175] Updated weights for policy 1, policy_version 60690 (0.0008) +[2023-10-11 17:27:55,080][85175] Updated weights for policy 1, policy_version 60700 (0.0010) +[2023-10-11 17:27:56,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 123404288. Throughput: 0: 1675.4, 1: 1701.2. Samples: 30856150. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) +[2023-10-11 17:27:56,063][84230] Avg episode reward: [(0, '39.780'), (1, '41.420')] +[2023-10-11 17:27:57,596][85176] Updated weights for policy 0, policy_version 59812 (0.0008) +[2023-10-11 17:27:57,956][85176] Updated weights for policy 0, policy_version 59822 (0.0010) +[2023-10-11 17:27:58,331][85176] Updated weights for policy 0, policy_version 59832 (0.0011) +[2023-10-11 17:27:59,163][85175] Updated weights for policy 1, policy_version 60710 (0.0011) +[2023-10-11 17:27:59,522][85175] Updated weights for policy 1, policy_version 60720 (0.0008) +[2023-10-11 17:27:59,894][85175] Updated weights for policy 1, policy_version 60730 (0.0011) +[2023-10-11 17:28:01,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 123469824. Throughput: 0: 1683.1, 1: 1676.8. Samples: 30876108. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) +[2023-10-11 17:28:01,064][84230] Avg episode reward: [(0, '37.940'), (1, '47.460')] +[2023-10-11 17:28:02,386][85176] Updated weights for policy 0, policy_version 59842 (0.0009) +[2023-10-11 17:28:02,773][85176] Updated weights for policy 0, policy_version 59852 (0.0009) +[2023-10-11 17:28:03,142][85176] Updated weights for policy 0, policy_version 59862 (0.0008) +[2023-10-11 17:28:03,513][85176] Updated weights for policy 0, policy_version 59872 (0.0010) +[2023-10-11 17:28:03,986][85175] Updated weights for policy 1, policy_version 60740 (0.0009) +[2023-10-11 17:28:04,355][85175] Updated weights for policy 1, policy_version 60750 (0.0010) +[2023-10-11 17:28:04,728][85175] Updated weights for policy 1, policy_version 60760 (0.0010) +[2023-10-11 17:28:06,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 123535360. Throughput: 0: 1661.1, 1: 1702.2. Samples: 30886448. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) +[2023-10-11 17:28:06,064][84230] Avg episode reward: [(0, '36.990'), (1, '43.260')] +[2023-10-11 17:28:07,598][85176] Updated weights for policy 0, policy_version 59882 (0.0008) +[2023-10-11 17:28:07,972][85176] Updated weights for policy 0, policy_version 59892 (0.0010) +[2023-10-11 17:28:08,346][85176] Updated weights for policy 0, policy_version 59902 (0.0007) +[2023-10-11 17:28:08,670][85175] Updated weights for policy 1, policy_version 60770 (0.0007) +[2023-10-11 17:28:09,051][85175] Updated weights for policy 1, policy_version 60780 (0.0009) +[2023-10-11 17:28:09,415][85175] Updated weights for policy 1, policy_version 60790 (0.0011) +[2023-10-11 17:28:09,785][85175] Updated weights for policy 1, policy_version 60800 (0.0009) +[2023-10-11 17:28:11,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 123600896. Throughput: 0: 1678.6, 1: 1685.3. Samples: 30906342. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) +[2023-10-11 17:28:11,063][84230] Avg episode reward: [(0, '37.650'), (1, '47.330')] +[2023-10-11 17:28:12,405][85176] Updated weights for policy 0, policy_version 59912 (0.0009) +[2023-10-11 17:28:12,782][85176] Updated weights for policy 0, policy_version 59922 (0.0010) +[2023-10-11 17:28:13,149][85176] Updated weights for policy 0, policy_version 59932 (0.0008) +[2023-10-11 17:28:14,042][85175] Updated weights for policy 1, policy_version 60810 (0.0008) +[2023-10-11 17:28:14,420][85175] Updated weights for policy 1, policy_version 60820 (0.0008) +[2023-10-11 17:28:14,787][85175] Updated weights for policy 1, policy_version 60830 (0.0009) +[2023-10-11 17:28:16,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 123666432. Throughput: 0: 1673.5, 1: 1684.7. Samples: 30926576. Policy #0 lag: (min: 31.0, avg: 34.6, max: 63.0) +[2023-10-11 17:28:16,063][84230] Avg episode reward: [(0, '39.570'), (1, '41.590')] +[2023-10-11 17:28:16,072][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000059936_61374464.pth... +[2023-10-11 17:28:16,073][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000060832_62291968.pth... +[2023-10-11 17:28:16,108][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000058368_59768832.pth +[2023-10-11 17:28:16,117][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000059232_60653568.pth +[2023-10-11 17:28:17,096][85176] Updated weights for policy 0, policy_version 59942 (0.0009) +[2023-10-11 17:28:17,472][85176] Updated weights for policy 0, policy_version 59952 (0.0011) +[2023-10-11 17:28:17,840][85176] Updated weights for policy 0, policy_version 59962 (0.0011) +[2023-10-11 17:28:18,760][85175] Updated weights for policy 1, policy_version 60840 (0.0009) +[2023-10-11 17:28:19,128][85175] Updated weights for policy 1, policy_version 60850 (0.0010) +[2023-10-11 17:28:19,498][85175] Updated weights for policy 1, policy_version 60860 (0.0010) +[2023-10-11 17:28:21,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 123731968. Throughput: 0: 1657.2, 1: 1702.9. Samples: 30936936. Policy #0 lag: (min: 31.0, avg: 34.6, max: 63.0) +[2023-10-11 17:28:21,063][84230] Avg episode reward: [(0, '36.750'), (1, '47.700')] +[2023-10-11 17:28:21,976][85176] Updated weights for policy 0, policy_version 59972 (0.0008) +[2023-10-11 17:28:22,351][85176] Updated weights for policy 0, policy_version 59982 (0.0008) +[2023-10-11 17:28:22,724][85176] Updated weights for policy 0, policy_version 59992 (0.0008) +[2023-10-11 17:28:23,364][85175] Updated weights for policy 1, policy_version 60870 (0.0010) +[2023-10-11 17:28:23,736][85175] Updated weights for policy 1, policy_version 60880 (0.0007) +[2023-10-11 17:28:24,094][85175] Updated weights for policy 1, policy_version 60890 (0.0008) +[2023-10-11 17:28:26,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 123797504. Throughput: 0: 1681.5, 1: 1672.4. Samples: 30956626. Policy #0 lag: (min: 31.0, avg: 34.6, max: 63.0) +[2023-10-11 17:28:26,064][84230] Avg episode reward: [(0, '40.590'), (1, '43.160')] +[2023-10-11 17:28:27,030][85176] Updated weights for policy 0, policy_version 60002 (0.0007) +[2023-10-11 17:28:27,448][85176] Updated weights for policy 0, policy_version 60012 (0.0007) +[2023-10-11 17:28:27,816][85176] Updated weights for policy 0, policy_version 60022 (0.0007) +[2023-10-11 17:28:28,038][85175] Updated weights for policy 1, policy_version 60900 (0.0008) +[2023-10-11 17:28:28,188][85176] Updated weights for policy 0, policy_version 60032 (0.0007) +[2023-10-11 17:28:28,402][85175] Updated weights for policy 1, policy_version 60910 (0.0008) +[2023-10-11 17:28:28,773][85175] Updated weights for policy 1, policy_version 60920 (0.0010) +[2023-10-11 17:28:31,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 123863040. Throughput: 0: 1669.2, 1: 1700.0. Samples: 30977288. Policy #0 lag: (min: 31.0, avg: 34.6, max: 63.0) +[2023-10-11 17:28:31,064][84230] Avg episode reward: [(0, '36.740'), (1, '48.780')] +[2023-10-11 17:28:32,405][85176] Updated weights for policy 0, policy_version 60042 (0.0009) +[2023-10-11 17:28:32,778][85176] Updated weights for policy 0, policy_version 60052 (0.0007) +[2023-10-11 17:28:32,786][85175] Updated weights for policy 1, policy_version 60930 (0.0008) +[2023-10-11 17:28:33,144][85176] Updated weights for policy 0, policy_version 60062 (0.0008) +[2023-10-11 17:28:33,158][85175] Updated weights for policy 1, policy_version 60940 (0.0008) +[2023-10-11 17:28:33,534][85175] Updated weights for policy 1, policy_version 60950 (0.0008) +[2023-10-11 17:28:33,895][85175] Updated weights for policy 1, policy_version 60960 (0.0010) +[2023-10-11 17:28:36,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 123928576. Throughput: 0: 1657.5, 1: 1682.4. Samples: 30986830. Policy #0 lag: (min: 31.0, avg: 34.6, max: 63.0) +[2023-10-11 17:28:36,064][84230] Avg episode reward: [(0, '41.040'), (1, '42.560')] +[2023-10-11 17:28:37,155][85176] Updated weights for policy 0, policy_version 60072 (0.0007) +[2023-10-11 17:28:37,526][85176] Updated weights for policy 0, policy_version 60082 (0.0011) +[2023-10-11 17:28:37,896][85176] Updated weights for policy 0, policy_version 60092 (0.0007) +[2023-10-11 17:28:38,025][85175] Updated weights for policy 1, policy_version 60970 (0.0009) +[2023-10-11 17:28:38,396][85175] Updated weights for policy 1, policy_version 60980 (0.0007) +[2023-10-11 17:28:38,763][85175] Updated weights for policy 1, policy_version 60990 (0.0007) +[2023-10-11 17:28:41,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 123994112. Throughput: 0: 1674.5, 1: 1683.5. Samples: 31007258. Policy #0 lag: (min: 31.0, avg: 34.6, max: 63.0) +[2023-10-11 17:28:41,063][84230] Avg episode reward: [(0, '37.610'), (1, '45.930')] +[2023-10-11 17:28:41,930][85176] Updated weights for policy 0, policy_version 60102 (0.0007) +[2023-10-11 17:28:42,300][85176] Updated weights for policy 0, policy_version 60112 (0.0007) +[2023-10-11 17:28:42,678][85176] Updated weights for policy 0, policy_version 60122 (0.0008) +[2023-10-11 17:28:42,707][85175] Updated weights for policy 1, policy_version 61000 (0.0007) +[2023-10-11 17:28:43,072][85175] Updated weights for policy 1, policy_version 61010 (0.0008) +[2023-10-11 17:28:43,442][85175] Updated weights for policy 1, policy_version 61020 (0.0009) +[2023-10-11 17:28:46,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 124059648. Throughput: 0: 1675.5, 1: 1707.2. Samples: 31028328. Policy #0 lag: (min: 31.0, avg: 34.6, max: 63.0) +[2023-10-11 17:28:46,064][84230] Avg episode reward: [(0, '39.260'), (1, '41.830')] +[2023-10-11 17:28:46,864][85176] Updated weights for policy 0, policy_version 60132 (0.0008) +[2023-10-11 17:28:47,241][85176] Updated weights for policy 0, policy_version 60142 (0.0009) +[2023-10-11 17:28:47,510][85175] Updated weights for policy 1, policy_version 61030 (0.0009) +[2023-10-11 17:28:47,610][85176] Updated weights for policy 0, policy_version 60152 (0.0009) +[2023-10-11 17:28:47,878][85175] Updated weights for policy 1, policy_version 61040 (0.0009) +[2023-10-11 17:28:48,255][85175] Updated weights for policy 1, policy_version 61050 (0.0010) +[2023-10-11 17:28:51,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 124125184. Throughput: 0: 1673.9, 1: 1681.2. Samples: 31037424. Policy #0 lag: (min: 31.0, avg: 34.6, max: 63.0) +[2023-10-11 17:28:51,064][84230] Avg episode reward: [(0, '38.410'), (1, '46.420')] +[2023-10-11 17:28:51,703][85176] Updated weights for policy 0, policy_version 60162 (0.0010) +[2023-10-11 17:28:52,090][85176] Updated weights for policy 0, policy_version 60172 (0.0009) +[2023-10-11 17:28:52,189][85175] Updated weights for policy 1, policy_version 61060 (0.0007) +[2023-10-11 17:28:52,466][85176] Updated weights for policy 0, policy_version 60182 (0.0009) +[2023-10-11 17:28:52,563][85175] Updated weights for policy 1, policy_version 61070 (0.0007) +[2023-10-11 17:28:52,830][85176] Updated weights for policy 0, policy_version 60192 (0.0009) +[2023-10-11 17:28:52,932][85175] Updated weights for policy 1, policy_version 61080 (0.0007) +[2023-10-11 17:28:56,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 124190720. Throughput: 0: 1672.6, 1: 1700.0. Samples: 31058112. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:28:56,063][84230] Avg episode reward: [(0, '38.030'), (1, '40.800')] +[2023-10-11 17:28:56,964][85176] Updated weights for policy 0, policy_version 60202 (0.0008) +[2023-10-11 17:28:57,012][85175] Updated weights for policy 1, policy_version 61090 (0.0007) +[2023-10-11 17:28:57,324][85176] Updated weights for policy 0, policy_version 60212 (0.0008) +[2023-10-11 17:28:57,381][85175] Updated weights for policy 1, policy_version 61100 (0.0008) +[2023-10-11 17:28:57,705][85176] Updated weights for policy 0, policy_version 60222 (0.0009) +[2023-10-11 17:28:57,744][85175] Updated weights for policy 1, policy_version 61110 (0.0009) +[2023-10-11 17:28:58,112][85175] Updated weights for policy 1, policy_version 61120 (0.0009) +[2023-10-11 17:29:01,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 124256256. Throughput: 0: 1672.3, 1: 1707.8. Samples: 31078678. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:29:01,064][84230] Avg episode reward: [(0, '40.640'), (1, '44.730')] +[2023-10-11 17:29:01,728][85176] Updated weights for policy 0, policy_version 60232 (0.0007) +[2023-10-11 17:29:02,104][85176] Updated weights for policy 0, policy_version 60242 (0.0007) +[2023-10-11 17:29:02,104][85175] Updated weights for policy 1, policy_version 61130 (0.0008) +[2023-10-11 17:29:02,464][85176] Updated weights for policy 0, policy_version 60252 (0.0009) +[2023-10-11 17:29:02,475][85175] Updated weights for policy 1, policy_version 61140 (0.0010) +[2023-10-11 17:29:02,842][85175] Updated weights for policy 1, policy_version 61150 (0.0009) +[2023-10-11 17:29:06,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 124321792. Throughput: 0: 1673.1, 1: 1680.0. Samples: 31087828. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:29:06,063][84230] Avg episode reward: [(0, '38.400'), (1, '41.010')] +[2023-10-11 17:29:06,642][85176] Updated weights for policy 0, policy_version 60262 (0.0009) +[2023-10-11 17:29:06,704][85175] Updated weights for policy 1, policy_version 61160 (0.0009) +[2023-10-11 17:29:07,018][85176] Updated weights for policy 0, policy_version 60272 (0.0010) +[2023-10-11 17:29:07,079][85175] Updated weights for policy 1, policy_version 61170 (0.0009) +[2023-10-11 17:29:07,394][85176] Updated weights for policy 0, policy_version 60282 (0.0010) +[2023-10-11 17:29:07,451][85175] Updated weights for policy 1, policy_version 61180 (0.0007) +[2023-10-11 17:29:11,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 124387328. Throughput: 0: 1666.5, 1: 1708.8. Samples: 31108512. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:29:11,064][84230] Avg episode reward: [(0, '40.640'), (1, '44.850')] +[2023-10-11 17:29:11,550][85176] Updated weights for policy 0, policy_version 60292 (0.0007) +[2023-10-11 17:29:11,564][85175] Updated weights for policy 1, policy_version 61190 (0.0008) +[2023-10-11 17:29:11,932][85175] Updated weights for policy 1, policy_version 61200 (0.0008) +[2023-10-11 17:29:11,934][85176] Updated weights for policy 0, policy_version 60302 (0.0007) +[2023-10-11 17:29:12,303][85175] Updated weights for policy 1, policy_version 61210 (0.0008) +[2023-10-11 17:29:12,314][85176] Updated weights for policy 0, policy_version 60312 (0.0010) +[2023-10-11 17:29:16,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 124452864. Throughput: 0: 1670.7, 1: 1703.6. Samples: 31129130. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:29:16,063][84230] Avg episode reward: [(0, '34.100'), (1, '41.750')] +[2023-10-11 17:29:16,333][85176] Updated weights for policy 0, policy_version 60322 (0.0010) +[2023-10-11 17:29:16,398][85175] Updated weights for policy 1, policy_version 61220 (0.0007) +[2023-10-11 17:29:16,697][85176] Updated weights for policy 0, policy_version 60332 (0.0009) +[2023-10-11 17:29:16,764][85175] Updated weights for policy 1, policy_version 61230 (0.0008) +[2023-10-11 17:29:17,077][85176] Updated weights for policy 0, policy_version 60342 (0.0009) +[2023-10-11 17:29:17,139][85175] Updated weights for policy 1, policy_version 61240 (0.0007) +[2023-10-11 17:29:17,446][85176] Updated weights for policy 0, policy_version 60352 (0.0010) +[2023-10-11 17:29:21,062][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 124518400. Throughput: 0: 1672.1, 1: 1692.2. Samples: 31138224. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:29:21,063][84230] Avg episode reward: [(0, '39.470'), (1, '44.750')] +[2023-10-11 17:29:21,084][85175] Updated weights for policy 1, policy_version 61250 (0.0008) +[2023-10-11 17:29:21,460][85175] Updated weights for policy 1, policy_version 61260 (0.0008) +[2023-10-11 17:29:21,588][85176] Updated weights for policy 0, policy_version 60362 (0.0008) +[2023-10-11 17:29:21,835][85175] Updated weights for policy 1, policy_version 61270 (0.0008) +[2023-10-11 17:29:21,962][85176] Updated weights for policy 0, policy_version 60372 (0.0008) +[2023-10-11 17:29:22,212][85175] Updated weights for policy 1, policy_version 61280 (0.0008) +[2023-10-11 17:29:22,339][85176] Updated weights for policy 0, policy_version 60382 (0.0009) +[2023-10-11 17:29:26,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 124583936. Throughput: 0: 1662.7, 1: 1707.1. Samples: 31158896. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:29:26,063][84230] Avg episode reward: [(0, '35.480'), (1, '41.240')] +[2023-10-11 17:29:26,083][85175] Updated weights for policy 1, policy_version 61290 (0.0009) +[2023-10-11 17:29:26,398][85176] Updated weights for policy 0, policy_version 60392 (0.0009) +[2023-10-11 17:29:26,457][85175] Updated weights for policy 1, policy_version 61300 (0.0008) +[2023-10-11 17:29:26,765][85176] Updated weights for policy 0, policy_version 60402 (0.0009) +[2023-10-11 17:29:26,815][85175] Updated weights for policy 1, policy_version 61310 (0.0008) +[2023-10-11 17:29:27,131][85176] Updated weights for policy 0, policy_version 60412 (0.0009) +[2023-10-11 17:29:30,771][85175] Updated weights for policy 1, policy_version 61320 (0.0008) +[2023-10-11 17:29:31,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 124649472. Throughput: 0: 1662.6, 1: 1705.7. Samples: 31179904. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:29:31,063][84230] Avg episode reward: [(0, '42.520'), (1, '45.740')] +[2023-10-11 17:29:31,103][85176] Updated weights for policy 0, policy_version 60422 (0.0008) +[2023-10-11 17:29:31,140][85175] Updated weights for policy 1, policy_version 61330 (0.0009) +[2023-10-11 17:29:31,477][85176] Updated weights for policy 0, policy_version 60432 (0.0009) +[2023-10-11 17:29:31,500][85175] Updated weights for policy 1, policy_version 61340 (0.0009) +[2023-10-11 17:29:31,847][85176] Updated weights for policy 0, policy_version 60442 (0.0007) +[2023-10-11 17:29:35,670][85175] Updated weights for policy 1, policy_version 61350 (0.0009) +[2023-10-11 17:29:35,872][85176] Updated weights for policy 0, policy_version 60452 (0.0008) +[2023-10-11 17:29:36,031][85175] Updated weights for policy 1, policy_version 61360 (0.0007) +[2023-10-11 17:29:36,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 124715008. Throughput: 0: 1665.3, 1: 1705.3. Samples: 31189098. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:29:36,063][84230] Avg episode reward: [(0, '39.500'), (1, '41.750')] +[2023-10-11 17:29:36,245][85176] Updated weights for policy 0, policy_version 60462 (0.0008) +[2023-10-11 17:29:36,394][85175] Updated weights for policy 1, policy_version 61370 (0.0008) +[2023-10-11 17:29:36,613][85176] Updated weights for policy 0, policy_version 60472 (0.0009) +[2023-10-11 17:29:40,413][85175] Updated weights for policy 1, policy_version 61380 (0.0009) +[2023-10-11 17:29:40,754][85176] Updated weights for policy 0, policy_version 60482 (0.0009) +[2023-10-11 17:29:40,774][85175] Updated weights for policy 1, policy_version 61390 (0.0009) +[2023-10-11 17:29:41,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 124780544. Throughput: 0: 1667.5, 1: 1704.1. Samples: 31209834. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:29:41,063][84230] Avg episode reward: [(0, '43.360'), (1, '45.270')] +[2023-10-11 17:29:41,117][85176] Updated weights for policy 0, policy_version 60492 (0.0008) +[2023-10-11 17:29:41,139][85175] Updated weights for policy 1, policy_version 61400 (0.0009) +[2023-10-11 17:29:41,493][85176] Updated weights for policy 0, policy_version 60502 (0.0007) +[2023-10-11 17:29:41,869][84801] Saving new best policy, reward=43.360! +[2023-10-11 17:29:41,871][85176] Updated weights for policy 0, policy_version 60512 (0.0010) +[2023-10-11 17:29:45,054][85175] Updated weights for policy 1, policy_version 61410 (0.0009) +[2023-10-11 17:29:45,411][85175] Updated weights for policy 1, policy_version 61420 (0.0010) +[2023-10-11 17:29:45,774][85175] Updated weights for policy 1, policy_version 61430 (0.0009) +[2023-10-11 17:29:45,944][85176] Updated weights for policy 0, policy_version 60522 (0.0007) +[2023-10-11 17:29:46,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 124846080. Throughput: 0: 1667.9, 1: 1697.6. Samples: 31230124. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:29:46,064][84230] Avg episode reward: [(0, '36.600'), (1, '43.380')] +[2023-10-11 17:29:46,147][85175] Updated weights for policy 1, policy_version 61440 (0.0008) +[2023-10-11 17:29:46,316][85176] Updated weights for policy 0, policy_version 60532 (0.0009) +[2023-10-11 17:29:46,686][85176] Updated weights for policy 0, policy_version 60542 (0.0007) +[2023-10-11 17:29:50,389][85175] Updated weights for policy 1, policy_version 61450 (0.0009) +[2023-10-11 17:29:50,744][85176] Updated weights for policy 0, policy_version 60552 (0.0007) +[2023-10-11 17:29:50,756][85175] Updated weights for policy 1, policy_version 61460 (0.0007) +[2023-10-11 17:29:51,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 124911616. Throughput: 0: 1668.5, 1: 1707.3. Samples: 31239740. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:29:51,063][84230] Avg episode reward: [(0, '39.710'), (1, '45.080')] +[2023-10-11 17:29:51,118][85176] Updated weights for policy 0, policy_version 60562 (0.0008) +[2023-10-11 17:29:51,129][85175] Updated weights for policy 1, policy_version 61470 (0.0008) +[2023-10-11 17:29:51,503][85176] Updated weights for policy 0, policy_version 60572 (0.0009) +[2023-10-11 17:29:55,247][85175] Updated weights for policy 1, policy_version 61480 (0.0008) +[2023-10-11 17:29:55,558][85176] Updated weights for policy 0, policy_version 60582 (0.0009) +[2023-10-11 17:29:55,622][85175] Updated weights for policy 1, policy_version 61490 (0.0007) +[2023-10-11 17:29:55,942][85176] Updated weights for policy 0, policy_version 60592 (0.0008) +[2023-10-11 17:29:55,992][85175] Updated weights for policy 1, policy_version 61500 (0.0009) +[2023-10-11 17:29:56,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 124977152. Throughput: 0: 1669.3, 1: 1698.4. Samples: 31260058. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:29:56,063][84230] Avg episode reward: [(0, '39.120'), (1, '42.890')] +[2023-10-11 17:29:56,313][85176] Updated weights for policy 0, policy_version 60602 (0.0007) +[2023-10-11 17:29:59,934][85175] Updated weights for policy 1, policy_version 61510 (0.0009) +[2023-10-11 17:30:00,299][85175] Updated weights for policy 1, policy_version 61520 (0.0008) +[2023-10-11 17:30:00,366][85176] Updated weights for policy 0, policy_version 60612 (0.0008) +[2023-10-11 17:30:00,658][85175] Updated weights for policy 1, policy_version 61530 (0.0008) +[2023-10-11 17:30:00,746][85176] Updated weights for policy 0, policy_version 60622 (0.0007) +[2023-10-11 17:30:01,062][84230] Fps is (10 sec: 16384.1, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 125075456. Throughput: 0: 1666.4, 1: 1683.2. Samples: 31279860. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:30:01,063][84230] Avg episode reward: [(0, '44.270'), (1, '42.770')] +[2023-10-11 17:30:01,117][85176] Updated weights for policy 0, policy_version 60632 (0.0007) +[2023-10-11 17:30:01,412][84801] Saving new best policy, reward=44.270! +[2023-10-11 17:30:04,796][85175] Updated weights for policy 1, policy_version 61540 (0.0008) +[2023-10-11 17:30:05,152][85175] Updated weights for policy 1, policy_version 61550 (0.0009) +[2023-10-11 17:30:05,170][85176] Updated weights for policy 0, policy_version 60642 (0.0008) +[2023-10-11 17:30:05,515][85175] Updated weights for policy 1, policy_version 61560 (0.0009) +[2023-10-11 17:30:05,535][85176] Updated weights for policy 0, policy_version 60652 (0.0007) +[2023-10-11 17:30:05,914][85176] Updated weights for policy 0, policy_version 60662 (0.0008) +[2023-10-11 17:30:06,062][84230] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 125140992. Throughput: 0: 1675.2, 1: 1703.6. Samples: 31290272. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:30:06,063][84230] Avg episode reward: [(0, '41.590'), (1, '40.420')] +[2023-10-11 17:30:06,287][85176] Updated weights for policy 0, policy_version 60672 (0.0010) +[2023-10-11 17:30:09,365][85175] Updated weights for policy 1, policy_version 61570 (0.0009) +[2023-10-11 17:30:09,727][85175] Updated weights for policy 1, policy_version 61580 (0.0007) +[2023-10-11 17:30:10,091][85175] Updated weights for policy 1, policy_version 61590 (0.0008) +[2023-10-11 17:30:10,432][85176] Updated weights for policy 0, policy_version 60682 (0.0007) +[2023-10-11 17:30:10,464][85175] Updated weights for policy 1, policy_version 61600 (0.0009) +[2023-10-11 17:30:10,807][85176] Updated weights for policy 0, policy_version 60692 (0.0009) +[2023-10-11 17:30:11,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 125206528. Throughput: 0: 1681.9, 1: 1691.6. Samples: 31310704. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:30:11,063][84230] Avg episode reward: [(0, '41.860'), (1, '43.820')] +[2023-10-11 17:30:11,172][85176] Updated weights for policy 0, policy_version 60702 (0.0008) +[2023-10-11 17:30:14,542][85175] Updated weights for policy 1, policy_version 61610 (0.0010) +[2023-10-11 17:30:14,908][85175] Updated weights for policy 1, policy_version 61620 (0.0010) +[2023-10-11 17:30:15,271][85175] Updated weights for policy 1, policy_version 61630 (0.0007) +[2023-10-11 17:30:15,373][85176] Updated weights for policy 0, policy_version 60712 (0.0008) +[2023-10-11 17:30:15,745][85176] Updated weights for policy 0, policy_version 60722 (0.0010) +[2023-10-11 17:30:16,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 125272064. Throughput: 0: 1666.0, 1: 1664.0. Samples: 31329756. Policy #0 lag: (min: 3.0, avg: 7.6, max: 35.0) +[2023-10-11 17:30:16,063][84230] Avg episode reward: [(0, '37.430'), (1, '44.350')] +[2023-10-11 17:30:16,073][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000061632_63111168.pth... +[2023-10-11 17:30:16,112][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000060032_61472768.pth +[2023-10-11 17:30:16,113][85176] Updated weights for policy 0, policy_version 60732 (0.0010) +[2023-10-11 17:30:16,262][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000060736_62193664.pth... +[2023-10-11 17:30:16,301][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000059168_60588032.pth +[2023-10-11 17:30:19,368][85175] Updated weights for policy 1, policy_version 61640 (0.0009) +[2023-10-11 17:30:19,731][85175] Updated weights for policy 1, policy_version 61650 (0.0007) +[2023-10-11 17:30:20,113][85175] Updated weights for policy 1, policy_version 61660 (0.0008) +[2023-10-11 17:30:20,259][85176] Updated weights for policy 0, policy_version 60742 (0.0008) +[2023-10-11 17:30:20,640][85176] Updated weights for policy 0, policy_version 60752 (0.0008) +[2023-10-11 17:30:21,010][85176] Updated weights for policy 0, policy_version 60762 (0.0008) +[2023-10-11 17:30:21,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 125337600. Throughput: 0: 1674.6, 1: 1690.7. Samples: 31340536. Policy #0 lag: (min: 3.0, avg: 7.6, max: 35.0) +[2023-10-11 17:30:21,063][84230] Avg episode reward: [(0, '43.400'), (1, '46.410')] +[2023-10-11 17:30:24,123][85175] Updated weights for policy 1, policy_version 61670 (0.0008) +[2023-10-11 17:30:24,491][85175] Updated weights for policy 1, policy_version 61680 (0.0008) +[2023-10-11 17:30:24,863][85175] Updated weights for policy 1, policy_version 61690 (0.0008) +[2023-10-11 17:30:25,024][85176] Updated weights for policy 0, policy_version 60772 (0.0007) +[2023-10-11 17:30:25,400][85176] Updated weights for policy 0, policy_version 60782 (0.0007) +[2023-10-11 17:30:25,764][85176] Updated weights for policy 0, policy_version 60792 (0.0007) +[2023-10-11 17:30:26,062][84230] Fps is (10 sec: 16384.3, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 125435904. Throughput: 0: 1675.9, 1: 1679.0. Samples: 31360806. Policy #0 lag: (min: 3.0, avg: 7.6, max: 35.0) +[2023-10-11 17:30:26,063][84230] Avg episode reward: [(0, '41.020'), (1, '44.150')] +[2023-10-11 17:30:28,754][85175] Updated weights for policy 1, policy_version 61700 (0.0009) +[2023-10-11 17:30:29,118][85175] Updated weights for policy 1, policy_version 61710 (0.0009) +[2023-10-11 17:30:29,482][85175] Updated weights for policy 1, policy_version 61720 (0.0010) +[2023-10-11 17:30:29,798][85176] Updated weights for policy 0, policy_version 60802 (0.0008) +[2023-10-11 17:30:30,162][85176] Updated weights for policy 0, policy_version 60812 (0.0007) +[2023-10-11 17:30:30,544][85176] Updated weights for policy 0, policy_version 60822 (0.0007) +[2023-10-11 17:30:30,915][85176] Updated weights for policy 0, policy_version 60832 (0.0008) +[2023-10-11 17:30:31,062][84230] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 125501440. Throughput: 0: 1659.2, 1: 1680.8. Samples: 31380424. Policy #0 lag: (min: 3.0, avg: 7.6, max: 35.0) +[2023-10-11 17:30:31,063][84230] Avg episode reward: [(0, '43.870'), (1, '44.870')] +[2023-10-11 17:30:33,501][85175] Updated weights for policy 1, policy_version 61730 (0.0010) +[2023-10-11 17:30:33,880][85175] Updated weights for policy 1, policy_version 61740 (0.0009) +[2023-10-11 17:30:34,242][85175] Updated weights for policy 1, policy_version 61750 (0.0009) +[2023-10-11 17:30:34,603][85175] Updated weights for policy 1, policy_version 61760 (0.0010) +[2023-10-11 17:30:34,952][85176] Updated weights for policy 0, policy_version 60842 (0.0010) +[2023-10-11 17:30:35,320][85176] Updated weights for policy 0, policy_version 60852 (0.0008) +[2023-10-11 17:30:35,687][85176] Updated weights for policy 0, policy_version 60862 (0.0007) +[2023-10-11 17:30:36,063][84230] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 125566976. Throughput: 0: 1680.6, 1: 1701.4. Samples: 31391930. Policy #0 lag: (min: 3.0, avg: 7.6, max: 35.0) +[2023-10-11 17:30:36,063][84230] Avg episode reward: [(0, '39.780'), (1, '43.120')] +[2023-10-11 17:30:38,664][85175] Updated weights for policy 1, policy_version 61770 (0.0008) +[2023-10-11 17:30:39,029][85175] Updated weights for policy 1, policy_version 61780 (0.0007) +[2023-10-11 17:30:39,405][85175] Updated weights for policy 1, policy_version 61790 (0.0010) +[2023-10-11 17:30:39,767][85176] Updated weights for policy 0, policy_version 60872 (0.0008) +[2023-10-11 17:30:40,133][85176] Updated weights for policy 0, policy_version 60882 (0.0008) +[2023-10-11 17:30:40,505][85176] Updated weights for policy 0, policy_version 60892 (0.0010) +[2023-10-11 17:30:41,063][84230] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 125632512. Throughput: 0: 1681.1, 1: 1682.7. Samples: 31411428. Policy #0 lag: (min: 3.0, avg: 7.6, max: 35.0) +[2023-10-11 17:30:41,063][84230] Avg episode reward: [(0, '42.220'), (1, '44.330')] +[2023-10-11 17:30:43,553][85175] Updated weights for policy 1, policy_version 61800 (0.0007) +[2023-10-11 17:30:43,913][85175] Updated weights for policy 1, policy_version 61810 (0.0007) +[2023-10-11 17:30:44,280][85175] Updated weights for policy 1, policy_version 61820 (0.0010) +[2023-10-11 17:30:44,582][85176] Updated weights for policy 0, policy_version 60902 (0.0010) +[2023-10-11 17:30:44,958][85176] Updated weights for policy 0, policy_version 60912 (0.0008) +[2023-10-11 17:30:45,332][85176] Updated weights for policy 0, policy_version 60922 (0.0009) +[2023-10-11 17:30:46,063][84230] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 125698048. Throughput: 0: 1663.7, 1: 1699.5. Samples: 31431204. Policy #0 lag: (min: 3.0, avg: 7.6, max: 35.0) +[2023-10-11 17:30:46,064][84230] Avg episode reward: [(0, '38.540'), (1, '43.010')] +[2023-10-11 17:30:48,173][85175] Updated weights for policy 1, policy_version 61830 (0.0009) +[2023-10-11 17:30:48,542][85175] Updated weights for policy 1, policy_version 61840 (0.0009) +[2023-10-11 17:30:48,915][85175] Updated weights for policy 1, policy_version 61850 (0.0009) +[2023-10-11 17:30:49,360][85176] Updated weights for policy 0, policy_version 60932 (0.0009) +[2023-10-11 17:30:49,744][85176] Updated weights for policy 0, policy_version 60942 (0.0008) +[2023-10-11 17:30:50,120][85176] Updated weights for policy 0, policy_version 60952 (0.0008) +[2023-10-11 17:30:51,062][84230] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 125763584. Throughput: 0: 1684.8, 1: 1694.4. Samples: 31442338. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:30:51,063][84230] Avg episode reward: [(0, '41.480'), (1, '44.960')] +[2023-10-11 17:30:53,110][85175] Updated weights for policy 1, policy_version 61860 (0.0009) +[2023-10-11 17:30:53,475][85175] Updated weights for policy 1, policy_version 61870 (0.0007) +[2023-10-11 17:30:53,841][85175] Updated weights for policy 1, policy_version 61880 (0.0008) +[2023-10-11 17:30:54,121][85176] Updated weights for policy 0, policy_version 60962 (0.0008) +[2023-10-11 17:30:54,497][85176] Updated weights for policy 0, policy_version 60972 (0.0007) +[2023-10-11 17:30:54,869][85176] Updated weights for policy 0, policy_version 60982 (0.0008) +[2023-10-11 17:30:55,229][85176] Updated weights for policy 0, policy_version 60992 (0.0008) +[2023-10-11 17:30:56,063][84230] Fps is (10 sec: 13107.4, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 125829120. Throughput: 0: 1671.8, 1: 1686.8. Samples: 31461840. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:30:56,063][84230] Avg episode reward: [(0, '38.180'), (1, '42.450')] +[2023-10-11 17:30:57,772][85175] Updated weights for policy 1, policy_version 61890 (0.0008) +[2023-10-11 17:30:58,135][85175] Updated weights for policy 1, policy_version 61900 (0.0007) +[2023-10-11 17:30:58,514][85175] Updated weights for policy 1, policy_version 61910 (0.0009) +[2023-10-11 17:30:58,874][85175] Updated weights for policy 1, policy_version 61920 (0.0008) +[2023-10-11 17:30:59,263][85176] Updated weights for policy 0, policy_version 61002 (0.0007) +[2023-10-11 17:30:59,635][85176] Updated weights for policy 0, policy_version 61012 (0.0008) +[2023-10-11 17:31:00,015][85176] Updated weights for policy 0, policy_version 61022 (0.0010) +[2023-10-11 17:31:01,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 125894656. Throughput: 0: 1669.7, 1: 1712.9. Samples: 31481976. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:31:01,063][84230] Avg episode reward: [(0, '43.420'), (1, '43.690')] +[2023-10-11 17:31:02,882][85175] Updated weights for policy 1, policy_version 61930 (0.0007) +[2023-10-11 17:31:03,248][85175] Updated weights for policy 1, policy_version 61940 (0.0009) +[2023-10-11 17:31:03,622][85175] Updated weights for policy 1, policy_version 61950 (0.0009) +[2023-10-11 17:31:04,046][85176] Updated weights for policy 0, policy_version 61032 (0.0009) +[2023-10-11 17:31:04,421][85176] Updated weights for policy 0, policy_version 61042 (0.0009) +[2023-10-11 17:31:04,799][85176] Updated weights for policy 0, policy_version 61052 (0.0007) +[2023-10-11 17:31:06,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 125960192. Throughput: 0: 1688.7, 1: 1690.4. Samples: 31492596. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:31:06,064][84230] Avg episode reward: [(0, '39.000'), (1, '39.290')] +[2023-10-11 17:31:07,716][85175] Updated weights for policy 1, policy_version 61960 (0.0008) +[2023-10-11 17:31:08,087][85175] Updated weights for policy 1, policy_version 61970 (0.0008) +[2023-10-11 17:31:08,454][85175] Updated weights for policy 1, policy_version 61980 (0.0007) +[2023-10-11 17:31:09,016][85176] Updated weights for policy 0, policy_version 61062 (0.0008) +[2023-10-11 17:31:09,386][85176] Updated weights for policy 0, policy_version 61072 (0.0008) +[2023-10-11 17:31:09,764][85176] Updated weights for policy 0, policy_version 61082 (0.0009) +[2023-10-11 17:31:11,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 126025728. Throughput: 0: 1672.7, 1: 1695.2. Samples: 31512360. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:31:11,063][84230] Avg episode reward: [(0, '41.510'), (1, '42.720')] +[2023-10-11 17:31:12,473][85175] Updated weights for policy 1, policy_version 61990 (0.0008) +[2023-10-11 17:31:12,841][85175] Updated weights for policy 1, policy_version 62000 (0.0009) +[2023-10-11 17:31:13,204][85175] Updated weights for policy 1, policy_version 62010 (0.0007) +[2023-10-11 17:31:13,892][85176] Updated weights for policy 0, policy_version 61092 (0.0010) +[2023-10-11 17:31:14,265][85176] Updated weights for policy 0, policy_version 61102 (0.0008) +[2023-10-11 17:31:14,640][85176] Updated weights for policy 0, policy_version 61112 (0.0007) +[2023-10-11 17:31:16,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 126091264. Throughput: 0: 1677.6, 1: 1709.3. Samples: 31532836. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:31:16,064][84230] Avg episode reward: [(0, '36.950'), (1, '40.350')] +[2023-10-11 17:31:17,149][85175] Updated weights for policy 1, policy_version 62020 (0.0009) +[2023-10-11 17:31:17,519][85175] Updated weights for policy 1, policy_version 62030 (0.0009) +[2023-10-11 17:31:17,880][85175] Updated weights for policy 1, policy_version 62040 (0.0007) +[2023-10-11 17:31:18,608][85176] Updated weights for policy 0, policy_version 61122 (0.0007) +[2023-10-11 17:31:18,979][85176] Updated weights for policy 0, policy_version 61132 (0.0008) +[2023-10-11 17:31:19,350][85176] Updated weights for policy 0, policy_version 61142 (0.0009) +[2023-10-11 17:31:19,720][85176] Updated weights for policy 0, policy_version 61152 (0.0010) +[2023-10-11 17:31:21,062][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 126156800. Throughput: 0: 1683.3, 1: 1680.5. Samples: 31543302. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:31:21,063][84230] Avg episode reward: [(0, '39.230'), (1, '45.850')] +[2023-10-11 17:31:21,819][85175] Updated weights for policy 1, policy_version 62050 (0.0008) +[2023-10-11 17:31:22,187][85175] Updated weights for policy 1, policy_version 62060 (0.0009) +[2023-10-11 17:31:22,554][85175] Updated weights for policy 1, policy_version 62070 (0.0010) +[2023-10-11 17:31:22,925][85175] Updated weights for policy 1, policy_version 62080 (0.0007) +[2023-10-11 17:31:23,773][85176] Updated weights for policy 0, policy_version 61162 (0.0007) +[2023-10-11 17:31:24,142][85176] Updated weights for policy 0, policy_version 61172 (0.0009) +[2023-10-11 17:31:24,522][85176] Updated weights for policy 0, policy_version 61182 (0.0008) +[2023-10-11 17:31:26,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 126222336. Throughput: 0: 1657.2, 1: 1711.4. Samples: 31563012. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:31:26,064][84230] Avg episode reward: [(0, '39.570'), (1, '39.610')] +[2023-10-11 17:31:26,928][85175] Updated weights for policy 1, policy_version 62090 (0.0009) +[2023-10-11 17:31:27,297][85175] Updated weights for policy 1, policy_version 62100 (0.0008) +[2023-10-11 17:31:27,664][85175] Updated weights for policy 1, policy_version 62110 (0.0008) +[2023-10-11 17:31:28,457][85176] Updated weights for policy 0, policy_version 61192 (0.0010) +[2023-10-11 17:31:28,834][85176] Updated weights for policy 0, policy_version 61202 (0.0008) +[2023-10-11 17:31:29,206][85176] Updated weights for policy 0, policy_version 61212 (0.0008) +[2023-10-11 17:31:31,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 126287872. Throughput: 0: 1680.8, 1: 1705.1. Samples: 31583566. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) +[2023-10-11 17:31:31,063][84230] Avg episode reward: [(0, '43.370'), (1, '45.630')] +[2023-10-11 17:31:31,822][85175] Updated weights for policy 1, policy_version 62120 (0.0008) +[2023-10-11 17:31:32,199][85175] Updated weights for policy 1, policy_version 62130 (0.0010) +[2023-10-11 17:31:32,567][85175] Updated weights for policy 1, policy_version 62140 (0.0011) +[2023-10-11 17:31:33,268][85176] Updated weights for policy 0, policy_version 61222 (0.0009) +[2023-10-11 17:31:33,631][85176] Updated weights for policy 0, policy_version 61232 (0.0008) +[2023-10-11 17:31:33,998][85176] Updated weights for policy 0, policy_version 61242 (0.0008) +[2023-10-11 17:31:36,062][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 126353408. Throughput: 0: 1667.6, 1: 1690.8. Samples: 31593468. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) +[2023-10-11 17:31:36,063][84230] Avg episode reward: [(0, '40.300'), (1, '42.160')] +[2023-10-11 17:31:36,554][85175] Updated weights for policy 1, policy_version 62150 (0.0008) +[2023-10-11 17:31:36,918][85175] Updated weights for policy 1, policy_version 62160 (0.0007) +[2023-10-11 17:31:37,279][85175] Updated weights for policy 1, policy_version 62170 (0.0007) +[2023-10-11 17:31:38,188][85176] Updated weights for policy 0, policy_version 61252 (0.0008) +[2023-10-11 17:31:38,577][85176] Updated weights for policy 0, policy_version 61262 (0.0009) +[2023-10-11 17:31:38,948][85176] Updated weights for policy 0, policy_version 61272 (0.0010) +[2023-10-11 17:31:41,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 126418944. Throughput: 0: 1664.0, 1: 1707.1. Samples: 31613536. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) +[2023-10-11 17:31:41,063][84230] Avg episode reward: [(0, '40.940'), (1, '45.360')] +[2023-10-11 17:31:41,091][85175] Updated weights for policy 1, policy_version 62180 (0.0009) +[2023-10-11 17:31:41,467][85175] Updated weights for policy 1, policy_version 62190 (0.0008) +[2023-10-11 17:31:41,838][85175] Updated weights for policy 1, policy_version 62200 (0.0008) +[2023-10-11 17:31:43,049][85176] Updated weights for policy 0, policy_version 61282 (0.0008) +[2023-10-11 17:31:43,424][85176] Updated weights for policy 0, policy_version 61292 (0.0010) +[2023-10-11 17:31:43,799][85176] Updated weights for policy 0, policy_version 61302 (0.0011) +[2023-10-11 17:31:44,171][85176] Updated weights for policy 0, policy_version 61312 (0.0009) +[2023-10-11 17:31:45,906][85175] Updated weights for policy 1, policy_version 62210 (0.0009) +[2023-10-11 17:31:46,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 126484480. Throughput: 0: 1684.1, 1: 1709.2. Samples: 31634676. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) +[2023-10-11 17:31:46,063][84230] Avg episode reward: [(0, '39.220'), (1, '41.220')] +[2023-10-11 17:31:46,267][85175] Updated weights for policy 1, policy_version 62220 (0.0009) +[2023-10-11 17:31:46,641][85175] Updated weights for policy 1, policy_version 62230 (0.0008) +[2023-10-11 17:31:47,002][85175] Updated weights for policy 1, policy_version 62240 (0.0009) +[2023-10-11 17:31:48,250][85176] Updated weights for policy 0, policy_version 61322 (0.0008) +[2023-10-11 17:31:48,619][85176] Updated weights for policy 0, policy_version 61332 (0.0008) +[2023-10-11 17:31:48,998][85176] Updated weights for policy 0, policy_version 61342 (0.0009) +[2023-10-11 17:31:51,000][85175] Updated weights for policy 1, policy_version 62250 (0.0007) +[2023-10-11 17:31:51,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 126550016. Throughput: 0: 1666.3, 1: 1703.3. Samples: 31644224. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) +[2023-10-11 17:31:51,063][84230] Avg episode reward: [(0, '41.740'), (1, '45.430')] +[2023-10-11 17:31:51,376][85175] Updated weights for policy 1, policy_version 62260 (0.0009) +[2023-10-11 17:31:51,739][85175] Updated weights for policy 1, policy_version 62270 (0.0007) +[2023-10-11 17:31:53,145][85176] Updated weights for policy 0, policy_version 61352 (0.0010) +[2023-10-11 17:31:53,519][85176] Updated weights for policy 0, policy_version 61362 (0.0008) +[2023-10-11 17:31:53,890][85176] Updated weights for policy 0, policy_version 61372 (0.0009) +[2023-10-11 17:31:55,557][85175] Updated weights for policy 1, policy_version 62280 (0.0007) +[2023-10-11 17:31:55,935][85175] Updated weights for policy 1, policy_version 62290 (0.0008) +[2023-10-11 17:31:56,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 126615552. Throughput: 0: 1665.6, 1: 1716.8. Samples: 31664566. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) +[2023-10-11 17:31:56,063][84230] Avg episode reward: [(0, '39.740'), (1, '40.820')] +[2023-10-11 17:31:56,310][85175] Updated weights for policy 1, policy_version 62300 (0.0007) +[2023-10-11 17:31:57,927][85176] Updated weights for policy 0, policy_version 61382 (0.0011) +[2023-10-11 17:31:58,304][85176] Updated weights for policy 0, policy_version 61392 (0.0007) +[2023-10-11 17:31:58,675][85176] Updated weights for policy 0, policy_version 61402 (0.0009) +[2023-10-11 17:32:00,379][85175] Updated weights for policy 1, policy_version 62310 (0.0009) +[2023-10-11 17:32:00,750][85175] Updated weights for policy 1, policy_version 62320 (0.0008) +[2023-10-11 17:32:01,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 126681088. Throughput: 0: 1677.7, 1: 1704.6. Samples: 31685036. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) +[2023-10-11 17:32:01,063][84230] Avg episode reward: [(0, '42.090'), (1, '44.930')] +[2023-10-11 17:32:01,127][85175] Updated weights for policy 1, policy_version 62330 (0.0009) +[2023-10-11 17:32:02,841][85176] Updated weights for policy 0, policy_version 61412 (0.0009) +[2023-10-11 17:32:03,213][85176] Updated weights for policy 0, policy_version 61422 (0.0008) +[2023-10-11 17:32:03,598][85176] Updated weights for policy 0, policy_version 61432 (0.0008) +[2023-10-11 17:32:05,194][85175] Updated weights for policy 1, policy_version 62340 (0.0009) +[2023-10-11 17:32:05,558][85175] Updated weights for policy 1, policy_version 62350 (0.0009) +[2023-10-11 17:32:05,921][85175] Updated weights for policy 1, policy_version 62360 (0.0010) +[2023-10-11 17:32:06,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 126746624. Throughput: 0: 1659.6, 1: 1712.4. Samples: 31695040. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) +[2023-10-11 17:32:06,063][84230] Avg episode reward: [(0, '41.750'), (1, '39.720')] +[2023-10-11 17:32:07,403][85176] Updated weights for policy 0, policy_version 61442 (0.0008) +[2023-10-11 17:32:07,772][85176] Updated weights for policy 0, policy_version 61452 (0.0010) +[2023-10-11 17:32:08,133][85176] Updated weights for policy 0, policy_version 61462 (0.0007) +[2023-10-11 17:32:08,511][85176] Updated weights for policy 0, policy_version 61472 (0.0011) +[2023-10-11 17:32:09,782][85175] Updated weights for policy 1, policy_version 62370 (0.0009) +[2023-10-11 17:32:10,146][85175] Updated weights for policy 1, policy_version 62380 (0.0009) +[2023-10-11 17:32:10,519][85175] Updated weights for policy 1, policy_version 62390 (0.0010) +[2023-10-11 17:32:10,890][85175] Updated weights for policy 1, policy_version 62400 (0.0007) +[2023-10-11 17:32:11,063][84230] Fps is (10 sec: 16383.7, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 126844928. Throughput: 0: 1680.4, 1: 1712.0. Samples: 31715672. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:32:11,064][84230] Avg episode reward: [(0, '43.850'), (1, '45.050')] +[2023-10-11 17:32:12,645][85176] Updated weights for policy 0, policy_version 61482 (0.0010) +[2023-10-11 17:32:13,015][85176] Updated weights for policy 0, policy_version 61492 (0.0011) +[2023-10-11 17:32:13,384][85176] Updated weights for policy 0, policy_version 61502 (0.0009) +[2023-10-11 17:32:14,776][85175] Updated weights for policy 1, policy_version 62410 (0.0008) +[2023-10-11 17:32:15,150][85175] Updated weights for policy 1, policy_version 62420 (0.0007) +[2023-10-11 17:32:15,522][85175] Updated weights for policy 1, policy_version 62430 (0.0007) +[2023-10-11 17:32:16,062][84230] Fps is (10 sec: 16384.1, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 126910464. Throughput: 0: 1681.0, 1: 1697.8. Samples: 31735614. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:32:16,063][84230] Avg episode reward: [(0, '40.480'), (1, '40.930')] +[2023-10-11 17:32:16,072][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000062432_63930368.pth... +[2023-10-11 17:32:16,073][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000061504_62980096.pth... +[2023-10-11 17:32:16,109][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000059936_61374464.pth +[2023-10-11 17:32:16,111][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000060832_62291968.pth +[2023-10-11 17:32:17,415][85176] Updated weights for policy 0, policy_version 61512 (0.0009) +[2023-10-11 17:32:17,797][85176] Updated weights for policy 0, policy_version 61522 (0.0010) +[2023-10-11 17:32:18,158][85176] Updated weights for policy 0, policy_version 61532 (0.0010) +[2023-10-11 17:32:19,602][85175] Updated weights for policy 1, policy_version 62440 (0.0007) +[2023-10-11 17:32:19,981][85175] Updated weights for policy 1, policy_version 62450 (0.0009) +[2023-10-11 17:32:20,345][85175] Updated weights for policy 1, policy_version 62460 (0.0010) +[2023-10-11 17:32:21,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 126976000. Throughput: 0: 1662.1, 1: 1725.2. Samples: 31745898. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:32:21,063][84230] Avg episode reward: [(0, '39.370'), (1, '44.680')] +[2023-10-11 17:32:22,182][85176] Updated weights for policy 0, policy_version 61542 (0.0010) +[2023-10-11 17:32:22,552][85176] Updated weights for policy 0, policy_version 61552 (0.0009) +[2023-10-11 17:32:22,930][85176] Updated weights for policy 0, policy_version 61562 (0.0007) +[2023-10-11 17:32:24,179][85175] Updated weights for policy 1, policy_version 62470 (0.0008) +[2023-10-11 17:32:24,553][85175] Updated weights for policy 1, policy_version 62480 (0.0008) +[2023-10-11 17:32:24,911][85175] Updated weights for policy 1, policy_version 62490 (0.0009) +[2023-10-11 17:32:26,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 127041536. Throughput: 0: 1681.8, 1: 1710.4. Samples: 31766182. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:32:26,063][84230] Avg episode reward: [(0, '38.820'), (1, '41.100')] +[2023-10-11 17:32:26,989][85176] Updated weights for policy 0, policy_version 61572 (0.0010) +[2023-10-11 17:32:27,374][85176] Updated weights for policy 0, policy_version 61582 (0.0007) +[2023-10-11 17:32:27,754][85176] Updated weights for policy 0, policy_version 61592 (0.0008) +[2023-10-11 17:32:29,023][85175] Updated weights for policy 1, policy_version 62500 (0.0008) +[2023-10-11 17:32:29,393][85175] Updated weights for policy 1, policy_version 62510 (0.0009) +[2023-10-11 17:32:29,760][85175] Updated weights for policy 1, policy_version 62520 (0.0009) +[2023-10-11 17:32:31,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 127107072. Throughput: 0: 1673.2, 1: 1690.2. Samples: 31786030. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:32:31,063][84230] Avg episode reward: [(0, '37.170'), (1, '44.310')] +[2023-10-11 17:32:31,936][85176] Updated weights for policy 0, policy_version 61602 (0.0010) +[2023-10-11 17:32:32,297][85176] Updated weights for policy 0, policy_version 61612 (0.0009) +[2023-10-11 17:32:32,676][85176] Updated weights for policy 0, policy_version 61622 (0.0011) +[2023-10-11 17:32:33,043][85176] Updated weights for policy 0, policy_version 61632 (0.0011) +[2023-10-11 17:32:33,759][85175] Updated weights for policy 1, policy_version 62530 (0.0010) +[2023-10-11 17:32:34,126][85175] Updated weights for policy 1, policy_version 62540 (0.0010) +[2023-10-11 17:32:34,492][85175] Updated weights for policy 1, policy_version 62550 (0.0008) +[2023-10-11 17:32:34,860][85175] Updated weights for policy 1, policy_version 62560 (0.0008) +[2023-10-11 17:32:36,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 127172608. Throughput: 0: 1658.8, 1: 1720.4. Samples: 31796290. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:32:36,063][84230] Avg episode reward: [(0, '38.230'), (1, '42.090')] +[2023-10-11 17:32:37,242][85176] Updated weights for policy 0, policy_version 61642 (0.0008) +[2023-10-11 17:32:37,615][85176] Updated weights for policy 0, policy_version 61652 (0.0009) +[2023-10-11 17:32:37,991][85176] Updated weights for policy 0, policy_version 61662 (0.0009) +[2023-10-11 17:32:38,696][85175] Updated weights for policy 1, policy_version 62570 (0.0009) +[2023-10-11 17:32:39,069][85175] Updated weights for policy 1, policy_version 62580 (0.0010) +[2023-10-11 17:32:39,441][85175] Updated weights for policy 1, policy_version 62590 (0.0010) +[2023-10-11 17:32:41,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 127238144. Throughput: 0: 1679.1, 1: 1693.6. Samples: 31816336. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:32:41,063][84230] Avg episode reward: [(0, '40.970'), (1, '45.920')] +[2023-10-11 17:32:42,082][85176] Updated weights for policy 0, policy_version 61672 (0.0010) +[2023-10-11 17:32:42,453][85176] Updated weights for policy 0, policy_version 61682 (0.0011) +[2023-10-11 17:32:42,820][85176] Updated weights for policy 0, policy_version 61692 (0.0011) +[2023-10-11 17:32:43,445][85175] Updated weights for policy 1, policy_version 62600 (0.0007) +[2023-10-11 17:32:43,808][85175] Updated weights for policy 1, policy_version 62610 (0.0007) +[2023-10-11 17:32:44,182][85175] Updated weights for policy 1, policy_version 62620 (0.0009) +[2023-10-11 17:32:46,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 127303680. Throughput: 0: 1677.4, 1: 1703.7. Samples: 31837186. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:32:46,063][84230] Avg episode reward: [(0, '42.570'), (1, '43.470')] +[2023-10-11 17:32:46,789][85176] Updated weights for policy 0, policy_version 61702 (0.0008) +[2023-10-11 17:32:47,168][85176] Updated weights for policy 0, policy_version 61712 (0.0008) +[2023-10-11 17:32:47,534][85176] Updated weights for policy 0, policy_version 61722 (0.0007) +[2023-10-11 17:32:48,270][85175] Updated weights for policy 1, policy_version 62630 (0.0009) +[2023-10-11 17:32:48,641][85175] Updated weights for policy 1, policy_version 62640 (0.0009) +[2023-10-11 17:32:49,013][85175] Updated weights for policy 1, policy_version 62650 (0.0008) +[2023-10-11 17:32:51,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 127369216. Throughput: 0: 1669.0, 1: 1713.5. Samples: 31847252. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:32:51,063][84230] Avg episode reward: [(0, '42.030'), (1, '44.790')] +[2023-10-11 17:32:51,560][85176] Updated weights for policy 0, policy_version 61732 (0.0009) +[2023-10-11 17:32:51,931][85176] Updated weights for policy 0, policy_version 61742 (0.0008) +[2023-10-11 17:32:52,317][85176] Updated weights for policy 0, policy_version 61752 (0.0008) +[2023-10-11 17:32:52,960][85175] Updated weights for policy 1, policy_version 62660 (0.0008) +[2023-10-11 17:32:53,330][85175] Updated weights for policy 1, policy_version 62670 (0.0007) +[2023-10-11 17:32:53,701][85175] Updated weights for policy 1, policy_version 62680 (0.0009) +[2023-10-11 17:32:56,062][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 127434752. Throughput: 0: 1677.1, 1: 1694.9. Samples: 31867412. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:32:56,063][84230] Avg episode reward: [(0, '42.340'), (1, '42.220')] +[2023-10-11 17:32:56,312][85176] Updated weights for policy 0, policy_version 61762 (0.0010) +[2023-10-11 17:32:56,683][85176] Updated weights for policy 0, policy_version 61772 (0.0011) +[2023-10-11 17:32:57,054][85176] Updated weights for policy 0, policy_version 61782 (0.0009) +[2023-10-11 17:32:57,429][85176] Updated weights for policy 0, policy_version 61792 (0.0010) +[2023-10-11 17:32:57,610][85175] Updated weights for policy 1, policy_version 62690 (0.0009) +[2023-10-11 17:32:57,976][85175] Updated weights for policy 1, policy_version 62700 (0.0010) +[2023-10-11 17:32:58,350][85175] Updated weights for policy 1, policy_version 62710 (0.0009) +[2023-10-11 17:32:58,710][85175] Updated weights for policy 1, policy_version 62720 (0.0010) +[2023-10-11 17:33:01,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 127500288. Throughput: 0: 1679.4, 1: 1718.0. Samples: 31888496. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:33:01,063][84230] Avg episode reward: [(0, '41.520'), (1, '40.960')] +[2023-10-11 17:33:01,367][85176] Updated weights for policy 0, policy_version 61802 (0.0008) +[2023-10-11 17:33:01,739][85176] Updated weights for policy 0, policy_version 61812 (0.0009) +[2023-10-11 17:33:02,106][85176] Updated weights for policy 0, policy_version 61822 (0.0008) +[2023-10-11 17:33:02,857][85175] Updated weights for policy 1, policy_version 62730 (0.0010) +[2023-10-11 17:33:03,224][85175] Updated weights for policy 1, policy_version 62740 (0.0011) +[2023-10-11 17:33:03,595][85175] Updated weights for policy 1, policy_version 62750 (0.0007) +[2023-10-11 17:33:06,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 127565824. Throughput: 0: 1682.9, 1: 1690.1. Samples: 31897682. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:33:06,064][84230] Avg episode reward: [(0, '44.270'), (1, '42.290')] +[2023-10-11 17:33:06,336][85176] Updated weights for policy 0, policy_version 61832 (0.0008) +[2023-10-11 17:33:06,719][85176] Updated weights for policy 0, policy_version 61842 (0.0010) +[2023-10-11 17:33:07,085][85176] Updated weights for policy 0, policy_version 61852 (0.0011) +[2023-10-11 17:33:07,755][85175] Updated weights for policy 1, policy_version 62760 (0.0008) +[2023-10-11 17:33:08,115][85175] Updated weights for policy 1, policy_version 62770 (0.0008) +[2023-10-11 17:33:08,491][85175] Updated weights for policy 1, policy_version 62780 (0.0009) +[2023-10-11 17:33:11,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 127631360. Throughput: 0: 1682.2, 1: 1694.9. Samples: 31918154. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:33:11,064][84230] Avg episode reward: [(0, '42.830'), (1, '42.710')] +[2023-10-11 17:33:11,229][85176] Updated weights for policy 0, policy_version 61862 (0.0009) +[2023-10-11 17:33:11,609][85176] Updated weights for policy 0, policy_version 61872 (0.0007) +[2023-10-11 17:33:11,979][85176] Updated weights for policy 0, policy_version 61882 (0.0007) +[2023-10-11 17:33:12,422][85175] Updated weights for policy 1, policy_version 62790 (0.0008) +[2023-10-11 17:33:12,793][85175] Updated weights for policy 1, policy_version 62800 (0.0009) +[2023-10-11 17:33:13,149][85175] Updated weights for policy 1, policy_version 62810 (0.0009) +[2023-10-11 17:33:16,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 127696896. Throughput: 0: 1687.7, 1: 1716.3. Samples: 31939210. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:33:16,064][84230] Avg episode reward: [(0, '42.550'), (1, '43.370')] +[2023-10-11 17:33:16,116][85176] Updated weights for policy 0, policy_version 61892 (0.0010) +[2023-10-11 17:33:16,501][85176] Updated weights for policy 0, policy_version 61902 (0.0008) +[2023-10-11 17:33:16,876][85176] Updated weights for policy 0, policy_version 61912 (0.0007) +[2023-10-11 17:33:16,995][85175] Updated weights for policy 1, policy_version 62820 (0.0008) +[2023-10-11 17:33:17,358][85175] Updated weights for policy 1, policy_version 62830 (0.0008) +[2023-10-11 17:33:17,735][85175] Updated weights for policy 1, policy_version 62840 (0.0009) +[2023-10-11 17:33:21,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 127762432. Throughput: 0: 1685.0, 1: 1690.1. Samples: 31948170. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:33:21,063][84230] Avg episode reward: [(0, '40.950'), (1, '43.370')] +[2023-10-11 17:33:21,068][85176] Updated weights for policy 0, policy_version 61922 (0.0008) +[2023-10-11 17:33:21,439][85176] Updated weights for policy 0, policy_version 61932 (0.0008) +[2023-10-11 17:33:21,804][85176] Updated weights for policy 0, policy_version 61942 (0.0009) +[2023-10-11 17:33:21,856][85175] Updated weights for policy 1, policy_version 62850 (0.0009) +[2023-10-11 17:33:22,180][85176] Updated weights for policy 0, policy_version 61952 (0.0007) +[2023-10-11 17:33:22,231][85175] Updated weights for policy 1, policy_version 62860 (0.0009) +[2023-10-11 17:33:22,598][85175] Updated weights for policy 1, policy_version 62870 (0.0007) +[2023-10-11 17:33:22,960][85175] Updated weights for policy 1, policy_version 62880 (0.0008) +[2023-10-11 17:33:26,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 127827968. Throughput: 0: 1676.0, 1: 1713.4. Samples: 31968860. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:33:26,064][84230] Avg episode reward: [(0, '37.720'), (1, '44.380')] +[2023-10-11 17:33:26,218][85176] Updated weights for policy 0, policy_version 61962 (0.0007) +[2023-10-11 17:33:26,586][85176] Updated weights for policy 0, policy_version 61972 (0.0007) +[2023-10-11 17:33:26,905][85175] Updated weights for policy 1, policy_version 62890 (0.0007) +[2023-10-11 17:33:26,965][85176] Updated weights for policy 0, policy_version 61982 (0.0007) +[2023-10-11 17:33:27,266][85175] Updated weights for policy 1, policy_version 62900 (0.0010) +[2023-10-11 17:33:27,638][85175] Updated weights for policy 1, policy_version 62910 (0.0009) +[2023-10-11 17:33:30,999][85176] Updated weights for policy 0, policy_version 61992 (0.0009) +[2023-10-11 17:33:31,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 127893504. Throughput: 0: 1680.2, 1: 1712.5. Samples: 31989860. Policy #0 lag: (min: 7.0, avg: 9.5, max: 39.0) +[2023-10-11 17:33:31,063][84230] Avg episode reward: [(0, '38.930'), (1, '45.120')] +[2023-10-11 17:33:31,382][85176] Updated weights for policy 0, policy_version 62002 (0.0008) +[2023-10-11 17:33:31,695][85175] Updated weights for policy 1, policy_version 62920 (0.0009) +[2023-10-11 17:33:31,751][85176] Updated weights for policy 0, policy_version 62012 (0.0007) +[2023-10-11 17:33:32,058][85175] Updated weights for policy 1, policy_version 62930 (0.0009) +[2023-10-11 17:33:32,430][85175] Updated weights for policy 1, policy_version 62940 (0.0010) +[2023-10-11 17:33:35,534][85176] Updated weights for policy 0, policy_version 62022 (0.0008) +[2023-10-11 17:33:35,898][85176] Updated weights for policy 0, policy_version 62032 (0.0008) +[2023-10-11 17:33:36,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 127959040. Throughput: 0: 1681.1, 1: 1689.9. Samples: 31998948. Policy #0 lag: (min: 7.0, avg: 9.5, max: 39.0) +[2023-10-11 17:33:36,063][84230] Avg episode reward: [(0, '42.520'), (1, '42.890')] +[2023-10-11 17:33:36,279][85176] Updated weights for policy 0, policy_version 62042 (0.0009) +[2023-10-11 17:33:36,661][85175] Updated weights for policy 1, policy_version 62950 (0.0009) +[2023-10-11 17:33:37,023][85175] Updated weights for policy 1, policy_version 62960 (0.0009) +[2023-10-11 17:33:37,401][85175] Updated weights for policy 1, policy_version 62970 (0.0008) +[2023-10-11 17:33:40,589][85176] Updated weights for policy 0, policy_version 62052 (0.0010) +[2023-10-11 17:33:40,966][85176] Updated weights for policy 0, policy_version 62062 (0.0009) +[2023-10-11 17:33:41,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 128024576. Throughput: 0: 1681.2, 1: 1711.5. Samples: 32020088. Policy #0 lag: (min: 7.0, avg: 9.5, max: 39.0) +[2023-10-11 17:33:41,064][84230] Avg episode reward: [(0, '42.770'), (1, '44.580')] +[2023-10-11 17:33:41,200][85175] Updated weights for policy 1, policy_version 62980 (0.0008) +[2023-10-11 17:33:41,344][85176] Updated weights for policy 0, policy_version 62072 (0.0009) +[2023-10-11 17:33:41,563][85175] Updated weights for policy 1, policy_version 62990 (0.0008) +[2023-10-11 17:33:41,931][85175] Updated weights for policy 1, policy_version 63000 (0.0011) +[2023-10-11 17:33:45,614][85176] Updated weights for policy 0, policy_version 62082 (0.0008) +[2023-10-11 17:33:45,961][85175] Updated weights for policy 1, policy_version 63010 (0.0010) +[2023-10-11 17:33:45,996][85176] Updated weights for policy 0, policy_version 62092 (0.0008) +[2023-10-11 17:33:46,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 128090112. Throughput: 0: 1677.4, 1: 1707.2. Samples: 32040802. Policy #0 lag: (min: 7.0, avg: 9.5, max: 39.0) +[2023-10-11 17:33:46,064][84230] Avg episode reward: [(0, '43.510'), (1, '42.800')] +[2023-10-11 17:33:46,318][85175] Updated weights for policy 1, policy_version 63020 (0.0010) +[2023-10-11 17:33:46,360][85176] Updated weights for policy 0, policy_version 62102 (0.0009) +[2023-10-11 17:33:46,683][85175] Updated weights for policy 1, policy_version 63030 (0.0010) +[2023-10-11 17:33:46,738][85176] Updated weights for policy 0, policy_version 62112 (0.0009) +[2023-10-11 17:33:47,049][85175] Updated weights for policy 1, policy_version 63040 (0.0008) +[2023-10-11 17:33:50,649][85176] Updated weights for policy 0, policy_version 62122 (0.0007) +[2023-10-11 17:33:51,026][85176] Updated weights for policy 0, policy_version 62132 (0.0007) +[2023-10-11 17:33:51,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 128155648. Throughput: 0: 1674.8, 1: 1707.8. Samples: 32049900. Policy #0 lag: (min: 7.0, avg: 9.5, max: 39.0) +[2023-10-11 17:33:51,063][84230] Avg episode reward: [(0, '41.430'), (1, '44.640')] +[2023-10-11 17:33:51,185][85175] Updated weights for policy 1, policy_version 63050 (0.0009) +[2023-10-11 17:33:51,396][85176] Updated weights for policy 0, policy_version 62142 (0.0009) +[2023-10-11 17:33:51,570][85175] Updated weights for policy 1, policy_version 63060 (0.0009) +[2023-10-11 17:33:51,933][85175] Updated weights for policy 1, policy_version 63070 (0.0008) +[2023-10-11 17:33:55,242][85176] Updated weights for policy 0, policy_version 62152 (0.0007) +[2023-10-11 17:33:55,621][85176] Updated weights for policy 0, policy_version 62162 (0.0008) +[2023-10-11 17:33:55,943][85175] Updated weights for policy 1, policy_version 63080 (0.0009) +[2023-10-11 17:33:55,987][85176] Updated weights for policy 0, policy_version 62172 (0.0009) +[2023-10-11 17:33:56,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 128221184. Throughput: 0: 1677.6, 1: 1712.8. Samples: 32070720. Policy #0 lag: (min: 7.0, avg: 9.5, max: 39.0) +[2023-10-11 17:33:56,063][84230] Avg episode reward: [(0, '43.090'), (1, '41.160')] +[2023-10-11 17:33:56,306][85175] Updated weights for policy 1, policy_version 63090 (0.0008) +[2023-10-11 17:33:56,673][85175] Updated weights for policy 1, policy_version 63100 (0.0009) +[2023-10-11 17:34:00,101][85176] Updated weights for policy 0, policy_version 62182 (0.0009) +[2023-10-11 17:34:00,475][85176] Updated weights for policy 0, policy_version 62192 (0.0008) +[2023-10-11 17:34:00,647][85175] Updated weights for policy 1, policy_version 63110 (0.0007) +[2023-10-11 17:34:00,844][85176] Updated weights for policy 0, policy_version 62202 (0.0008) +[2023-10-11 17:34:01,018][85175] Updated weights for policy 1, policy_version 63120 (0.0010) +[2023-10-11 17:34:01,063][84230] Fps is (10 sec: 16383.7, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 128319488. Throughput: 0: 1657.3, 1: 1710.3. Samples: 32090752. Policy #0 lag: (min: 7.0, avg: 9.5, max: 39.0) +[2023-10-11 17:34:01,063][84230] Avg episode reward: [(0, '40.990'), (1, '45.410')] +[2023-10-11 17:34:01,379][85175] Updated weights for policy 1, policy_version 63130 (0.0012) +[2023-10-11 17:34:04,891][85176] Updated weights for policy 0, policy_version 62212 (0.0008) +[2023-10-11 17:34:05,268][85176] Updated weights for policy 0, policy_version 62222 (0.0010) +[2023-10-11 17:34:05,454][85175] Updated weights for policy 1, policy_version 63140 (0.0010) +[2023-10-11 17:34:05,648][85176] Updated weights for policy 0, policy_version 62232 (0.0008) +[2023-10-11 17:34:05,824][85175] Updated weights for policy 1, policy_version 63150 (0.0007) +[2023-10-11 17:34:06,062][84230] Fps is (10 sec: 16384.2, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 128385024. Throughput: 0: 1682.3, 1: 1708.3. Samples: 32100746. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:34:06,063][84230] Avg episode reward: [(0, '43.040'), (1, '43.300')] +[2023-10-11 17:34:06,190][85175] Updated weights for policy 1, policy_version 63160 (0.0009) +[2023-10-11 17:34:09,614][85176] Updated weights for policy 0, policy_version 62242 (0.0007) +[2023-10-11 17:34:09,990][85176] Updated weights for policy 0, policy_version 62252 (0.0008) +[2023-10-11 17:34:10,147][85175] Updated weights for policy 1, policy_version 63170 (0.0009) +[2023-10-11 17:34:10,361][85176] Updated weights for policy 0, policy_version 62262 (0.0007) +[2023-10-11 17:34:10,519][85175] Updated weights for policy 1, policy_version 63180 (0.0009) +[2023-10-11 17:34:10,732][85176] Updated weights for policy 0, policy_version 62272 (0.0008) +[2023-10-11 17:34:10,880][85175] Updated weights for policy 1, policy_version 63190 (0.0007) +[2023-10-11 17:34:11,062][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 128450560. Throughput: 0: 1683.3, 1: 1707.9. Samples: 32121462. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:34:11,063][84230] Avg episode reward: [(0, '42.580'), (1, '47.370')] +[2023-10-11 17:34:11,250][85175] Updated weights for policy 1, policy_version 63200 (0.0007) +[2023-10-11 17:34:14,939][85176] Updated weights for policy 0, policy_version 62282 (0.0007) +[2023-10-11 17:34:15,317][85176] Updated weights for policy 0, policy_version 62292 (0.0008) +[2023-10-11 17:34:15,370][85175] Updated weights for policy 1, policy_version 63210 (0.0007) +[2023-10-11 17:34:15,690][85176] Updated weights for policy 0, policy_version 62302 (0.0008) +[2023-10-11 17:34:15,738][85175] Updated weights for policy 1, policy_version 63220 (0.0011) +[2023-10-11 17:34:16,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 128516096. Throughput: 0: 1657.6, 1: 1692.9. Samples: 32140632. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:34:16,063][84230] Avg episode reward: [(0, '45.080'), (1, '43.210')] +[2023-10-11 17:34:16,073][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000062304_63799296.pth... +[2023-10-11 17:34:16,103][85175] Updated weights for policy 1, policy_version 63230 (0.0007) +[2023-10-11 17:34:16,113][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000060736_62193664.pth +[2023-10-11 17:34:16,119][84801] Saving new best policy, reward=45.080! +[2023-10-11 17:34:16,164][84801] Saving a milestone ./train_atari/atari_frostbite_APPO/checkpoint_p0/milestones/checkpoint_000062304_63799296.pth +[2023-10-11 17:34:16,176][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000063232_64749568.pth... +[2023-10-11 17:34:16,214][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000061632_63111168.pth +[2023-10-11 17:34:16,219][85000] Saving a milestone ./train_atari/atari_frostbite_APPO/checkpoint_p1/milestones/checkpoint_000063232_64749568.pth +[2023-10-11 17:34:19,926][85176] Updated weights for policy 0, policy_version 62312 (0.0008) +[2023-10-11 17:34:20,135][85175] Updated weights for policy 1, policy_version 63240 (0.0008) +[2023-10-11 17:34:20,303][85176] Updated weights for policy 0, policy_version 62322 (0.0009) +[2023-10-11 17:34:20,507][85175] Updated weights for policy 1, policy_version 63250 (0.0008) +[2023-10-11 17:34:20,670][85176] Updated weights for policy 0, policy_version 62332 (0.0009) +[2023-10-11 17:34:20,883][85175] Updated weights for policy 1, policy_version 63260 (0.0008) +[2023-10-11 17:34:21,063][84230] Fps is (10 sec: 16383.4, 60 sec: 14199.4, 300 sec: 13662.6). Total num frames: 128614400. Throughput: 0: 1678.9, 1: 1705.6. Samples: 32151254. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:34:21,064][84230] Avg episode reward: [(0, '42.090'), (1, '45.070')] +[2023-10-11 17:34:24,681][85176] Updated weights for policy 0, policy_version 62342 (0.0009) +[2023-10-11 17:34:25,051][85175] Updated weights for policy 1, policy_version 63270 (0.0008) +[2023-10-11 17:34:25,061][85176] Updated weights for policy 0, policy_version 62352 (0.0009) +[2023-10-11 17:34:25,420][85175] Updated weights for policy 1, policy_version 63280 (0.0007) +[2023-10-11 17:34:25,424][85176] Updated weights for policy 0, policy_version 62362 (0.0007) +[2023-10-11 17:34:25,785][85175] Updated weights for policy 1, policy_version 63290 (0.0007) +[2023-10-11 17:34:26,063][84230] Fps is (10 sec: 16383.8, 60 sec: 14199.5, 300 sec: 13662.6). Total num frames: 128679936. Throughput: 0: 1674.0, 1: 1696.4. Samples: 32171760. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:34:26,064][84230] Avg episode reward: [(0, '41.760'), (1, '42.560')] +[2023-10-11 17:34:29,653][85176] Updated weights for policy 0, policy_version 62372 (0.0009) +[2023-10-11 17:34:29,712][85175] Updated weights for policy 1, policy_version 63300 (0.0008) +[2023-10-11 17:34:30,020][85176] Updated weights for policy 0, policy_version 62382 (0.0007) +[2023-10-11 17:34:30,082][85175] Updated weights for policy 1, policy_version 63310 (0.0008) +[2023-10-11 17:34:30,386][85176] Updated weights for policy 0, policy_version 62392 (0.0009) +[2023-10-11 17:34:30,446][85175] Updated weights for policy 1, policy_version 63320 (0.0008) +[2023-10-11 17:34:31,063][84230] Fps is (10 sec: 13107.5, 60 sec: 14199.4, 300 sec: 13662.6). Total num frames: 128745472. Throughput: 0: 1650.6, 1: 1678.3. Samples: 32190600. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:34:31,063][84230] Avg episode reward: [(0, '39.900'), (1, '44.860')] +[2023-10-11 17:34:34,406][85176] Updated weights for policy 0, policy_version 62402 (0.0007) +[2023-10-11 17:34:34,481][85175] Updated weights for policy 1, policy_version 63330 (0.0008) +[2023-10-11 17:34:34,787][85176] Updated weights for policy 0, policy_version 62412 (0.0007) +[2023-10-11 17:34:34,844][85175] Updated weights for policy 1, policy_version 63340 (0.0008) +[2023-10-11 17:34:35,156][85176] Updated weights for policy 0, policy_version 62422 (0.0008) +[2023-10-11 17:34:35,210][85175] Updated weights for policy 1, policy_version 63350 (0.0007) +[2023-10-11 17:34:35,527][85176] Updated weights for policy 0, policy_version 62432 (0.0007) +[2023-10-11 17:34:35,568][85175] Updated weights for policy 1, policy_version 63360 (0.0008) +[2023-10-11 17:34:36,063][84230] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13662.6). Total num frames: 128811008. Throughput: 0: 1678.7, 1: 1698.8. Samples: 32201884. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:34:36,063][84230] Avg episode reward: [(0, '43.590'), (1, '44.470')] +[2023-10-11 17:34:39,736][85176] Updated weights for policy 0, policy_version 62442 (0.0009) +[2023-10-11 17:34:39,836][85175] Updated weights for policy 1, policy_version 63370 (0.0008) +[2023-10-11 17:34:40,119][85176] Updated weights for policy 0, policy_version 62452 (0.0009) +[2023-10-11 17:34:40,197][85175] Updated weights for policy 1, policy_version 63380 (0.0010) +[2023-10-11 17:34:40,493][85176] Updated weights for policy 0, policy_version 62462 (0.0010) +[2023-10-11 17:34:40,564][85175] Updated weights for policy 1, policy_version 63390 (0.0008) +[2023-10-11 17:34:41,063][84230] Fps is (10 sec: 13106.8, 60 sec: 14199.4, 300 sec: 13662.6). Total num frames: 128876544. Throughput: 0: 1669.8, 1: 1691.9. Samples: 32221996. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:34:41,064][84230] Avg episode reward: [(0, '42.620'), (1, '44.790')] +[2023-10-11 17:34:44,472][85175] Updated weights for policy 1, policy_version 63400 (0.0009) +[2023-10-11 17:34:44,647][85176] Updated weights for policy 0, policy_version 62472 (0.0009) +[2023-10-11 17:34:44,826][85175] Updated weights for policy 1, policy_version 63410 (0.0007) +[2023-10-11 17:34:45,018][85176] Updated weights for policy 0, policy_version 62482 (0.0007) +[2023-10-11 17:34:45,194][85175] Updated weights for policy 1, policy_version 63420 (0.0007) +[2023-10-11 17:34:45,383][85176] Updated weights for policy 0, policy_version 62492 (0.0009) +[2023-10-11 17:34:46,063][84230] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13662.6). Total num frames: 128942080. Throughput: 0: 1661.6, 1: 1659.7. Samples: 32240212. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:34:46,063][84230] Avg episode reward: [(0, '45.440'), (1, '43.150')] +[2023-10-11 17:34:46,073][84801] Saving new best policy, reward=45.440! +[2023-10-11 17:34:49,350][85175] Updated weights for policy 1, policy_version 63430 (0.0009) +[2023-10-11 17:34:49,468][85176] Updated weights for policy 0, policy_version 62502 (0.0009) +[2023-10-11 17:34:49,714][85175] Updated weights for policy 1, policy_version 63440 (0.0007) +[2023-10-11 17:34:49,851][85176] Updated weights for policy 0, policy_version 62512 (0.0007) +[2023-10-11 17:34:50,075][85175] Updated weights for policy 1, policy_version 63450 (0.0008) +[2023-10-11 17:34:50,222][85176] Updated weights for policy 0, policy_version 62522 (0.0008) +[2023-10-11 17:34:51,063][84230] Fps is (10 sec: 13107.6, 60 sec: 14199.4, 300 sec: 13662.6). Total num frames: 129007616. Throughput: 0: 1671.3, 1: 1685.1. Samples: 32251784. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:34:51,063][84230] Avg episode reward: [(0, '41.740'), (1, '44.670')] +[2023-10-11 17:34:54,145][85175] Updated weights for policy 1, policy_version 63460 (0.0008) +[2023-10-11 17:34:54,273][85176] Updated weights for policy 0, policy_version 62532 (0.0007) +[2023-10-11 17:34:54,507][85175] Updated weights for policy 1, policy_version 63470 (0.0007) +[2023-10-11 17:34:54,633][85176] Updated weights for policy 0, policy_version 62542 (0.0007) +[2023-10-11 17:34:54,877][85175] Updated weights for policy 1, policy_version 63480 (0.0007) +[2023-10-11 17:34:55,013][85176] Updated weights for policy 0, policy_version 62552 (0.0009) +[2023-10-11 17:34:56,063][84230] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 129073152. Throughput: 0: 1656.8, 1: 1670.2. Samples: 32271174. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:34:56,063][84230] Avg episode reward: [(0, '43.860'), (1, '42.320')] +[2023-10-11 17:34:59,075][85175] Updated weights for policy 1, policy_version 63490 (0.0009) +[2023-10-11 17:34:59,167][85176] Updated weights for policy 0, policy_version 62562 (0.0008) +[2023-10-11 17:34:59,444][85175] Updated weights for policy 1, policy_version 63500 (0.0008) +[2023-10-11 17:34:59,531][85176] Updated weights for policy 0, policy_version 62572 (0.0010) +[2023-10-11 17:34:59,811][85175] Updated weights for policy 1, policy_version 63510 (0.0009) +[2023-10-11 17:34:59,897][85176] Updated weights for policy 0, policy_version 62582 (0.0008) +[2023-10-11 17:35:00,176][85175] Updated weights for policy 1, policy_version 63520 (0.0009) +[2023-10-11 17:35:00,264][85176] Updated weights for policy 0, policy_version 62592 (0.0009) +[2023-10-11 17:35:01,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 129138688. Throughput: 0: 1657.8, 1: 1659.7. Samples: 32289918. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:35:01,064][84230] Avg episode reward: [(0, '41.380'), (1, '46.600')] +[2023-10-11 17:35:04,137][85175] Updated weights for policy 1, policy_version 63530 (0.0009) +[2023-10-11 17:35:04,279][85176] Updated weights for policy 0, policy_version 62602 (0.0010) +[2023-10-11 17:35:04,500][85175] Updated weights for policy 1, policy_version 63540 (0.0008) +[2023-10-11 17:35:04,654][85176] Updated weights for policy 0, policy_version 62612 (0.0008) +[2023-10-11 17:35:04,866][85175] Updated weights for policy 1, policy_version 63550 (0.0009) +[2023-10-11 17:35:05,031][85176] Updated weights for policy 0, policy_version 62622 (0.0008) +[2023-10-11 17:35:06,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 129204224. Throughput: 0: 1664.2, 1: 1680.5. Samples: 32301766. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:35:06,064][84230] Avg episode reward: [(0, '42.350'), (1, '41.350')] +[2023-10-11 17:35:08,703][85175] Updated weights for policy 1, policy_version 63560 (0.0007) +[2023-10-11 17:35:09,069][85175] Updated weights for policy 1, policy_version 63570 (0.0009) +[2023-10-11 17:35:09,220][85176] Updated weights for policy 0, policy_version 62632 (0.0008) +[2023-10-11 17:35:09,441][85175] Updated weights for policy 1, policy_version 63580 (0.0008) +[2023-10-11 17:35:09,591][85176] Updated weights for policy 0, policy_version 62642 (0.0007) +[2023-10-11 17:35:09,972][85176] Updated weights for policy 0, policy_version 62652 (0.0008) +[2023-10-11 17:35:11,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 129269760. Throughput: 0: 1655.7, 1: 1661.8. Samples: 32321048. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:35:11,064][84230] Avg episode reward: [(0, '39.520'), (1, '45.160')] +[2023-10-11 17:35:13,506][85175] Updated weights for policy 1, policy_version 63590 (0.0008) +[2023-10-11 17:35:13,880][85175] Updated weights for policy 1, policy_version 63600 (0.0008) +[2023-10-11 17:35:14,192][85176] Updated weights for policy 0, policy_version 62662 (0.0009) +[2023-10-11 17:35:14,245][85175] Updated weights for policy 1, policy_version 63610 (0.0008) +[2023-10-11 17:35:14,561][85176] Updated weights for policy 0, policy_version 62672 (0.0010) +[2023-10-11 17:35:14,936][85176] Updated weights for policy 0, policy_version 62682 (0.0007) +[2023-10-11 17:35:16,062][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 129335296. Throughput: 0: 1664.9, 1: 1675.2. Samples: 32340904. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:35:16,063][84230] Avg episode reward: [(0, '44.140'), (1, '42.930')] +[2023-10-11 17:35:18,298][85175] Updated weights for policy 1, policy_version 63620 (0.0008) +[2023-10-11 17:35:18,665][85175] Updated weights for policy 1, policy_version 63630 (0.0007) +[2023-10-11 17:35:19,022][85175] Updated weights for policy 1, policy_version 63640 (0.0008) +[2023-10-11 17:35:19,052][85176] Updated weights for policy 0, policy_version 62692 (0.0008) +[2023-10-11 17:35:19,419][85176] Updated weights for policy 0, policy_version 62702 (0.0009) +[2023-10-11 17:35:19,796][85176] Updated weights for policy 0, policy_version 62712 (0.0011) +[2023-10-11 17:35:21,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 129400832. Throughput: 0: 1667.8, 1: 1671.3. Samples: 32352146. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:35:21,063][84230] Avg episode reward: [(0, '41.490'), (1, '45.190')] +[2023-10-11 17:35:23,254][85175] Updated weights for policy 1, policy_version 63650 (0.0009) +[2023-10-11 17:35:23,630][85175] Updated weights for policy 1, policy_version 63660 (0.0009) +[2023-10-11 17:35:23,939][85176] Updated weights for policy 0, policy_version 62722 (0.0010) +[2023-10-11 17:35:23,992][85175] Updated weights for policy 1, policy_version 63670 (0.0008) +[2023-10-11 17:35:24,317][85176] Updated weights for policy 0, policy_version 62732 (0.0009) +[2023-10-11 17:35:24,367][85175] Updated weights for policy 1, policy_version 63680 (0.0007) +[2023-10-11 17:35:24,682][85176] Updated weights for policy 0, policy_version 62742 (0.0007) +[2023-10-11 17:35:25,056][85176] Updated weights for policy 0, policy_version 62752 (0.0008) +[2023-10-11 17:35:26,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 129466368. Throughput: 0: 1654.1, 1: 1657.4. Samples: 32371014. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:35:26,063][84230] Avg episode reward: [(0, '43.510'), (1, '44.560')] +[2023-10-11 17:35:28,534][85175] Updated weights for policy 1, policy_version 63690 (0.0008) +[2023-10-11 17:35:28,906][85175] Updated weights for policy 1, policy_version 63700 (0.0008) +[2023-10-11 17:35:29,048][85176] Updated weights for policy 0, policy_version 62762 (0.0008) +[2023-10-11 17:35:29,267][85175] Updated weights for policy 1, policy_version 63710 (0.0009) +[2023-10-11 17:35:29,414][85176] Updated weights for policy 0, policy_version 62772 (0.0009) +[2023-10-11 17:35:29,789][85176] Updated weights for policy 0, policy_version 62782 (0.0009) +[2023-10-11 17:35:31,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 129531904. Throughput: 0: 1667.4, 1: 1682.2. Samples: 32390946. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:35:31,063][84230] Avg episode reward: [(0, '42.010'), (1, '45.690')] +[2023-10-11 17:35:33,144][85175] Updated weights for policy 1, policy_version 63720 (0.0009) +[2023-10-11 17:35:33,512][85175] Updated weights for policy 1, policy_version 63730 (0.0008) +[2023-10-11 17:35:33,883][85175] Updated weights for policy 1, policy_version 63740 (0.0008) +[2023-10-11 17:35:33,949][85176] Updated weights for policy 0, policy_version 62792 (0.0008) +[2023-10-11 17:35:34,328][85176] Updated weights for policy 0, policy_version 62802 (0.0009) +[2023-10-11 17:35:34,694][85176] Updated weights for policy 0, policy_version 62812 (0.0007) +[2023-10-11 17:35:36,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 129597440. Throughput: 0: 1665.7, 1: 1673.6. Samples: 32402052. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:35:36,064][84230] Avg episode reward: [(0, '42.660'), (1, '45.300')] +[2023-10-11 17:35:37,875][85175] Updated weights for policy 1, policy_version 63750 (0.0009) +[2023-10-11 17:35:38,229][85175] Updated weights for policy 1, policy_version 63760 (0.0011) +[2023-10-11 17:35:38,589][85175] Updated weights for policy 1, policy_version 63770 (0.0009) +[2023-10-11 17:35:38,758][85176] Updated weights for policy 0, policy_version 62822 (0.0007) +[2023-10-11 17:35:39,133][85176] Updated weights for policy 0, policy_version 62832 (0.0009) +[2023-10-11 17:35:39,500][85176] Updated weights for policy 0, policy_version 62842 (0.0011) +[2023-10-11 17:35:41,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 129662976. Throughput: 0: 1657.2, 1: 1673.1. Samples: 32421036. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:35:41,063][84230] Avg episode reward: [(0, '41.910'), (1, '44.900')] +[2023-10-11 17:35:42,721][85175] Updated weights for policy 1, policy_version 63780 (0.0009) +[2023-10-11 17:35:43,095][85175] Updated weights for policy 1, policy_version 63790 (0.0009) +[2023-10-11 17:35:43,457][85175] Updated weights for policy 1, policy_version 63800 (0.0007) +[2023-10-11 17:35:43,624][85176] Updated weights for policy 0, policy_version 62852 (0.0009) +[2023-10-11 17:35:43,986][85176] Updated weights for policy 0, policy_version 62862 (0.0008) +[2023-10-11 17:35:44,361][85176] Updated weights for policy 0, policy_version 62872 (0.0011) +[2023-10-11 17:35:46,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 129728512. Throughput: 0: 1673.2, 1: 1692.8. Samples: 32441388. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:35:46,064][84230] Avg episode reward: [(0, '40.820'), (1, '43.850')] +[2023-10-11 17:35:47,708][85175] Updated weights for policy 1, policy_version 63810 (0.0008) +[2023-10-11 17:35:48,071][85175] Updated weights for policy 1, policy_version 63820 (0.0007) +[2023-10-11 17:35:48,326][85176] Updated weights for policy 0, policy_version 62882 (0.0009) +[2023-10-11 17:35:48,438][85175] Updated weights for policy 1, policy_version 63830 (0.0009) +[2023-10-11 17:35:48,707][85176] Updated weights for policy 0, policy_version 62892 (0.0008) +[2023-10-11 17:35:48,805][85175] Updated weights for policy 1, policy_version 63840 (0.0009) +[2023-10-11 17:35:49,075][85176] Updated weights for policy 0, policy_version 62902 (0.0007) +[2023-10-11 17:35:49,455][85176] Updated weights for policy 0, policy_version 62912 (0.0008) +[2023-10-11 17:35:51,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 129794048. Throughput: 0: 1665.0, 1: 1669.4. Samples: 32451814. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:35:51,063][84230] Avg episode reward: [(0, '43.180'), (1, '42.920')] +[2023-10-11 17:35:52,974][85175] Updated weights for policy 1, policy_version 63850 (0.0007) +[2023-10-11 17:35:53,335][85175] Updated weights for policy 1, policy_version 63860 (0.0007) +[2023-10-11 17:35:53,531][85176] Updated weights for policy 0, policy_version 62922 (0.0011) +[2023-10-11 17:35:53,707][85175] Updated weights for policy 1, policy_version 63870 (0.0008) +[2023-10-11 17:35:53,901][85176] Updated weights for policy 0, policy_version 62932 (0.0008) +[2023-10-11 17:35:54,283][85176] Updated weights for policy 0, policy_version 62942 (0.0009) +[2023-10-11 17:35:56,063][84230] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 129859584. Throughput: 0: 1655.8, 1: 1681.5. Samples: 32471224. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:35:56,063][84230] Avg episode reward: [(0, '42.730'), (1, '44.850')] +[2023-10-11 17:35:57,632][85175] Updated weights for policy 1, policy_version 63880 (0.0009) +[2023-10-11 17:35:58,006][85175] Updated weights for policy 1, policy_version 63890 (0.0007) +[2023-10-11 17:35:58,383][85175] Updated weights for policy 1, policy_version 63900 (0.0007) +[2023-10-11 17:35:58,408][85176] Updated weights for policy 0, policy_version 62952 (0.0007) +[2023-10-11 17:35:58,778][85176] Updated weights for policy 0, policy_version 62962 (0.0008) +[2023-10-11 17:35:59,144][85176] Updated weights for policy 0, policy_version 62972 (0.0007) +[2023-10-11 17:36:01,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 129925120. Throughput: 0: 1673.3, 1: 1687.0. Samples: 32492118. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:36:01,063][84230] Avg episode reward: [(0, '43.100'), (1, '42.430')] +[2023-10-11 17:36:02,319][85175] Updated weights for policy 1, policy_version 63910 (0.0009) +[2023-10-11 17:36:02,697][85175] Updated weights for policy 1, policy_version 63920 (0.0007) +[2023-10-11 17:36:03,044][85176] Updated weights for policy 0, policy_version 62982 (0.0008) +[2023-10-11 17:36:03,062][85175] Updated weights for policy 1, policy_version 63930 (0.0007) +[2023-10-11 17:36:03,417][85176] Updated weights for policy 0, policy_version 62992 (0.0008) +[2023-10-11 17:36:03,795][85176] Updated weights for policy 0, policy_version 63002 (0.0009) +[2023-10-11 17:36:06,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 129990656. Throughput: 0: 1656.2, 1: 1666.4. Samples: 32501660. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:36:06,064][84230] Avg episode reward: [(0, '41.770'), (1, '43.650')] +[2023-10-11 17:36:07,084][85175] Updated weights for policy 1, policy_version 63940 (0.0007) +[2023-10-11 17:36:07,451][85175] Updated weights for policy 1, policy_version 63950 (0.0008) +[2023-10-11 17:36:07,828][85175] Updated weights for policy 1, policy_version 63960 (0.0008) +[2023-10-11 17:36:07,856][85176] Updated weights for policy 0, policy_version 63012 (0.0008) +[2023-10-11 17:36:08,230][85176] Updated weights for policy 0, policy_version 63022 (0.0007) +[2023-10-11 17:36:08,610][85176] Updated weights for policy 0, policy_version 63032 (0.0010) +[2023-10-11 17:36:11,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 130056192. Throughput: 0: 1666.0, 1: 1689.4. Samples: 32522006. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:36:11,064][84230] Avg episode reward: [(0, '41.440'), (1, '43.220')] +[2023-10-11 17:36:11,849][85175] Updated weights for policy 1, policy_version 63970 (0.0008) +[2023-10-11 17:36:12,212][85175] Updated weights for policy 1, policy_version 63980 (0.0008) +[2023-10-11 17:36:12,576][85175] Updated weights for policy 1, policy_version 63990 (0.0008) +[2023-10-11 17:36:12,617][85176] Updated weights for policy 0, policy_version 63042 (0.0008) +[2023-10-11 17:36:12,945][85175] Updated weights for policy 1, policy_version 64000 (0.0009) +[2023-10-11 17:36:12,985][85176] Updated weights for policy 0, policy_version 63052 (0.0007) +[2023-10-11 17:36:13,351][85176] Updated weights for policy 0, policy_version 63062 (0.0011) +[2023-10-11 17:36:13,722][85176] Updated weights for policy 0, policy_version 63072 (0.0011) +[2023-10-11 17:36:16,062][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 130121728. Throughput: 0: 1679.1, 1: 1696.8. Samples: 32542860. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) +[2023-10-11 17:36:16,063][84230] Avg episode reward: [(0, '39.670'), (1, '43.900')] +[2023-10-11 17:36:16,071][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000064000_65536000.pth... +[2023-10-11 17:36:16,072][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000063072_64585728.pth... +[2023-10-11 17:36:16,102][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000062432_63930368.pth +[2023-10-11 17:36:16,115][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000061504_62980096.pth +[2023-10-11 17:36:17,107][85175] Updated weights for policy 1, policy_version 64010 (0.0007) +[2023-10-11 17:36:17,483][85175] Updated weights for policy 1, policy_version 64020 (0.0007) +[2023-10-11 17:36:17,805][85176] Updated weights for policy 0, policy_version 63082 (0.0007) +[2023-10-11 17:36:17,850][85175] Updated weights for policy 1, policy_version 64030 (0.0007) +[2023-10-11 17:36:18,181][85176] Updated weights for policy 0, policy_version 63092 (0.0007) +[2023-10-11 17:36:18,549][85176] Updated weights for policy 0, policy_version 63102 (0.0007) +[2023-10-11 17:36:21,062][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 130187264. Throughput: 0: 1655.0, 1: 1678.7. Samples: 32552066. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) +[2023-10-11 17:36:21,063][84230] Avg episode reward: [(0, '41.790'), (1, '41.570')] +[2023-10-11 17:36:21,852][85175] Updated weights for policy 1, policy_version 64040 (0.0009) +[2023-10-11 17:36:22,220][85175] Updated weights for policy 1, policy_version 64050 (0.0007) +[2023-10-11 17:36:22,593][85175] Updated weights for policy 1, policy_version 64060 (0.0008) +[2023-10-11 17:36:22,772][85176] Updated weights for policy 0, policy_version 63112 (0.0009) +[2023-10-11 17:36:23,149][85176] Updated weights for policy 0, policy_version 63122 (0.0008) +[2023-10-11 17:36:23,517][85176] Updated weights for policy 0, policy_version 63132 (0.0009) +[2023-10-11 17:36:26,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 130252800. Throughput: 0: 1675.2, 1: 1692.4. Samples: 32572576. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) +[2023-10-11 17:36:26,064][84230] Avg episode reward: [(0, '40.240'), (1, '45.970')] +[2023-10-11 17:36:26,608][85175] Updated weights for policy 1, policy_version 64070 (0.0008) +[2023-10-11 17:36:26,990][85175] Updated weights for policy 1, policy_version 64080 (0.0008) +[2023-10-11 17:36:27,354][85175] Updated weights for policy 1, policy_version 64090 (0.0008) +[2023-10-11 17:36:27,574][85176] Updated weights for policy 0, policy_version 63142 (0.0010) +[2023-10-11 17:36:27,963][85176] Updated weights for policy 0, policy_version 63152 (0.0011) +[2023-10-11 17:36:28,338][85176] Updated weights for policy 0, policy_version 63162 (0.0011) +[2023-10-11 17:36:31,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 130318336. Throughput: 0: 1680.2, 1: 1699.5. Samples: 32593476. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) +[2023-10-11 17:36:31,063][84230] Avg episode reward: [(0, '40.750'), (1, '42.690')] +[2023-10-11 17:36:31,175][85175] Updated weights for policy 1, policy_version 64100 (0.0007) +[2023-10-11 17:36:31,539][85175] Updated weights for policy 1, policy_version 64110 (0.0007) +[2023-10-11 17:36:31,906][85175] Updated weights for policy 1, policy_version 64120 (0.0008) +[2023-10-11 17:36:32,442][85176] Updated weights for policy 0, policy_version 63172 (0.0010) +[2023-10-11 17:36:32,814][85176] Updated weights for policy 0, policy_version 63182 (0.0010) +[2023-10-11 17:36:33,197][85176] Updated weights for policy 0, policy_version 63192 (0.0009) +[2023-10-11 17:36:35,912][85175] Updated weights for policy 1, policy_version 64130 (0.0007) +[2023-10-11 17:36:36,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 130383872. Throughput: 0: 1656.0, 1: 1699.9. Samples: 32602826. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) +[2023-10-11 17:36:36,063][84230] Avg episode reward: [(0, '39.600'), (1, '43.800')] +[2023-10-11 17:36:36,280][85175] Updated weights for policy 1, policy_version 64140 (0.0007) +[2023-10-11 17:36:36,634][85175] Updated weights for policy 1, policy_version 64150 (0.0009) +[2023-10-11 17:36:37,011][85175] Updated weights for policy 1, policy_version 64160 (0.0008) +[2023-10-11 17:36:37,248][85176] Updated weights for policy 0, policy_version 63202 (0.0008) +[2023-10-11 17:36:37,628][85176] Updated weights for policy 0, policy_version 63212 (0.0009) +[2023-10-11 17:36:38,009][85176] Updated weights for policy 0, policy_version 63222 (0.0008) +[2023-10-11 17:36:38,383][85176] Updated weights for policy 0, policy_version 63232 (0.0007) +[2023-10-11 17:36:41,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 130449408. Throughput: 0: 1678.1, 1: 1707.6. Samples: 32623584. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) +[2023-10-11 17:36:41,063][84230] Avg episode reward: [(0, '39.590'), (1, '40.280')] +[2023-10-11 17:36:41,148][85175] Updated weights for policy 1, policy_version 64170 (0.0007) +[2023-10-11 17:36:41,514][85175] Updated weights for policy 1, policy_version 64180 (0.0007) +[2023-10-11 17:36:41,882][85175] Updated weights for policy 1, policy_version 64190 (0.0008) +[2023-10-11 17:36:42,495][85176] Updated weights for policy 0, policy_version 63242 (0.0009) +[2023-10-11 17:36:42,873][85176] Updated weights for policy 0, policy_version 63252 (0.0007) +[2023-10-11 17:36:43,249][85176] Updated weights for policy 0, policy_version 63262 (0.0007) +[2023-10-11 17:36:45,888][85175] Updated weights for policy 1, policy_version 64200 (0.0008) +[2023-10-11 17:36:46,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 130514944. Throughput: 0: 1669.1, 1: 1703.9. Samples: 32643900. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) +[2023-10-11 17:36:46,064][84230] Avg episode reward: [(0, '42.510'), (1, '45.610')] +[2023-10-11 17:36:46,250][85175] Updated weights for policy 1, policy_version 64210 (0.0007) +[2023-10-11 17:36:46,615][85175] Updated weights for policy 1, policy_version 64220 (0.0008) +[2023-10-11 17:36:47,329][85176] Updated weights for policy 0, policy_version 63272 (0.0010) +[2023-10-11 17:36:47,701][85176] Updated weights for policy 0, policy_version 63282 (0.0008) +[2023-10-11 17:36:48,072][85176] Updated weights for policy 0, policy_version 63292 (0.0009) +[2023-10-11 17:36:50,632][85175] Updated weights for policy 1, policy_version 64230 (0.0010) +[2023-10-11 17:36:50,994][85175] Updated weights for policy 1, policy_version 64240 (0.0010) +[2023-10-11 17:36:51,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 130580480. Throughput: 0: 1659.5, 1: 1706.3. Samples: 32653120. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) +[2023-10-11 17:36:51,063][84230] Avg episode reward: [(0, '39.600'), (1, '43.380')] +[2023-10-11 17:36:51,361][85175] Updated weights for policy 1, policy_version 64250 (0.0010) +[2023-10-11 17:36:52,119][85176] Updated weights for policy 0, policy_version 63302 (0.0008) +[2023-10-11 17:36:52,490][85176] Updated weights for policy 0, policy_version 63312 (0.0007) +[2023-10-11 17:36:52,862][85176] Updated weights for policy 0, policy_version 63322 (0.0009) +[2023-10-11 17:36:55,308][85175] Updated weights for policy 1, policy_version 64260 (0.0009) +[2023-10-11 17:36:55,683][85175] Updated weights for policy 1, policy_version 64270 (0.0010) +[2023-10-11 17:36:56,043][85175] Updated weights for policy 1, policy_version 64280 (0.0007) +[2023-10-11 17:36:56,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 130646016. Throughput: 0: 1669.2, 1: 1706.9. Samples: 32673932. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) +[2023-10-11 17:36:56,063][84230] Avg episode reward: [(0, '39.580'), (1, '46.040')] +[2023-10-11 17:36:56,968][85176] Updated weights for policy 0, policy_version 63332 (0.0009) +[2023-10-11 17:36:57,333][85176] Updated weights for policy 0, policy_version 63342 (0.0010) +[2023-10-11 17:36:57,713][85176] Updated weights for policy 0, policy_version 63352 (0.0011) +[2023-10-11 17:37:00,046][85175] Updated weights for policy 1, policy_version 64290 (0.0008) +[2023-10-11 17:37:00,425][85175] Updated weights for policy 1, policy_version 64300 (0.0011) +[2023-10-11 17:37:00,798][85175] Updated weights for policy 1, policy_version 64310 (0.0007) +[2023-10-11 17:37:01,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 130711552. Throughput: 0: 1673.1, 1: 1696.8. Samples: 32694504. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:37:01,064][84230] Avg episode reward: [(0, '38.390'), (1, '43.030')] +[2023-10-11 17:37:01,166][85175] Updated weights for policy 1, policy_version 64320 (0.0010) +[2023-10-11 17:37:01,524][85176] Updated weights for policy 0, policy_version 63362 (0.0008) +[2023-10-11 17:37:01,900][85176] Updated weights for policy 0, policy_version 63372 (0.0008) +[2023-10-11 17:37:02,273][85176] Updated weights for policy 0, policy_version 63382 (0.0008) +[2023-10-11 17:37:02,645][85176] Updated weights for policy 0, policy_version 63392 (0.0007) +[2023-10-11 17:37:05,367][85175] Updated weights for policy 1, policy_version 64330 (0.0008) +[2023-10-11 17:37:05,730][85175] Updated weights for policy 1, policy_version 64340 (0.0011) +[2023-10-11 17:37:06,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 130777088. Throughput: 0: 1674.8, 1: 1710.0. Samples: 32704380. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:37:06,063][84230] Avg episode reward: [(0, '38.630'), (1, '46.160')] +[2023-10-11 17:37:06,108][85175] Updated weights for policy 1, policy_version 64350 (0.0009) +[2023-10-11 17:37:06,479][85176] Updated weights for policy 0, policy_version 63402 (0.0007) +[2023-10-11 17:37:06,841][85176] Updated weights for policy 0, policy_version 63412 (0.0007) +[2023-10-11 17:37:07,221][85176] Updated weights for policy 0, policy_version 63422 (0.0007) +[2023-10-11 17:37:09,957][85175] Updated weights for policy 1, policy_version 64360 (0.0008) +[2023-10-11 17:37:10,315][85175] Updated weights for policy 1, policy_version 64370 (0.0007) +[2023-10-11 17:37:10,684][85175] Updated weights for policy 1, policy_version 64380 (0.0008) +[2023-10-11 17:37:11,063][84230] Fps is (10 sec: 16384.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 130875392. Throughput: 0: 1688.7, 1: 1708.5. Samples: 32725450. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:37:11,063][84230] Avg episode reward: [(0, '40.140'), (1, '43.040')] +[2023-10-11 17:37:11,388][85176] Updated weights for policy 0, policy_version 63432 (0.0007) +[2023-10-11 17:37:11,760][85176] Updated weights for policy 0, policy_version 63442 (0.0007) +[2023-10-11 17:37:12,131][85176] Updated weights for policy 0, policy_version 63452 (0.0010) +[2023-10-11 17:37:14,660][85175] Updated weights for policy 1, policy_version 64390 (0.0011) +[2023-10-11 17:37:15,040][85175] Updated weights for policy 1, policy_version 64400 (0.0008) +[2023-10-11 17:37:15,405][85175] Updated weights for policy 1, policy_version 64410 (0.0011) +[2023-10-11 17:37:16,063][84230] Fps is (10 sec: 16383.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 130940928. Throughput: 0: 1692.3, 1: 1678.4. Samples: 32745160. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:37:16,064][84230] Avg episode reward: [(0, '36.690'), (1, '45.060')] +[2023-10-11 17:37:16,447][85176] Updated weights for policy 0, policy_version 63462 (0.0010) +[2023-10-11 17:37:16,833][85176] Updated weights for policy 0, policy_version 63472 (0.0009) +[2023-10-11 17:37:17,211][85176] Updated weights for policy 0, policy_version 63482 (0.0008) +[2023-10-11 17:37:19,560][85175] Updated weights for policy 1, policy_version 64420 (0.0009) +[2023-10-11 17:37:19,927][85175] Updated weights for policy 1, policy_version 64430 (0.0008) +[2023-10-11 17:37:20,296][85175] Updated weights for policy 1, policy_version 64440 (0.0008) +[2023-10-11 17:37:21,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 131006464. Throughput: 0: 1688.0, 1: 1697.5. Samples: 32755170. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:37:21,063][84230] Avg episode reward: [(0, '39.040'), (1, '41.080')] +[2023-10-11 17:37:21,235][85176] Updated weights for policy 0, policy_version 63492 (0.0008) +[2023-10-11 17:37:21,605][85176] Updated weights for policy 0, policy_version 63502 (0.0010) +[2023-10-11 17:37:21,976][85176] Updated weights for policy 0, policy_version 63512 (0.0010) +[2023-10-11 17:37:24,257][85175] Updated weights for policy 1, policy_version 64450 (0.0008) +[2023-10-11 17:37:24,627][85175] Updated weights for policy 1, policy_version 64460 (0.0008) +[2023-10-11 17:37:24,994][85175] Updated weights for policy 1, policy_version 64470 (0.0009) +[2023-10-11 17:37:25,353][85175] Updated weights for policy 1, policy_version 64480 (0.0007) +[2023-10-11 17:37:25,996][85176] Updated weights for policy 0, policy_version 63522 (0.0009) +[2023-10-11 17:37:26,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 131072000. Throughput: 0: 1683.9, 1: 1692.6. Samples: 32775526. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:37:26,064][84230] Avg episode reward: [(0, '34.130'), (1, '45.760')] +[2023-10-11 17:37:26,358][85176] Updated weights for policy 0, policy_version 63532 (0.0009) +[2023-10-11 17:37:26,741][85176] Updated weights for policy 0, policy_version 63542 (0.0010) +[2023-10-11 17:37:27,124][85176] Updated weights for policy 0, policy_version 63552 (0.0010) +[2023-10-11 17:37:29,523][85175] Updated weights for policy 1, policy_version 64490 (0.0009) +[2023-10-11 17:37:29,885][85175] Updated weights for policy 1, policy_version 64500 (0.0008) +[2023-10-11 17:37:30,250][85175] Updated weights for policy 1, policy_version 64510 (0.0008) +[2023-10-11 17:37:31,013][85176] Updated weights for policy 0, policy_version 63562 (0.0010) +[2023-10-11 17:37:31,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 131137536. Throughput: 0: 1698.9, 1: 1671.9. Samples: 32795586. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:37:31,064][84230] Avg episode reward: [(0, '40.910'), (1, '43.470')] +[2023-10-11 17:37:31,391][85176] Updated weights for policy 0, policy_version 63572 (0.0009) +[2023-10-11 17:37:31,756][85176] Updated weights for policy 0, policy_version 63582 (0.0008) +[2023-10-11 17:37:34,308][85175] Updated weights for policy 1, policy_version 64520 (0.0009) +[2023-10-11 17:37:34,670][85175] Updated weights for policy 1, policy_version 64530 (0.0008) +[2023-10-11 17:37:35,032][85175] Updated weights for policy 1, policy_version 64540 (0.0007) +[2023-10-11 17:37:35,745][85176] Updated weights for policy 0, policy_version 63592 (0.0010) +[2023-10-11 17:37:36,062][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 131203072. Throughput: 0: 1694.3, 1: 1701.8. Samples: 32805944. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:37:36,063][84230] Avg episode reward: [(0, '39.480'), (1, '47.600')] +[2023-10-11 17:37:36,128][85176] Updated weights for policy 0, policy_version 63602 (0.0011) +[2023-10-11 17:37:36,494][85176] Updated weights for policy 0, policy_version 63612 (0.0010) +[2023-10-11 17:37:39,017][85175] Updated weights for policy 1, policy_version 64550 (0.0007) +[2023-10-11 17:37:39,385][85175] Updated weights for policy 1, policy_version 64560 (0.0009) +[2023-10-11 17:37:39,745][85175] Updated weights for policy 1, policy_version 64570 (0.0007) +[2023-10-11 17:37:40,779][85176] Updated weights for policy 0, policy_version 63622 (0.0009) +[2023-10-11 17:37:41,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 131268608. Throughput: 0: 1688.0, 1: 1688.0. Samples: 32825850. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:37:41,064][84230] Avg episode reward: [(0, '41.260'), (1, '42.560')] +[2023-10-11 17:37:41,139][85176] Updated weights for policy 0, policy_version 63632 (0.0007) +[2023-10-11 17:37:41,509][85176] Updated weights for policy 0, policy_version 63642 (0.0008) +[2023-10-11 17:37:43,724][85175] Updated weights for policy 1, policy_version 64580 (0.0008) +[2023-10-11 17:37:44,093][85175] Updated weights for policy 1, policy_version 64590 (0.0009) +[2023-10-11 17:37:44,464][85175] Updated weights for policy 1, policy_version 64600 (0.0008) +[2023-10-11 17:37:45,825][85176] Updated weights for policy 0, policy_version 63652 (0.0007) +[2023-10-11 17:37:46,063][84230] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 131334144. Throughput: 0: 1687.4, 1: 1685.6. Samples: 32846292. Policy #0 lag: (min: 2.0, avg: 6.9, max: 34.0) +[2023-10-11 17:37:46,064][84230] Avg episode reward: [(0, '36.790'), (1, '47.630')] +[2023-10-11 17:37:46,206][85176] Updated weights for policy 0, policy_version 63662 (0.0008) +[2023-10-11 17:37:46,583][85176] Updated weights for policy 0, policy_version 63672 (0.0010) +[2023-10-11 17:37:48,405][85175] Updated weights for policy 1, policy_version 64610 (0.0008) +[2023-10-11 17:37:48,764][85175] Updated weights for policy 1, policy_version 64620 (0.0010) +[2023-10-11 17:37:49,131][85175] Updated weights for policy 1, policy_version 64630 (0.0008) +[2023-10-11 17:37:49,507][85175] Updated weights for policy 1, policy_version 64640 (0.0011) +[2023-10-11 17:37:50,651][85176] Updated weights for policy 0, policy_version 63682 (0.0009) +[2023-10-11 17:37:51,030][85176] Updated weights for policy 0, policy_version 63692 (0.0008) +[2023-10-11 17:37:51,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 131399680. Throughput: 0: 1680.7, 1: 1698.7. Samples: 32856452. Policy #0 lag: (min: 2.0, avg: 6.9, max: 34.0) +[2023-10-11 17:37:51,063][84230] Avg episode reward: [(0, '40.220'), (1, '39.590')] +[2023-10-11 17:37:51,415][85176] Updated weights for policy 0, policy_version 63702 (0.0008) +[2023-10-11 17:37:51,783][85176] Updated weights for policy 0, policy_version 63712 (0.0007) +[2023-10-11 17:37:53,566][85175] Updated weights for policy 1, policy_version 64650 (0.0007) +[2023-10-11 17:37:53,933][85175] Updated weights for policy 1, policy_version 64660 (0.0008) +[2023-10-11 17:37:54,302][85175] Updated weights for policy 1, policy_version 64670 (0.0010) +[2023-10-11 17:37:55,819][85176] Updated weights for policy 0, policy_version 63722 (0.0008) +[2023-10-11 17:37:56,062][84230] Fps is (10 sec: 13107.6, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 131465216. Throughput: 0: 1669.0, 1: 1677.2. Samples: 32876030. Policy #0 lag: (min: 2.0, avg: 6.9, max: 34.0) +[2023-10-11 17:37:56,063][84230] Avg episode reward: [(0, '40.970'), (1, '46.550')] +[2023-10-11 17:37:56,191][85176] Updated weights for policy 0, policy_version 63732 (0.0008) +[2023-10-11 17:37:56,554][85176] Updated weights for policy 0, policy_version 63742 (0.0010) +[2023-10-11 17:37:58,315][85175] Updated weights for policy 1, policy_version 64680 (0.0009) +[2023-10-11 17:37:58,688][85175] Updated weights for policy 1, policy_version 64690 (0.0007) +[2023-10-11 17:37:59,054][85175] Updated weights for policy 1, policy_version 64700 (0.0009) +[2023-10-11 17:38:00,612][85176] Updated weights for policy 0, policy_version 63752 (0.0011) +[2023-10-11 17:38:00,992][85176] Updated weights for policy 0, policy_version 63762 (0.0011) +[2023-10-11 17:38:01,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 131530752. Throughput: 0: 1660.8, 1: 1703.7. Samples: 32896558. Policy #0 lag: (min: 2.0, avg: 6.9, max: 34.0) +[2023-10-11 17:38:01,063][84230] Avg episode reward: [(0, '44.500'), (1, '41.010')] +[2023-10-11 17:38:01,352][85176] Updated weights for policy 0, policy_version 63772 (0.0008) +[2023-10-11 17:38:02,964][85175] Updated weights for policy 1, policy_version 64710 (0.0007) +[2023-10-11 17:38:03,334][85175] Updated weights for policy 1, policy_version 64720 (0.0007) +[2023-10-11 17:38:03,711][85175] Updated weights for policy 1, policy_version 64730 (0.0007) +[2023-10-11 17:38:05,615][85176] Updated weights for policy 0, policy_version 63782 (0.0007) +[2023-10-11 17:38:06,008][85176] Updated weights for policy 0, policy_version 63792 (0.0009) +[2023-10-11 17:38:06,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 131596288. Throughput: 0: 1671.4, 1: 1695.0. Samples: 32906658. Policy #0 lag: (min: 2.0, avg: 6.9, max: 34.0) +[2023-10-11 17:38:06,064][84230] Avg episode reward: [(0, '40.250'), (1, '46.750')] +[2023-10-11 17:38:06,385][85176] Updated weights for policy 0, policy_version 63802 (0.0009) +[2023-10-11 17:38:07,690][85175] Updated weights for policy 1, policy_version 64740 (0.0007) +[2023-10-11 17:38:08,057][85175] Updated weights for policy 1, policy_version 64750 (0.0007) +[2023-10-11 17:38:08,427][85175] Updated weights for policy 1, policy_version 64760 (0.0007) +[2023-10-11 17:38:10,444][85176] Updated weights for policy 0, policy_version 63812 (0.0008) +[2023-10-11 17:38:10,821][85176] Updated weights for policy 0, policy_version 63822 (0.0010) +[2023-10-11 17:38:11,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 131661824. Throughput: 0: 1672.9, 1: 1693.2. Samples: 32927000. Policy #0 lag: (min: 2.0, avg: 6.9, max: 34.0) +[2023-10-11 17:38:11,063][84230] Avg episode reward: [(0, '43.390'), (1, '41.180')] +[2023-10-11 17:38:11,198][85176] Updated weights for policy 0, policy_version 63832 (0.0008) +[2023-10-11 17:38:12,455][85175] Updated weights for policy 1, policy_version 64770 (0.0007) +[2023-10-11 17:38:12,831][85175] Updated weights for policy 1, policy_version 64780 (0.0009) +[2023-10-11 17:38:13,203][85175] Updated weights for policy 1, policy_version 64790 (0.0009) +[2023-10-11 17:38:13,580][85175] Updated weights for policy 1, policy_version 64800 (0.0010) +[2023-10-11 17:38:15,270][85176] Updated weights for policy 0, policy_version 63842 (0.0007) +[2023-10-11 17:38:15,645][85176] Updated weights for policy 0, policy_version 63852 (0.0009) +[2023-10-11 17:38:16,018][85176] Updated weights for policy 0, policy_version 63862 (0.0009) +[2023-10-11 17:38:16,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 131727360. Throughput: 0: 1655.0, 1: 1720.8. Samples: 32947494. Policy #0 lag: (min: 2.0, avg: 6.9, max: 34.0) +[2023-10-11 17:38:16,063][84230] Avg episode reward: [(0, '38.060'), (1, '46.490')] +[2023-10-11 17:38:16,073][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000064800_66355200.pth... +[2023-10-11 17:38:16,107][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000063232_64749568.pth +[2023-10-11 17:38:16,390][85176] Updated weights for policy 0, policy_version 63872 (0.0008) +[2023-10-11 17:38:16,390][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000063872_65404928.pth... +[2023-10-11 17:38:16,430][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000062304_63799296.pth +[2023-10-11 17:38:17,626][85175] Updated weights for policy 1, policy_version 64810 (0.0008) +[2023-10-11 17:38:17,991][85175] Updated weights for policy 1, policy_version 64820 (0.0008) +[2023-10-11 17:38:18,360][85175] Updated weights for policy 1, policy_version 64830 (0.0008) +[2023-10-11 17:38:20,367][85176] Updated weights for policy 0, policy_version 63882 (0.0007) +[2023-10-11 17:38:20,729][85176] Updated weights for policy 0, policy_version 63892 (0.0007) +[2023-10-11 17:38:21,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 131792896. Throughput: 0: 1667.3, 1: 1690.3. Samples: 32957036. Policy #0 lag: (min: 2.0, avg: 6.9, max: 34.0) +[2023-10-11 17:38:21,063][84230] Avg episode reward: [(0, '41.030'), (1, '42.950')] +[2023-10-11 17:38:21,098][85176] Updated weights for policy 0, policy_version 63902 (0.0008) +[2023-10-11 17:38:22,421][85175] Updated weights for policy 1, policy_version 64840 (0.0008) +[2023-10-11 17:38:22,799][85175] Updated weights for policy 1, policy_version 64850 (0.0007) +[2023-10-11 17:38:23,169][85175] Updated weights for policy 1, policy_version 64860 (0.0008) +[2023-10-11 17:38:25,138][85176] Updated weights for policy 0, policy_version 63912 (0.0007) +[2023-10-11 17:38:25,517][85176] Updated weights for policy 0, policy_version 63922 (0.0008) +[2023-10-11 17:38:25,886][85176] Updated weights for policy 0, policy_version 63932 (0.0008) +[2023-10-11 17:38:26,063][84230] Fps is (10 sec: 16384.0, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 131891200. Throughput: 0: 1671.6, 1: 1702.6. Samples: 32977688. Policy #0 lag: (min: 31.0, avg: 31.6, max: 47.0) +[2023-10-11 17:38:26,063][84230] Avg episode reward: [(0, '39.980'), (1, '47.790')] +[2023-10-11 17:38:27,154][85175] Updated weights for policy 1, policy_version 64870 (0.0007) +[2023-10-11 17:38:27,522][85175] Updated weights for policy 1, policy_version 64880 (0.0009) +[2023-10-11 17:38:27,884][85175] Updated weights for policy 1, policy_version 64890 (0.0007) +[2023-10-11 17:38:30,008][85176] Updated weights for policy 0, policy_version 63942 (0.0009) +[2023-10-11 17:38:30,381][85176] Updated weights for policy 0, policy_version 63952 (0.0009) +[2023-10-11 17:38:30,759][85176] Updated weights for policy 0, policy_version 63962 (0.0008) +[2023-10-11 17:38:31,063][84230] Fps is (10 sec: 16383.8, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 131956736. Throughput: 0: 1648.0, 1: 1715.4. Samples: 32997644. Policy #0 lag: (min: 31.0, avg: 31.6, max: 47.0) +[2023-10-11 17:38:31,063][84230] Avg episode reward: [(0, '43.220'), (1, '42.180')] +[2023-10-11 17:38:31,771][85175] Updated weights for policy 1, policy_version 64900 (0.0008) +[2023-10-11 17:38:32,144][85175] Updated weights for policy 1, policy_version 64910 (0.0009) +[2023-10-11 17:38:32,515][85175] Updated weights for policy 1, policy_version 64920 (0.0007) +[2023-10-11 17:38:34,900][85176] Updated weights for policy 0, policy_version 63972 (0.0009) +[2023-10-11 17:38:35,277][85176] Updated weights for policy 0, policy_version 63982 (0.0008) +[2023-10-11 17:38:35,642][85176] Updated weights for policy 0, policy_version 63992 (0.0008) +[2023-10-11 17:38:36,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 132022272. Throughput: 0: 1668.4, 1: 1688.6. Samples: 33007520. Policy #0 lag: (min: 31.0, avg: 31.6, max: 47.0) +[2023-10-11 17:38:36,064][84230] Avg episode reward: [(0, '41.890'), (1, '45.220')] +[2023-10-11 17:38:36,439][85175] Updated weights for policy 1, policy_version 64930 (0.0007) +[2023-10-11 17:38:36,809][85175] Updated weights for policy 1, policy_version 64940 (0.0009) +[2023-10-11 17:38:37,170][85175] Updated weights for policy 1, policy_version 64950 (0.0009) +[2023-10-11 17:38:37,541][85175] Updated weights for policy 1, policy_version 64960 (0.0009) +[2023-10-11 17:38:39,659][85176] Updated weights for policy 0, policy_version 64002 (0.0008) +[2023-10-11 17:38:40,032][85176] Updated weights for policy 0, policy_version 64012 (0.0007) +[2023-10-11 17:38:40,396][85176] Updated weights for policy 0, policy_version 64022 (0.0008) +[2023-10-11 17:38:40,777][85176] Updated weights for policy 0, policy_version 64032 (0.0011) +[2023-10-11 17:38:41,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 132087808. Throughput: 0: 1671.8, 1: 1718.9. Samples: 33028610. Policy #0 lag: (min: 31.0, avg: 31.6, max: 47.0) +[2023-10-11 17:38:41,064][84230] Avg episode reward: [(0, '41.520'), (1, '43.090')] +[2023-10-11 17:38:41,531][85175] Updated weights for policy 1, policy_version 64970 (0.0007) +[2023-10-11 17:38:41,905][85175] Updated weights for policy 1, policy_version 64980 (0.0008) +[2023-10-11 17:38:42,266][85175] Updated weights for policy 1, policy_version 64990 (0.0007) +[2023-10-11 17:38:44,994][85176] Updated weights for policy 0, policy_version 64042 (0.0009) +[2023-10-11 17:38:45,361][85176] Updated weights for policy 0, policy_version 64052 (0.0008) +[2023-10-11 17:38:45,738][85176] Updated weights for policy 0, policy_version 64062 (0.0010) +[2023-10-11 17:38:46,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 132153344. Throughput: 0: 1651.9, 1: 1715.4. Samples: 33048086. Policy #0 lag: (min: 31.0, avg: 31.6, max: 47.0) +[2023-10-11 17:38:46,064][84230] Avg episode reward: [(0, '40.690'), (1, '45.370')] +[2023-10-11 17:38:46,448][85175] Updated weights for policy 1, policy_version 65000 (0.0009) +[2023-10-11 17:38:46,830][85175] Updated weights for policy 1, policy_version 65010 (0.0010) +[2023-10-11 17:38:47,190][85175] Updated weights for policy 1, policy_version 65020 (0.0009) +[2023-10-11 17:38:49,920][85176] Updated weights for policy 0, policy_version 64072 (0.0007) +[2023-10-11 17:38:50,292][85176] Updated weights for policy 0, policy_version 64082 (0.0008) +[2023-10-11 17:38:50,658][85176] Updated weights for policy 0, policy_version 64092 (0.0008) +[2023-10-11 17:38:51,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 132218880. Throughput: 0: 1670.3, 1: 1693.9. Samples: 33058046. Policy #0 lag: (min: 31.0, avg: 31.6, max: 47.0) +[2023-10-11 17:38:51,064][84230] Avg episode reward: [(0, '41.350'), (1, '44.410')] +[2023-10-11 17:38:51,264][85175] Updated weights for policy 1, policy_version 65030 (0.0008) +[2023-10-11 17:38:51,630][85175] Updated weights for policy 1, policy_version 65040 (0.0007) +[2023-10-11 17:38:51,998][85175] Updated weights for policy 1, policy_version 65050 (0.0009) +[2023-10-11 17:38:54,733][85176] Updated weights for policy 0, policy_version 64102 (0.0007) +[2023-10-11 17:38:55,116][85176] Updated weights for policy 0, policy_version 64112 (0.0008) +[2023-10-11 17:38:55,501][85176] Updated weights for policy 0, policy_version 64122 (0.0007) +[2023-10-11 17:38:55,952][85175] Updated weights for policy 1, policy_version 65060 (0.0008) +[2023-10-11 17:38:56,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 132284416. Throughput: 0: 1671.6, 1: 1698.7. Samples: 33078662. Policy #0 lag: (min: 31.0, avg: 31.6, max: 47.0) +[2023-10-11 17:38:56,064][84230] Avg episode reward: [(0, '42.060'), (1, '44.000')] +[2023-10-11 17:38:56,326][85175] Updated weights for policy 1, policy_version 65070 (0.0007) +[2023-10-11 17:38:56,686][85175] Updated weights for policy 1, policy_version 65080 (0.0008) +[2023-10-11 17:38:59,478][85176] Updated weights for policy 0, policy_version 64132 (0.0008) +[2023-10-11 17:38:59,845][85176] Updated weights for policy 0, policy_version 64142 (0.0008) +[2023-10-11 17:39:00,216][85176] Updated weights for policy 0, policy_version 64152 (0.0008) +[2023-10-11 17:39:00,742][85175] Updated weights for policy 1, policy_version 65090 (0.0008) +[2023-10-11 17:39:01,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 132349952. Throughput: 0: 1656.7, 1: 1701.5. Samples: 33098614. Policy #0 lag: (min: 31.0, avg: 31.6, max: 47.0) +[2023-10-11 17:39:01,063][84230] Avg episode reward: [(0, '43.050'), (1, '44.930')] +[2023-10-11 17:39:01,109][85175] Updated weights for policy 1, policy_version 65100 (0.0009) +[2023-10-11 17:39:01,477][85175] Updated weights for policy 1, policy_version 65110 (0.0010) +[2023-10-11 17:39:01,846][85175] Updated weights for policy 1, policy_version 65120 (0.0010) +[2023-10-11 17:39:04,142][85176] Updated weights for policy 0, policy_version 64162 (0.0008) +[2023-10-11 17:39:04,519][85176] Updated weights for policy 0, policy_version 64172 (0.0009) +[2023-10-11 17:39:04,898][85176] Updated weights for policy 0, policy_version 64182 (0.0011) +[2023-10-11 17:39:05,266][85176] Updated weights for policy 0, policy_version 64192 (0.0010) +[2023-10-11 17:39:06,022][85175] Updated weights for policy 1, policy_version 65130 (0.0008) +[2023-10-11 17:39:06,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 132415488. Throughput: 0: 1675.9, 1: 1698.4. Samples: 33108882. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:39:06,063][84230] Avg episode reward: [(0, '44.470'), (1, '43.410')] +[2023-10-11 17:39:06,394][85175] Updated weights for policy 1, policy_version 65140 (0.0009) +[2023-10-11 17:39:06,756][85175] Updated weights for policy 1, policy_version 65150 (0.0009) +[2023-10-11 17:39:09,408][85176] Updated weights for policy 0, policy_version 64202 (0.0009) +[2023-10-11 17:39:09,775][85176] Updated weights for policy 0, policy_version 64212 (0.0007) +[2023-10-11 17:39:10,158][85176] Updated weights for policy 0, policy_version 64222 (0.0008) +[2023-10-11 17:39:10,697][85175] Updated weights for policy 1, policy_version 65160 (0.0008) +[2023-10-11 17:39:11,058][85175] Updated weights for policy 1, policy_version 65170 (0.0009) +[2023-10-11 17:39:11,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 132481024. Throughput: 0: 1662.5, 1: 1702.4. Samples: 33129106. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:39:11,063][84230] Avg episode reward: [(0, '42.700'), (1, '44.470')] +[2023-10-11 17:39:11,424][85175] Updated weights for policy 1, policy_version 65180 (0.0007) +[2023-10-11 17:39:14,053][85176] Updated weights for policy 0, policy_version 64232 (0.0009) +[2023-10-11 17:39:14,421][85176] Updated weights for policy 0, policy_version 64242 (0.0009) +[2023-10-11 17:39:14,790][85176] Updated weights for policy 0, policy_version 64252 (0.0008) +[2023-10-11 17:39:15,487][85175] Updated weights for policy 1, policy_version 65190 (0.0008) +[2023-10-11 17:39:15,857][85175] Updated weights for policy 1, policy_version 65200 (0.0007) +[2023-10-11 17:39:16,063][84230] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 132546560. Throughput: 0: 1673.6, 1: 1693.9. Samples: 33149184. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:39:16,064][84230] Avg episode reward: [(0, '45.320'), (1, '43.890')] +[2023-10-11 17:39:16,236][85175] Updated weights for policy 1, policy_version 65210 (0.0007) +[2023-10-11 17:39:18,922][85176] Updated weights for policy 0, policy_version 64262 (0.0007) +[2023-10-11 17:39:19,283][85176] Updated weights for policy 0, policy_version 64272 (0.0008) +[2023-10-11 17:39:19,663][85176] Updated weights for policy 0, policy_version 64282 (0.0007) +[2023-10-11 17:39:20,231][85175] Updated weights for policy 1, policy_version 65220 (0.0007) +[2023-10-11 17:39:20,588][85175] Updated weights for policy 1, policy_version 65230 (0.0009) +[2023-10-11 17:39:20,959][85175] Updated weights for policy 1, policy_version 65240 (0.0011) +[2023-10-11 17:39:21,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 132612096. Throughput: 0: 1685.3, 1: 1702.2. Samples: 33159956. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:39:21,063][84230] Avg episode reward: [(0, '43.040'), (1, '45.750')] +[2023-10-11 17:39:23,688][85176] Updated weights for policy 0, policy_version 64292 (0.0007) +[2023-10-11 17:39:24,066][85176] Updated weights for policy 0, policy_version 64302 (0.0009) +[2023-10-11 17:39:24,430][85176] Updated weights for policy 0, policy_version 64312 (0.0009) +[2023-10-11 17:39:24,951][85175] Updated weights for policy 1, policy_version 65250 (0.0010) +[2023-10-11 17:39:25,319][85175] Updated weights for policy 1, policy_version 65260 (0.0009) +[2023-10-11 17:39:25,693][85175] Updated weights for policy 1, policy_version 65270 (0.0009) +[2023-10-11 17:39:26,063][84230] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 132677632. Throughput: 0: 1663.2, 1: 1695.9. Samples: 33179768. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:39:26,064][84230] Avg episode reward: [(0, '46.060'), (1, '42.000')] +[2023-10-11 17:39:26,065][84801] Saving new best policy, reward=46.060! +[2023-10-11 17:39:26,070][85175] Updated weights for policy 1, policy_version 65280 (0.0007) +[2023-10-11 17:39:28,718][85176] Updated weights for policy 0, policy_version 64322 (0.0008) +[2023-10-11 17:39:29,088][85176] Updated weights for policy 0, policy_version 64332 (0.0008) +[2023-10-11 17:39:29,457][85176] Updated weights for policy 0, policy_version 64342 (0.0008) +[2023-10-11 17:39:29,826][85176] Updated weights for policy 0, policy_version 64352 (0.0008) +[2023-10-11 17:39:29,990][85175] Updated weights for policy 1, policy_version 65290 (0.0008) +[2023-10-11 17:39:30,356][85175] Updated weights for policy 1, policy_version 65300 (0.0010) +[2023-10-11 17:39:30,719][85175] Updated weights for policy 1, policy_version 65310 (0.0007) +[2023-10-11 17:39:31,063][84230] Fps is (10 sec: 16383.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 132775936. Throughput: 0: 1685.2, 1: 1681.6. Samples: 33199592. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:39:31,063][84230] Avg episode reward: [(0, '41.290'), (1, '46.310')] +[2023-10-11 17:39:33,948][85176] Updated weights for policy 0, policy_version 64362 (0.0008) +[2023-10-11 17:39:34,328][85176] Updated weights for policy 0, policy_version 64372 (0.0008) +[2023-10-11 17:39:34,703][85176] Updated weights for policy 0, policy_version 64382 (0.0008) +[2023-10-11 17:39:34,860][85175] Updated weights for policy 1, policy_version 65320 (0.0008) +[2023-10-11 17:39:35,231][85175] Updated weights for policy 1, policy_version 65330 (0.0009) +[2023-10-11 17:39:35,605][85175] Updated weights for policy 1, policy_version 65340 (0.0008) +[2023-10-11 17:39:36,063][84230] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 132841472. Throughput: 0: 1687.6, 1: 1706.3. Samples: 33210770. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:39:36,064][84230] Avg episode reward: [(0, '41.900'), (1, '42.400')] +[2023-10-11 17:39:38,736][85176] Updated weights for policy 0, policy_version 64392 (0.0008) +[2023-10-11 17:39:39,109][85176] Updated weights for policy 0, policy_version 64402 (0.0008) +[2023-10-11 17:39:39,357][85175] Updated weights for policy 1, policy_version 65350 (0.0008) +[2023-10-11 17:39:39,488][85176] Updated weights for policy 0, policy_version 64412 (0.0009) +[2023-10-11 17:39:39,738][85175] Updated weights for policy 1, policy_version 65360 (0.0009) +[2023-10-11 17:39:40,112][85175] Updated weights for policy 1, policy_version 65370 (0.0009) +[2023-10-11 17:39:41,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 132907008. Throughput: 0: 1664.1, 1: 1705.0. Samples: 33230272. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:39:41,063][84230] Avg episode reward: [(0, '42.050'), (1, '47.820')] +[2023-10-11 17:39:43,540][85176] Updated weights for policy 0, policy_version 64422 (0.0008) +[2023-10-11 17:39:43,933][85176] Updated weights for policy 0, policy_version 64432 (0.0008) +[2023-10-11 17:39:44,167][85175] Updated weights for policy 1, policy_version 65380 (0.0008) +[2023-10-11 17:39:44,305][85176] Updated weights for policy 0, policy_version 64442 (0.0008) +[2023-10-11 17:39:44,524][85175] Updated weights for policy 1, policy_version 65390 (0.0008) +[2023-10-11 17:39:44,896][85175] Updated weights for policy 1, policy_version 65400 (0.0007) +[2023-10-11 17:39:46,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 132972544. Throughput: 0: 1685.8, 1: 1681.2. Samples: 33250128. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:39:46,064][84230] Avg episode reward: [(0, '41.860'), (1, '44.460')] +[2023-10-11 17:39:48,369][85176] Updated weights for policy 0, policy_version 64452 (0.0009) +[2023-10-11 17:39:48,738][85176] Updated weights for policy 0, policy_version 64462 (0.0011) +[2023-10-11 17:39:48,793][85175] Updated weights for policy 1, policy_version 65410 (0.0009) +[2023-10-11 17:39:49,107][85176] Updated weights for policy 0, policy_version 64472 (0.0008) +[2023-10-11 17:39:49,163][85175] Updated weights for policy 1, policy_version 65420 (0.0007) +[2023-10-11 17:39:49,523][85175] Updated weights for policy 1, policy_version 65430 (0.0008) +[2023-10-11 17:39:49,896][85175] Updated weights for policy 1, policy_version 65440 (0.0007) +[2023-10-11 17:39:51,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 133038080. Throughput: 0: 1675.3, 1: 1719.6. Samples: 33261652. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:39:51,063][84230] Avg episode reward: [(0, '43.240'), (1, '47.780')] +[2023-10-11 17:39:53,023][85176] Updated weights for policy 0, policy_version 64482 (0.0008) +[2023-10-11 17:39:53,396][85176] Updated weights for policy 0, policy_version 64492 (0.0009) +[2023-10-11 17:39:53,771][85176] Updated weights for policy 0, policy_version 64502 (0.0007) +[2023-10-11 17:39:53,984][85175] Updated weights for policy 1, policy_version 65450 (0.0008) +[2023-10-11 17:39:54,146][85176] Updated weights for policy 0, policy_version 64512 (0.0008) +[2023-10-11 17:39:54,352][85175] Updated weights for policy 1, policy_version 65460 (0.0007) +[2023-10-11 17:39:54,722][85175] Updated weights for policy 1, policy_version 65470 (0.0008) +[2023-10-11 17:39:56,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 133103616. Throughput: 0: 1675.5, 1: 1692.4. Samples: 33280664. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:39:56,063][84230] Avg episode reward: [(0, '41.870'), (1, '44.060')] +[2023-10-11 17:39:58,211][85176] Updated weights for policy 0, policy_version 64522 (0.0009) +[2023-10-11 17:39:58,595][85176] Updated weights for policy 0, policy_version 64532 (0.0007) +[2023-10-11 17:39:58,779][85175] Updated weights for policy 1, policy_version 65480 (0.0008) +[2023-10-11 17:39:58,953][85176] Updated weights for policy 0, policy_version 64542 (0.0009) +[2023-10-11 17:39:59,141][85175] Updated weights for policy 1, policy_version 65490 (0.0008) +[2023-10-11 17:39:59,497][85175] Updated weights for policy 1, policy_version 65500 (0.0008) +[2023-10-11 17:40:01,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 133169152. Throughput: 0: 1682.1, 1: 1690.4. Samples: 33300946. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:40:01,063][84230] Avg episode reward: [(0, '43.330'), (1, '47.440')] +[2023-10-11 17:40:03,081][85176] Updated weights for policy 0, policy_version 64552 (0.0011) +[2023-10-11 17:40:03,459][85176] Updated weights for policy 0, policy_version 64562 (0.0009) +[2023-10-11 17:40:03,720][85175] Updated weights for policy 1, policy_version 65510 (0.0008) +[2023-10-11 17:40:03,839][85176] Updated weights for policy 0, policy_version 64572 (0.0009) +[2023-10-11 17:40:04,077][85175] Updated weights for policy 1, policy_version 65520 (0.0009) +[2023-10-11 17:40:04,442][85175] Updated weights for policy 1, policy_version 65530 (0.0010) +[2023-10-11 17:40:06,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 133234688. Throughput: 0: 1663.2, 1: 1707.6. Samples: 33311642. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:40:06,064][84230] Avg episode reward: [(0, '41.860'), (1, '44.630')] +[2023-10-11 17:40:07,797][85176] Updated weights for policy 0, policy_version 64582 (0.0009) +[2023-10-11 17:40:08,171][85176] Updated weights for policy 0, policy_version 64592 (0.0009) +[2023-10-11 17:40:08,561][85176] Updated weights for policy 0, policy_version 64602 (0.0007) +[2023-10-11 17:40:08,656][85175] Updated weights for policy 1, policy_version 65540 (0.0008) +[2023-10-11 17:40:09,022][85175] Updated weights for policy 1, policy_version 65550 (0.0009) +[2023-10-11 17:40:09,399][85175] Updated weights for policy 1, policy_version 65560 (0.0008) +[2023-10-11 17:40:11,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 133300224. Throughput: 0: 1677.7, 1: 1682.4. Samples: 33330974. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:40:11,063][84230] Avg episode reward: [(0, '42.380'), (1, '47.080')] +[2023-10-11 17:40:12,636][85176] Updated weights for policy 0, policy_version 64612 (0.0009) +[2023-10-11 17:40:13,021][85176] Updated weights for policy 0, policy_version 64622 (0.0008) +[2023-10-11 17:40:13,332][85175] Updated weights for policy 1, policy_version 65570 (0.0008) +[2023-10-11 17:40:13,394][85176] Updated weights for policy 0, policy_version 64632 (0.0009) +[2023-10-11 17:40:13,691][85175] Updated weights for policy 1, policy_version 65580 (0.0007) +[2023-10-11 17:40:14,059][85175] Updated weights for policy 1, policy_version 65590 (0.0008) +[2023-10-11 17:40:14,423][85175] Updated weights for policy 1, policy_version 65600 (0.0009) +[2023-10-11 17:40:16,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 133365760. Throughput: 0: 1680.7, 1: 1696.3. Samples: 33351554. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:40:16,063][84230] Avg episode reward: [(0, '41.000'), (1, '44.040')] +[2023-10-11 17:40:16,073][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000064640_66191360.pth... +[2023-10-11 17:40:16,073][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000065600_67174400.pth... +[2023-10-11 17:40:16,113][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000063072_64585728.pth +[2023-10-11 17:40:16,113][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000064000_65536000.pth +[2023-10-11 17:40:17,579][85176] Updated weights for policy 0, policy_version 64642 (0.0008) +[2023-10-11 17:40:17,954][85176] Updated weights for policy 0, policy_version 64652 (0.0008) +[2023-10-11 17:40:18,314][85176] Updated weights for policy 0, policy_version 64662 (0.0008) +[2023-10-11 17:40:18,338][85175] Updated weights for policy 1, policy_version 65610 (0.0007) +[2023-10-11 17:40:18,677][85176] Updated weights for policy 0, policy_version 64672 (0.0008) +[2023-10-11 17:40:18,709][85175] Updated weights for policy 1, policy_version 65620 (0.0008) +[2023-10-11 17:40:19,070][85175] Updated weights for policy 1, policy_version 65630 (0.0009) +[2023-10-11 17:40:21,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 133431296. Throughput: 0: 1656.0, 1: 1695.7. Samples: 33361596. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:40:21,063][84230] Avg episode reward: [(0, '42.110'), (1, '45.140')] +[2023-10-11 17:40:22,642][85176] Updated weights for policy 0, policy_version 64682 (0.0009) +[2023-10-11 17:40:23,013][85176] Updated weights for policy 0, policy_version 64692 (0.0010) +[2023-10-11 17:40:23,229][85175] Updated weights for policy 1, policy_version 65640 (0.0008) +[2023-10-11 17:40:23,381][85176] Updated weights for policy 0, policy_version 64702 (0.0008) +[2023-10-11 17:40:23,598][85175] Updated weights for policy 1, policy_version 65650 (0.0008) +[2023-10-11 17:40:23,969][85175] Updated weights for policy 1, policy_version 65660 (0.0008) +[2023-10-11 17:40:26,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 133496832. Throughput: 0: 1681.8, 1: 1681.2. Samples: 33381610. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:40:26,063][84230] Avg episode reward: [(0, '42.090'), (1, '44.180')] +[2023-10-11 17:40:27,528][85176] Updated weights for policy 0, policy_version 64712 (0.0010) +[2023-10-11 17:40:27,880][85175] Updated weights for policy 1, policy_version 65670 (0.0008) +[2023-10-11 17:40:27,886][85176] Updated weights for policy 0, policy_version 64722 (0.0008) +[2023-10-11 17:40:28,254][85175] Updated weights for policy 1, policy_version 65680 (0.0009) +[2023-10-11 17:40:28,259][85176] Updated weights for policy 0, policy_version 64732 (0.0007) +[2023-10-11 17:40:28,625][85175] Updated weights for policy 1, policy_version 65690 (0.0009) +[2023-10-11 17:40:31,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 133562368. Throughput: 0: 1684.4, 1: 1699.9. Samples: 33402420. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:40:31,063][84230] Avg episode reward: [(0, '41.090'), (1, '46.620')] +[2023-10-11 17:40:32,249][85176] Updated weights for policy 0, policy_version 64742 (0.0009) +[2023-10-11 17:40:32,485][85175] Updated weights for policy 1, policy_version 65700 (0.0010) +[2023-10-11 17:40:32,619][85176] Updated weights for policy 0, policy_version 64752 (0.0007) +[2023-10-11 17:40:32,860][85175] Updated weights for policy 1, policy_version 65710 (0.0009) +[2023-10-11 17:40:33,003][85176] Updated weights for policy 0, policy_version 64762 (0.0008) +[2023-10-11 17:40:33,228][85175] Updated weights for policy 1, policy_version 65720 (0.0008) +[2023-10-11 17:40:36,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 133627904. Throughput: 0: 1662.9, 1: 1673.1. Samples: 33411772. Policy #0 lag: (min: 31.0, avg: 37.2, max: 63.0) +[2023-10-11 17:40:36,063][84230] Avg episode reward: [(0, '41.330'), (1, '44.910')] +[2023-10-11 17:40:37,092][85176] Updated weights for policy 0, policy_version 64772 (0.0008) +[2023-10-11 17:40:37,253][85175] Updated weights for policy 1, policy_version 65730 (0.0007) +[2023-10-11 17:40:37,457][85176] Updated weights for policy 0, policy_version 64782 (0.0009) +[2023-10-11 17:40:37,621][85175] Updated weights for policy 1, policy_version 65740 (0.0008) +[2023-10-11 17:40:37,837][85176] Updated weights for policy 0, policy_version 64792 (0.0009) +[2023-10-11 17:40:37,979][85175] Updated weights for policy 1, policy_version 65750 (0.0008) +[2023-10-11 17:40:38,345][85175] Updated weights for policy 1, policy_version 65760 (0.0010) +[2023-10-11 17:40:41,062][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 133693440. Throughput: 0: 1677.4, 1: 1695.5. Samples: 33432444. Policy #0 lag: (min: 31.0, avg: 37.2, max: 63.0) +[2023-10-11 17:40:41,063][84230] Avg episode reward: [(0, '43.070'), (1, '45.170')] +[2023-10-11 17:40:41,836][85176] Updated weights for policy 0, policy_version 64802 (0.0010) +[2023-10-11 17:40:42,209][85176] Updated weights for policy 0, policy_version 64812 (0.0009) +[2023-10-11 17:40:42,582][85176] Updated weights for policy 0, policy_version 64822 (0.0007) +[2023-10-11 17:40:42,632][85175] Updated weights for policy 1, policy_version 65770 (0.0011) +[2023-10-11 17:40:42,955][85176] Updated weights for policy 0, policy_version 64832 (0.0008) +[2023-10-11 17:40:43,001][85175] Updated weights for policy 1, policy_version 65780 (0.0008) +[2023-10-11 17:40:43,364][85175] Updated weights for policy 1, policy_version 65790 (0.0009) +[2023-10-11 17:40:46,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 133758976. Throughput: 0: 1683.3, 1: 1699.6. Samples: 33453178. Policy #0 lag: (min: 31.0, avg: 37.2, max: 63.0) +[2023-10-11 17:40:46,064][84230] Avg episode reward: [(0, '42.690'), (1, '43.350')] +[2023-10-11 17:40:47,108][85176] Updated weights for policy 0, policy_version 64842 (0.0009) +[2023-10-11 17:40:47,457][85175] Updated weights for policy 1, policy_version 65800 (0.0008) +[2023-10-11 17:40:47,489][85176] Updated weights for policy 0, policy_version 64852 (0.0009) +[2023-10-11 17:40:47,826][85175] Updated weights for policy 1, policy_version 65810 (0.0008) +[2023-10-11 17:40:47,851][85176] Updated weights for policy 0, policy_version 64862 (0.0008) +[2023-10-11 17:40:48,187][85175] Updated weights for policy 1, policy_version 65820 (0.0009) +[2023-10-11 17:40:51,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 133824512. Throughput: 0: 1669.5, 1: 1677.5. Samples: 33462256. Policy #0 lag: (min: 31.0, avg: 37.2, max: 63.0) +[2023-10-11 17:40:51,063][84230] Avg episode reward: [(0, '43.740'), (1, '42.140')] +[2023-10-11 17:40:51,946][85176] Updated weights for policy 0, policy_version 64872 (0.0009) +[2023-10-11 17:40:52,231][85175] Updated weights for policy 1, policy_version 65830 (0.0007) +[2023-10-11 17:40:52,322][85176] Updated weights for policy 0, policy_version 64882 (0.0009) +[2023-10-11 17:40:52,595][85175] Updated weights for policy 1, policy_version 65840 (0.0007) +[2023-10-11 17:40:52,695][85176] Updated weights for policy 0, policy_version 64892 (0.0009) +[2023-10-11 17:40:52,963][85175] Updated weights for policy 1, policy_version 65850 (0.0009) +[2023-10-11 17:40:56,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 133890048. Throughput: 0: 1674.9, 1: 1700.5. Samples: 33482868. Policy #0 lag: (min: 31.0, avg: 37.2, max: 63.0) +[2023-10-11 17:40:56,063][84230] Avg episode reward: [(0, '42.290'), (1, '44.300')] +[2023-10-11 17:40:56,750][85176] Updated weights for policy 0, policy_version 64902 (0.0009) +[2023-10-11 17:40:56,977][85175] Updated weights for policy 1, policy_version 65860 (0.0009) +[2023-10-11 17:40:57,120][85176] Updated weights for policy 0, policy_version 64912 (0.0009) +[2023-10-11 17:40:57,343][85175] Updated weights for policy 1, policy_version 65870 (0.0009) +[2023-10-11 17:40:57,487][85176] Updated weights for policy 0, policy_version 64922 (0.0009) +[2023-10-11 17:40:57,710][85175] Updated weights for policy 1, policy_version 65880 (0.0009) +[2023-10-11 17:41:01,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 133955584. Throughput: 0: 1675.7, 1: 1704.9. Samples: 33503682. Policy #0 lag: (min: 31.0, avg: 37.2, max: 63.0) +[2023-10-11 17:41:01,063][84230] Avg episode reward: [(0, '44.460'), (1, '44.660')] +[2023-10-11 17:41:01,497][85176] Updated weights for policy 0, policy_version 64932 (0.0009) +[2023-10-11 17:41:01,578][85175] Updated weights for policy 1, policy_version 65890 (0.0009) +[2023-10-11 17:41:01,867][85176] Updated weights for policy 0, policy_version 64942 (0.0009) +[2023-10-11 17:41:01,938][85175] Updated weights for policy 1, policy_version 65900 (0.0008) +[2023-10-11 17:41:02,243][85176] Updated weights for policy 0, policy_version 64952 (0.0008) +[2023-10-11 17:41:02,303][85175] Updated weights for policy 1, policy_version 65910 (0.0008) +[2023-10-11 17:41:02,675][85175] Updated weights for policy 1, policy_version 65920 (0.0009) +[2023-10-11 17:41:06,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 134021120. Throughput: 0: 1674.5, 1: 1686.9. Samples: 33512860. Policy #0 lag: (min: 31.0, avg: 37.2, max: 63.0) +[2023-10-11 17:41:06,063][84230] Avg episode reward: [(0, '39.760'), (1, '44.870')] +[2023-10-11 17:41:06,409][85176] Updated weights for policy 0, policy_version 64962 (0.0008) +[2023-10-11 17:41:06,748][85175] Updated weights for policy 1, policy_version 65930 (0.0009) +[2023-10-11 17:41:06,789][85176] Updated weights for policy 0, policy_version 64972 (0.0008) +[2023-10-11 17:41:07,124][85175] Updated weights for policy 1, policy_version 65940 (0.0008) +[2023-10-11 17:41:07,153][85176] Updated weights for policy 0, policy_version 64982 (0.0007) +[2023-10-11 17:41:07,481][85175] Updated weights for policy 1, policy_version 65950 (0.0008) +[2023-10-11 17:41:07,530][85176] Updated weights for policy 0, policy_version 64992 (0.0007) +[2023-10-11 17:41:11,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 134086656. Throughput: 0: 1675.7, 1: 1701.2. Samples: 33533574. Policy #0 lag: (min: 31.0, avg: 37.2, max: 63.0) +[2023-10-11 17:41:11,063][84230] Avg episode reward: [(0, '42.360'), (1, '40.440')] +[2023-10-11 17:41:11,578][85175] Updated weights for policy 1, policy_version 65960 (0.0008) +[2023-10-11 17:41:11,679][85176] Updated weights for policy 0, policy_version 65002 (0.0008) +[2023-10-11 17:41:11,943][85175] Updated weights for policy 1, policy_version 65970 (0.0009) +[2023-10-11 17:41:12,048][85176] Updated weights for policy 0, policy_version 65012 (0.0008) +[2023-10-11 17:41:12,310][85175] Updated weights for policy 1, policy_version 65980 (0.0008) +[2023-10-11 17:41:12,422][85176] Updated weights for policy 0, policy_version 65022 (0.0009) +[2023-10-11 17:41:16,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 134152192. Throughput: 0: 1673.3, 1: 1701.8. Samples: 33554298. Policy #0 lag: (min: 31.0, avg: 37.2, max: 63.0) +[2023-10-11 17:41:16,063][84230] Avg episode reward: [(0, '38.540'), (1, '45.200')] +[2023-10-11 17:41:16,362][85175] Updated weights for policy 1, policy_version 65990 (0.0010) +[2023-10-11 17:41:16,605][85176] Updated weights for policy 0, policy_version 65032 (0.0008) +[2023-10-11 17:41:16,752][85175] Updated weights for policy 1, policy_version 66000 (0.0009) +[2023-10-11 17:41:16,970][85176] Updated weights for policy 0, policy_version 65042 (0.0007) +[2023-10-11 17:41:17,118][85175] Updated weights for policy 1, policy_version 66010 (0.0009) +[2023-10-11 17:41:17,336][85176] Updated weights for policy 0, policy_version 65052 (0.0008) +[2023-10-11 17:41:21,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 134217728. Throughput: 0: 1676.5, 1: 1689.4. Samples: 33563240. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-11 17:41:21,063][84230] Avg episode reward: [(0, '43.820'), (1, '40.830')] +[2023-10-11 17:41:21,071][85175] Updated weights for policy 1, policy_version 66020 (0.0007) +[2023-10-11 17:41:21,431][85175] Updated weights for policy 1, policy_version 66030 (0.0009) +[2023-10-11 17:41:21,533][85176] Updated weights for policy 0, policy_version 65062 (0.0009) +[2023-10-11 17:41:21,803][85175] Updated weights for policy 1, policy_version 66040 (0.0008) +[2023-10-11 17:41:21,907][85176] Updated weights for policy 0, policy_version 65072 (0.0008) +[2023-10-11 17:41:22,290][85176] Updated weights for policy 0, policy_version 65082 (0.0009) +[2023-10-11 17:41:25,745][85175] Updated weights for policy 1, policy_version 66050 (0.0008) +[2023-10-11 17:41:26,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 134283264. Throughput: 0: 1666.0, 1: 1687.6. Samples: 33583356. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-11 17:41:26,064][84230] Avg episode reward: [(0, '42.060'), (1, '45.710')] +[2023-10-11 17:41:26,116][85175] Updated weights for policy 1, policy_version 66060 (0.0008) +[2023-10-11 17:41:26,449][85176] Updated weights for policy 0, policy_version 65092 (0.0009) +[2023-10-11 17:41:26,488][85175] Updated weights for policy 1, policy_version 66070 (0.0007) +[2023-10-11 17:41:26,814][85176] Updated weights for policy 0, policy_version 65102 (0.0008) +[2023-10-11 17:41:26,852][85175] Updated weights for policy 1, policy_version 66080 (0.0009) +[2023-10-11 17:41:27,187][85176] Updated weights for policy 0, policy_version 65112 (0.0008) +[2023-10-11 17:41:30,919][85175] Updated weights for policy 1, policy_version 66090 (0.0009) +[2023-10-11 17:41:31,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 134348800. Throughput: 0: 1661.2, 1: 1693.4. Samples: 33604138. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-11 17:41:31,064][84230] Avg episode reward: [(0, '44.150'), (1, '41.970')] +[2023-10-11 17:41:31,145][85176] Updated weights for policy 0, policy_version 65122 (0.0009) +[2023-10-11 17:41:31,286][85175] Updated weights for policy 1, policy_version 66100 (0.0009) +[2023-10-11 17:41:31,518][85176] Updated weights for policy 0, policy_version 65132 (0.0010) +[2023-10-11 17:41:31,637][85175] Updated weights for policy 1, policy_version 66110 (0.0007) +[2023-10-11 17:41:31,889][85176] Updated weights for policy 0, policy_version 65142 (0.0010) +[2023-10-11 17:41:32,255][85176] Updated weights for policy 0, policy_version 65152 (0.0011) +[2023-10-11 17:41:35,844][85175] Updated weights for policy 1, policy_version 66120 (0.0007) +[2023-10-11 17:41:36,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 134414336. Throughput: 0: 1663.2, 1: 1689.6. Samples: 33613128. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-11 17:41:36,063][84230] Avg episode reward: [(0, '40.710'), (1, '43.950')] +[2023-10-11 17:41:36,215][85175] Updated weights for policy 1, policy_version 66130 (0.0007) +[2023-10-11 17:41:36,274][85176] Updated weights for policy 0, policy_version 65162 (0.0009) +[2023-10-11 17:41:36,585][85175] Updated weights for policy 1, policy_version 66140 (0.0008) +[2023-10-11 17:41:36,644][85176] Updated weights for policy 0, policy_version 65172 (0.0009) +[2023-10-11 17:41:37,009][85176] Updated weights for policy 0, policy_version 65182 (0.0009) +[2023-10-11 17:41:40,680][85175] Updated weights for policy 1, policy_version 66150 (0.0010) +[2023-10-11 17:41:41,053][85175] Updated weights for policy 1, policy_version 66160 (0.0009) +[2023-10-11 17:41:41,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 134479872. Throughput: 0: 1666.0, 1: 1686.9. Samples: 33633750. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-11 17:41:41,064][84230] Avg episode reward: [(0, '40.680'), (1, '41.310')] +[2023-10-11 17:41:41,255][85176] Updated weights for policy 0, policy_version 65192 (0.0009) +[2023-10-11 17:41:41,425][85175] Updated weights for policy 1, policy_version 66170 (0.0007) +[2023-10-11 17:41:41,632][85176] Updated weights for policy 0, policy_version 65202 (0.0009) +[2023-10-11 17:41:42,008][85176] Updated weights for policy 0, policy_version 65212 (0.0010) +[2023-10-11 17:41:45,424][85175] Updated weights for policy 1, policy_version 66180 (0.0008) +[2023-10-11 17:41:45,790][85175] Updated weights for policy 1, policy_version 66190 (0.0007) +[2023-10-11 17:41:46,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 134545408. Throughput: 0: 1663.6, 1: 1681.2. Samples: 33654200. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-11 17:41:46,063][84230] Avg episode reward: [(0, '38.890'), (1, '45.640')] +[2023-10-11 17:41:46,153][85175] Updated weights for policy 1, policy_version 66200 (0.0007) +[2023-10-11 17:41:46,156][85176] Updated weights for policy 0, policy_version 65222 (0.0008) +[2023-10-11 17:41:46,534][85176] Updated weights for policy 0, policy_version 65232 (0.0009) +[2023-10-11 17:41:46,904][85176] Updated weights for policy 0, policy_version 65242 (0.0009) +[2023-10-11 17:41:50,081][85175] Updated weights for policy 1, policy_version 66210 (0.0007) +[2023-10-11 17:41:50,440][85175] Updated weights for policy 1, policy_version 66220 (0.0009) +[2023-10-11 17:41:50,815][85175] Updated weights for policy 1, policy_version 66230 (0.0010) +[2023-10-11 17:41:51,057][85176] Updated weights for policy 0, policy_version 65252 (0.0010) +[2023-10-11 17:41:51,062][84230] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 134610944. Throughput: 0: 1661.7, 1: 1684.9. Samples: 33663458. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-11 17:41:51,063][84230] Avg episode reward: [(0, '42.750'), (1, '42.020')] +[2023-10-11 17:41:51,181][85175] Updated weights for policy 1, policy_version 66240 (0.0007) +[2023-10-11 17:41:51,433][85176] Updated weights for policy 0, policy_version 65262 (0.0009) +[2023-10-11 17:41:51,800][85176] Updated weights for policy 0, policy_version 65272 (0.0009) +[2023-10-11 17:41:55,111][85175] Updated weights for policy 1, policy_version 66250 (0.0009) +[2023-10-11 17:41:55,484][85175] Updated weights for policy 1, policy_version 66260 (0.0009) +[2023-10-11 17:41:55,805][85176] Updated weights for policy 0, policy_version 65282 (0.0008) +[2023-10-11 17:41:55,860][85175] Updated weights for policy 1, policy_version 66270 (0.0008) +[2023-10-11 17:41:56,063][84230] Fps is (10 sec: 16384.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 134709248. Throughput: 0: 1656.8, 1: 1693.3. Samples: 33684326. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-11 17:41:56,063][84230] Avg episode reward: [(0, '39.840'), (1, '45.690')] +[2023-10-11 17:41:56,169][85176] Updated weights for policy 0, policy_version 65292 (0.0009) +[2023-10-11 17:41:56,546][85176] Updated weights for policy 0, policy_version 65302 (0.0008) +[2023-10-11 17:41:56,912][85176] Updated weights for policy 0, policy_version 65312 (0.0009) +[2023-10-11 17:41:59,845][85175] Updated weights for policy 1, policy_version 66280 (0.0010) +[2023-10-11 17:42:00,219][85175] Updated weights for policy 1, policy_version 66290 (0.0007) +[2023-10-11 17:42:00,588][85175] Updated weights for policy 1, policy_version 66300 (0.0008) +[2023-10-11 17:42:01,063][84230] Fps is (10 sec: 16383.8, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 134774784. Throughput: 0: 1662.1, 1: 1670.3. Samples: 33704258. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-11 17:42:01,063][84230] Avg episode reward: [(0, '42.170'), (1, '42.280')] +[2023-10-11 17:42:01,067][85176] Updated weights for policy 0, policy_version 65322 (0.0010) +[2023-10-11 17:42:01,447][85176] Updated weights for policy 0, policy_version 65332 (0.0008) +[2023-10-11 17:42:01,809][85176] Updated weights for policy 0, policy_version 65342 (0.0009) +[2023-10-11 17:42:04,709][85175] Updated weights for policy 1, policy_version 66310 (0.0008) +[2023-10-11 17:42:05,106][85175] Updated weights for policy 1, policy_version 66320 (0.0007) +[2023-10-11 17:42:05,477][85175] Updated weights for policy 1, policy_version 66330 (0.0008) +[2023-10-11 17:42:05,889][85176] Updated weights for policy 0, policy_version 65352 (0.0008) +[2023-10-11 17:42:06,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 134840320. Throughput: 0: 1660.8, 1: 1700.1. Samples: 33714484. Policy #0 lag: (min: 31.0, avg: 35.3, max: 63.0) +[2023-10-11 17:42:06,063][84230] Avg episode reward: [(0, '40.460'), (1, '45.360')] +[2023-10-11 17:42:06,264][85176] Updated weights for policy 0, policy_version 65362 (0.0008) +[2023-10-11 17:42:06,633][85176] Updated weights for policy 0, policy_version 65372 (0.0010) +[2023-10-11 17:42:09,562][85175] Updated weights for policy 1, policy_version 66340 (0.0007) +[2023-10-11 17:42:09,933][85175] Updated weights for policy 1, policy_version 66350 (0.0009) +[2023-10-11 17:42:10,317][85175] Updated weights for policy 1, policy_version 66360 (0.0009) +[2023-10-11 17:42:10,722][85176] Updated weights for policy 0, policy_version 65382 (0.0008) +[2023-10-11 17:42:11,062][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13440.5). Total num frames: 134905856. Throughput: 0: 1671.7, 1: 1697.1. Samples: 33734950. Policy #0 lag: (min: 31.0, avg: 35.3, max: 63.0) +[2023-10-11 17:42:11,063][84230] Avg episode reward: [(0, '40.680'), (1, '41.030')] +[2023-10-11 17:42:11,105][85176] Updated weights for policy 0, policy_version 65392 (0.0007) +[2023-10-11 17:42:11,485][85176] Updated weights for policy 0, policy_version 65402 (0.0007) +[2023-10-11 17:42:14,376][85175] Updated weights for policy 1, policy_version 66370 (0.0007) +[2023-10-11 17:42:14,750][85175] Updated weights for policy 1, policy_version 66380 (0.0007) +[2023-10-11 17:42:15,114][85175] Updated weights for policy 1, policy_version 66390 (0.0007) +[2023-10-11 17:42:15,452][85176] Updated weights for policy 0, policy_version 65412 (0.0008) +[2023-10-11 17:42:15,480][85175] Updated weights for policy 1, policy_version 66400 (0.0009) +[2023-10-11 17:42:15,828][85176] Updated weights for policy 0, policy_version 65422 (0.0007) +[2023-10-11 17:42:16,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 134971392. Throughput: 0: 1665.7, 1: 1668.4. Samples: 33754176. Policy #0 lag: (min: 31.0, avg: 35.3, max: 63.0) +[2023-10-11 17:42:16,063][84230] Avg episode reward: [(0, '41.140'), (1, '44.690')] +[2023-10-11 17:42:16,073][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000066400_67993600.pth... +[2023-10-11 17:42:16,104][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000064800_66355200.pth +[2023-10-11 17:42:16,197][85176] Updated weights for policy 0, policy_version 65432 (0.0011) +[2023-10-11 17:42:16,500][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000065440_67010560.pth... +[2023-10-11 17:42:16,530][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000063872_65404928.pth +[2023-10-11 17:42:19,472][85175] Updated weights for policy 1, policy_version 66410 (0.0009) +[2023-10-11 17:42:19,836][85175] Updated weights for policy 1, policy_version 66420 (0.0009) +[2023-10-11 17:42:20,208][85175] Updated weights for policy 1, policy_version 66430 (0.0008) +[2023-10-11 17:42:20,394][85176] Updated weights for policy 0, policy_version 65442 (0.0010) +[2023-10-11 17:42:20,777][85176] Updated weights for policy 0, policy_version 65452 (0.0009) +[2023-10-11 17:42:21,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 135036928. Throughput: 0: 1670.0, 1: 1698.2. Samples: 33764698. Policy #0 lag: (min: 31.0, avg: 35.3, max: 63.0) +[2023-10-11 17:42:21,063][84230] Avg episode reward: [(0, '40.310'), (1, '44.790')] +[2023-10-11 17:42:21,140][85176] Updated weights for policy 0, policy_version 65462 (0.0009) +[2023-10-11 17:42:21,513][85176] Updated weights for policy 0, policy_version 65472 (0.0008) +[2023-10-11 17:42:24,209][85175] Updated weights for policy 1, policy_version 66440 (0.0009) +[2023-10-11 17:42:24,578][85175] Updated weights for policy 1, policy_version 66450 (0.0007) +[2023-10-11 17:42:24,948][85175] Updated weights for policy 1, policy_version 66460 (0.0008) +[2023-10-11 17:42:25,612][85176] Updated weights for policy 0, policy_version 65482 (0.0008) +[2023-10-11 17:42:25,990][85176] Updated weights for policy 0, policy_version 65492 (0.0007) +[2023-10-11 17:42:26,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 135102464. Throughput: 0: 1667.7, 1: 1691.3. Samples: 33784906. Policy #0 lag: (min: 31.0, avg: 35.3, max: 63.0) +[2023-10-11 17:42:26,063][84230] Avg episode reward: [(0, '41.810'), (1, '46.570')] +[2023-10-11 17:42:26,361][85176] Updated weights for policy 0, policy_version 65502 (0.0007) +[2023-10-11 17:42:28,980][85175] Updated weights for policy 1, policy_version 66470 (0.0010) +[2023-10-11 17:42:29,339][85175] Updated weights for policy 1, policy_version 66480 (0.0008) +[2023-10-11 17:42:29,704][85175] Updated weights for policy 1, policy_version 66490 (0.0008) +[2023-10-11 17:42:30,347][85176] Updated weights for policy 0, policy_version 65512 (0.0009) +[2023-10-11 17:42:30,730][85176] Updated weights for policy 0, policy_version 65522 (0.0007) +[2023-10-11 17:42:31,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 135168000. Throughput: 0: 1660.8, 1: 1682.2. Samples: 33804638. Policy #0 lag: (min: 31.0, avg: 35.3, max: 63.0) +[2023-10-11 17:42:31,063][84230] Avg episode reward: [(0, '40.460'), (1, '43.410')] +[2023-10-11 17:42:31,102][85176] Updated weights for policy 0, policy_version 65532 (0.0007) +[2023-10-11 17:42:33,761][85175] Updated weights for policy 1, policy_version 66500 (0.0009) +[2023-10-11 17:42:34,122][85175] Updated weights for policy 1, policy_version 66510 (0.0010) +[2023-10-11 17:42:34,493][85175] Updated weights for policy 1, policy_version 66520 (0.0011) +[2023-10-11 17:42:35,114][85176] Updated weights for policy 0, policy_version 65542 (0.0007) +[2023-10-11 17:42:35,487][85176] Updated weights for policy 0, policy_version 65552 (0.0007) +[2023-10-11 17:42:35,862][85176] Updated weights for policy 0, policy_version 65562 (0.0008) +[2023-10-11 17:42:36,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 135233536. Throughput: 0: 1675.7, 1: 1707.1. Samples: 33815684. Policy #0 lag: (min: 31.0, avg: 35.3, max: 63.0) +[2023-10-11 17:42:36,064][84230] Avg episode reward: [(0, '43.760'), (1, '45.920')] +[2023-10-11 17:42:38,516][85175] Updated weights for policy 1, policy_version 66530 (0.0009) +[2023-10-11 17:42:38,892][85175] Updated weights for policy 1, policy_version 66540 (0.0009) +[2023-10-11 17:42:39,252][85175] Updated weights for policy 1, policy_version 66550 (0.0009) +[2023-10-11 17:42:39,627][85175] Updated weights for policy 1, policy_version 66560 (0.0008) +[2023-10-11 17:42:39,970][85176] Updated weights for policy 0, policy_version 65572 (0.0007) +[2023-10-11 17:42:40,342][85176] Updated weights for policy 0, policy_version 65582 (0.0008) +[2023-10-11 17:42:40,711][85176] Updated weights for policy 0, policy_version 65592 (0.0008) +[2023-10-11 17:42:41,063][84230] Fps is (10 sec: 16383.8, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 135331840. Throughput: 0: 1681.1, 1: 1679.4. Samples: 33835546. Policy #0 lag: (min: 31.0, avg: 35.3, max: 63.0) +[2023-10-11 17:42:41,063][84230] Avg episode reward: [(0, '41.890'), (1, '44.210')] +[2023-10-11 17:42:43,570][85175] Updated weights for policy 1, policy_version 66570 (0.0009) +[2023-10-11 17:42:43,939][85175] Updated weights for policy 1, policy_version 66580 (0.0010) +[2023-10-11 17:42:44,314][85175] Updated weights for policy 1, policy_version 66590 (0.0009) +[2023-10-11 17:42:44,839][85176] Updated weights for policy 0, policy_version 65602 (0.0010) +[2023-10-11 17:42:45,209][85176] Updated weights for policy 0, policy_version 65612 (0.0007) +[2023-10-11 17:42:45,574][85176] Updated weights for policy 0, policy_version 65622 (0.0007) +[2023-10-11 17:42:45,948][85176] Updated weights for policy 0, policy_version 65632 (0.0007) +[2023-10-11 17:42:46,063][84230] Fps is (10 sec: 16383.8, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 135397376. Throughput: 0: 1662.7, 1: 1700.6. Samples: 33855606. Policy #0 lag: (min: 24.0, avg: 50.9, max: 56.0) +[2023-10-11 17:42:46,064][84230] Avg episode reward: [(0, '43.980'), (1, '46.780')] +[2023-10-11 17:42:48,427][85175] Updated weights for policy 1, policy_version 66600 (0.0008) +[2023-10-11 17:42:48,795][85175] Updated weights for policy 1, policy_version 66610 (0.0010) +[2023-10-11 17:42:49,161][85175] Updated weights for policy 1, policy_version 66620 (0.0007) +[2023-10-11 17:42:49,848][85176] Updated weights for policy 0, policy_version 65642 (0.0007) +[2023-10-11 17:42:50,217][85176] Updated weights for policy 0, policy_version 65652 (0.0007) +[2023-10-11 17:42:50,605][85176] Updated weights for policy 0, policy_version 65662 (0.0008) +[2023-10-11 17:42:51,063][84230] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 135462912. Throughput: 0: 1683.6, 1: 1693.3. Samples: 33866444. Policy #0 lag: (min: 24.0, avg: 50.9, max: 56.0) +[2023-10-11 17:42:51,064][84230] Avg episode reward: [(0, '42.830'), (1, '45.590')] +[2023-10-11 17:42:53,201][85175] Updated weights for policy 1, policy_version 66630 (0.0010) +[2023-10-11 17:42:53,572][85175] Updated weights for policy 1, policy_version 66640 (0.0011) +[2023-10-11 17:42:53,945][85175] Updated weights for policy 1, policy_version 66650 (0.0008) +[2023-10-11 17:42:54,769][85176] Updated weights for policy 0, policy_version 65672 (0.0010) +[2023-10-11 17:42:55,127][85176] Updated weights for policy 0, policy_version 65682 (0.0009) +[2023-10-11 17:42:55,505][85176] Updated weights for policy 0, policy_version 65692 (0.0007) +[2023-10-11 17:42:56,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 135528448. Throughput: 0: 1680.3, 1: 1682.7. Samples: 33886288. Policy #0 lag: (min: 24.0, avg: 50.9, max: 56.0) +[2023-10-11 17:42:56,064][84230] Avg episode reward: [(0, '43.600'), (1, '43.700')] +[2023-10-11 17:42:58,215][85175] Updated weights for policy 1, policy_version 66660 (0.0008) +[2023-10-11 17:42:58,614][85175] Updated weights for policy 1, policy_version 66670 (0.0008) +[2023-10-11 17:42:58,982][85175] Updated weights for policy 1, policy_version 66680 (0.0007) +[2023-10-11 17:42:59,660][85176] Updated weights for policy 0, policy_version 65702 (0.0010) +[2023-10-11 17:43:00,048][85176] Updated weights for policy 0, policy_version 65712 (0.0010) +[2023-10-11 17:43:00,426][85176] Updated weights for policy 0, policy_version 65722 (0.0008) +[2023-10-11 17:43:01,063][84230] Fps is (10 sec: 13107.6, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 135593984. Throughput: 0: 1662.8, 1: 1704.1. Samples: 33905686. Policy #0 lag: (min: 24.0, avg: 50.9, max: 56.0) +[2023-10-11 17:43:01,063][84230] Avg episode reward: [(0, '43.570'), (1, '44.920')] +[2023-10-11 17:43:02,928][85175] Updated weights for policy 1, policy_version 66690 (0.0008) +[2023-10-11 17:43:03,300][85175] Updated weights for policy 1, policy_version 66700 (0.0007) +[2023-10-11 17:43:03,669][85175] Updated weights for policy 1, policy_version 66710 (0.0007) +[2023-10-11 17:43:04,040][85175] Updated weights for policy 1, policy_version 66720 (0.0009) +[2023-10-11 17:43:04,479][85176] Updated weights for policy 0, policy_version 65732 (0.0009) +[2023-10-11 17:43:04,866][85176] Updated weights for policy 0, policy_version 65742 (0.0009) +[2023-10-11 17:43:05,237][85176] Updated weights for policy 0, policy_version 65752 (0.0007) +[2023-10-11 17:43:06,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 135659520. Throughput: 0: 1686.7, 1: 1689.1. Samples: 33916610. Policy #0 lag: (min: 24.0, avg: 50.9, max: 56.0) +[2023-10-11 17:43:06,064][84230] Avg episode reward: [(0, '44.280'), (1, '43.110')] +[2023-10-11 17:43:07,869][85175] Updated weights for policy 1, policy_version 66730 (0.0008) +[2023-10-11 17:43:08,244][85175] Updated weights for policy 1, policy_version 66740 (0.0008) +[2023-10-11 17:43:08,616][85175] Updated weights for policy 1, policy_version 66750 (0.0008) +[2023-10-11 17:43:09,278][85176] Updated weights for policy 0, policy_version 65762 (0.0007) +[2023-10-11 17:43:09,653][85176] Updated weights for policy 0, policy_version 65772 (0.0007) +[2023-10-11 17:43:10,035][85176] Updated weights for policy 0, policy_version 65782 (0.0008) +[2023-10-11 17:43:10,398][85176] Updated weights for policy 0, policy_version 65792 (0.0009) +[2023-10-11 17:43:11,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 135725056. Throughput: 0: 1682.2, 1: 1685.9. Samples: 33936468. Policy #0 lag: (min: 24.0, avg: 50.9, max: 56.0) +[2023-10-11 17:43:11,063][84230] Avg episode reward: [(0, '44.650'), (1, '43.580')] +[2023-10-11 17:43:12,601][85175] Updated weights for policy 1, policy_version 66760 (0.0010) +[2023-10-11 17:43:12,971][85175] Updated weights for policy 1, policy_version 66770 (0.0009) +[2023-10-11 17:43:13,340][85175] Updated weights for policy 1, policy_version 66780 (0.0007) +[2023-10-11 17:43:14,358][85176] Updated weights for policy 0, policy_version 65802 (0.0009) +[2023-10-11 17:43:14,732][85176] Updated weights for policy 0, policy_version 65812 (0.0009) +[2023-10-11 17:43:15,103][85176] Updated weights for policy 0, policy_version 65822 (0.0009) +[2023-10-11 17:43:16,062][84230] Fps is (10 sec: 13107.6, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 135790592. Throughput: 0: 1671.5, 1: 1703.3. Samples: 33956504. Policy #0 lag: (min: 24.0, avg: 50.9, max: 56.0) +[2023-10-11 17:43:16,063][84230] Avg episode reward: [(0, '44.470'), (1, '42.280')] +[2023-10-11 17:43:17,346][85175] Updated weights for policy 1, policy_version 66790 (0.0007) +[2023-10-11 17:43:17,716][85175] Updated weights for policy 1, policy_version 66800 (0.0007) +[2023-10-11 17:43:18,090][85175] Updated weights for policy 1, policy_version 66810 (0.0009) +[2023-10-11 17:43:19,359][85176] Updated weights for policy 0, policy_version 65832 (0.0009) +[2023-10-11 17:43:19,735][85176] Updated weights for policy 0, policy_version 65842 (0.0009) +[2023-10-11 17:43:20,113][85176] Updated weights for policy 0, policy_version 65852 (0.0008) +[2023-10-11 17:43:21,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 135856128. Throughput: 0: 1684.0, 1: 1674.3. Samples: 33966804. Policy #0 lag: (min: 24.0, avg: 50.9, max: 56.0) +[2023-10-11 17:43:21,063][84230] Avg episode reward: [(0, '43.570'), (1, '43.830')] +[2023-10-11 17:43:22,237][85175] Updated weights for policy 1, policy_version 66820 (0.0008) +[2023-10-11 17:43:22,605][85175] Updated weights for policy 1, policy_version 66830 (0.0009) +[2023-10-11 17:43:22,972][85175] Updated weights for policy 1, policy_version 66840 (0.0008) +[2023-10-11 17:43:24,096][85176] Updated weights for policy 0, policy_version 65862 (0.0008) +[2023-10-11 17:43:24,467][85176] Updated weights for policy 0, policy_version 65872 (0.0007) +[2023-10-11 17:43:24,843][85176] Updated weights for policy 0, policy_version 65882 (0.0008) +[2023-10-11 17:43:26,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 135921664. Throughput: 0: 1662.7, 1: 1696.6. Samples: 33986712. Policy #0 lag: (min: 24.0, avg: 50.9, max: 56.0) +[2023-10-11 17:43:26,063][84230] Avg episode reward: [(0, '45.750'), (1, '43.150')] +[2023-10-11 17:43:27,034][85175] Updated weights for policy 1, policy_version 66850 (0.0009) +[2023-10-11 17:43:27,405][85175] Updated weights for policy 1, policy_version 66860 (0.0008) +[2023-10-11 17:43:27,774][85175] Updated weights for policy 1, policy_version 66870 (0.0010) +[2023-10-11 17:43:28,152][85175] Updated weights for policy 1, policy_version 66880 (0.0010) +[2023-10-11 17:43:28,847][85176] Updated weights for policy 0, policy_version 65892 (0.0009) +[2023-10-11 17:43:29,225][85176] Updated weights for policy 0, policy_version 65902 (0.0010) +[2023-10-11 17:43:29,603][85176] Updated weights for policy 0, policy_version 65912 (0.0009) +[2023-10-11 17:43:31,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 135987200. Throughput: 0: 1668.5, 1: 1694.5. Samples: 34006940. Policy #0 lag: (min: 31.0, avg: 46.7, max: 63.0) +[2023-10-11 17:43:31,064][84230] Avg episode reward: [(0, '44.650'), (1, '47.360')] +[2023-10-11 17:43:32,274][85175] Updated weights for policy 1, policy_version 66890 (0.0008) +[2023-10-11 17:43:32,651][85175] Updated weights for policy 1, policy_version 66900 (0.0008) +[2023-10-11 17:43:33,021][85175] Updated weights for policy 1, policy_version 66910 (0.0007) +[2023-10-11 17:43:33,628][85176] Updated weights for policy 0, policy_version 65922 (0.0011) +[2023-10-11 17:43:33,991][85176] Updated weights for policy 0, policy_version 65932 (0.0009) +[2023-10-11 17:43:34,370][85176] Updated weights for policy 0, policy_version 65942 (0.0008) +[2023-10-11 17:43:34,733][85176] Updated weights for policy 0, policy_version 65952 (0.0007) +[2023-10-11 17:43:36,062][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 136052736. Throughput: 0: 1677.1, 1: 1677.0. Samples: 34017380. Policy #0 lag: (min: 31.0, avg: 46.7, max: 63.0) +[2023-10-11 17:43:36,063][84230] Avg episode reward: [(0, '45.440'), (1, '44.180')] +[2023-10-11 17:43:37,006][85175] Updated weights for policy 1, policy_version 66920 (0.0008) +[2023-10-11 17:43:37,374][85175] Updated weights for policy 1, policy_version 66930 (0.0008) +[2023-10-11 17:43:37,753][85175] Updated weights for policy 1, policy_version 66940 (0.0008) +[2023-10-11 17:43:38,771][85176] Updated weights for policy 0, policy_version 65962 (0.0008) +[2023-10-11 17:43:39,138][85176] Updated weights for policy 0, policy_version 65972 (0.0010) +[2023-10-11 17:43:39,513][85176] Updated weights for policy 0, policy_version 65982 (0.0010) +[2023-10-11 17:43:41,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 136118272. Throughput: 0: 1652.8, 1: 1700.9. Samples: 34037206. Policy #0 lag: (min: 31.0, avg: 46.7, max: 63.0) +[2023-10-11 17:43:41,064][84230] Avg episode reward: [(0, '43.160'), (1, '48.430')] +[2023-10-11 17:43:41,587][85175] Updated weights for policy 1, policy_version 66950 (0.0009) +[2023-10-11 17:43:41,960][85175] Updated weights for policy 1, policy_version 66960 (0.0010) +[2023-10-11 17:43:42,324][85175] Updated weights for policy 1, policy_version 66970 (0.0008) +[2023-10-11 17:43:43,658][85176] Updated weights for policy 0, policy_version 65992 (0.0010) +[2023-10-11 17:43:44,035][85176] Updated weights for policy 0, policy_version 66002 (0.0008) +[2023-10-11 17:43:44,404][85176] Updated weights for policy 0, policy_version 66012 (0.0008) +[2023-10-11 17:43:46,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 136183808. Throughput: 0: 1675.7, 1: 1711.8. Samples: 34058124. Policy #0 lag: (min: 31.0, avg: 46.7, max: 63.0) +[2023-10-11 17:43:46,063][84230] Avg episode reward: [(0, '47.620'), (1, '43.790')] +[2023-10-11 17:43:46,072][84801] Saving new best policy, reward=47.620! +[2023-10-11 17:43:46,324][85175] Updated weights for policy 1, policy_version 66980 (0.0008) +[2023-10-11 17:43:46,721][85175] Updated weights for policy 1, policy_version 66990 (0.0008) +[2023-10-11 17:43:47,084][85175] Updated weights for policy 1, policy_version 67000 (0.0009) +[2023-10-11 17:43:48,684][85176] Updated weights for policy 0, policy_version 66022 (0.0009) +[2023-10-11 17:43:49,058][85176] Updated weights for policy 0, policy_version 66032 (0.0010) +[2023-10-11 17:43:49,431][85176] Updated weights for policy 0, policy_version 66042 (0.0008) +[2023-10-11 17:43:51,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 136249344. Throughput: 0: 1668.6, 1: 1695.3. Samples: 34067984. Policy #0 lag: (min: 31.0, avg: 46.7, max: 63.0) +[2023-10-11 17:43:51,064][84230] Avg episode reward: [(0, '43.670'), (1, '46.730')] +[2023-10-11 17:43:51,085][85175] Updated weights for policy 1, policy_version 67010 (0.0009) +[2023-10-11 17:43:51,451][85175] Updated weights for policy 1, policy_version 67020 (0.0010) +[2023-10-11 17:43:51,813][85175] Updated weights for policy 1, policy_version 67030 (0.0008) +[2023-10-11 17:43:52,180][85175] Updated weights for policy 1, policy_version 67040 (0.0011) +[2023-10-11 17:43:53,551][85176] Updated weights for policy 0, policy_version 66052 (0.0009) +[2023-10-11 17:43:53,920][85176] Updated weights for policy 0, policy_version 66062 (0.0011) +[2023-10-11 17:43:54,294][85176] Updated weights for policy 0, policy_version 66072 (0.0008) +[2023-10-11 17:43:56,062][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 136314880. Throughput: 0: 1652.7, 1: 1710.1. Samples: 34087792. Policy #0 lag: (min: 31.0, avg: 46.7, max: 63.0) +[2023-10-11 17:43:56,063][84230] Avg episode reward: [(0, '46.730'), (1, '42.660')] +[2023-10-11 17:43:56,221][85175] Updated weights for policy 1, policy_version 67050 (0.0008) +[2023-10-11 17:43:56,580][85175] Updated weights for policy 1, policy_version 67060 (0.0008) +[2023-10-11 17:43:56,945][85175] Updated weights for policy 1, policy_version 67070 (0.0009) +[2023-10-11 17:43:58,483][85176] Updated weights for policy 0, policy_version 66082 (0.0008) +[2023-10-11 17:43:58,858][85176] Updated weights for policy 0, policy_version 66092 (0.0008) +[2023-10-11 17:43:59,231][85176] Updated weights for policy 0, policy_version 66102 (0.0009) +[2023-10-11 17:43:59,608][85176] Updated weights for policy 0, policy_version 66112 (0.0010) +[2023-10-11 17:44:01,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 136380416. Throughput: 0: 1671.6, 1: 1701.4. Samples: 34108290. Policy #0 lag: (min: 31.0, avg: 46.7, max: 63.0) +[2023-10-11 17:44:01,063][84230] Avg episode reward: [(0, '39.630'), (1, '44.530')] +[2023-10-11 17:44:01,101][85175] Updated weights for policy 1, policy_version 67080 (0.0010) +[2023-10-11 17:44:01,470][85175] Updated weights for policy 1, policy_version 67090 (0.0011) +[2023-10-11 17:44:01,839][85175] Updated weights for policy 1, policy_version 67100 (0.0010) +[2023-10-11 17:44:03,541][85176] Updated weights for policy 0, policy_version 66122 (0.0009) +[2023-10-11 17:44:03,912][85176] Updated weights for policy 0, policy_version 66132 (0.0009) +[2023-10-11 17:44:04,289][85176] Updated weights for policy 0, policy_version 66142 (0.0009) +[2023-10-11 17:44:05,834][85175] Updated weights for policy 1, policy_version 67110 (0.0010) +[2023-10-11 17:44:06,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 136445952. Throughput: 0: 1668.0, 1: 1701.9. Samples: 34118450. Policy #0 lag: (min: 31.0, avg: 46.7, max: 63.0) +[2023-10-11 17:44:06,063][84230] Avg episode reward: [(0, '46.190'), (1, '40.710')] +[2023-10-11 17:44:06,208][85175] Updated weights for policy 1, policy_version 67120 (0.0009) +[2023-10-11 17:44:06,569][85175] Updated weights for policy 1, policy_version 67130 (0.0008) +[2023-10-11 17:44:08,282][85176] Updated weights for policy 0, policy_version 66152 (0.0007) +[2023-10-11 17:44:08,653][85176] Updated weights for policy 0, policy_version 66162 (0.0008) +[2023-10-11 17:44:09,028][85176] Updated weights for policy 0, policy_version 66172 (0.0009) +[2023-10-11 17:44:10,757][85175] Updated weights for policy 1, policy_version 67140 (0.0010) +[2023-10-11 17:44:11,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.5). Total num frames: 136511488. Throughput: 0: 1668.1, 1: 1704.0. Samples: 34138456. Policy #0 lag: (min: 31.0, avg: 46.7, max: 63.0) +[2023-10-11 17:44:11,063][84230] Avg episode reward: [(0, '40.680'), (1, '44.330')] +[2023-10-11 17:44:11,120][85175] Updated weights for policy 1, policy_version 67150 (0.0008) +[2023-10-11 17:44:11,496][85175] Updated weights for policy 1, policy_version 67160 (0.0009) +[2023-10-11 17:44:13,033][85176] Updated weights for policy 0, policy_version 66182 (0.0007) +[2023-10-11 17:44:13,397][85176] Updated weights for policy 0, policy_version 66192 (0.0007) +[2023-10-11 17:44:13,776][85176] Updated weights for policy 0, policy_version 66202 (0.0009) +[2023-10-11 17:44:15,289][85175] Updated weights for policy 1, policy_version 67170 (0.0009) +[2023-10-11 17:44:15,663][85175] Updated weights for policy 1, policy_version 67180 (0.0007) +[2023-10-11 17:44:16,038][85175] Updated weights for policy 1, policy_version 67190 (0.0007) +[2023-10-11 17:44:16,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 136577024. Throughput: 0: 1679.7, 1: 1702.1. Samples: 34159122. Policy #0 lag: (min: 2.0, avg: 3.8, max: 29.0) +[2023-10-11 17:44:16,063][84230] Avg episode reward: [(0, '46.430'), (1, '44.110')] +[2023-10-11 17:44:16,075][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000066208_67796992.pth... +[2023-10-11 17:44:16,109][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000064640_66191360.pth +[2023-10-11 17:44:16,396][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000067200_68812800.pth... +[2023-10-11 17:44:16,396][85175] Updated weights for policy 1, policy_version 67200 (0.0008) +[2023-10-11 17:44:16,436][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000065600_67174400.pth +[2023-10-11 17:44:17,783][85176] Updated weights for policy 0, policy_version 66212 (0.0008) +[2023-10-11 17:44:18,159][85176] Updated weights for policy 0, policy_version 66222 (0.0009) +[2023-10-11 17:44:18,530][85176] Updated weights for policy 0, policy_version 66232 (0.0008) +[2023-10-11 17:44:20,425][85175] Updated weights for policy 1, policy_version 67210 (0.0008) +[2023-10-11 17:44:20,788][85175] Updated weights for policy 1, policy_version 67220 (0.0008) +[2023-10-11 17:44:21,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 136642560. Throughput: 0: 1660.1, 1: 1710.2. Samples: 34169042. Policy #0 lag: (min: 2.0, avg: 3.8, max: 29.0) +[2023-10-11 17:44:21,063][84230] Avg episode reward: [(0, '40.500'), (1, '46.020')] +[2023-10-11 17:44:21,151][85175] Updated weights for policy 1, policy_version 67230 (0.0009) +[2023-10-11 17:44:22,585][85176] Updated weights for policy 0, policy_version 66242 (0.0008) +[2023-10-11 17:44:22,948][85176] Updated weights for policy 0, policy_version 66252 (0.0007) +[2023-10-11 17:44:23,317][85176] Updated weights for policy 0, policy_version 66262 (0.0010) +[2023-10-11 17:44:23,691][85176] Updated weights for policy 0, policy_version 66272 (0.0010) +[2023-10-11 17:44:25,078][85175] Updated weights for policy 1, policy_version 67240 (0.0009) +[2023-10-11 17:44:25,450][85175] Updated weights for policy 1, policy_version 67250 (0.0007) +[2023-10-11 17:44:25,819][85175] Updated weights for policy 1, policy_version 67260 (0.0007) +[2023-10-11 17:44:26,063][84230] Fps is (10 sec: 16384.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 136740864. Throughput: 0: 1682.0, 1: 1703.6. Samples: 34189554. Policy #0 lag: (min: 2.0, avg: 3.8, max: 29.0) +[2023-10-11 17:44:26,063][84230] Avg episode reward: [(0, '46.090'), (1, '44.200')] +[2023-10-11 17:44:27,520][85176] Updated weights for policy 0, policy_version 66282 (0.0008) +[2023-10-11 17:44:27,898][85176] Updated weights for policy 0, policy_version 66292 (0.0009) +[2023-10-11 17:44:28,279][85176] Updated weights for policy 0, policy_version 66302 (0.0011) +[2023-10-11 17:44:29,918][85175] Updated weights for policy 1, policy_version 67270 (0.0007) +[2023-10-11 17:44:30,283][85175] Updated weights for policy 1, policy_version 67280 (0.0008) +[2023-10-11 17:44:30,653][85175] Updated weights for policy 1, policy_version 67290 (0.0008) +[2023-10-11 17:44:31,063][84230] Fps is (10 sec: 16383.9, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 136806400. Throughput: 0: 1685.5, 1: 1677.6. Samples: 34209462. Policy #0 lag: (min: 2.0, avg: 3.8, max: 29.0) +[2023-10-11 17:44:31,063][84230] Avg episode reward: [(0, '40.260'), (1, '44.960')] +[2023-10-11 17:44:32,304][85176] Updated weights for policy 0, policy_version 66312 (0.0009) +[2023-10-11 17:44:32,670][85176] Updated weights for policy 0, policy_version 66322 (0.0010) +[2023-10-11 17:44:33,050][85176] Updated weights for policy 0, policy_version 66332 (0.0007) +[2023-10-11 17:44:34,821][85175] Updated weights for policy 1, policy_version 67300 (0.0008) +[2023-10-11 17:44:35,216][85175] Updated weights for policy 1, policy_version 67310 (0.0007) +[2023-10-11 17:44:35,583][85175] Updated weights for policy 1, policy_version 67320 (0.0008) +[2023-10-11 17:44:36,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 136871936. Throughput: 0: 1665.5, 1: 1699.3. Samples: 34219402. Policy #0 lag: (min: 2.0, avg: 3.8, max: 29.0) +[2023-10-11 17:44:36,064][84230] Avg episode reward: [(0, '46.830'), (1, '42.100')] +[2023-10-11 17:44:37,378][85176] Updated weights for policy 0, policy_version 66342 (0.0008) +[2023-10-11 17:44:37,767][85176] Updated weights for policy 0, policy_version 66352 (0.0009) +[2023-10-11 17:44:38,134][85176] Updated weights for policy 0, policy_version 66362 (0.0009) +[2023-10-11 17:44:39,507][85175] Updated weights for policy 1, policy_version 67330 (0.0007) +[2023-10-11 17:44:39,875][85175] Updated weights for policy 1, policy_version 67340 (0.0007) +[2023-10-11 17:44:40,243][85175] Updated weights for policy 1, policy_version 67350 (0.0008) +[2023-10-11 17:44:40,609][85175] Updated weights for policy 1, policy_version 67360 (0.0008) +[2023-10-11 17:44:41,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 136937472. Throughput: 0: 1679.3, 1: 1694.7. Samples: 34239620. Policy #0 lag: (min: 2.0, avg: 3.8, max: 29.0) +[2023-10-11 17:44:41,064][84230] Avg episode reward: [(0, '43.920'), (1, '43.910')] +[2023-10-11 17:44:42,415][85176] Updated weights for policy 0, policy_version 66372 (0.0011) +[2023-10-11 17:44:42,800][85176] Updated weights for policy 0, policy_version 66382 (0.0010) +[2023-10-11 17:44:43,166][85176] Updated weights for policy 0, policy_version 66392 (0.0010) +[2023-10-11 17:44:44,616][85175] Updated weights for policy 1, policy_version 67370 (0.0009) +[2023-10-11 17:44:44,988][85175] Updated weights for policy 1, policy_version 67380 (0.0008) +[2023-10-11 17:44:45,358][85175] Updated weights for policy 1, policy_version 67390 (0.0008) +[2023-10-11 17:44:46,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 137003008. Throughput: 0: 1682.1, 1: 1674.5. Samples: 34259338. Policy #0 lag: (min: 2.0, avg: 3.8, max: 29.0) +[2023-10-11 17:44:46,064][84230] Avg episode reward: [(0, '47.360'), (1, '46.540')] +[2023-10-11 17:44:47,017][85176] Updated weights for policy 0, policy_version 66402 (0.0008) +[2023-10-11 17:44:47,385][85176] Updated weights for policy 0, policy_version 66412 (0.0007) +[2023-10-11 17:44:47,756][85176] Updated weights for policy 0, policy_version 66422 (0.0008) +[2023-10-11 17:44:48,123][85176] Updated weights for policy 0, policy_version 66432 (0.0009) +[2023-10-11 17:44:49,360][85175] Updated weights for policy 1, policy_version 67400 (0.0010) +[2023-10-11 17:44:49,732][85175] Updated weights for policy 1, policy_version 67410 (0.0009) +[2023-10-11 17:44:50,097][85175] Updated weights for policy 1, policy_version 67420 (0.0010) +[2023-10-11 17:44:51,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 137068544. Throughput: 0: 1661.0, 1: 1704.7. Samples: 34269908. Policy #0 lag: (min: 2.0, avg: 3.8, max: 29.0) +[2023-10-11 17:44:51,063][84230] Avg episode reward: [(0, '41.450'), (1, '43.080')] +[2023-10-11 17:44:52,088][85176] Updated weights for policy 0, policy_version 66442 (0.0007) +[2023-10-11 17:44:52,464][85176] Updated weights for policy 0, policy_version 66452 (0.0007) +[2023-10-11 17:44:52,836][85176] Updated weights for policy 0, policy_version 66462 (0.0008) +[2023-10-11 17:44:54,094][85175] Updated weights for policy 1, policy_version 67430 (0.0010) +[2023-10-11 17:44:54,458][85175] Updated weights for policy 1, policy_version 67440 (0.0009) +[2023-10-11 17:44:54,828][85175] Updated weights for policy 1, policy_version 67450 (0.0010) +[2023-10-11 17:44:56,063][84230] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 137134080. Throughput: 0: 1680.5, 1: 1688.8. Samples: 34290074. Policy #0 lag: (min: 2.0, avg: 3.8, max: 29.0) +[2023-10-11 17:44:56,063][84230] Avg episode reward: [(0, '46.220'), (1, '44.240')] +[2023-10-11 17:44:56,897][85176] Updated weights for policy 0, policy_version 66472 (0.0009) +[2023-10-11 17:44:57,270][85176] Updated weights for policy 0, policy_version 66482 (0.0009) +[2023-10-11 17:44:57,653][85176] Updated weights for policy 0, policy_version 66492 (0.0008) +[2023-10-11 17:44:58,851][85175] Updated weights for policy 1, policy_version 67460 (0.0008) +[2023-10-11 17:44:59,224][85175] Updated weights for policy 1, policy_version 67470 (0.0008) +[2023-10-11 17:44:59,587][85175] Updated weights for policy 1, policy_version 67480 (0.0008) +[2023-10-11 17:45:01,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 137199616. Throughput: 0: 1683.8, 1: 1681.3. Samples: 34310554. Policy #0 lag: (min: 18.0, avg: 18.0, max: 18.0) +[2023-10-11 17:45:01,064][84230] Avg episode reward: [(0, '41.370'), (1, '40.530')] +[2023-10-11 17:45:01,810][85176] Updated weights for policy 0, policy_version 66502 (0.0008) +[2023-10-11 17:45:02,180][85176] Updated weights for policy 0, policy_version 66512 (0.0010) +[2023-10-11 17:45:02,547][85176] Updated weights for policy 0, policy_version 66522 (0.0008) +[2023-10-11 17:45:03,441][85175] Updated weights for policy 1, policy_version 67490 (0.0009) +[2023-10-11 17:45:03,809][85175] Updated weights for policy 1, policy_version 67500 (0.0008) +[2023-10-11 17:45:04,176][85175] Updated weights for policy 1, policy_version 67510 (0.0009) +[2023-10-11 17:45:04,544][85175] Updated weights for policy 1, policy_version 67520 (0.0008) +[2023-10-11 17:45:06,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 137265152. Throughput: 0: 1673.4, 1: 1701.6. Samples: 34320918. Policy #0 lag: (min: 18.0, avg: 18.0, max: 18.0) +[2023-10-11 17:45:06,063][84230] Avg episode reward: [(0, '43.240'), (1, '45.230')] +[2023-10-11 17:45:06,761][85176] Updated weights for policy 0, policy_version 66532 (0.0010) +[2023-10-11 17:45:07,130][85176] Updated weights for policy 0, policy_version 66542 (0.0011) +[2023-10-11 17:45:07,501][85176] Updated weights for policy 0, policy_version 66552 (0.0010) +[2023-10-11 17:45:08,767][85175] Updated weights for policy 1, policy_version 67530 (0.0010) +[2023-10-11 17:45:09,138][85175] Updated weights for policy 1, policy_version 67540 (0.0011) +[2023-10-11 17:45:09,507][85175] Updated weights for policy 1, policy_version 67550 (0.0010) +[2023-10-11 17:45:11,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 137330688. Throughput: 0: 1674.8, 1: 1682.4. Samples: 34340630. Policy #0 lag: (min: 18.0, avg: 18.0, max: 18.0) +[2023-10-11 17:45:11,063][84230] Avg episode reward: [(0, '38.520'), (1, '39.910')] +[2023-10-11 17:45:11,577][85176] Updated weights for policy 0, policy_version 66562 (0.0010) +[2023-10-11 17:45:11,949][85176] Updated weights for policy 0, policy_version 66572 (0.0007) +[2023-10-11 17:45:12,329][85176] Updated weights for policy 0, policy_version 66582 (0.0007) +[2023-10-11 17:45:12,695][85176] Updated weights for policy 0, policy_version 66592 (0.0007) +[2023-10-11 17:45:13,339][85175] Updated weights for policy 1, policy_version 67560 (0.0007) +[2023-10-11 17:45:13,704][85175] Updated weights for policy 1, policy_version 67570 (0.0007) +[2023-10-11 17:45:14,063][85175] Updated weights for policy 1, policy_version 67580 (0.0008) +[2023-10-11 17:45:16,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 137396224. Throughput: 0: 1671.0, 1: 1701.3. Samples: 34361214. Policy #0 lag: (min: 18.0, avg: 18.0, max: 18.0) +[2023-10-11 17:45:16,064][84230] Avg episode reward: [(0, '42.740'), (1, '46.230')] +[2023-10-11 17:45:16,913][85176] Updated weights for policy 0, policy_version 66602 (0.0007) +[2023-10-11 17:45:17,275][85176] Updated weights for policy 0, policy_version 66612 (0.0009) +[2023-10-11 17:45:17,653][85176] Updated weights for policy 0, policy_version 66622 (0.0010) +[2023-10-11 17:45:18,150][85175] Updated weights for policy 1, policy_version 67590 (0.0008) +[2023-10-11 17:45:18,516][85175] Updated weights for policy 1, policy_version 67600 (0.0007) +[2023-10-11 17:45:18,880][85175] Updated weights for policy 1, policy_version 67610 (0.0007) +[2023-10-11 17:45:21,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 137461760. Throughput: 0: 1672.1, 1: 1699.5. Samples: 34371122. Policy #0 lag: (min: 18.0, avg: 18.0, max: 18.0) +[2023-10-11 17:45:21,063][84230] Avg episode reward: [(0, '42.650'), (1, '39.850')] +[2023-10-11 17:45:21,757][85176] Updated weights for policy 0, policy_version 66632 (0.0009) +[2023-10-11 17:45:22,125][85176] Updated weights for policy 0, policy_version 66642 (0.0009) +[2023-10-11 17:45:22,495][85176] Updated weights for policy 0, policy_version 66652 (0.0010) +[2023-10-11 17:45:22,923][85175] Updated weights for policy 1, policy_version 67620 (0.0009) +[2023-10-11 17:45:23,291][85175] Updated weights for policy 1, policy_version 67630 (0.0007) +[2023-10-11 17:45:23,664][85175] Updated weights for policy 1, policy_version 67640 (0.0010) +[2023-10-11 17:45:26,062][84230] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 137527296. Throughput: 0: 1679.9, 1: 1691.6. Samples: 34391336. Policy #0 lag: (min: 18.0, avg: 18.0, max: 18.0) +[2023-10-11 17:45:26,063][84230] Avg episode reward: [(0, '43.180'), (1, '45.210')] +[2023-10-11 17:45:26,528][85176] Updated weights for policy 0, policy_version 66662 (0.0008) +[2023-10-11 17:45:26,909][85176] Updated weights for policy 0, policy_version 66672 (0.0009) +[2023-10-11 17:45:27,286][85176] Updated weights for policy 0, policy_version 66682 (0.0010) +[2023-10-11 17:45:27,544][85175] Updated weights for policy 1, policy_version 67650 (0.0009) +[2023-10-11 17:45:27,917][85175] Updated weights for policy 1, policy_version 67660 (0.0008) +[2023-10-11 17:45:28,282][85175] Updated weights for policy 1, policy_version 67670 (0.0009) +[2023-10-11 17:45:28,650][85175] Updated weights for policy 1, policy_version 67680 (0.0009) +[2023-10-11 17:45:31,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 137592832. Throughput: 0: 1673.8, 1: 1719.6. Samples: 34412040. Policy #0 lag: (min: 18.0, avg: 18.0, max: 18.0) +[2023-10-11 17:45:31,064][84230] Avg episode reward: [(0, '42.210'), (1, '40.330')] +[2023-10-11 17:45:31,396][85176] Updated weights for policy 0, policy_version 66692 (0.0009) +[2023-10-11 17:45:31,772][85176] Updated weights for policy 0, policy_version 66702 (0.0009) +[2023-10-11 17:45:32,134][85176] Updated weights for policy 0, policy_version 66712 (0.0009) +[2023-10-11 17:45:32,562][85175] Updated weights for policy 1, policy_version 67690 (0.0008) +[2023-10-11 17:45:32,938][85175] Updated weights for policy 1, policy_version 67700 (0.0009) +[2023-10-11 17:45:33,306][85175] Updated weights for policy 1, policy_version 67710 (0.0008) +[2023-10-11 17:45:36,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 137658368. Throughput: 0: 1672.8, 1: 1691.5. Samples: 34421302. Policy #0 lag: (min: 18.0, avg: 18.0, max: 18.0) +[2023-10-11 17:45:36,063][84230] Avg episode reward: [(0, '44.370'), (1, '46.350')] +[2023-10-11 17:45:36,178][85176] Updated weights for policy 0, policy_version 66722 (0.0008) +[2023-10-11 17:45:36,548][85176] Updated weights for policy 0, policy_version 66732 (0.0011) +[2023-10-11 17:45:36,927][85176] Updated weights for policy 0, policy_version 66742 (0.0008) +[2023-10-11 17:45:37,262][85175] Updated weights for policy 1, policy_version 67720 (0.0008) +[2023-10-11 17:45:37,303][85176] Updated weights for policy 0, policy_version 66752 (0.0008) +[2023-10-11 17:45:37,632][85175] Updated weights for policy 1, policy_version 67730 (0.0007) +[2023-10-11 17:45:37,997][85175] Updated weights for policy 1, policy_version 67740 (0.0008) +[2023-10-11 17:45:41,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 137723904. Throughput: 0: 1670.1, 1: 1705.8. Samples: 34441990. Policy #0 lag: (min: 18.0, avg: 18.0, max: 18.0) +[2023-10-11 17:45:41,063][84230] Avg episode reward: [(0, '45.100'), (1, '39.820')] +[2023-10-11 17:45:41,393][85176] Updated weights for policy 0, policy_version 66762 (0.0008) +[2023-10-11 17:45:41,770][85176] Updated weights for policy 0, policy_version 66772 (0.0008) +[2023-10-11 17:45:41,995][85175] Updated weights for policy 1, policy_version 67750 (0.0009) +[2023-10-11 17:45:42,145][85176] Updated weights for policy 0, policy_version 66782 (0.0009) +[2023-10-11 17:45:42,359][85175] Updated weights for policy 1, policy_version 67760 (0.0009) +[2023-10-11 17:45:42,732][85175] Updated weights for policy 1, policy_version 67770 (0.0010) +[2023-10-11 17:45:46,062][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 137789440. Throughput: 0: 1666.5, 1: 1725.3. Samples: 34463182. Policy #0 lag: (min: 29.0, avg: 31.6, max: 61.0) +[2023-10-11 17:45:46,063][84230] Avg episode reward: [(0, '46.020'), (1, '45.030')] +[2023-10-11 17:45:46,092][85176] Updated weights for policy 0, policy_version 66792 (0.0008) +[2023-10-11 17:45:46,474][85176] Updated weights for policy 0, policy_version 66802 (0.0008) +[2023-10-11 17:45:46,696][85175] Updated weights for policy 1, policy_version 67780 (0.0010) +[2023-10-11 17:45:46,842][85176] Updated weights for policy 0, policy_version 66812 (0.0007) +[2023-10-11 17:45:47,066][85175] Updated weights for policy 1, policy_version 67790 (0.0008) +[2023-10-11 17:45:47,427][85175] Updated weights for policy 1, policy_version 67800 (0.0008) +[2023-10-11 17:45:51,046][85176] Updated weights for policy 0, policy_version 66822 (0.0010) +[2023-10-11 17:45:51,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 137854976. Throughput: 0: 1668.7, 1: 1699.2. Samples: 34472474. Policy #0 lag: (min: 29.0, avg: 31.6, max: 61.0) +[2023-10-11 17:45:51,063][84230] Avg episode reward: [(0, '42.850'), (1, '37.220')] +[2023-10-11 17:45:51,344][85175] Updated weights for policy 1, policy_version 67810 (0.0010) +[2023-10-11 17:45:51,416][85176] Updated weights for policy 0, policy_version 66832 (0.0010) +[2023-10-11 17:45:51,717][85175] Updated weights for policy 1, policy_version 67820 (0.0008) +[2023-10-11 17:45:51,794][85176] Updated weights for policy 0, policy_version 66842 (0.0009) +[2023-10-11 17:45:52,084][85175] Updated weights for policy 1, policy_version 67830 (0.0009) +[2023-10-11 17:45:52,455][85175] Updated weights for policy 1, policy_version 67840 (0.0011) +[2023-10-11 17:45:55,884][85176] Updated weights for policy 0, policy_version 66852 (0.0007) +[2023-10-11 17:45:56,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 137920512. Throughput: 0: 1673.1, 1: 1720.3. Samples: 34493338. Policy #0 lag: (min: 29.0, avg: 31.6, max: 61.0) +[2023-10-11 17:45:56,064][84230] Avg episode reward: [(0, '43.500'), (1, '45.550')] +[2023-10-11 17:45:56,260][85176] Updated weights for policy 0, policy_version 66862 (0.0008) +[2023-10-11 17:45:56,511][85175] Updated weights for policy 1, policy_version 67850 (0.0009) +[2023-10-11 17:45:56,631][85176] Updated weights for policy 0, policy_version 66872 (0.0009) +[2023-10-11 17:45:56,875][85175] Updated weights for policy 1, policy_version 67860 (0.0008) +[2023-10-11 17:45:57,248][85175] Updated weights for policy 1, policy_version 67870 (0.0008) +[2023-10-11 17:46:00,695][85176] Updated weights for policy 0, policy_version 66882 (0.0009) +[2023-10-11 17:46:01,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 137986048. Throughput: 0: 1676.5, 1: 1721.3. Samples: 34514116. Policy #0 lag: (min: 29.0, avg: 31.6, max: 61.0) +[2023-10-11 17:46:01,063][84230] Avg episode reward: [(0, '41.160'), (1, '39.130')] +[2023-10-11 17:46:01,077][85176] Updated weights for policy 0, policy_version 66892 (0.0008) +[2023-10-11 17:46:01,320][85175] Updated weights for policy 1, policy_version 67880 (0.0008) +[2023-10-11 17:46:01,440][85176] Updated weights for policy 0, policy_version 66902 (0.0008) +[2023-10-11 17:46:01,693][85175] Updated weights for policy 1, policy_version 67890 (0.0008) +[2023-10-11 17:46:01,806][85176] Updated weights for policy 0, policy_version 66912 (0.0009) +[2023-10-11 17:46:02,058][85175] Updated weights for policy 1, policy_version 67900 (0.0009) +[2023-10-11 17:46:05,956][85176] Updated weights for policy 0, policy_version 66922 (0.0007) +[2023-10-11 17:46:05,967][85175] Updated weights for policy 1, policy_version 67910 (0.0008) +[2023-10-11 17:46:06,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 138051584. Throughput: 0: 1672.2, 1: 1705.8. Samples: 34523130. Policy #0 lag: (min: 29.0, avg: 31.6, max: 61.0) +[2023-10-11 17:46:06,063][84230] Avg episode reward: [(0, '43.850'), (1, '45.900')] +[2023-10-11 17:46:06,330][85176] Updated weights for policy 0, policy_version 66932 (0.0007) +[2023-10-11 17:46:06,336][85175] Updated weights for policy 1, policy_version 67920 (0.0008) +[2023-10-11 17:46:06,698][85176] Updated weights for policy 0, policy_version 66942 (0.0010) +[2023-10-11 17:46:06,702][85175] Updated weights for policy 1, policy_version 67930 (0.0008) +[2023-10-11 17:46:10,765][85176] Updated weights for policy 0, policy_version 66952 (0.0008) +[2023-10-11 17:46:10,785][85175] Updated weights for policy 1, policy_version 67940 (0.0009) +[2023-10-11 17:46:11,062][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 138117120. Throughput: 0: 1669.8, 1: 1716.4. Samples: 34543716. Policy #0 lag: (min: 29.0, avg: 31.6, max: 61.0) +[2023-10-11 17:46:11,063][84230] Avg episode reward: [(0, '46.240'), (1, '41.650')] +[2023-10-11 17:46:11,134][85176] Updated weights for policy 0, policy_version 66962 (0.0007) +[2023-10-11 17:46:11,180][85175] Updated weights for policy 1, policy_version 67950 (0.0008) +[2023-10-11 17:46:11,505][85176] Updated weights for policy 0, policy_version 66972 (0.0010) +[2023-10-11 17:46:11,544][85175] Updated weights for policy 1, policy_version 67960 (0.0007) +[2023-10-11 17:46:15,505][85175] Updated weights for policy 1, policy_version 67970 (0.0007) +[2023-10-11 17:46:15,683][85176] Updated weights for policy 0, policy_version 66982 (0.0007) +[2023-10-11 17:46:15,874][85175] Updated weights for policy 1, policy_version 67980 (0.0009) +[2023-10-11 17:46:16,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 138182656. Throughput: 0: 1666.2, 1: 1714.1. Samples: 34564156. Policy #0 lag: (min: 29.0, avg: 31.6, max: 61.0) +[2023-10-11 17:46:16,063][84230] Avg episode reward: [(0, '44.890'), (1, '47.060')] +[2023-10-11 17:46:16,075][85176] Updated weights for policy 0, policy_version 66992 (0.0007) +[2023-10-11 17:46:16,251][85175] Updated weights for policy 1, policy_version 67990 (0.0008) +[2023-10-11 17:46:16,443][85176] Updated weights for policy 0, policy_version 67002 (0.0008) +[2023-10-11 17:46:16,615][85175] Updated weights for policy 1, policy_version 68000 (0.0007) +[2023-10-11 17:46:16,616][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000068000_69632000.pth... +[2023-10-11 17:46:16,646][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000066400_67993600.pth +[2023-10-11 17:46:16,663][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000067008_68616192.pth... +[2023-10-11 17:46:16,692][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000065440_67010560.pth +[2023-10-11 17:46:20,435][85176] Updated weights for policy 0, policy_version 67012 (0.0008) +[2023-10-11 17:46:20,705][85175] Updated weights for policy 1, policy_version 68010 (0.0007) +[2023-10-11 17:46:20,811][85176] Updated weights for policy 0, policy_version 67022 (0.0008) +[2023-10-11 17:46:21,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 138248192. Throughput: 0: 1667.2, 1: 1710.5. Samples: 34573300. Policy #0 lag: (min: 29.0, avg: 31.6, max: 61.0) +[2023-10-11 17:46:21,064][84230] Avg episode reward: [(0, '46.250'), (1, '41.170')] +[2023-10-11 17:46:21,068][85175] Updated weights for policy 1, policy_version 68020 (0.0007) +[2023-10-11 17:46:21,185][85176] Updated weights for policy 0, policy_version 67032 (0.0008) +[2023-10-11 17:46:21,450][85175] Updated weights for policy 1, policy_version 68030 (0.0009) +[2023-10-11 17:46:25,310][85176] Updated weights for policy 0, policy_version 67042 (0.0007) +[2023-10-11 17:46:25,574][85175] Updated weights for policy 1, policy_version 68040 (0.0007) +[2023-10-11 17:46:25,685][85176] Updated weights for policy 0, policy_version 67052 (0.0007) +[2023-10-11 17:46:25,935][85175] Updated weights for policy 1, policy_version 68050 (0.0007) +[2023-10-11 17:46:26,052][85176] Updated weights for policy 0, policy_version 67062 (0.0008) +[2023-10-11 17:46:26,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 138313728. Throughput: 0: 1666.0, 1: 1708.4. Samples: 34593838. Policy #0 lag: (min: 29.0, avg: 31.6, max: 61.0) +[2023-10-11 17:46:26,063][84230] Avg episode reward: [(0, '40.270'), (1, '45.610')] +[2023-10-11 17:46:26,306][85175] Updated weights for policy 1, policy_version 68060 (0.0008) +[2023-10-11 17:46:26,419][85176] Updated weights for policy 0, policy_version 67072 (0.0009) +[2023-10-11 17:46:30,060][85175] Updated weights for policy 1, policy_version 68070 (0.0008) +[2023-10-11 17:46:30,434][85175] Updated weights for policy 1, policy_version 68080 (0.0007) +[2023-10-11 17:46:30,660][85176] Updated weights for policy 0, policy_version 67082 (0.0011) +[2023-10-11 17:46:30,804][85175] Updated weights for policy 1, policy_version 68090 (0.0008) +[2023-10-11 17:46:31,027][85176] Updated weights for policy 0, policy_version 67092 (0.0009) +[2023-10-11 17:46:31,062][84230] Fps is (10 sec: 16384.3, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 138412032. Throughput: 0: 1654.3, 1: 1694.0. Samples: 34613852. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:46:31,063][84230] Avg episode reward: [(0, '42.390'), (1, '41.860')] +[2023-10-11 17:46:31,412][85176] Updated weights for policy 0, policy_version 67102 (0.0008) +[2023-10-11 17:46:34,879][85175] Updated weights for policy 1, policy_version 68100 (0.0010) +[2023-10-11 17:46:35,248][85175] Updated weights for policy 1, policy_version 68110 (0.0011) +[2023-10-11 17:46:35,612][85175] Updated weights for policy 1, policy_version 68120 (0.0009) +[2023-10-11 17:46:35,657][85176] Updated weights for policy 0, policy_version 67112 (0.0008) +[2023-10-11 17:46:36,029][85176] Updated weights for policy 0, policy_version 67122 (0.0010) +[2023-10-11 17:46:36,062][84230] Fps is (10 sec: 16384.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 138477568. Throughput: 0: 1659.8, 1: 1707.0. Samples: 34623980. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:46:36,063][84230] Avg episode reward: [(0, '39.870'), (1, '45.640')] +[2023-10-11 17:46:36,397][85176] Updated weights for policy 0, policy_version 67132 (0.0010) +[2023-10-11 17:46:39,387][85175] Updated weights for policy 1, policy_version 68130 (0.0009) +[2023-10-11 17:46:39,755][85175] Updated weights for policy 1, policy_version 68140 (0.0007) +[2023-10-11 17:46:40,120][85175] Updated weights for policy 1, policy_version 68150 (0.0007) +[2023-10-11 17:46:40,491][85175] Updated weights for policy 1, policy_version 68160 (0.0008) +[2023-10-11 17:46:40,652][85176] Updated weights for policy 0, policy_version 67142 (0.0007) +[2023-10-11 17:46:41,025][85176] Updated weights for policy 0, policy_version 67152 (0.0007) +[2023-10-11 17:46:41,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 138543104. Throughput: 0: 1654.5, 1: 1708.5. Samples: 34644670. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:46:41,063][84230] Avg episode reward: [(0, '44.180'), (1, '43.180')] +[2023-10-11 17:46:41,406][85176] Updated weights for policy 0, policy_version 67162 (0.0010) +[2023-10-11 17:46:44,707][85175] Updated weights for policy 1, policy_version 68170 (0.0011) +[2023-10-11 17:46:45,072][85175] Updated weights for policy 1, policy_version 68180 (0.0008) +[2023-10-11 17:46:45,366][85176] Updated weights for policy 0, policy_version 67172 (0.0011) +[2023-10-11 17:46:45,439][85175] Updated weights for policy 1, policy_version 68190 (0.0007) +[2023-10-11 17:46:45,749][85176] Updated weights for policy 0, policy_version 67182 (0.0010) +[2023-10-11 17:46:46,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 138608640. Throughput: 0: 1648.1, 1: 1681.4. Samples: 34663942. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:46:46,064][84230] Avg episode reward: [(0, '40.390'), (1, '43.810')] +[2023-10-11 17:46:46,112][85176] Updated weights for policy 0, policy_version 67192 (0.0008) +[2023-10-11 17:46:49,507][85175] Updated weights for policy 1, policy_version 68200 (0.0008) +[2023-10-11 17:46:49,890][85175] Updated weights for policy 1, policy_version 68210 (0.0008) +[2023-10-11 17:46:50,255][85175] Updated weights for policy 1, policy_version 68220 (0.0010) +[2023-10-11 17:46:50,257][85176] Updated weights for policy 0, policy_version 67202 (0.0007) +[2023-10-11 17:46:50,627][85176] Updated weights for policy 0, policy_version 67212 (0.0008) +[2023-10-11 17:46:51,008][85176] Updated weights for policy 0, policy_version 67222 (0.0008) +[2023-10-11 17:46:51,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 138674176. Throughput: 0: 1660.2, 1: 1711.5. Samples: 34674856. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:46:51,063][84230] Avg episode reward: [(0, '46.340'), (1, '41.670')] +[2023-10-11 17:46:51,381][85176] Updated weights for policy 0, policy_version 67232 (0.0007) +[2023-10-11 17:46:54,191][85175] Updated weights for policy 1, policy_version 68230 (0.0009) +[2023-10-11 17:46:54,561][85175] Updated weights for policy 1, policy_version 68240 (0.0010) +[2023-10-11 17:46:54,922][85175] Updated weights for policy 1, policy_version 68250 (0.0007) +[2023-10-11 17:46:55,429][85176] Updated weights for policy 0, policy_version 67242 (0.0008) +[2023-10-11 17:46:55,795][85176] Updated weights for policy 0, policy_version 67252 (0.0009) +[2023-10-11 17:46:56,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 138739712. Throughput: 0: 1663.4, 1: 1701.2. Samples: 34695122. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:46:56,063][84230] Avg episode reward: [(0, '44.050'), (1, '44.400')] +[2023-10-11 17:46:56,171][85176] Updated weights for policy 0, policy_version 67262 (0.0009) +[2023-10-11 17:46:59,009][85175] Updated weights for policy 1, policy_version 68260 (0.0009) +[2023-10-11 17:46:59,390][85175] Updated weights for policy 1, policy_version 68270 (0.0008) +[2023-10-11 17:46:59,761][85175] Updated weights for policy 1, policy_version 68280 (0.0009) +[2023-10-11 17:47:00,181][85176] Updated weights for policy 0, policy_version 67272 (0.0008) +[2023-10-11 17:47:00,547][85176] Updated weights for policy 0, policy_version 67282 (0.0009) +[2023-10-11 17:47:00,923][85176] Updated weights for policy 0, policy_version 67292 (0.0008) +[2023-10-11 17:47:01,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 138805248. Throughput: 0: 1652.0, 1: 1682.2. Samples: 34714196. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:47:01,064][84230] Avg episode reward: [(0, '44.840'), (1, '43.910')] +[2023-10-11 17:47:03,851][85175] Updated weights for policy 1, policy_version 68290 (0.0010) +[2023-10-11 17:47:04,216][85175] Updated weights for policy 1, policy_version 68300 (0.0008) +[2023-10-11 17:47:04,584][85175] Updated weights for policy 1, policy_version 68310 (0.0007) +[2023-10-11 17:47:04,953][85175] Updated weights for policy 1, policy_version 68320 (0.0007) +[2023-10-11 17:47:04,957][85176] Updated weights for policy 0, policy_version 67302 (0.0009) +[2023-10-11 17:47:05,334][85176] Updated weights for policy 0, policy_version 67312 (0.0009) +[2023-10-11 17:47:05,707][85176] Updated weights for policy 0, policy_version 67322 (0.0008) +[2023-10-11 17:47:06,063][84230] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 138903552. Throughput: 0: 1665.2, 1: 1708.8. Samples: 34725134. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:47:06,063][84230] Avg episode reward: [(0, '42.780'), (1, '44.210')] +[2023-10-11 17:47:08,945][85175] Updated weights for policy 1, policy_version 68330 (0.0008) +[2023-10-11 17:47:09,315][85175] Updated weights for policy 1, policy_version 68340 (0.0009) +[2023-10-11 17:47:09,684][85175] Updated weights for policy 1, policy_version 68350 (0.0008) +[2023-10-11 17:47:09,876][85176] Updated weights for policy 0, policy_version 67332 (0.0009) +[2023-10-11 17:47:10,240][85176] Updated weights for policy 0, policy_version 67342 (0.0008) +[2023-10-11 17:47:10,606][85176] Updated weights for policy 0, policy_version 67352 (0.0010) +[2023-10-11 17:47:11,063][84230] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 138969088. Throughput: 0: 1669.0, 1: 1687.6. Samples: 34744886. Policy #0 lag: (min: 31.0, avg: 40.9, max: 63.0) +[2023-10-11 17:47:11,063][84230] Avg episode reward: [(0, '44.380'), (1, '43.380')] +[2023-10-11 17:47:13,770][85175] Updated weights for policy 1, policy_version 68360 (0.0008) +[2023-10-11 17:47:14,140][85175] Updated weights for policy 1, policy_version 68370 (0.0008) +[2023-10-11 17:47:14,513][85176] Updated weights for policy 0, policy_version 67362 (0.0009) +[2023-10-11 17:47:14,515][85175] Updated weights for policy 1, policy_version 68380 (0.0008) +[2023-10-11 17:47:14,874][85176] Updated weights for policy 0, policy_version 67372 (0.0008) +[2023-10-11 17:47:15,248][85176] Updated weights for policy 0, policy_version 67382 (0.0008) +[2023-10-11 17:47:15,614][85176] Updated weights for policy 0, policy_version 67392 (0.0007) +[2023-10-11 17:47:16,063][84230] Fps is (10 sec: 13106.8, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 139034624. Throughput: 0: 1661.5, 1: 1688.7. Samples: 34764614. Policy #0 lag: (min: 31.0, avg: 40.9, max: 63.0) +[2023-10-11 17:47:16,064][84230] Avg episode reward: [(0, '40.850'), (1, '42.900')] +[2023-10-11 17:47:18,441][85175] Updated weights for policy 1, policy_version 68390 (0.0008) +[2023-10-11 17:47:18,814][85175] Updated weights for policy 1, policy_version 68400 (0.0009) +[2023-10-11 17:47:19,171][85175] Updated weights for policy 1, policy_version 68410 (0.0008) +[2023-10-11 17:47:19,631][85176] Updated weights for policy 0, policy_version 67402 (0.0010) +[2023-10-11 17:47:20,003][85176] Updated weights for policy 0, policy_version 67412 (0.0009) +[2023-10-11 17:47:20,369][85176] Updated weights for policy 0, policy_version 67422 (0.0009) +[2023-10-11 17:47:21,063][84230] Fps is (10 sec: 13106.9, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 139100160. Throughput: 0: 1687.3, 1: 1695.7. Samples: 34776214. Policy #0 lag: (min: 31.0, avg: 40.9, max: 63.0) +[2023-10-11 17:47:21,064][84230] Avg episode reward: [(0, '44.000'), (1, '43.270')] +[2023-10-11 17:47:23,363][85175] Updated weights for policy 1, policy_version 68420 (0.0009) +[2023-10-11 17:47:23,739][85175] Updated weights for policy 1, policy_version 68430 (0.0009) +[2023-10-11 17:47:24,108][85175] Updated weights for policy 1, policy_version 68440 (0.0008) +[2023-10-11 17:47:24,522][85176] Updated weights for policy 0, policy_version 67432 (0.0008) +[2023-10-11 17:47:24,894][85176] Updated weights for policy 0, policy_version 67442 (0.0009) +[2023-10-11 17:47:25,275][85176] Updated weights for policy 0, policy_version 67452 (0.0008) +[2023-10-11 17:47:26,062][84230] Fps is (10 sec: 13107.7, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 139165696. Throughput: 0: 1681.4, 1: 1670.4. Samples: 34795504. Policy #0 lag: (min: 31.0, avg: 40.9, max: 63.0) +[2023-10-11 17:47:26,063][84230] Avg episode reward: [(0, '40.940'), (1, '42.970')] +[2023-10-11 17:47:28,006][85175] Updated weights for policy 1, policy_version 68450 (0.0009) +[2023-10-11 17:47:28,367][85175] Updated weights for policy 1, policy_version 68460 (0.0008) +[2023-10-11 17:47:28,731][85175] Updated weights for policy 1, policy_version 68470 (0.0007) +[2023-10-11 17:47:29,100][85175] Updated weights for policy 1, policy_version 68480 (0.0008) +[2023-10-11 17:47:29,374][85176] Updated weights for policy 0, policy_version 67462 (0.0008) +[2023-10-11 17:47:29,752][85176] Updated weights for policy 0, policy_version 67472 (0.0007) +[2023-10-11 17:47:30,127][85176] Updated weights for policy 0, policy_version 67482 (0.0008) +[2023-10-11 17:47:31,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 139231232. Throughput: 0: 1669.4, 1: 1700.4. Samples: 34815584. Policy #0 lag: (min: 31.0, avg: 40.9, max: 63.0) +[2023-10-11 17:47:31,064][84230] Avg episode reward: [(0, '46.270'), (1, '45.340')] +[2023-10-11 17:47:33,186][85175] Updated weights for policy 1, policy_version 68490 (0.0007) +[2023-10-11 17:47:33,545][85175] Updated weights for policy 1, policy_version 68500 (0.0007) +[2023-10-11 17:47:33,910][85175] Updated weights for policy 1, policy_version 68510 (0.0007) +[2023-10-11 17:47:34,003][85176] Updated weights for policy 0, policy_version 67492 (0.0009) +[2023-10-11 17:47:34,373][85176] Updated weights for policy 0, policy_version 67502 (0.0009) +[2023-10-11 17:47:34,749][85176] Updated weights for policy 0, policy_version 67512 (0.0007) +[2023-10-11 17:47:36,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 139296768. Throughput: 0: 1690.0, 1: 1679.8. Samples: 34826500. Policy #0 lag: (min: 31.0, avg: 40.9, max: 63.0) +[2023-10-11 17:47:36,064][84230] Avg episode reward: [(0, '42.590'), (1, '44.950')] +[2023-10-11 17:47:37,986][85175] Updated weights for policy 1, policy_version 68520 (0.0009) +[2023-10-11 17:47:38,360][85175] Updated weights for policy 1, policy_version 68530 (0.0007) +[2023-10-11 17:47:38,725][85175] Updated weights for policy 1, policy_version 68540 (0.0009) +[2023-10-11 17:47:38,785][85176] Updated weights for policy 0, policy_version 67522 (0.0008) +[2023-10-11 17:47:39,155][85176] Updated weights for policy 0, policy_version 67532 (0.0009) +[2023-10-11 17:47:39,526][85176] Updated weights for policy 0, policy_version 67542 (0.0008) +[2023-10-11 17:47:39,904][85176] Updated weights for policy 0, policy_version 67552 (0.0009) +[2023-10-11 17:47:41,063][84230] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 139362304. Throughput: 0: 1669.8, 1: 1681.5. Samples: 34845932. Policy #0 lag: (min: 31.0, avg: 40.9, max: 63.0) +[2023-10-11 17:47:41,063][84230] Avg episode reward: [(0, '45.500'), (1, '45.050')] +[2023-10-11 17:47:42,558][85175] Updated weights for policy 1, policy_version 68550 (0.0009) +[2023-10-11 17:47:42,925][85175] Updated weights for policy 1, policy_version 68560 (0.0009) +[2023-10-11 17:47:43,301][85175] Updated weights for policy 1, policy_version 68570 (0.0009) +[2023-10-11 17:47:43,931][85176] Updated weights for policy 0, policy_version 67562 (0.0007) +[2023-10-11 17:47:44,305][85176] Updated weights for policy 0, policy_version 67572 (0.0009) +[2023-10-11 17:47:44,687][85176] Updated weights for policy 0, policy_version 67582 (0.0007) +[2023-10-11 17:47:46,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 139427840. Throughput: 0: 1679.9, 1: 1707.7. Samples: 34866638. Policy #0 lag: (min: 31.0, avg: 40.9, max: 63.0) +[2023-10-11 17:47:46,064][84230] Avg episode reward: [(0, '42.360'), (1, '42.720')] +[2023-10-11 17:47:47,213][85175] Updated weights for policy 1, policy_version 68580 (0.0008) +[2023-10-11 17:47:47,594][85175] Updated weights for policy 1, policy_version 68590 (0.0008) +[2023-10-11 17:47:47,958][85175] Updated weights for policy 1, policy_version 68600 (0.0008) +[2023-10-11 17:47:48,916][85176] Updated weights for policy 0, policy_version 67592 (0.0009) +[2023-10-11 17:47:49,291][85176] Updated weights for policy 0, policy_version 67602 (0.0008) +[2023-10-11 17:47:49,667][85176] Updated weights for policy 0, policy_version 67612 (0.0008) +[2023-10-11 17:47:51,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 139493376. Throughput: 0: 1696.5, 1: 1682.7. Samples: 34877196. Policy #0 lag: (min: 31.0, avg: 40.9, max: 63.0) +[2023-10-11 17:47:51,063][84230] Avg episode reward: [(0, '45.380'), (1, '46.290')] +[2023-10-11 17:47:52,027][85175] Updated weights for policy 1, policy_version 68610 (0.0007) +[2023-10-11 17:47:52,394][85175] Updated weights for policy 1, policy_version 68620 (0.0008) +[2023-10-11 17:47:52,765][85175] Updated weights for policy 1, policy_version 68630 (0.0008) +[2023-10-11 17:47:53,134][85175] Updated weights for policy 1, policy_version 68640 (0.0009) +[2023-10-11 17:47:53,757][85176] Updated weights for policy 0, policy_version 67622 (0.0010) +[2023-10-11 17:47:54,123][85176] Updated weights for policy 0, policy_version 67632 (0.0008) +[2023-10-11 17:47:54,511][85176] Updated weights for policy 0, policy_version 67642 (0.0009) +[2023-10-11 17:47:56,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 139558912. Throughput: 0: 1668.6, 1: 1707.4. Samples: 34896806. Policy #0 lag: (min: 31.0, avg: 31.0, max: 34.0) +[2023-10-11 17:47:56,063][84230] Avg episode reward: [(0, '43.270'), (1, '44.480')] +[2023-10-11 17:47:57,036][85175] Updated weights for policy 1, policy_version 68650 (0.0008) +[2023-10-11 17:47:57,402][85175] Updated weights for policy 1, policy_version 68660 (0.0008) +[2023-10-11 17:47:57,776][85175] Updated weights for policy 1, policy_version 68670 (0.0007) +[2023-10-11 17:47:58,551][85176] Updated weights for policy 0, policy_version 67652 (0.0010) +[2023-10-11 17:47:58,923][85176] Updated weights for policy 0, policy_version 67662 (0.0010) +[2023-10-11 17:47:59,297][85176] Updated weights for policy 0, policy_version 67672 (0.0009) +[2023-10-11 17:48:01,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 139624448. Throughput: 0: 1685.3, 1: 1717.4. Samples: 34917734. Policy #0 lag: (min: 31.0, avg: 31.0, max: 34.0) +[2023-10-11 17:48:01,063][84230] Avg episode reward: [(0, '44.060'), (1, '47.410')] +[2023-10-11 17:48:01,708][85175] Updated weights for policy 1, policy_version 68680 (0.0010) +[2023-10-11 17:48:02,089][85175] Updated weights for policy 1, policy_version 68690 (0.0010) +[2023-10-11 17:48:02,456][85175] Updated weights for policy 1, policy_version 68700 (0.0011) +[2023-10-11 17:48:03,172][85176] Updated weights for policy 0, policy_version 67682 (0.0008) +[2023-10-11 17:48:03,540][85176] Updated weights for policy 0, policy_version 67692 (0.0009) +[2023-10-11 17:48:03,919][85176] Updated weights for policy 0, policy_version 67702 (0.0008) +[2023-10-11 17:48:04,285][85176] Updated weights for policy 0, policy_version 67712 (0.0007) +[2023-10-11 17:48:06,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 139689984. Throughput: 0: 1674.7, 1: 1693.7. Samples: 34927788. Policy #0 lag: (min: 31.0, avg: 31.0, max: 34.0) +[2023-10-11 17:48:06,063][84230] Avg episode reward: [(0, '41.140'), (1, '43.930')] +[2023-10-11 17:48:06,313][85175] Updated weights for policy 1, policy_version 68710 (0.0010) +[2023-10-11 17:48:06,674][85175] Updated weights for policy 1, policy_version 68720 (0.0010) +[2023-10-11 17:48:07,048][85175] Updated weights for policy 1, policy_version 68730 (0.0009) +[2023-10-11 17:48:08,386][85176] Updated weights for policy 0, policy_version 67722 (0.0008) +[2023-10-11 17:48:08,760][85176] Updated weights for policy 0, policy_version 67732 (0.0007) +[2023-10-11 17:48:09,135][85176] Updated weights for policy 0, policy_version 67742 (0.0007) +[2023-10-11 17:48:11,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 139755520. Throughput: 0: 1666.6, 1: 1714.3. Samples: 34947644. Policy #0 lag: (min: 31.0, avg: 31.0, max: 34.0) +[2023-10-11 17:48:11,064][84230] Avg episode reward: [(0, '42.720'), (1, '45.330')] +[2023-10-11 17:48:11,239][85175] Updated weights for policy 1, policy_version 68740 (0.0008) +[2023-10-11 17:48:11,600][85175] Updated weights for policy 1, policy_version 68750 (0.0008) +[2023-10-11 17:48:11,971][85175] Updated weights for policy 1, policy_version 68760 (0.0007) +[2023-10-11 17:48:13,112][85176] Updated weights for policy 0, policy_version 67752 (0.0007) +[2023-10-11 17:48:13,483][85176] Updated weights for policy 0, policy_version 67762 (0.0007) +[2023-10-11 17:48:13,856][85176] Updated weights for policy 0, policy_version 67772 (0.0008) +[2023-10-11 17:48:15,944][85175] Updated weights for policy 1, policy_version 68770 (0.0008) +[2023-10-11 17:48:16,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 139821056. Throughput: 0: 1679.8, 1: 1716.8. Samples: 34968432. Policy #0 lag: (min: 31.0, avg: 31.0, max: 34.0) +[2023-10-11 17:48:16,063][84230] Avg episode reward: [(0, '41.340'), (1, '41.950')] +[2023-10-11 17:48:16,071][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000067776_69402624.pth... +[2023-10-11 17:48:16,104][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000066208_67796992.pth +[2023-10-11 17:48:16,321][85175] Updated weights for policy 1, policy_version 68780 (0.0009) +[2023-10-11 17:48:16,681][85175] Updated weights for policy 1, policy_version 68790 (0.0008) +[2023-10-11 17:48:17,044][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000068800_70451200.pth... +[2023-10-11 17:48:17,049][85175] Updated weights for policy 1, policy_version 68800 (0.0008) +[2023-10-11 17:48:17,073][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000067200_68812800.pth +[2023-10-11 17:48:17,924][85176] Updated weights for policy 0, policy_version 67782 (0.0009) +[2023-10-11 17:48:18,294][85176] Updated weights for policy 0, policy_version 67792 (0.0011) +[2023-10-11 17:48:18,669][85176] Updated weights for policy 0, policy_version 67802 (0.0008) +[2023-10-11 17:48:21,062][84230] Fps is (10 sec: 13107.6, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 139886592. Throughput: 0: 1662.1, 1: 1707.7. Samples: 34978140. Policy #0 lag: (min: 31.0, avg: 31.0, max: 34.0) +[2023-10-11 17:48:21,063][84230] Avg episode reward: [(0, '43.760'), (1, '43.960')] +[2023-10-11 17:48:21,193][85175] Updated weights for policy 1, policy_version 68810 (0.0010) +[2023-10-11 17:48:21,559][85175] Updated weights for policy 1, policy_version 68820 (0.0008) +[2023-10-11 17:48:21,932][85175] Updated weights for policy 1, policy_version 68830 (0.0008) +[2023-10-11 17:48:22,716][85176] Updated weights for policy 0, policy_version 67812 (0.0009) +[2023-10-11 17:48:23,091][85176] Updated weights for policy 0, policy_version 67822 (0.0011) +[2023-10-11 17:48:23,474][85176] Updated weights for policy 0, policy_version 67832 (0.0010) +[2023-10-11 17:48:25,972][85175] Updated weights for policy 1, policy_version 68840 (0.0009) +[2023-10-11 17:48:26,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 139952128. Throughput: 0: 1670.7, 1: 1717.5. Samples: 34998402. Policy #0 lag: (min: 31.0, avg: 31.0, max: 34.0) +[2023-10-11 17:48:26,064][84230] Avg episode reward: [(0, '43.190'), (1, '43.410')] +[2023-10-11 17:48:26,342][85175] Updated weights for policy 1, policy_version 68850 (0.0012) +[2023-10-11 17:48:26,710][85175] Updated weights for policy 1, policy_version 68860 (0.0010) +[2023-10-11 17:48:27,457][85176] Updated weights for policy 0, policy_version 67842 (0.0010) +[2023-10-11 17:48:27,835][85176] Updated weights for policy 0, policy_version 67852 (0.0011) +[2023-10-11 17:48:28,219][85176] Updated weights for policy 0, policy_version 67862 (0.0010) +[2023-10-11 17:48:28,585][85176] Updated weights for policy 0, policy_version 67872 (0.0010) +[2023-10-11 17:48:30,820][85175] Updated weights for policy 1, policy_version 68870 (0.0009) +[2023-10-11 17:48:31,063][84230] Fps is (10 sec: 13106.8, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 140017664. Throughput: 0: 1683.5, 1: 1707.5. Samples: 35019230. Policy #0 lag: (min: 31.0, avg: 31.0, max: 34.0) +[2023-10-11 17:48:31,064][84230] Avg episode reward: [(0, '45.010'), (1, '46.000')] +[2023-10-11 17:48:31,180][85175] Updated weights for policy 1, policy_version 68880 (0.0008) +[2023-10-11 17:48:31,543][85175] Updated weights for policy 1, policy_version 68890 (0.0007) +[2023-10-11 17:48:32,781][85176] Updated weights for policy 0, policy_version 67882 (0.0009) +[2023-10-11 17:48:33,160][85176] Updated weights for policy 0, policy_version 67892 (0.0008) +[2023-10-11 17:48:33,539][85176] Updated weights for policy 0, policy_version 67902 (0.0009) +[2023-10-11 17:48:35,723][85175] Updated weights for policy 1, policy_version 68900 (0.0007) +[2023-10-11 17:48:36,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 140083200. Throughput: 0: 1657.0, 1: 1703.4. Samples: 35028416. Policy #0 lag: (min: 31.0, avg: 31.0, max: 34.0) +[2023-10-11 17:48:36,063][84230] Avg episode reward: [(0, '42.790'), (1, '43.700')] +[2023-10-11 17:48:36,121][85175] Updated weights for policy 1, policy_version 68910 (0.0008) +[2023-10-11 17:48:36,489][85175] Updated weights for policy 1, policy_version 68920 (0.0008) +[2023-10-11 17:48:37,620][85176] Updated weights for policy 0, policy_version 67912 (0.0009) +[2023-10-11 17:48:37,992][85176] Updated weights for policy 0, policy_version 67922 (0.0008) +[2023-10-11 17:48:38,360][85176] Updated weights for policy 0, policy_version 67932 (0.0007) +[2023-10-11 17:48:40,092][85175] Updated weights for policy 1, policy_version 68930 (0.0008) +[2023-10-11 17:48:40,459][85175] Updated weights for policy 1, policy_version 68940 (0.0009) +[2023-10-11 17:48:40,825][85175] Updated weights for policy 1, policy_version 68950 (0.0008) +[2023-10-11 17:48:41,062][84230] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 140148736. Throughput: 0: 1676.2, 1: 1709.4. Samples: 35049158. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:48:41,063][84230] Avg episode reward: [(0, '46.190'), (1, '44.850')] +[2023-10-11 17:48:41,181][85175] Updated weights for policy 1, policy_version 68960 (0.0010) +[2023-10-11 17:48:42,473][85176] Updated weights for policy 0, policy_version 67942 (0.0009) +[2023-10-11 17:48:42,845][85176] Updated weights for policy 0, policy_version 67952 (0.0011) +[2023-10-11 17:48:43,212][85176] Updated weights for policy 0, policy_version 67962 (0.0010) +[2023-10-11 17:48:45,271][85175] Updated weights for policy 1, policy_version 68970 (0.0009) +[2023-10-11 17:48:45,638][85175] Updated weights for policy 1, policy_version 68980 (0.0009) +[2023-10-11 17:48:46,012][85175] Updated weights for policy 1, policy_version 68990 (0.0007) +[2023-10-11 17:48:46,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 140214272. Throughput: 0: 1682.8, 1: 1689.5. Samples: 35069486. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:48:46,063][84230] Avg episode reward: [(0, '44.920'), (1, '42.060')] +[2023-10-11 17:48:47,306][85176] Updated weights for policy 0, policy_version 67972 (0.0009) +[2023-10-11 17:48:47,673][85176] Updated weights for policy 0, policy_version 67982 (0.0008) +[2023-10-11 17:48:48,058][85176] Updated weights for policy 0, policy_version 67992 (0.0008) +[2023-10-11 17:48:50,138][85175] Updated weights for policy 1, policy_version 69000 (0.0010) +[2023-10-11 17:48:50,514][85175] Updated weights for policy 1, policy_version 69010 (0.0008) +[2023-10-11 17:48:50,877][85175] Updated weights for policy 1, policy_version 69020 (0.0008) +[2023-10-11 17:48:51,063][84230] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 140312576. Throughput: 0: 1661.7, 1: 1704.1. Samples: 35079252. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:48:51,063][84230] Avg episode reward: [(0, '45.990'), (1, '44.290')] +[2023-10-11 17:48:52,263][85176] Updated weights for policy 0, policy_version 68002 (0.0007) +[2023-10-11 17:48:52,642][85176] Updated weights for policy 0, policy_version 68012 (0.0008) +[2023-10-11 17:48:53,021][85176] Updated weights for policy 0, policy_version 68022 (0.0009) +[2023-10-11 17:48:53,391][85176] Updated weights for policy 0, policy_version 68032 (0.0009) +[2023-10-11 17:48:54,900][85175] Updated weights for policy 1, policy_version 69030 (0.0009) +[2023-10-11 17:48:55,266][85175] Updated weights for policy 1, policy_version 69040 (0.0008) +[2023-10-11 17:48:55,640][85175] Updated weights for policy 1, policy_version 69050 (0.0009) +[2023-10-11 17:48:56,063][84230] Fps is (10 sec: 16383.6, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 140378112. Throughput: 0: 1680.5, 1: 1699.8. Samples: 35099758. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:48:56,064][84230] Avg episode reward: [(0, '41.970'), (1, '42.580')] +[2023-10-11 17:48:57,291][85176] Updated weights for policy 0, policy_version 68042 (0.0007) +[2023-10-11 17:48:57,660][85176] Updated weights for policy 0, policy_version 68052 (0.0010) +[2023-10-11 17:48:58,037][85176] Updated weights for policy 0, policy_version 68062 (0.0007) +[2023-10-11 17:48:59,716][85175] Updated weights for policy 1, policy_version 69060 (0.0009) +[2023-10-11 17:49:00,089][85175] Updated weights for policy 1, policy_version 69070 (0.0009) +[2023-10-11 17:49:00,459][85175] Updated weights for policy 1, policy_version 69080 (0.0009) +[2023-10-11 17:49:01,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 140443648. Throughput: 0: 1689.6, 1: 1674.1. Samples: 35119800. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:49:01,063][84230] Avg episode reward: [(0, '44.430'), (1, '43.040')] +[2023-10-11 17:49:01,993][85176] Updated weights for policy 0, policy_version 68072 (0.0007) +[2023-10-11 17:49:02,361][85176] Updated weights for policy 0, policy_version 68082 (0.0009) +[2023-10-11 17:49:02,741][85176] Updated weights for policy 0, policy_version 68092 (0.0010) +[2023-10-11 17:49:04,515][85175] Updated weights for policy 1, policy_version 69090 (0.0009) +[2023-10-11 17:49:04,877][85175] Updated weights for policy 1, policy_version 69100 (0.0010) +[2023-10-11 17:49:05,240][85175] Updated weights for policy 1, policy_version 69110 (0.0009) +[2023-10-11 17:49:05,609][85175] Updated weights for policy 1, policy_version 69120 (0.0008) +[2023-10-11 17:49:06,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 140509184. Throughput: 0: 1682.9, 1: 1695.6. Samples: 35130172. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:49:06,063][84230] Avg episode reward: [(0, '42.580'), (1, '43.200')] +[2023-10-11 17:49:06,552][85176] Updated weights for policy 0, policy_version 68102 (0.0009) +[2023-10-11 17:49:06,931][85176] Updated weights for policy 0, policy_version 68112 (0.0009) +[2023-10-11 17:49:07,306][85176] Updated weights for policy 0, policy_version 68122 (0.0007) +[2023-10-11 17:49:09,625][85175] Updated weights for policy 1, policy_version 69130 (0.0009) +[2023-10-11 17:49:09,995][85175] Updated weights for policy 1, policy_version 69140 (0.0009) +[2023-10-11 17:49:10,357][85175] Updated weights for policy 1, policy_version 69150 (0.0008) +[2023-10-11 17:49:11,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 140574720. Throughput: 0: 1694.2, 1: 1690.7. Samples: 35150724. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:49:11,063][84230] Avg episode reward: [(0, '45.830'), (1, '42.800')] +[2023-10-11 17:49:11,397][85176] Updated weights for policy 0, policy_version 68132 (0.0009) +[2023-10-11 17:49:11,766][85176] Updated weights for policy 0, policy_version 68142 (0.0008) +[2023-10-11 17:49:12,146][85176] Updated weights for policy 0, policy_version 68152 (0.0010) +[2023-10-11 17:49:14,362][85175] Updated weights for policy 1, policy_version 69160 (0.0009) +[2023-10-11 17:49:14,727][85175] Updated weights for policy 1, policy_version 69170 (0.0007) +[2023-10-11 17:49:15,101][85175] Updated weights for policy 1, policy_version 69180 (0.0007) +[2023-10-11 17:49:16,063][84230] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 140640256. Throughput: 0: 1692.6, 1: 1671.3. Samples: 35170606. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:49:16,064][84230] Avg episode reward: [(0, '43.120'), (1, '43.340')] +[2023-10-11 17:49:16,083][85176] Updated weights for policy 0, policy_version 68162 (0.0010) +[2023-10-11 17:49:16,457][85176] Updated weights for policy 0, policy_version 68172 (0.0009) +[2023-10-11 17:49:16,838][85176] Updated weights for policy 0, policy_version 68182 (0.0011) +[2023-10-11 17:49:17,194][85176] Updated weights for policy 0, policy_version 68192 (0.0009) +[2023-10-11 17:49:19,213][85175] Updated weights for policy 1, policy_version 69190 (0.0008) +[2023-10-11 17:49:19,572][85175] Updated weights for policy 1, policy_version 69200 (0.0008) +[2023-10-11 17:49:19,941][85175] Updated weights for policy 1, policy_version 69210 (0.0007) +[2023-10-11 17:49:21,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 140705792. Throughput: 0: 1687.5, 1: 1703.9. Samples: 35181028. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:49:21,063][84230] Avg episode reward: [(0, '46.060'), (1, '42.470')] +[2023-10-11 17:49:21,432][85176] Updated weights for policy 0, policy_version 68202 (0.0010) +[2023-10-11 17:49:21,796][85176] Updated weights for policy 0, policy_version 68212 (0.0007) +[2023-10-11 17:49:22,173][85176] Updated weights for policy 0, policy_version 68222 (0.0007) +[2023-10-11 17:49:23,815][85175] Updated weights for policy 1, policy_version 69220 (0.0010) +[2023-10-11 17:49:24,208][85175] Updated weights for policy 1, policy_version 69230 (0.0010) +[2023-10-11 17:49:24,573][85175] Updated weights for policy 1, policy_version 69240 (0.0007) +[2023-10-11 17:49:26,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 140771328. Throughput: 0: 1692.7, 1: 1680.1. Samples: 35200936. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) +[2023-10-11 17:49:26,064][84230] Avg episode reward: [(0, '41.370'), (1, '44.770')] +[2023-10-11 17:49:26,447][85176] Updated weights for policy 0, policy_version 68232 (0.0007) +[2023-10-11 17:49:26,827][85176] Updated weights for policy 0, policy_version 68242 (0.0008) +[2023-10-11 17:49:27,200][85176] Updated weights for policy 0, policy_version 68252 (0.0009) +[2023-10-11 17:49:28,593][85175] Updated weights for policy 1, policy_version 69250 (0.0007) +[2023-10-11 17:49:28,958][85175] Updated weights for policy 1, policy_version 69260 (0.0009) +[2023-10-11 17:49:29,327][85175] Updated weights for policy 1, policy_version 69270 (0.0010) +[2023-10-11 17:49:29,691][85175] Updated weights for policy 1, policy_version 69280 (0.0007) +[2023-10-11 17:49:31,063][84230] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 140836864. Throughput: 0: 1687.5, 1: 1689.7. Samples: 35221460. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) +[2023-10-11 17:49:31,064][84230] Avg episode reward: [(0, '43.950'), (1, '43.990')] +[2023-10-11 17:49:31,325][85176] Updated weights for policy 0, policy_version 68262 (0.0007) +[2023-10-11 17:49:31,700][85176] Updated weights for policy 0, policy_version 68272 (0.0009) +[2023-10-11 17:49:32,070][85176] Updated weights for policy 0, policy_version 68282 (0.0007) +[2023-10-11 17:49:33,673][85175] Updated weights for policy 1, policy_version 69290 (0.0009) +[2023-10-11 17:49:34,048][85175] Updated weights for policy 1, policy_version 69300 (0.0008) +[2023-10-11 17:49:34,404][85175] Updated weights for policy 1, policy_version 69310 (0.0010) +[2023-10-11 17:49:36,062][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 140902400. Throughput: 0: 1687.3, 1: 1697.3. Samples: 35231562. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) +[2023-10-11 17:49:36,063][84230] Avg episode reward: [(0, '41.800'), (1, '47.590')] +[2023-10-11 17:49:36,246][85176] Updated weights for policy 0, policy_version 68292 (0.0009) +[2023-10-11 17:49:36,608][85176] Updated weights for policy 0, policy_version 68302 (0.0007) +[2023-10-11 17:49:36,991][85176] Updated weights for policy 0, policy_version 68312 (0.0007) +[2023-10-11 17:49:38,303][85175] Updated weights for policy 1, policy_version 69320 (0.0007) +[2023-10-11 17:49:38,674][85175] Updated weights for policy 1, policy_version 69330 (0.0008) +[2023-10-11 17:49:39,037][85175] Updated weights for policy 1, policy_version 69340 (0.0007) +[2023-10-11 17:49:40,879][85176] Updated weights for policy 0, policy_version 68322 (0.0009) +[2023-10-11 17:49:41,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 140967936. Throughput: 0: 1692.5, 1: 1683.4. Samples: 35251674. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) +[2023-10-11 17:49:41,063][84230] Avg episode reward: [(0, '44.810'), (1, '43.540')] +[2023-10-11 17:49:41,257][85176] Updated weights for policy 0, policy_version 68332 (0.0010) +[2023-10-11 17:49:41,639][85176] Updated weights for policy 0, policy_version 68342 (0.0011) +[2023-10-11 17:49:42,013][85176] Updated weights for policy 0, policy_version 68352 (0.0009) +[2023-10-11 17:49:42,998][85175] Updated weights for policy 1, policy_version 69350 (0.0009) +[2023-10-11 17:49:43,373][85175] Updated weights for policy 1, policy_version 69360 (0.0007) +[2023-10-11 17:49:43,755][85175] Updated weights for policy 1, policy_version 69370 (0.0008) +[2023-10-11 17:49:46,039][85176] Updated weights for policy 0, policy_version 68362 (0.0007) +[2023-10-11 17:49:46,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 141033472. Throughput: 0: 1687.3, 1: 1711.2. Samples: 35272734. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) +[2023-10-11 17:49:46,063][84230] Avg episode reward: [(0, '42.090'), (1, '45.260')] +[2023-10-11 17:49:46,420][85176] Updated weights for policy 0, policy_version 68372 (0.0007) +[2023-10-11 17:49:46,792][85176] Updated weights for policy 0, policy_version 68382 (0.0008) +[2023-10-11 17:49:47,681][85175] Updated weights for policy 1, policy_version 69380 (0.0008) +[2023-10-11 17:49:48,055][85175] Updated weights for policy 1, policy_version 69390 (0.0009) +[2023-10-11 17:49:48,430][85175] Updated weights for policy 1, policy_version 69400 (0.0008) +[2023-10-11 17:49:50,720][85176] Updated weights for policy 0, policy_version 68392 (0.0007) +[2023-10-11 17:49:51,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 141099008. Throughput: 0: 1684.2, 1: 1696.2. Samples: 35282290. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) +[2023-10-11 17:49:51,064][84230] Avg episode reward: [(0, '42.770'), (1, '42.870')] +[2023-10-11 17:49:51,093][85176] Updated weights for policy 0, policy_version 68402 (0.0007) +[2023-10-11 17:49:51,459][85176] Updated weights for policy 0, policy_version 68412 (0.0007) +[2023-10-11 17:49:52,405][85175] Updated weights for policy 1, policy_version 69410 (0.0009) +[2023-10-11 17:49:52,769][85175] Updated weights for policy 1, policy_version 69420 (0.0010) +[2023-10-11 17:49:53,132][85175] Updated weights for policy 1, policy_version 69430 (0.0010) +[2023-10-11 17:49:53,499][85175] Updated weights for policy 1, policy_version 69440 (0.0012) +[2023-10-11 17:49:55,243][85176] Updated weights for policy 0, policy_version 68422 (0.0009) +[2023-10-11 17:49:55,616][85176] Updated weights for policy 0, policy_version 68432 (0.0007) +[2023-10-11 17:49:55,989][85176] Updated weights for policy 0, policy_version 68442 (0.0009) +[2023-10-11 17:49:56,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 141164544. Throughput: 0: 1688.2, 1: 1694.0. Samples: 35302922. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) +[2023-10-11 17:49:56,063][84230] Avg episode reward: [(0, '40.650'), (1, '46.650')] +[2023-10-11 17:49:57,533][85175] Updated weights for policy 1, policy_version 69450 (0.0010) +[2023-10-11 17:49:57,911][85175] Updated weights for policy 1, policy_version 69460 (0.0008) +[2023-10-11 17:49:58,283][85175] Updated weights for policy 1, policy_version 69470 (0.0009) +[2023-10-11 17:50:00,247][85176] Updated weights for policy 0, policy_version 68452 (0.0009) +[2023-10-11 17:50:00,611][85176] Updated weights for policy 0, policy_version 68462 (0.0010) +[2023-10-11 17:50:00,976][85176] Updated weights for policy 0, policy_version 68472 (0.0008) +[2023-10-11 17:50:01,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 141230080. Throughput: 0: 1673.3, 1: 1719.7. Samples: 35323288. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) +[2023-10-11 17:50:01,063][84230] Avg episode reward: [(0, '46.740'), (1, '43.240')] +[2023-10-11 17:50:02,262][85175] Updated weights for policy 1, policy_version 69480 (0.0011) +[2023-10-11 17:50:02,642][85175] Updated weights for policy 1, policy_version 69490 (0.0009) +[2023-10-11 17:50:03,006][85175] Updated weights for policy 1, policy_version 69500 (0.0009) +[2023-10-11 17:50:05,221][85176] Updated weights for policy 0, policy_version 68482 (0.0009) +[2023-10-11 17:50:05,595][85176] Updated weights for policy 0, policy_version 68492 (0.0009) +[2023-10-11 17:50:05,975][85176] Updated weights for policy 0, policy_version 68502 (0.0009) +[2023-10-11 17:50:06,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 141295616. Throughput: 0: 1684.4, 1: 1689.7. Samples: 35332866. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) +[2023-10-11 17:50:06,063][84230] Avg episode reward: [(0, '42.450'), (1, '45.050')] +[2023-10-11 17:50:06,344][85176] Updated weights for policy 0, policy_version 68512 (0.0010) +[2023-10-11 17:50:07,225][85175] Updated weights for policy 1, policy_version 69510 (0.0007) +[2023-10-11 17:50:07,594][85175] Updated weights for policy 1, policy_version 69520 (0.0009) +[2023-10-11 17:50:07,966][85175] Updated weights for policy 1, policy_version 69530 (0.0008) +[2023-10-11 17:50:10,510][85176] Updated weights for policy 0, policy_version 68522 (0.0008) +[2023-10-11 17:50:10,876][85176] Updated weights for policy 0, policy_version 68532 (0.0008) +[2023-10-11 17:50:11,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 141361152. Throughput: 0: 1684.6, 1: 1699.2. Samples: 35353206. Policy #0 lag: (min: 31.0, avg: 35.9, max: 63.0) +[2023-10-11 17:50:11,064][84230] Avg episode reward: [(0, '46.190'), (1, '43.370')] +[2023-10-11 17:50:11,244][85176] Updated weights for policy 0, policy_version 68542 (0.0009) +[2023-10-11 17:50:11,989][85175] Updated weights for policy 1, policy_version 69540 (0.0007) +[2023-10-11 17:50:12,390][85175] Updated weights for policy 1, policy_version 69550 (0.0010) +[2023-10-11 17:50:12,758][85175] Updated weights for policy 1, policy_version 69560 (0.0007) +[2023-10-11 17:50:15,363][85176] Updated weights for policy 0, policy_version 68552 (0.0008) +[2023-10-11 17:50:15,739][85176] Updated weights for policy 0, policy_version 68562 (0.0010) +[2023-10-11 17:50:16,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 141426688. Throughput: 0: 1669.2, 1: 1708.6. Samples: 35373460. Policy #0 lag: (min: 31.0, avg: 35.9, max: 63.0) +[2023-10-11 17:50:16,064][84230] Avg episode reward: [(0, '43.700'), (1, '45.380')] +[2023-10-11 17:50:16,074][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000069568_71237632.pth... +[2023-10-11 17:50:16,112][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000068000_69632000.pth +[2023-10-11 17:50:16,112][85176] Updated weights for policy 0, policy_version 68572 (0.0008) +[2023-10-11 17:50:16,259][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000068576_70221824.pth... +[2023-10-11 17:50:16,299][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000067008_68616192.pth +[2023-10-11 17:50:16,802][85175] Updated weights for policy 1, policy_version 69570 (0.0007) +[2023-10-11 17:50:17,170][85175] Updated weights for policy 1, policy_version 69580 (0.0007) +[2023-10-11 17:50:17,534][85175] Updated weights for policy 1, policy_version 69590 (0.0008) +[2023-10-11 17:50:17,899][85175] Updated weights for policy 1, policy_version 69600 (0.0011) +[2023-10-11 17:50:20,155][85176] Updated weights for policy 0, policy_version 68582 (0.0010) +[2023-10-11 17:50:20,542][85176] Updated weights for policy 0, policy_version 68592 (0.0008) +[2023-10-11 17:50:20,912][85176] Updated weights for policy 0, policy_version 68602 (0.0007) +[2023-10-11 17:50:21,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 141492224. Throughput: 0: 1683.9, 1: 1688.0. Samples: 35383300. Policy #0 lag: (min: 31.0, avg: 35.9, max: 63.0) +[2023-10-11 17:50:21,063][84230] Avg episode reward: [(0, '45.620'), (1, '44.020')] +[2023-10-11 17:50:22,003][85175] Updated weights for policy 1, policy_version 69610 (0.0011) +[2023-10-11 17:50:22,362][85175] Updated weights for policy 1, policy_version 69620 (0.0010) +[2023-10-11 17:50:22,730][85175] Updated weights for policy 1, policy_version 69630 (0.0011) +[2023-10-11 17:50:24,843][85176] Updated weights for policy 0, policy_version 68612 (0.0007) +[2023-10-11 17:50:25,216][85176] Updated weights for policy 0, policy_version 68622 (0.0009) +[2023-10-11 17:50:25,592][85176] Updated weights for policy 0, policy_version 68632 (0.0012) +[2023-10-11 17:50:26,062][84230] Fps is (10 sec: 16384.4, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 141590528. Throughput: 0: 1678.9, 1: 1704.1. Samples: 35403912. Policy #0 lag: (min: 31.0, avg: 35.9, max: 63.0) +[2023-10-11 17:50:26,063][84230] Avg episode reward: [(0, '42.150'), (1, '45.860')] +[2023-10-11 17:50:26,766][85175] Updated weights for policy 1, policy_version 69640 (0.0009) +[2023-10-11 17:50:27,129][85175] Updated weights for policy 1, policy_version 69650 (0.0010) +[2023-10-11 17:50:27,508][85175] Updated weights for policy 1, policy_version 69660 (0.0010) +[2023-10-11 17:50:29,607][85176] Updated weights for policy 0, policy_version 68642 (0.0009) +[2023-10-11 17:50:29,988][85176] Updated weights for policy 0, policy_version 68652 (0.0009) +[2023-10-11 17:50:30,365][85176] Updated weights for policy 0, policy_version 68662 (0.0010) +[2023-10-11 17:50:30,744][85176] Updated weights for policy 0, policy_version 68672 (0.0009) +[2023-10-11 17:50:31,063][84230] Fps is (10 sec: 16383.8, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 141656064. Throughput: 0: 1656.2, 1: 1694.1. Samples: 35423498. Policy #0 lag: (min: 31.0, avg: 35.9, max: 63.0) +[2023-10-11 17:50:31,063][84230] Avg episode reward: [(0, '46.270'), (1, '45.220')] +[2023-10-11 17:50:31,533][85175] Updated weights for policy 1, policy_version 69670 (0.0008) +[2023-10-11 17:50:31,901][85175] Updated weights for policy 1, policy_version 69680 (0.0008) +[2023-10-11 17:50:32,264][85175] Updated weights for policy 1, policy_version 69690 (0.0008) +[2023-10-11 17:50:35,077][85176] Updated weights for policy 0, policy_version 68682 (0.0010) +[2023-10-11 17:50:35,461][85176] Updated weights for policy 0, policy_version 68692 (0.0008) +[2023-10-11 17:50:35,837][85176] Updated weights for policy 0, policy_version 68702 (0.0008) +[2023-10-11 17:50:36,062][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 141721600. Throughput: 0: 1677.5, 1: 1687.8. Samples: 35433728. Policy #0 lag: (min: 31.0, avg: 35.9, max: 63.0) +[2023-10-11 17:50:36,063][84230] Avg episode reward: [(0, '42.070'), (1, '46.020')] +[2023-10-11 17:50:36,253][85175] Updated weights for policy 1, policy_version 69700 (0.0007) +[2023-10-11 17:50:36,613][85175] Updated weights for policy 1, policy_version 69710 (0.0007) +[2023-10-11 17:50:36,974][85175] Updated weights for policy 1, policy_version 69720 (0.0008) +[2023-10-11 17:50:39,885][85176] Updated weights for policy 0, policy_version 68712 (0.0008) +[2023-10-11 17:50:40,247][85176] Updated weights for policy 0, policy_version 68722 (0.0009) +[2023-10-11 17:50:40,616][85176] Updated weights for policy 0, policy_version 68732 (0.0010) +[2023-10-11 17:50:41,058][85175] Updated weights for policy 1, policy_version 69730 (0.0008) +[2023-10-11 17:50:41,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 141787136. Throughput: 0: 1667.7, 1: 1692.0. Samples: 35454110. Policy #0 lag: (min: 31.0, avg: 35.9, max: 63.0) +[2023-10-11 17:50:41,063][84230] Avg episode reward: [(0, '44.990'), (1, '45.260')] +[2023-10-11 17:50:41,425][85175] Updated weights for policy 1, policy_version 69740 (0.0007) +[2023-10-11 17:50:41,797][85175] Updated weights for policy 1, policy_version 69750 (0.0008) +[2023-10-11 17:50:42,160][85175] Updated weights for policy 1, policy_version 69760 (0.0008) +[2023-10-11 17:50:44,726][85176] Updated weights for policy 0, policy_version 68742 (0.0008) +[2023-10-11 17:50:45,100][85176] Updated weights for policy 0, policy_version 68752 (0.0009) +[2023-10-11 17:50:45,478][85176] Updated weights for policy 0, policy_version 68762 (0.0009) +[2023-10-11 17:50:46,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 141852672. Throughput: 0: 1655.1, 1: 1692.2. Samples: 35473920. Policy #0 lag: (min: 31.0, avg: 35.9, max: 63.0) +[2023-10-11 17:50:46,064][84230] Avg episode reward: [(0, '41.110'), (1, '44.640')] +[2023-10-11 17:50:46,210][85175] Updated weights for policy 1, policy_version 69770 (0.0009) +[2023-10-11 17:50:46,584][85175] Updated weights for policy 1, policy_version 69780 (0.0009) +[2023-10-11 17:50:46,953][85175] Updated weights for policy 1, policy_version 69790 (0.0009) +[2023-10-11 17:50:49,581][85176] Updated weights for policy 0, policy_version 68772 (0.0009) +[2023-10-11 17:50:49,951][85176] Updated weights for policy 0, policy_version 68782 (0.0007) +[2023-10-11 17:50:50,327][85176] Updated weights for policy 0, policy_version 68792 (0.0008) +[2023-10-11 17:50:50,923][85175] Updated weights for policy 1, policy_version 69800 (0.0007) +[2023-10-11 17:50:51,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 141918208. Throughput: 0: 1668.6, 1: 1690.1. Samples: 35484008. Policy #0 lag: (min: 25.0, avg: 53.0, max: 56.0) +[2023-10-11 17:50:51,063][84230] Avg episode reward: [(0, '42.720'), (1, '43.270')] +[2023-10-11 17:50:51,297][85175] Updated weights for policy 1, policy_version 69810 (0.0008) +[2023-10-11 17:50:51,660][85175] Updated weights for policy 1, policy_version 69820 (0.0007) +[2023-10-11 17:50:54,514][85176] Updated weights for policy 0, policy_version 68802 (0.0008) +[2023-10-11 17:50:54,888][85176] Updated weights for policy 0, policy_version 68812 (0.0009) +[2023-10-11 17:50:55,266][85176] Updated weights for policy 0, policy_version 68822 (0.0008) +[2023-10-11 17:50:55,638][85176] Updated weights for policy 0, policy_version 68832 (0.0007) +[2023-10-11 17:50:55,706][85175] Updated weights for policy 1, policy_version 69830 (0.0008) +[2023-10-11 17:50:56,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 141983744. Throughput: 0: 1664.7, 1: 1699.7. Samples: 35504602. Policy #0 lag: (min: 25.0, avg: 53.0, max: 56.0) +[2023-10-11 17:50:56,063][84230] Avg episode reward: [(0, '40.330'), (1, '45.180')] +[2023-10-11 17:50:56,069][85175] Updated weights for policy 1, policy_version 69840 (0.0008) +[2023-10-11 17:50:56,430][85175] Updated weights for policy 1, policy_version 69850 (0.0010) +[2023-10-11 17:50:59,611][85176] Updated weights for policy 0, policy_version 68842 (0.0007) +[2023-10-11 17:50:59,989][85176] Updated weights for policy 0, policy_version 68852 (0.0009) +[2023-10-11 17:51:00,363][85176] Updated weights for policy 0, policy_version 68862 (0.0009) +[2023-10-11 17:51:00,409][85175] Updated weights for policy 1, policy_version 69860 (0.0009) +[2023-10-11 17:51:00,788][85175] Updated weights for policy 1, policy_version 69870 (0.0010) +[2023-10-11 17:51:01,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 142049280. Throughput: 0: 1655.1, 1: 1693.4. Samples: 35524140. Policy #0 lag: (min: 25.0, avg: 53.0, max: 56.0) +[2023-10-11 17:51:01,064][84230] Avg episode reward: [(0, '42.330'), (1, '43.870')] +[2023-10-11 17:51:01,157][85175] Updated weights for policy 1, policy_version 69880 (0.0008) +[2023-10-11 17:51:04,366][85176] Updated weights for policy 0, policy_version 68872 (0.0007) +[2023-10-11 17:51:04,751][85176] Updated weights for policy 0, policy_version 68882 (0.0008) +[2023-10-11 17:51:05,117][85176] Updated weights for policy 0, policy_version 68892 (0.0008) +[2023-10-11 17:51:05,208][85175] Updated weights for policy 1, policy_version 69890 (0.0009) +[2023-10-11 17:51:05,583][85175] Updated weights for policy 1, policy_version 69900 (0.0009) +[2023-10-11 17:51:05,948][85175] Updated weights for policy 1, policy_version 69910 (0.0011) +[2023-10-11 17:51:06,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 142114816. Throughput: 0: 1671.6, 1: 1698.0. Samples: 35534930. Policy #0 lag: (min: 25.0, avg: 53.0, max: 56.0) +[2023-10-11 17:51:06,063][84230] Avg episode reward: [(0, '44.330'), (1, '45.150')] +[2023-10-11 17:51:06,319][85175] Updated weights for policy 1, policy_version 69920 (0.0007) +[2023-10-11 17:51:09,208][85176] Updated weights for policy 0, policy_version 68902 (0.0008) +[2023-10-11 17:51:09,587][85176] Updated weights for policy 0, policy_version 68912 (0.0010) +[2023-10-11 17:51:09,954][85176] Updated weights for policy 0, policy_version 68922 (0.0007) +[2023-10-11 17:51:10,412][85175] Updated weights for policy 1, policy_version 69930 (0.0010) +[2023-10-11 17:51:10,775][85175] Updated weights for policy 1, policy_version 69940 (0.0007) +[2023-10-11 17:51:11,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 142180352. Throughput: 0: 1653.8, 1: 1700.3. Samples: 35554850. Policy #0 lag: (min: 25.0, avg: 53.0, max: 56.0) +[2023-10-11 17:51:11,064][84230] Avg episode reward: [(0, '43.520'), (1, '45.780')] +[2023-10-11 17:51:11,144][85175] Updated weights for policy 1, policy_version 69950 (0.0009) +[2023-10-11 17:51:13,959][85176] Updated weights for policy 0, policy_version 68932 (0.0010) +[2023-10-11 17:51:14,336][85176] Updated weights for policy 0, policy_version 68942 (0.0008) +[2023-10-11 17:51:14,709][85176] Updated weights for policy 0, policy_version 68952 (0.0007) +[2023-10-11 17:51:15,103][85175] Updated weights for policy 1, policy_version 69960 (0.0010) +[2023-10-11 17:51:15,481][85175] Updated weights for policy 1, policy_version 69970 (0.0009) +[2023-10-11 17:51:15,842][85175] Updated weights for policy 1, policy_version 69980 (0.0010) +[2023-10-11 17:51:16,063][84230] Fps is (10 sec: 16383.5, 60 sec: 14199.4, 300 sec: 13662.6). Total num frames: 142278656. Throughput: 0: 1660.8, 1: 1691.1. Samples: 35574336. Policy #0 lag: (min: 25.0, avg: 53.0, max: 56.0) +[2023-10-11 17:51:16,064][84230] Avg episode reward: [(0, '44.030'), (1, '42.940')] +[2023-10-11 17:51:18,947][85176] Updated weights for policy 0, policy_version 68962 (0.0008) +[2023-10-11 17:51:19,322][85176] Updated weights for policy 0, policy_version 68972 (0.0009) +[2023-10-11 17:51:19,691][85176] Updated weights for policy 0, policy_version 68982 (0.0009) +[2023-10-11 17:51:19,792][85175] Updated weights for policy 1, policy_version 69990 (0.0010) +[2023-10-11 17:51:20,065][85176] Updated weights for policy 0, policy_version 68992 (0.0007) +[2023-10-11 17:51:20,147][85175] Updated weights for policy 1, policy_version 70000 (0.0008) +[2023-10-11 17:51:20,523][85175] Updated weights for policy 1, policy_version 70010 (0.0008) +[2023-10-11 17:51:21,063][84230] Fps is (10 sec: 16384.2, 60 sec: 14199.4, 300 sec: 13662.6). Total num frames: 142344192. Throughput: 0: 1665.7, 1: 1706.5. Samples: 35585476. Policy #0 lag: (min: 25.0, avg: 53.0, max: 56.0) +[2023-10-11 17:51:21,063][84230] Avg episode reward: [(0, '43.010'), (1, '44.720')] +[2023-10-11 17:51:24,255][85176] Updated weights for policy 0, policy_version 69002 (0.0009) +[2023-10-11 17:51:24,630][85176] Updated weights for policy 0, policy_version 69012 (0.0007) +[2023-10-11 17:51:24,740][85175] Updated weights for policy 1, policy_version 70020 (0.0007) +[2023-10-11 17:51:25,007][85176] Updated weights for policy 0, policy_version 69022 (0.0007) +[2023-10-11 17:51:25,103][85175] Updated weights for policy 1, policy_version 70030 (0.0010) +[2023-10-11 17:51:25,479][85175] Updated weights for policy 1, policy_version 70040 (0.0008) +[2023-10-11 17:51:26,063][84230] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 142409728. Throughput: 0: 1654.8, 1: 1703.8. Samples: 35605246. Policy #0 lag: (min: 25.0, avg: 53.0, max: 56.0) +[2023-10-11 17:51:26,063][84230] Avg episode reward: [(0, '46.540'), (1, '45.100')] +[2023-10-11 17:51:28,871][85176] Updated weights for policy 0, policy_version 69032 (0.0008) +[2023-10-11 17:51:29,241][85176] Updated weights for policy 0, policy_version 69042 (0.0007) +[2023-10-11 17:51:29,423][85175] Updated weights for policy 1, policy_version 70050 (0.0009) +[2023-10-11 17:51:29,613][85176] Updated weights for policy 0, policy_version 69052 (0.0008) +[2023-10-11 17:51:29,791][85175] Updated weights for policy 1, policy_version 70060 (0.0008) +[2023-10-11 17:51:30,166][85175] Updated weights for policy 1, policy_version 70070 (0.0009) +[2023-10-11 17:51:30,530][85175] Updated weights for policy 1, policy_version 70080 (0.0011) +[2023-10-11 17:51:31,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 142475264. Throughput: 0: 1669.5, 1: 1673.9. Samples: 35624370. Policy #0 lag: (min: 25.0, avg: 53.0, max: 56.0) +[2023-10-11 17:51:31,064][84230] Avg episode reward: [(0, '43.450'), (1, '45.000')] +[2023-10-11 17:51:33,591][85176] Updated weights for policy 0, policy_version 69062 (0.0008) +[2023-10-11 17:51:33,963][85176] Updated weights for policy 0, policy_version 69072 (0.0008) +[2023-10-11 17:51:34,330][85176] Updated weights for policy 0, policy_version 69082 (0.0008) +[2023-10-11 17:51:34,631][85175] Updated weights for policy 1, policy_version 70090 (0.0008) +[2023-10-11 17:51:34,997][85175] Updated weights for policy 1, policy_version 70100 (0.0008) +[2023-10-11 17:51:35,360][85175] Updated weights for policy 1, policy_version 70110 (0.0007) +[2023-10-11 17:51:36,062][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 142540800. Throughput: 0: 1668.8, 1: 1700.0. Samples: 35635604. Policy #0 lag: (min: 31.0, avg: 35.9, max: 63.0) +[2023-10-11 17:51:36,063][84230] Avg episode reward: [(0, '43.510'), (1, '43.900')] +[2023-10-11 17:51:38,419][85176] Updated weights for policy 0, policy_version 69092 (0.0008) +[2023-10-11 17:51:38,797][85176] Updated weights for policy 0, policy_version 69102 (0.0009) +[2023-10-11 17:51:39,178][85176] Updated weights for policy 0, policy_version 69112 (0.0010) +[2023-10-11 17:51:39,466][85175] Updated weights for policy 1, policy_version 70120 (0.0007) +[2023-10-11 17:51:39,847][85175] Updated weights for policy 1, policy_version 70130 (0.0009) +[2023-10-11 17:51:40,205][85175] Updated weights for policy 1, policy_version 70140 (0.0009) +[2023-10-11 17:51:41,063][84230] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 142606336. Throughput: 0: 1650.8, 1: 1691.7. Samples: 35655016. Policy #0 lag: (min: 31.0, avg: 35.9, max: 63.0) +[2023-10-11 17:51:41,063][84230] Avg episode reward: [(0, '39.870'), (1, '47.850')] +[2023-10-11 17:51:43,274][85176] Updated weights for policy 0, policy_version 69122 (0.0009) +[2023-10-11 17:51:43,657][85176] Updated weights for policy 0, policy_version 69132 (0.0008) +[2023-10-11 17:51:44,020][85176] Updated weights for policy 0, policy_version 69142 (0.0011) +[2023-10-11 17:51:44,234][85175] Updated weights for policy 1, policy_version 70150 (0.0009) +[2023-10-11 17:51:44,404][85176] Updated weights for policy 0, policy_version 69152 (0.0008) +[2023-10-11 17:51:44,592][85175] Updated weights for policy 1, policy_version 70160 (0.0007) +[2023-10-11 17:51:44,965][85175] Updated weights for policy 1, policy_version 70170 (0.0007) +[2023-10-11 17:51:46,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 142671872. Throughput: 0: 1678.8, 1: 1673.5. Samples: 35674990. Policy #0 lag: (min: 31.0, avg: 35.9, max: 63.0) +[2023-10-11 17:51:46,063][84230] Avg episode reward: [(0, '41.130'), (1, '44.700')] +[2023-10-11 17:51:48,682][85176] Updated weights for policy 0, policy_version 69162 (0.0008) +[2023-10-11 17:51:49,054][85176] Updated weights for policy 0, policy_version 69172 (0.0008) +[2023-10-11 17:51:49,066][85175] Updated weights for policy 1, policy_version 70180 (0.0007) +[2023-10-11 17:51:49,428][85176] Updated weights for policy 0, policy_version 69182 (0.0008) +[2023-10-11 17:51:49,468][85175] Updated weights for policy 1, policy_version 70190 (0.0009) +[2023-10-11 17:51:49,836][85175] Updated weights for policy 1, policy_version 70200 (0.0009) +[2023-10-11 17:51:51,062][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 142737408. Throughput: 0: 1665.6, 1: 1695.6. Samples: 35686184. Policy #0 lag: (min: 31.0, avg: 35.9, max: 63.0) +[2023-10-11 17:51:51,063][84230] Avg episode reward: [(0, '42.680'), (1, '47.340')] +[2023-10-11 17:51:53,504][85176] Updated weights for policy 0, policy_version 69192 (0.0009) +[2023-10-11 17:51:53,872][85176] Updated weights for policy 0, policy_version 69202 (0.0009) +[2023-10-11 17:51:53,901][85175] Updated weights for policy 1, policy_version 70210 (0.0008) +[2023-10-11 17:51:54,248][85176] Updated weights for policy 0, policy_version 69212 (0.0010) +[2023-10-11 17:51:54,264][85175] Updated weights for policy 1, policy_version 70220 (0.0008) +[2023-10-11 17:51:54,644][85175] Updated weights for policy 1, policy_version 70230 (0.0007) +[2023-10-11 17:51:55,004][85175] Updated weights for policy 1, policy_version 70240 (0.0009) +[2023-10-11 17:51:56,062][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 142802944. Throughput: 0: 1657.2, 1: 1678.9. Samples: 35704972. Policy #0 lag: (min: 31.0, avg: 35.9, max: 63.0) +[2023-10-11 17:51:56,063][84230] Avg episode reward: [(0, '44.050'), (1, '44.050')] +[2023-10-11 17:51:58,349][85176] Updated weights for policy 0, policy_version 69222 (0.0009) +[2023-10-11 17:51:58,723][85176] Updated weights for policy 0, policy_version 69232 (0.0007) +[2023-10-11 17:51:58,996][85175] Updated weights for policy 1, policy_version 70250 (0.0008) +[2023-10-11 17:51:59,089][85176] Updated weights for policy 0, policy_version 69242 (0.0007) +[2023-10-11 17:51:59,358][85175] Updated weights for policy 1, policy_version 70260 (0.0009) +[2023-10-11 17:51:59,733][85175] Updated weights for policy 1, policy_version 70270 (0.0008) +[2023-10-11 17:52:01,062][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 142868480. Throughput: 0: 1675.6, 1: 1680.1. Samples: 35725344. Policy #0 lag: (min: 31.0, avg: 35.9, max: 63.0) +[2023-10-11 17:52:01,063][84230] Avg episode reward: [(0, '44.540'), (1, '46.450')] +[2023-10-11 17:52:03,211][85176] Updated weights for policy 0, policy_version 69252 (0.0008) +[2023-10-11 17:52:03,586][85176] Updated weights for policy 0, policy_version 69262 (0.0008) +[2023-10-11 17:52:03,679][85175] Updated weights for policy 1, policy_version 70280 (0.0009) +[2023-10-11 17:52:03,964][85176] Updated weights for policy 0, policy_version 69272 (0.0008) +[2023-10-11 17:52:04,055][85175] Updated weights for policy 1, policy_version 70290 (0.0009) +[2023-10-11 17:52:04,420][85175] Updated weights for policy 1, policy_version 70300 (0.0008) +[2023-10-11 17:52:06,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 142934016. Throughput: 0: 1661.3, 1: 1692.0. Samples: 35736378. Policy #0 lag: (min: 31.0, avg: 35.9, max: 63.0) +[2023-10-11 17:52:06,064][84230] Avg episode reward: [(0, '42.030'), (1, '44.330')] +[2023-10-11 17:52:08,087][85176] Updated weights for policy 0, policy_version 69282 (0.0007) +[2023-10-11 17:52:08,460][85176] Updated weights for policy 0, policy_version 69292 (0.0009) +[2023-10-11 17:52:08,497][85175] Updated weights for policy 1, policy_version 70310 (0.0008) +[2023-10-11 17:52:08,832][85176] Updated weights for policy 0, policy_version 69302 (0.0008) +[2023-10-11 17:52:08,865][85175] Updated weights for policy 1, policy_version 70320 (0.0009) +[2023-10-11 17:52:09,213][85176] Updated weights for policy 0, policy_version 69312 (0.0008) +[2023-10-11 17:52:09,240][85175] Updated weights for policy 1, policy_version 70330 (0.0009) +[2023-10-11 17:52:11,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13440.5). Total num frames: 142999552. Throughput: 0: 1663.0, 1: 1671.7. Samples: 35755310. Policy #0 lag: (min: 31.0, avg: 35.9, max: 63.0) +[2023-10-11 17:52:11,063][84230] Avg episode reward: [(0, '44.900'), (1, '45.370')] +[2023-10-11 17:52:13,234][85176] Updated weights for policy 0, policy_version 69322 (0.0007) +[2023-10-11 17:52:13,317][85175] Updated weights for policy 1, policy_version 70340 (0.0008) +[2023-10-11 17:52:13,608][85176] Updated weights for policy 0, policy_version 69332 (0.0008) +[2023-10-11 17:52:13,685][85175] Updated weights for policy 1, policy_version 70350 (0.0007) +[2023-10-11 17:52:13,972][85176] Updated weights for policy 0, policy_version 69342 (0.0008) +[2023-10-11 17:52:14,048][85175] Updated weights for policy 1, policy_version 70360 (0.0007) +[2023-10-11 17:52:16,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 143065088. Throughput: 0: 1677.3, 1: 1696.0. Samples: 35776168. Policy #0 lag: (min: 31.0, avg: 35.9, max: 63.0) +[2023-10-11 17:52:16,063][84230] Avg episode reward: [(0, '44.490'), (1, '43.410')] +[2023-10-11 17:52:16,074][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000070368_72056832.pth... +[2023-10-11 17:52:16,075][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000069344_71008256.pth... +[2023-10-11 17:52:16,110][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000067776_69402624.pth +[2023-10-11 17:52:16,114][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000068800_70451200.pth +[2023-10-11 17:52:17,935][85175] Updated weights for policy 1, policy_version 70370 (0.0009) +[2023-10-11 17:52:18,075][85176] Updated weights for policy 0, policy_version 69352 (0.0008) +[2023-10-11 17:52:18,302][85175] Updated weights for policy 1, policy_version 70380 (0.0007) +[2023-10-11 17:52:18,452][85176] Updated weights for policy 0, policy_version 69362 (0.0007) +[2023-10-11 17:52:18,679][85175] Updated weights for policy 1, policy_version 70390 (0.0007) +[2023-10-11 17:52:18,816][85176] Updated weights for policy 0, policy_version 69372 (0.0009) +[2023-10-11 17:52:19,038][85175] Updated weights for policy 1, policy_version 70400 (0.0010) +[2023-10-11 17:52:21,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 143130624. Throughput: 0: 1662.0, 1: 1688.1. Samples: 35786360. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2023-10-11 17:52:21,063][84230] Avg episode reward: [(0, '45.600'), (1, '43.230')] +[2023-10-11 17:52:22,933][85176] Updated weights for policy 0, policy_version 69382 (0.0009) +[2023-10-11 17:52:23,233][85175] Updated weights for policy 1, policy_version 70410 (0.0009) +[2023-10-11 17:52:23,297][85176] Updated weights for policy 0, policy_version 69392 (0.0008) +[2023-10-11 17:52:23,592][85175] Updated weights for policy 1, policy_version 70420 (0.0008) +[2023-10-11 17:52:23,674][85176] Updated weights for policy 0, policy_version 69402 (0.0008) +[2023-10-11 17:52:23,966][85175] Updated weights for policy 1, policy_version 70430 (0.0009) +[2023-10-11 17:52:26,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 143196160. Throughput: 0: 1671.9, 1: 1676.9. Samples: 35805712. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2023-10-11 17:52:26,064][84230] Avg episode reward: [(0, '44.900'), (1, '41.570')] +[2023-10-11 17:52:27,854][85176] Updated weights for policy 0, policy_version 69412 (0.0009) +[2023-10-11 17:52:27,990][85175] Updated weights for policy 1, policy_version 70440 (0.0007) +[2023-10-11 17:52:28,225][85176] Updated weights for policy 0, policy_version 69422 (0.0007) +[2023-10-11 17:52:28,365][85175] Updated weights for policy 1, policy_version 70450 (0.0007) +[2023-10-11 17:52:28,595][85176] Updated weights for policy 0, policy_version 69432 (0.0007) +[2023-10-11 17:52:28,724][85175] Updated weights for policy 1, policy_version 70460 (0.0008) +[2023-10-11 17:52:31,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 143261696. Throughput: 0: 1668.5, 1: 1694.2. Samples: 35826314. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2023-10-11 17:52:31,064][84230] Avg episode reward: [(0, '42.180'), (1, '41.030')] +[2023-10-11 17:52:32,716][85175] Updated weights for policy 1, policy_version 70470 (0.0008) +[2023-10-11 17:52:32,783][85176] Updated weights for policy 0, policy_version 69442 (0.0009) +[2023-10-11 17:52:33,079][85175] Updated weights for policy 1, policy_version 70480 (0.0009) +[2023-10-11 17:52:33,147][85176] Updated weights for policy 0, policy_version 69452 (0.0007) +[2023-10-11 17:52:33,453][85175] Updated weights for policy 1, policy_version 70490 (0.0009) +[2023-10-11 17:52:33,525][85176] Updated weights for policy 0, policy_version 69462 (0.0009) +[2023-10-11 17:52:33,906][85176] Updated weights for policy 0, policy_version 69472 (0.0010) +[2023-10-11 17:52:36,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 143327232. Throughput: 0: 1655.0, 1: 1673.3. Samples: 35835958. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2023-10-11 17:52:36,064][84230] Avg episode reward: [(0, '43.750'), (1, '44.780')] +[2023-10-11 17:52:37,548][85175] Updated weights for policy 1, policy_version 70500 (0.0009) +[2023-10-11 17:52:37,938][85175] Updated weights for policy 1, policy_version 70510 (0.0008) +[2023-10-11 17:52:38,125][85176] Updated weights for policy 0, policy_version 69482 (0.0008) +[2023-10-11 17:52:38,303][85175] Updated weights for policy 1, policy_version 70520 (0.0009) +[2023-10-11 17:52:38,496][85176] Updated weights for policy 0, policy_version 69492 (0.0010) +[2023-10-11 17:52:38,860][85176] Updated weights for policy 0, policy_version 69502 (0.0007) +[2023-10-11 17:52:41,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 143392768. Throughput: 0: 1668.0, 1: 1686.9. Samples: 35855942. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2023-10-11 17:52:41,064][84230] Avg episode reward: [(0, '43.950'), (1, '43.690')] +[2023-10-11 17:52:42,219][85175] Updated weights for policy 1, policy_version 70530 (0.0009) +[2023-10-11 17:52:42,591][85175] Updated weights for policy 1, policy_version 70540 (0.0008) +[2023-10-11 17:52:42,885][85176] Updated weights for policy 0, policy_version 69512 (0.0008) +[2023-10-11 17:52:42,956][85175] Updated weights for policy 1, policy_version 70550 (0.0009) +[2023-10-11 17:52:43,263][85176] Updated weights for policy 0, policy_version 69522 (0.0007) +[2023-10-11 17:52:43,323][85175] Updated weights for policy 1, policy_version 70560 (0.0007) +[2023-10-11 17:52:43,637][85176] Updated weights for policy 0, policy_version 69532 (0.0008) +[2023-10-11 17:52:46,062][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 143458304. Throughput: 0: 1666.6, 1: 1698.6. Samples: 35876778. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2023-10-11 17:52:46,063][84230] Avg episode reward: [(0, '45.450'), (1, '45.540')] +[2023-10-11 17:52:47,393][85175] Updated weights for policy 1, policy_version 70570 (0.0008) +[2023-10-11 17:52:47,578][85176] Updated weights for policy 0, policy_version 69542 (0.0008) +[2023-10-11 17:52:47,760][85175] Updated weights for policy 1, policy_version 70580 (0.0008) +[2023-10-11 17:52:47,951][85176] Updated weights for policy 0, policy_version 69552 (0.0010) +[2023-10-11 17:52:48,137][85175] Updated weights for policy 1, policy_version 70590 (0.0009) +[2023-10-11 17:52:48,320][85176] Updated weights for policy 0, policy_version 69562 (0.0008) +[2023-10-11 17:52:51,062][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 143523840. Throughput: 0: 1653.9, 1: 1670.7. Samples: 35885984. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2023-10-11 17:52:51,063][84230] Avg episode reward: [(0, '42.690'), (1, '42.150')] +[2023-10-11 17:52:52,196][85175] Updated weights for policy 1, policy_version 70600 (0.0008) +[2023-10-11 17:52:52,450][85176] Updated weights for policy 0, policy_version 69572 (0.0009) +[2023-10-11 17:52:52,565][85175] Updated weights for policy 1, policy_version 70610 (0.0008) +[2023-10-11 17:52:52,832][85176] Updated weights for policy 0, policy_version 69582 (0.0009) +[2023-10-11 17:52:52,922][85175] Updated weights for policy 1, policy_version 70620 (0.0007) +[2023-10-11 17:52:53,207][85176] Updated weights for policy 0, policy_version 69592 (0.0008) +[2023-10-11 17:52:56,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 143589376. Throughput: 0: 1665.3, 1: 1695.1. Samples: 35906528. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2023-10-11 17:52:56,064][84230] Avg episode reward: [(0, '43.730'), (1, '45.280')] +[2023-10-11 17:52:57,042][85175] Updated weights for policy 1, policy_version 70630 (0.0008) +[2023-10-11 17:52:57,202][85176] Updated weights for policy 0, policy_version 69602 (0.0008) +[2023-10-11 17:52:57,408][85175] Updated weights for policy 1, policy_version 70640 (0.0009) +[2023-10-11 17:52:57,576][85176] Updated weights for policy 0, policy_version 69612 (0.0009) +[2023-10-11 17:52:57,772][85175] Updated weights for policy 1, policy_version 70650 (0.0009) +[2023-10-11 17:52:57,952][85176] Updated weights for policy 0, policy_version 69622 (0.0009) +[2023-10-11 17:52:58,332][85176] Updated weights for policy 0, policy_version 69632 (0.0007) +[2023-10-11 17:53:01,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 143654912. Throughput: 0: 1663.1, 1: 1696.8. Samples: 35927362. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2023-10-11 17:53:01,064][84230] Avg episode reward: [(0, '42.200'), (1, '43.060')] +[2023-10-11 17:53:01,705][85175] Updated weights for policy 1, policy_version 70660 (0.0008) +[2023-10-11 17:53:02,075][85175] Updated weights for policy 1, policy_version 70670 (0.0009) +[2023-10-11 17:53:02,357][85176] Updated weights for policy 0, policy_version 69642 (0.0008) +[2023-10-11 17:53:02,435][85175] Updated weights for policy 1, policy_version 70680 (0.0008) +[2023-10-11 17:53:02,727][85176] Updated weights for policy 0, policy_version 69652 (0.0009) +[2023-10-11 17:53:03,103][85176] Updated weights for policy 0, policy_version 69662 (0.0009) +[2023-10-11 17:53:06,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 143720448. Throughput: 0: 1654.5, 1: 1679.7. Samples: 35936402. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:53:06,064][84230] Avg episode reward: [(0, '43.970'), (1, '44.220')] +[2023-10-11 17:53:06,286][85175] Updated weights for policy 1, policy_version 70690 (0.0007) +[2023-10-11 17:53:06,644][85175] Updated weights for policy 1, policy_version 70700 (0.0008) +[2023-10-11 17:53:07,011][85175] Updated weights for policy 1, policy_version 70710 (0.0008) +[2023-10-11 17:53:07,132][85176] Updated weights for policy 0, policy_version 69672 (0.0009) +[2023-10-11 17:53:07,386][85175] Updated weights for policy 1, policy_version 70720 (0.0010) +[2023-10-11 17:53:07,500][85176] Updated weights for policy 0, policy_version 69682 (0.0008) +[2023-10-11 17:53:07,880][85176] Updated weights for policy 0, policy_version 69692 (0.0009) +[2023-10-11 17:53:11,062][84230] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 143785984. Throughput: 0: 1671.5, 1: 1700.6. Samples: 35957458. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:53:11,063][84230] Avg episode reward: [(0, '44.480'), (1, '42.920')] +[2023-10-11 17:53:11,336][85175] Updated weights for policy 1, policy_version 70730 (0.0008) +[2023-10-11 17:53:11,698][85175] Updated weights for policy 1, policy_version 70740 (0.0009) +[2023-10-11 17:53:11,849][85176] Updated weights for policy 0, policy_version 69702 (0.0008) +[2023-10-11 17:53:12,071][85175] Updated weights for policy 1, policy_version 70750 (0.0010) +[2023-10-11 17:53:12,223][85176] Updated weights for policy 0, policy_version 69712 (0.0008) +[2023-10-11 17:53:12,600][85176] Updated weights for policy 0, policy_version 69722 (0.0010) +[2023-10-11 17:53:16,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 143851520. Throughput: 0: 1673.7, 1: 1702.8. Samples: 35978254. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:53:16,063][84230] Avg episode reward: [(0, '45.170'), (1, '45.280')] +[2023-10-11 17:53:16,277][85175] Updated weights for policy 1, policy_version 70760 (0.0009) +[2023-10-11 17:53:16,641][85175] Updated weights for policy 1, policy_version 70770 (0.0009) +[2023-10-11 17:53:16,814][85176] Updated weights for policy 0, policy_version 69732 (0.0009) +[2023-10-11 17:53:17,007][85175] Updated weights for policy 1, policy_version 70780 (0.0009) +[2023-10-11 17:53:17,186][85176] Updated weights for policy 0, policy_version 69742 (0.0008) +[2023-10-11 17:53:17,560][85176] Updated weights for policy 0, policy_version 69752 (0.0009) +[2023-10-11 17:53:21,031][85175] Updated weights for policy 1, policy_version 70790 (0.0008) +[2023-10-11 17:53:21,063][84230] Fps is (10 sec: 13106.8, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 143917056. Throughput: 0: 1667.1, 1: 1696.3. Samples: 35987310. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:53:21,064][84230] Avg episode reward: [(0, '44.060'), (1, '42.120')] +[2023-10-11 17:53:21,394][85175] Updated weights for policy 1, policy_version 70800 (0.0009) +[2023-10-11 17:53:21,626][85176] Updated weights for policy 0, policy_version 69762 (0.0007) +[2023-10-11 17:53:21,761][85175] Updated weights for policy 1, policy_version 70810 (0.0007) +[2023-10-11 17:53:21,998][85176] Updated weights for policy 0, policy_version 69772 (0.0007) +[2023-10-11 17:53:22,385][85176] Updated weights for policy 0, policy_version 69782 (0.0007) +[2023-10-11 17:53:22,759][85176] Updated weights for policy 0, policy_version 69792 (0.0009) +[2023-10-11 17:53:25,748][85175] Updated weights for policy 1, policy_version 70820 (0.0008) +[2023-10-11 17:53:26,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 143982592. Throughput: 0: 1677.0, 1: 1704.5. Samples: 36008110. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:53:26,063][84230] Avg episode reward: [(0, '43.930'), (1, '47.110')] +[2023-10-11 17:53:26,125][85175] Updated weights for policy 1, policy_version 70830 (0.0007) +[2023-10-11 17:53:26,490][85175] Updated weights for policy 1, policy_version 70840 (0.0007) +[2023-10-11 17:53:26,981][85176] Updated weights for policy 0, policy_version 69802 (0.0009) +[2023-10-11 17:53:27,342][85176] Updated weights for policy 0, policy_version 69812 (0.0007) +[2023-10-11 17:53:27,715][85176] Updated weights for policy 0, policy_version 69822 (0.0009) +[2023-10-11 17:53:30,422][85175] Updated weights for policy 1, policy_version 70850 (0.0007) +[2023-10-11 17:53:30,781][85175] Updated weights for policy 1, policy_version 70860 (0.0008) +[2023-10-11 17:53:31,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 144048128. Throughput: 0: 1671.3, 1: 1700.4. Samples: 36028506. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:53:31,063][84230] Avg episode reward: [(0, '42.030'), (1, '44.640')] +[2023-10-11 17:53:31,152][85175] Updated weights for policy 1, policy_version 70870 (0.0009) +[2023-10-11 17:53:31,524][85175] Updated weights for policy 1, policy_version 70880 (0.0008) +[2023-10-11 17:53:31,800][85176] Updated weights for policy 0, policy_version 69832 (0.0007) +[2023-10-11 17:53:32,179][85176] Updated weights for policy 0, policy_version 69842 (0.0007) +[2023-10-11 17:53:32,544][85176] Updated weights for policy 0, policy_version 69852 (0.0007) +[2023-10-11 17:53:35,591][85175] Updated weights for policy 1, policy_version 70890 (0.0007) +[2023-10-11 17:53:35,962][85175] Updated weights for policy 1, policy_version 70900 (0.0007) +[2023-10-11 17:53:36,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 144113664. Throughput: 0: 1670.2, 1: 1703.4. Samples: 36037798. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:53:36,063][84230] Avg episode reward: [(0, '44.210'), (1, '47.190')] +[2023-10-11 17:53:36,323][85175] Updated weights for policy 1, policy_version 70910 (0.0007) +[2023-10-11 17:53:36,689][85176] Updated weights for policy 0, policy_version 69862 (0.0007) +[2023-10-11 17:53:37,063][85176] Updated weights for policy 0, policy_version 69872 (0.0008) +[2023-10-11 17:53:37,440][85176] Updated weights for policy 0, policy_version 69882 (0.0007) +[2023-10-11 17:53:40,292][85175] Updated weights for policy 1, policy_version 70920 (0.0009) +[2023-10-11 17:53:40,659][85175] Updated weights for policy 1, policy_version 70930 (0.0009) +[2023-10-11 17:53:41,033][85175] Updated weights for policy 1, policy_version 70940 (0.0008) +[2023-10-11 17:53:41,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 144179200. Throughput: 0: 1670.1, 1: 1708.1. Samples: 36058548. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:53:41,063][84230] Avg episode reward: [(0, '41.550'), (1, '44.750')] +[2023-10-11 17:53:41,472][85176] Updated weights for policy 0, policy_version 69892 (0.0008) +[2023-10-11 17:53:41,839][85176] Updated weights for policy 0, policy_version 69902 (0.0010) +[2023-10-11 17:53:42,219][85176] Updated weights for policy 0, policy_version 69912 (0.0009) +[2023-10-11 17:53:45,130][85175] Updated weights for policy 1, policy_version 70950 (0.0007) +[2023-10-11 17:53:45,502][85175] Updated weights for policy 1, policy_version 70960 (0.0007) +[2023-10-11 17:53:45,869][85175] Updated weights for policy 1, policy_version 70970 (0.0007) +[2023-10-11 17:53:46,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13107.1, 300 sec: 13329.3). Total num frames: 144244736. Throughput: 0: 1667.2, 1: 1695.8. Samples: 36078696. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 17:53:46,064][84230] Avg episode reward: [(0, '44.490'), (1, '45.720')] +[2023-10-11 17:53:46,456][85176] Updated weights for policy 0, policy_version 69922 (0.0009) +[2023-10-11 17:53:46,826][85176] Updated weights for policy 0, policy_version 69932 (0.0010) +[2023-10-11 17:53:47,203][85176] Updated weights for policy 0, policy_version 69942 (0.0010) +[2023-10-11 17:53:47,569][85176] Updated weights for policy 0, policy_version 69952 (0.0010) +[2023-10-11 17:53:49,707][85175] Updated weights for policy 1, policy_version 70980 (0.0008) +[2023-10-11 17:53:50,076][85175] Updated weights for policy 1, policy_version 70990 (0.0010) +[2023-10-11 17:53:50,442][85175] Updated weights for policy 1, policy_version 71000 (0.0009) +[2023-10-11 17:53:51,063][84230] Fps is (10 sec: 16383.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 144343040. Throughput: 0: 1667.6, 1: 1713.7. Samples: 36088560. Policy #0 lag: (min: 31.0, avg: 34.0, max: 63.0) +[2023-10-11 17:53:51,064][84230] Avg episode reward: [(0, '42.650'), (1, '44.210')] +[2023-10-11 17:53:51,587][85176] Updated weights for policy 0, policy_version 69962 (0.0008) +[2023-10-11 17:53:51,962][85176] Updated weights for policy 0, policy_version 69972 (0.0009) +[2023-10-11 17:53:52,338][85176] Updated weights for policy 0, policy_version 69982 (0.0008) +[2023-10-11 17:53:54,472][85175] Updated weights for policy 1, policy_version 71010 (0.0009) +[2023-10-11 17:53:54,834][85175] Updated weights for policy 1, policy_version 71020 (0.0009) +[2023-10-11 17:53:55,196][85175] Updated weights for policy 1, policy_version 71030 (0.0008) +[2023-10-11 17:53:55,565][85175] Updated weights for policy 1, policy_version 71040 (0.0008) +[2023-10-11 17:53:56,062][84230] Fps is (10 sec: 16384.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 144408576. Throughput: 0: 1668.1, 1: 1706.1. Samples: 36109300. Policy #0 lag: (min: 31.0, avg: 34.0, max: 63.0) +[2023-10-11 17:53:56,063][84230] Avg episode reward: [(0, '45.510'), (1, '43.890')] +[2023-10-11 17:53:56,334][85176] Updated weights for policy 0, policy_version 69992 (0.0007) +[2023-10-11 17:53:56,696][85176] Updated weights for policy 0, policy_version 70002 (0.0009) +[2023-10-11 17:53:57,076][85176] Updated weights for policy 0, policy_version 70012 (0.0008) +[2023-10-11 17:53:59,578][85175] Updated weights for policy 1, policy_version 71050 (0.0008) +[2023-10-11 17:53:59,950][85175] Updated weights for policy 1, policy_version 71060 (0.0007) +[2023-10-11 17:54:00,326][85175] Updated weights for policy 1, policy_version 71070 (0.0007) +[2023-10-11 17:54:01,062][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 144474112. Throughput: 0: 1668.0, 1: 1683.1. Samples: 36129052. Policy #0 lag: (min: 31.0, avg: 34.0, max: 63.0) +[2023-10-11 17:54:01,063][84230] Avg episode reward: [(0, '44.920'), (1, '43.080')] +[2023-10-11 17:54:01,250][85176] Updated weights for policy 0, policy_version 70022 (0.0009) +[2023-10-11 17:54:01,625][85176] Updated weights for policy 0, policy_version 70032 (0.0010) +[2023-10-11 17:54:01,992][85176] Updated weights for policy 0, policy_version 70042 (0.0009) +[2023-10-11 17:54:04,325][85175] Updated weights for policy 1, policy_version 71080 (0.0008) +[2023-10-11 17:54:04,696][85175] Updated weights for policy 1, policy_version 71090 (0.0007) +[2023-10-11 17:54:05,058][85175] Updated weights for policy 1, policy_version 71100 (0.0009) +[2023-10-11 17:54:06,051][85176] Updated weights for policy 0, policy_version 70052 (0.0008) +[2023-10-11 17:54:06,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 144539648. Throughput: 0: 1666.5, 1: 1718.8. Samples: 36139646. Policy #0 lag: (min: 31.0, avg: 34.0, max: 63.0) +[2023-10-11 17:54:06,063][84230] Avg episode reward: [(0, '45.240'), (1, '43.590')] +[2023-10-11 17:54:06,421][85176] Updated weights for policy 0, policy_version 70062 (0.0008) +[2023-10-11 17:54:06,794][85176] Updated weights for policy 0, policy_version 70072 (0.0009) +[2023-10-11 17:54:09,042][85175] Updated weights for policy 1, policy_version 71110 (0.0010) +[2023-10-11 17:54:09,412][85175] Updated weights for policy 1, policy_version 71120 (0.0009) +[2023-10-11 17:54:09,788][85175] Updated weights for policy 1, policy_version 71130 (0.0009) +[2023-10-11 17:54:10,826][85176] Updated weights for policy 0, policy_version 70082 (0.0008) +[2023-10-11 17:54:11,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 144605184. Throughput: 0: 1670.4, 1: 1700.9. Samples: 36159818. Policy #0 lag: (min: 31.0, avg: 34.0, max: 63.0) +[2023-10-11 17:54:11,063][84230] Avg episode reward: [(0, '42.840'), (1, '43.800')] +[2023-10-11 17:54:11,204][85176] Updated weights for policy 0, policy_version 70092 (0.0008) +[2023-10-11 17:54:11,573][85176] Updated weights for policy 0, policy_version 70102 (0.0008) +[2023-10-11 17:54:11,945][85176] Updated weights for policy 0, policy_version 70112 (0.0007) +[2023-10-11 17:54:13,771][85175] Updated weights for policy 1, policy_version 71140 (0.0009) +[2023-10-11 17:54:14,166][85175] Updated weights for policy 1, policy_version 71150 (0.0009) +[2023-10-11 17:54:14,548][85175] Updated weights for policy 1, policy_version 71160 (0.0007) +[2023-10-11 17:54:16,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 144670720. Throughput: 0: 1676.3, 1: 1688.6. Samples: 36179926. Policy #0 lag: (min: 31.0, avg: 34.0, max: 63.0) +[2023-10-11 17:54:16,063][84230] Avg episode reward: [(0, '41.990'), (1, '43.470')] +[2023-10-11 17:54:16,070][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000071168_72876032.pth... +[2023-10-11 17:54:16,099][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000069568_71237632.pth +[2023-10-11 17:54:16,103][85000] Saving a milestone ./train_atari/atari_frostbite_APPO/checkpoint_p1/milestones/checkpoint_000071168_72876032.pth +[2023-10-11 17:54:16,290][85176] Updated weights for policy 0, policy_version 70122 (0.0007) +[2023-10-11 17:54:16,656][85176] Updated weights for policy 0, policy_version 70132 (0.0010) +[2023-10-11 17:54:17,031][85176] Updated weights for policy 0, policy_version 70142 (0.0007) +[2023-10-11 17:54:17,104][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000070144_71827456.pth... +[2023-10-11 17:54:17,133][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000068576_70221824.pth +[2023-10-11 17:54:17,137][84801] Saving a milestone ./train_atari/atari_frostbite_APPO/checkpoint_p0/milestones/checkpoint_000070144_71827456.pth +[2023-10-11 17:54:18,518][85175] Updated weights for policy 1, policy_version 71170 (0.0009) +[2023-10-11 17:54:18,887][85175] Updated weights for policy 1, policy_version 71180 (0.0009) +[2023-10-11 17:54:19,256][85175] Updated weights for policy 1, policy_version 71190 (0.0011) +[2023-10-11 17:54:19,620][85175] Updated weights for policy 1, policy_version 71200 (0.0011) +[2023-10-11 17:54:21,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 144736256. Throughput: 0: 1672.3, 1: 1717.8. Samples: 36190352. Policy #0 lag: (min: 31.0, avg: 34.0, max: 63.0) +[2023-10-11 17:54:21,063][84230] Avg episode reward: [(0, '42.490'), (1, '43.590')] +[2023-10-11 17:54:21,099][85176] Updated weights for policy 0, policy_version 70152 (0.0009) +[2023-10-11 17:54:21,467][85176] Updated weights for policy 0, policy_version 70162 (0.0008) +[2023-10-11 17:54:21,827][85176] Updated weights for policy 0, policy_version 70172 (0.0008) +[2023-10-11 17:54:23,565][85175] Updated weights for policy 1, policy_version 71210 (0.0008) +[2023-10-11 17:54:23,944][85175] Updated weights for policy 1, policy_version 71220 (0.0009) +[2023-10-11 17:54:24,309][85175] Updated weights for policy 1, policy_version 71230 (0.0009) +[2023-10-11 17:54:25,857][85176] Updated weights for policy 0, policy_version 70182 (0.0008) +[2023-10-11 17:54:26,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 144801792. Throughput: 0: 1677.6, 1: 1690.3. Samples: 36210102. Policy #0 lag: (min: 31.0, avg: 34.0, max: 63.0) +[2023-10-11 17:54:26,063][84230] Avg episode reward: [(0, '43.990'), (1, '42.140')] +[2023-10-11 17:54:26,235][85176] Updated weights for policy 0, policy_version 70192 (0.0007) +[2023-10-11 17:54:26,596][85176] Updated weights for policy 0, policy_version 70202 (0.0008) +[2023-10-11 17:54:28,329][85175] Updated weights for policy 1, policy_version 71240 (0.0007) +[2023-10-11 17:54:28,696][85175] Updated weights for policy 1, policy_version 71250 (0.0009) +[2023-10-11 17:54:29,062][85175] Updated weights for policy 1, policy_version 71260 (0.0009) +[2023-10-11 17:54:30,744][85176] Updated weights for policy 0, policy_version 70212 (0.0007) +[2023-10-11 17:54:31,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 144867328. Throughput: 0: 1676.8, 1: 1699.7. Samples: 36230636. Policy #0 lag: (min: 31.0, avg: 34.0, max: 63.0) +[2023-10-11 17:54:31,063][84230] Avg episode reward: [(0, '43.220'), (1, '44.450')] +[2023-10-11 17:54:31,120][85176] Updated weights for policy 0, policy_version 70222 (0.0007) +[2023-10-11 17:54:31,484][85176] Updated weights for policy 0, policy_version 70232 (0.0007) +[2023-10-11 17:54:33,058][85175] Updated weights for policy 1, policy_version 71270 (0.0009) +[2023-10-11 17:54:33,424][85175] Updated weights for policy 1, policy_version 71280 (0.0009) +[2023-10-11 17:54:33,787][85175] Updated weights for policy 1, policy_version 71290 (0.0007) +[2023-10-11 17:54:35,369][85176] Updated weights for policy 0, policy_version 70242 (0.0008) +[2023-10-11 17:54:35,737][85176] Updated weights for policy 0, policy_version 70252 (0.0007) +[2023-10-11 17:54:36,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 144932864. Throughput: 0: 1678.7, 1: 1693.9. Samples: 36240326. Policy #0 lag: (min: 31.0, avg: 37.3, max: 63.0) +[2023-10-11 17:54:36,063][84230] Avg episode reward: [(0, '43.960'), (1, '41.880')] +[2023-10-11 17:54:36,112][85176] Updated weights for policy 0, policy_version 70262 (0.0009) +[2023-10-11 17:54:36,483][85176] Updated weights for policy 0, policy_version 70272 (0.0008) +[2023-10-11 17:54:37,850][85175] Updated weights for policy 1, policy_version 71300 (0.0009) +[2023-10-11 17:54:38,224][85175] Updated weights for policy 1, policy_version 71310 (0.0008) +[2023-10-11 17:54:38,592][85175] Updated weights for policy 1, policy_version 71320 (0.0007) +[2023-10-11 17:54:40,515][85176] Updated weights for policy 0, policy_version 70282 (0.0010) +[2023-10-11 17:54:40,891][85176] Updated weights for policy 0, policy_version 70292 (0.0008) +[2023-10-11 17:54:41,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 144998400. Throughput: 0: 1678.1, 1: 1684.7. Samples: 36260626. Policy #0 lag: (min: 31.0, avg: 37.3, max: 63.0) +[2023-10-11 17:54:41,063][84230] Avg episode reward: [(0, '42.880'), (1, '45.380')] +[2023-10-11 17:54:41,267][85176] Updated weights for policy 0, policy_version 70302 (0.0007) +[2023-10-11 17:54:42,348][85175] Updated weights for policy 1, policy_version 71330 (0.0008) +[2023-10-11 17:54:42,719][85175] Updated weights for policy 1, policy_version 71340 (0.0009) +[2023-10-11 17:54:43,086][85175] Updated weights for policy 1, policy_version 71350 (0.0009) +[2023-10-11 17:54:43,454][85175] Updated weights for policy 1, policy_version 71360 (0.0008) +[2023-10-11 17:54:45,403][85176] Updated weights for policy 0, policy_version 70312 (0.0010) +[2023-10-11 17:54:45,784][85176] Updated weights for policy 0, policy_version 70322 (0.0009) +[2023-10-11 17:54:46,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 145063936. Throughput: 0: 1667.0, 1: 1714.6. Samples: 36281224. Policy #0 lag: (min: 31.0, avg: 37.3, max: 63.0) +[2023-10-11 17:54:46,063][84230] Avg episode reward: [(0, '45.120'), (1, '44.110')] +[2023-10-11 17:54:46,165][85176] Updated weights for policy 0, policy_version 70332 (0.0007) +[2023-10-11 17:54:47,482][85175] Updated weights for policy 1, policy_version 71370 (0.0008) +[2023-10-11 17:54:47,843][85175] Updated weights for policy 1, policy_version 71380 (0.0007) +[2023-10-11 17:54:48,206][85175] Updated weights for policy 1, policy_version 71390 (0.0008) +[2023-10-11 17:54:50,154][85176] Updated weights for policy 0, policy_version 70342 (0.0007) +[2023-10-11 17:54:50,525][85176] Updated weights for policy 0, policy_version 70352 (0.0009) +[2023-10-11 17:54:50,896][85176] Updated weights for policy 0, policy_version 70362 (0.0010) +[2023-10-11 17:54:51,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 145129472. Throughput: 0: 1682.8, 1: 1679.0. Samples: 36290928. Policy #0 lag: (min: 31.0, avg: 37.3, max: 63.0) +[2023-10-11 17:54:51,063][84230] Avg episode reward: [(0, '44.620'), (1, '44.790')] +[2023-10-11 17:54:52,353][85175] Updated weights for policy 1, policy_version 71400 (0.0009) +[2023-10-11 17:54:52,725][85175] Updated weights for policy 1, policy_version 71410 (0.0011) +[2023-10-11 17:54:53,096][85175] Updated weights for policy 1, policy_version 71420 (0.0011) +[2023-10-11 17:54:54,917][85176] Updated weights for policy 0, policy_version 70372 (0.0009) +[2023-10-11 17:54:55,286][85176] Updated weights for policy 0, policy_version 70382 (0.0008) +[2023-10-11 17:54:55,662][85176] Updated weights for policy 0, policy_version 70392 (0.0011) +[2023-10-11 17:54:56,063][84230] Fps is (10 sec: 16383.8, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 145227776. Throughput: 0: 1683.9, 1: 1696.4. Samples: 36311930. Policy #0 lag: (min: 31.0, avg: 37.3, max: 63.0) +[2023-10-11 17:54:56,064][84230] Avg episode reward: [(0, '41.370'), (1, '43.530')] +[2023-10-11 17:54:57,111][85175] Updated weights for policy 1, policy_version 71430 (0.0008) +[2023-10-11 17:54:57,473][85175] Updated weights for policy 1, policy_version 71440 (0.0007) +[2023-10-11 17:54:57,845][85175] Updated weights for policy 1, policy_version 71450 (0.0010) +[2023-10-11 17:54:59,693][85176] Updated weights for policy 0, policy_version 70402 (0.0010) +[2023-10-11 17:55:00,077][85176] Updated weights for policy 0, policy_version 70412 (0.0011) +[2023-10-11 17:55:00,444][85176] Updated weights for policy 0, policy_version 70422 (0.0008) +[2023-10-11 17:55:00,811][85176] Updated weights for policy 0, policy_version 70432 (0.0009) +[2023-10-11 17:55:01,062][84230] Fps is (10 sec: 16384.3, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 145293312. Throughput: 0: 1663.4, 1: 1713.4. Samples: 36331880. Policy #0 lag: (min: 31.0, avg: 37.3, max: 63.0) +[2023-10-11 17:55:01,063][84230] Avg episode reward: [(0, '44.280'), (1, '43.610')] +[2023-10-11 17:55:01,928][85175] Updated weights for policy 1, policy_version 71460 (0.0007) +[2023-10-11 17:55:02,338][85175] Updated weights for policy 1, policy_version 71470 (0.0007) +[2023-10-11 17:55:02,713][85175] Updated weights for policy 1, policy_version 71480 (0.0008) +[2023-10-11 17:55:04,876][85176] Updated weights for policy 0, policy_version 70442 (0.0009) +[2023-10-11 17:55:05,245][85176] Updated weights for policy 0, policy_version 70452 (0.0010) +[2023-10-11 17:55:05,619][85176] Updated weights for policy 0, policy_version 70462 (0.0009) +[2023-10-11 17:55:06,062][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 145358848. Throughput: 0: 1694.7, 1: 1674.4. Samples: 36341960. Policy #0 lag: (min: 31.0, avg: 37.3, max: 63.0) +[2023-10-11 17:55:06,063][84230] Avg episode reward: [(0, '43.720'), (1, '43.180')] +[2023-10-11 17:55:06,738][85175] Updated weights for policy 1, policy_version 71490 (0.0007) +[2023-10-11 17:55:07,105][85175] Updated weights for policy 1, policy_version 71500 (0.0009) +[2023-10-11 17:55:07,479][85175] Updated weights for policy 1, policy_version 71510 (0.0008) +[2023-10-11 17:55:07,841][85175] Updated weights for policy 1, policy_version 71520 (0.0007) +[2023-10-11 17:55:09,474][85176] Updated weights for policy 0, policy_version 70472 (0.0008) +[2023-10-11 17:55:09,840][85176] Updated weights for policy 0, policy_version 70482 (0.0008) +[2023-10-11 17:55:10,215][85176] Updated weights for policy 0, policy_version 70492 (0.0007) +[2023-10-11 17:55:11,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 145424384. Throughput: 0: 1687.0, 1: 1698.8. Samples: 36362460. Policy #0 lag: (min: 31.0, avg: 37.3, max: 63.0) +[2023-10-11 17:55:11,063][84230] Avg episode reward: [(0, '46.940'), (1, '40.620')] +[2023-10-11 17:55:12,000][85175] Updated weights for policy 1, policy_version 71530 (0.0009) +[2023-10-11 17:55:12,369][85175] Updated weights for policy 1, policy_version 71540 (0.0007) +[2023-10-11 17:55:12,735][85175] Updated weights for policy 1, policy_version 71550 (0.0008) +[2023-10-11 17:55:14,087][85176] Updated weights for policy 0, policy_version 70502 (0.0009) +[2023-10-11 17:55:14,464][85176] Updated weights for policy 0, policy_version 70512 (0.0009) +[2023-10-11 17:55:14,837][85176] Updated weights for policy 0, policy_version 70522 (0.0008) +[2023-10-11 17:55:16,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 145489920. Throughput: 0: 1672.0, 1: 1698.6. Samples: 36382316. Policy #0 lag: (min: 31.0, avg: 37.3, max: 63.0) +[2023-10-11 17:55:16,063][84230] Avg episode reward: [(0, '43.330'), (1, '42.560')] +[2023-10-11 17:55:16,703][85175] Updated weights for policy 1, policy_version 71560 (0.0008) +[2023-10-11 17:55:17,078][85175] Updated weights for policy 1, policy_version 71570 (0.0008) +[2023-10-11 17:55:17,452][85175] Updated weights for policy 1, policy_version 71580 (0.0008) +[2023-10-11 17:55:19,044][85176] Updated weights for policy 0, policy_version 70532 (0.0008) +[2023-10-11 17:55:19,415][85176] Updated weights for policy 0, policy_version 70542 (0.0010) +[2023-10-11 17:55:19,795][85176] Updated weights for policy 0, policy_version 70552 (0.0008) +[2023-10-11 17:55:21,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 145555456. Throughput: 0: 1699.8, 1: 1686.8. Samples: 36392726. Policy #0 lag: (min: 31.0, avg: 34.6, max: 63.0) +[2023-10-11 17:55:21,064][84230] Avg episode reward: [(0, '43.240'), (1, '39.060')] +[2023-10-11 17:55:21,448][85175] Updated weights for policy 1, policy_version 71590 (0.0009) +[2023-10-11 17:55:21,819][85175] Updated weights for policy 1, policy_version 71600 (0.0007) +[2023-10-11 17:55:22,177][85175] Updated weights for policy 1, policy_version 71610 (0.0009) +[2023-10-11 17:55:23,954][85176] Updated weights for policy 0, policy_version 70562 (0.0010) +[2023-10-11 17:55:24,334][85176] Updated weights for policy 0, policy_version 70572 (0.0007) +[2023-10-11 17:55:24,707][85176] Updated weights for policy 0, policy_version 70582 (0.0007) +[2023-10-11 17:55:25,074][85176] Updated weights for policy 0, policy_version 70592 (0.0010) +[2023-10-11 17:55:26,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 145620992. Throughput: 0: 1677.8, 1: 1696.8. Samples: 36412480. Policy #0 lag: (min: 31.0, avg: 34.6, max: 63.0) +[2023-10-11 17:55:26,063][84230] Avg episode reward: [(0, '40.950'), (1, '46.150')] +[2023-10-11 17:55:26,347][85175] Updated weights for policy 1, policy_version 71620 (0.0011) +[2023-10-11 17:55:26,706][85175] Updated weights for policy 1, policy_version 71630 (0.0011) +[2023-10-11 17:55:27,076][85175] Updated weights for policy 1, policy_version 71640 (0.0009) +[2023-10-11 17:55:29,123][85176] Updated weights for policy 0, policy_version 70602 (0.0010) +[2023-10-11 17:55:29,488][85176] Updated weights for policy 0, policy_version 70612 (0.0010) +[2023-10-11 17:55:29,861][85176] Updated weights for policy 0, policy_version 70622 (0.0010) +[2023-10-11 17:55:31,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 145686528. Throughput: 0: 1676.4, 1: 1686.8. Samples: 36432566. Policy #0 lag: (min: 31.0, avg: 34.6, max: 63.0) +[2023-10-11 17:55:31,063][84230] Avg episode reward: [(0, '43.370'), (1, '44.130')] +[2023-10-11 17:55:31,213][85175] Updated weights for policy 1, policy_version 71650 (0.0008) +[2023-10-11 17:55:31,588][85175] Updated weights for policy 1, policy_version 71660 (0.0008) +[2023-10-11 17:55:31,965][85175] Updated weights for policy 1, policy_version 71670 (0.0007) +[2023-10-11 17:55:32,327][85175] Updated weights for policy 1, policy_version 71680 (0.0009) +[2023-10-11 17:55:33,750][85176] Updated weights for policy 0, policy_version 70632 (0.0009) +[2023-10-11 17:55:34,126][85176] Updated weights for policy 0, policy_version 70642 (0.0008) +[2023-10-11 17:55:34,503][85176] Updated weights for policy 0, policy_version 70652 (0.0008) +[2023-10-11 17:55:36,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 145752064. Throughput: 0: 1692.8, 1: 1684.8. Samples: 36442918. Policy #0 lag: (min: 31.0, avg: 34.6, max: 63.0) +[2023-10-11 17:55:36,063][84230] Avg episode reward: [(0, '38.610'), (1, '47.360')] +[2023-10-11 17:55:36,454][85175] Updated weights for policy 1, policy_version 71690 (0.0007) +[2023-10-11 17:55:36,825][85175] Updated weights for policy 1, policy_version 71700 (0.0007) +[2023-10-11 17:55:37,185][85175] Updated weights for policy 1, policy_version 71710 (0.0007) +[2023-10-11 17:55:38,667][85176] Updated weights for policy 0, policy_version 70662 (0.0010) +[2023-10-11 17:55:39,032][85176] Updated weights for policy 0, policy_version 70672 (0.0009) +[2023-10-11 17:55:39,405][85176] Updated weights for policy 0, policy_version 70682 (0.0011) +[2023-10-11 17:55:41,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 145817600. Throughput: 0: 1663.4, 1: 1685.4. Samples: 36462626. Policy #0 lag: (min: 31.0, avg: 34.6, max: 63.0) +[2023-10-11 17:55:41,064][84230] Avg episode reward: [(0, '45.090'), (1, '43.450')] +[2023-10-11 17:55:41,237][85175] Updated weights for policy 1, policy_version 71720 (0.0009) +[2023-10-11 17:55:41,609][85175] Updated weights for policy 1, policy_version 71730 (0.0009) +[2023-10-11 17:55:41,974][85175] Updated weights for policy 1, policy_version 71740 (0.0007) +[2023-10-11 17:55:43,648][85176] Updated weights for policy 0, policy_version 70692 (0.0009) +[2023-10-11 17:55:44,013][85176] Updated weights for policy 0, policy_version 70702 (0.0011) +[2023-10-11 17:55:44,392][85176] Updated weights for policy 0, policy_version 70712 (0.0009) +[2023-10-11 17:55:46,035][85175] Updated weights for policy 1, policy_version 71750 (0.0009) +[2023-10-11 17:55:46,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 145883136. Throughput: 0: 1683.1, 1: 1685.2. Samples: 36483452. Policy #0 lag: (min: 31.0, avg: 34.6, max: 63.0) +[2023-10-11 17:55:46,063][84230] Avg episode reward: [(0, '40.760'), (1, '47.100')] +[2023-10-11 17:55:46,409][85175] Updated weights for policy 1, policy_version 71760 (0.0008) +[2023-10-11 17:55:46,770][85175] Updated weights for policy 1, policy_version 71770 (0.0009) +[2023-10-11 17:55:48,399][85176] Updated weights for policy 0, policy_version 70722 (0.0008) +[2023-10-11 17:55:48,776][85176] Updated weights for policy 0, policy_version 70732 (0.0009) +[2023-10-11 17:55:49,164][85176] Updated weights for policy 0, policy_version 70742 (0.0008) +[2023-10-11 17:55:49,525][85176] Updated weights for policy 0, policy_version 70752 (0.0007) +[2023-10-11 17:55:50,843][85175] Updated weights for policy 1, policy_version 71780 (0.0009) +[2023-10-11 17:55:51,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 145948672. Throughput: 0: 1681.8, 1: 1690.2. Samples: 36493702. Policy #0 lag: (min: 31.0, avg: 34.6, max: 63.0) +[2023-10-11 17:55:51,064][84230] Avg episode reward: [(0, '46.330'), (1, '42.300')] +[2023-10-11 17:55:51,231][85175] Updated weights for policy 1, policy_version 71790 (0.0007) +[2023-10-11 17:55:51,603][85175] Updated weights for policy 1, policy_version 71800 (0.0010) +[2023-10-11 17:55:53,577][85176] Updated weights for policy 0, policy_version 70762 (0.0009) +[2023-10-11 17:55:53,953][85176] Updated weights for policy 0, policy_version 70772 (0.0008) +[2023-10-11 17:55:54,330][85176] Updated weights for policy 0, policy_version 70782 (0.0008) +[2023-10-11 17:55:55,389][85175] Updated weights for policy 1, policy_version 71810 (0.0010) +[2023-10-11 17:55:55,759][85175] Updated weights for policy 1, policy_version 71820 (0.0009) +[2023-10-11 17:55:56,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 146014208. Throughput: 0: 1664.8, 1: 1690.0. Samples: 36513428. Policy #0 lag: (min: 31.0, avg: 34.6, max: 63.0) +[2023-10-11 17:55:56,063][84230] Avg episode reward: [(0, '41.860'), (1, '45.750')] +[2023-10-11 17:55:56,131][85175] Updated weights for policy 1, policy_version 71830 (0.0009) +[2023-10-11 17:55:56,497][85175] Updated weights for policy 1, policy_version 71840 (0.0007) +[2023-10-11 17:55:58,295][85176] Updated weights for policy 0, policy_version 70792 (0.0009) +[2023-10-11 17:55:58,670][85176] Updated weights for policy 0, policy_version 70802 (0.0008) +[2023-10-11 17:55:59,050][85176] Updated weights for policy 0, policy_version 70812 (0.0008) +[2023-10-11 17:56:00,713][85175] Updated weights for policy 1, policy_version 71850 (0.0007) +[2023-10-11 17:56:01,063][84230] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 146079744. Throughput: 0: 1683.6, 1: 1684.5. Samples: 36533880. Policy #0 lag: (min: 31.0, avg: 34.6, max: 63.0) +[2023-10-11 17:56:01,063][84230] Avg episode reward: [(0, '45.970'), (1, '43.860')] +[2023-10-11 17:56:01,091][85175] Updated weights for policy 1, policy_version 71860 (0.0008) +[2023-10-11 17:56:01,461][85175] Updated weights for policy 1, policy_version 71870 (0.0009) +[2023-10-11 17:56:03,071][85176] Updated weights for policy 0, policy_version 70822 (0.0008) +[2023-10-11 17:56:03,439][85176] Updated weights for policy 0, policy_version 70832 (0.0007) +[2023-10-11 17:56:03,822][85176] Updated weights for policy 0, policy_version 70842 (0.0008) +[2023-10-11 17:56:05,415][85175] Updated weights for policy 1, policy_version 71880 (0.0007) +[2023-10-11 17:56:05,795][85175] Updated weights for policy 1, policy_version 71890 (0.0008) +[2023-10-11 17:56:06,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 146145280. Throughput: 0: 1667.8, 1: 1689.7. Samples: 36543812. Policy #0 lag: (min: 31.0, avg: 34.6, max: 63.0) +[2023-10-11 17:56:06,063][84230] Avg episode reward: [(0, '40.790'), (1, '44.650')] +[2023-10-11 17:56:06,171][85175] Updated weights for policy 1, policy_version 71900 (0.0009) +[2023-10-11 17:56:07,736][85176] Updated weights for policy 0, policy_version 70852 (0.0007) +[2023-10-11 17:56:08,116][85176] Updated weights for policy 0, policy_version 70862 (0.0009) +[2023-10-11 17:56:08,491][85176] Updated weights for policy 0, policy_version 70872 (0.0008) +[2023-10-11 17:56:10,164][85175] Updated weights for policy 1, policy_version 71910 (0.0009) +[2023-10-11 17:56:10,541][85175] Updated weights for policy 1, policy_version 71920 (0.0009) +[2023-10-11 17:56:10,916][85175] Updated weights for policy 1, policy_version 71930 (0.0010) +[2023-10-11 17:56:11,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 146210816. Throughput: 0: 1674.7, 1: 1694.0. Samples: 36564074. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) +[2023-10-11 17:56:11,064][84230] Avg episode reward: [(0, '43.740'), (1, '44.050')] +[2023-10-11 17:56:12,609][85176] Updated weights for policy 0, policy_version 70882 (0.0008) +[2023-10-11 17:56:12,980][85176] Updated weights for policy 0, policy_version 70892 (0.0008) +[2023-10-11 17:56:13,361][85176] Updated weights for policy 0, policy_version 70902 (0.0007) +[2023-10-11 17:56:13,745][85176] Updated weights for policy 0, policy_version 70912 (0.0009) +[2023-10-11 17:56:14,948][85175] Updated weights for policy 1, policy_version 71940 (0.0008) +[2023-10-11 17:56:15,317][85175] Updated weights for policy 1, policy_version 71950 (0.0008) +[2023-10-11 17:56:15,694][85175] Updated weights for policy 1, policy_version 71960 (0.0010) +[2023-10-11 17:56:16,063][84230] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 146309120. Throughput: 0: 1685.9, 1: 1683.2. Samples: 36584174. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) +[2023-10-11 17:56:16,063][84230] Avg episode reward: [(0, '44.050'), (1, '42.000')] +[2023-10-11 17:56:16,070][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000071968_73695232.pth... +[2023-10-11 17:56:16,070][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000070912_72613888.pth... +[2023-10-11 17:56:16,105][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000069344_71008256.pth +[2023-10-11 17:56:16,109][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000070368_72056832.pth +[2023-10-11 17:56:17,835][85176] Updated weights for policy 0, policy_version 70922 (0.0009) +[2023-10-11 17:56:18,216][85176] Updated weights for policy 0, policy_version 70932 (0.0009) +[2023-10-11 17:56:18,588][85176] Updated weights for policy 0, policy_version 70942 (0.0007) +[2023-10-11 17:56:19,639][85175] Updated weights for policy 1, policy_version 71970 (0.0008) +[2023-10-11 17:56:20,010][85175] Updated weights for policy 1, policy_version 71980 (0.0007) +[2023-10-11 17:56:20,373][85175] Updated weights for policy 1, policy_version 71990 (0.0009) +[2023-10-11 17:56:20,740][85175] Updated weights for policy 1, policy_version 72000 (0.0010) +[2023-10-11 17:56:21,062][84230] Fps is (10 sec: 16384.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 146374656. Throughput: 0: 1658.4, 1: 1702.7. Samples: 36594166. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) +[2023-10-11 17:56:21,063][84230] Avg episode reward: [(0, '45.210'), (1, '43.180')] +[2023-10-11 17:56:22,710][85176] Updated weights for policy 0, policy_version 70952 (0.0010) +[2023-10-11 17:56:23,079][85176] Updated weights for policy 0, policy_version 70962 (0.0008) +[2023-10-11 17:56:23,448][85176] Updated weights for policy 0, policy_version 70972 (0.0008) +[2023-10-11 17:56:24,869][85175] Updated weights for policy 1, policy_version 72010 (0.0007) +[2023-10-11 17:56:25,234][85175] Updated weights for policy 1, policy_version 72020 (0.0009) +[2023-10-11 17:56:25,598][85175] Updated weights for policy 1, policy_version 72030 (0.0008) +[2023-10-11 17:56:26,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 146440192. Throughput: 0: 1682.7, 1: 1695.4. Samples: 36614638. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) +[2023-10-11 17:56:26,063][84230] Avg episode reward: [(0, '45.310'), (1, '44.440')] +[2023-10-11 17:56:27,634][85176] Updated weights for policy 0, policy_version 70982 (0.0009) +[2023-10-11 17:56:28,007][85176] Updated weights for policy 0, policy_version 70992 (0.0010) +[2023-10-11 17:56:28,386][85176] Updated weights for policy 0, policy_version 71002 (0.0010) +[2023-10-11 17:56:29,532][85175] Updated weights for policy 1, policy_version 72040 (0.0011) +[2023-10-11 17:56:29,909][85175] Updated weights for policy 1, policy_version 72050 (0.0009) +[2023-10-11 17:56:30,273][85175] Updated weights for policy 1, policy_version 72060 (0.0011) +[2023-10-11 17:56:31,063][84230] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 146505728. Throughput: 0: 1683.2, 1: 1666.6. Samples: 36634192. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) +[2023-10-11 17:56:31,064][84230] Avg episode reward: [(0, '44.700'), (1, '45.120')] +[2023-10-11 17:56:32,375][85176] Updated weights for policy 0, policy_version 71012 (0.0010) +[2023-10-11 17:56:32,752][85176] Updated weights for policy 0, policy_version 71022 (0.0008) +[2023-10-11 17:56:33,126][85176] Updated weights for policy 0, policy_version 71032 (0.0010) +[2023-10-11 17:56:34,354][85175] Updated weights for policy 1, policy_version 72070 (0.0009) +[2023-10-11 17:56:34,726][85175] Updated weights for policy 1, policy_version 72080 (0.0008) +[2023-10-11 17:56:35,087][85175] Updated weights for policy 1, policy_version 72090 (0.0007) +[2023-10-11 17:56:36,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 146571264. Throughput: 0: 1658.2, 1: 1701.2. Samples: 36644874. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) +[2023-10-11 17:56:36,064][84230] Avg episode reward: [(0, '44.000'), (1, '46.030')] +[2023-10-11 17:56:37,199][85176] Updated weights for policy 0, policy_version 71042 (0.0010) +[2023-10-11 17:56:37,576][85176] Updated weights for policy 0, policy_version 71052 (0.0008) +[2023-10-11 17:56:37,947][85176] Updated weights for policy 0, policy_version 71062 (0.0008) +[2023-10-11 17:56:38,331][85176] Updated weights for policy 0, policy_version 71072 (0.0007) +[2023-10-11 17:56:39,205][85175] Updated weights for policy 1, policy_version 72100 (0.0008) +[2023-10-11 17:56:39,611][85175] Updated weights for policy 1, policy_version 72110 (0.0008) +[2023-10-11 17:56:39,974][85175] Updated weights for policy 1, policy_version 72120 (0.0008) +[2023-10-11 17:56:41,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 146636800. Throughput: 0: 1683.2, 1: 1688.6. Samples: 36665162. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) +[2023-10-11 17:56:41,063][84230] Avg episode reward: [(0, '44.140'), (1, '47.680')] +[2023-10-11 17:56:42,375][85176] Updated weights for policy 0, policy_version 71082 (0.0008) +[2023-10-11 17:56:42,741][85176] Updated weights for policy 0, policy_version 71092 (0.0007) +[2023-10-11 17:56:43,107][85176] Updated weights for policy 0, policy_version 71102 (0.0007) +[2023-10-11 17:56:43,738][85175] Updated weights for policy 1, policy_version 72130 (0.0011) +[2023-10-11 17:56:44,105][85175] Updated weights for policy 1, policy_version 72140 (0.0008) +[2023-10-11 17:56:44,478][85175] Updated weights for policy 1, policy_version 72150 (0.0010) +[2023-10-11 17:56:44,848][85175] Updated weights for policy 1, policy_version 72160 (0.0010) +[2023-10-11 17:56:46,062][84230] Fps is (10 sec: 13107.6, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 146702336. Throughput: 0: 1679.6, 1: 1680.8. Samples: 36685096. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) +[2023-10-11 17:56:46,063][84230] Avg episode reward: [(0, '43.970'), (1, '47.090')] +[2023-10-11 17:56:47,292][85176] Updated weights for policy 0, policy_version 71112 (0.0007) +[2023-10-11 17:56:47,669][85176] Updated weights for policy 0, policy_version 71122 (0.0008) +[2023-10-11 17:56:48,047][85176] Updated weights for policy 0, policy_version 71132 (0.0008) +[2023-10-11 17:56:48,915][85175] Updated weights for policy 1, policy_version 72170 (0.0008) +[2023-10-11 17:56:49,287][85175] Updated weights for policy 1, policy_version 72180 (0.0007) +[2023-10-11 17:56:49,648][85175] Updated weights for policy 1, policy_version 72190 (0.0007) +[2023-10-11 17:56:51,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 146767872. Throughput: 0: 1665.2, 1: 1702.4. Samples: 36695352. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) +[2023-10-11 17:56:51,063][84230] Avg episode reward: [(0, '44.240'), (1, '46.040')] +[2023-10-11 17:56:52,140][85176] Updated weights for policy 0, policy_version 71142 (0.0009) +[2023-10-11 17:56:52,506][85176] Updated weights for policy 0, policy_version 71152 (0.0008) +[2023-10-11 17:56:52,889][85176] Updated weights for policy 0, policy_version 71162 (0.0009) +[2023-10-11 17:56:53,594][85175] Updated weights for policy 1, policy_version 72200 (0.0009) +[2023-10-11 17:56:53,969][85175] Updated weights for policy 1, policy_version 72210 (0.0008) +[2023-10-11 17:56:54,329][85175] Updated weights for policy 1, policy_version 72220 (0.0009) +[2023-10-11 17:56:56,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 146833408. Throughput: 0: 1684.4, 1: 1679.4. Samples: 36715446. Policy #0 lag: (min: 25.0, avg: 29.9, max: 57.0) +[2023-10-11 17:56:56,063][84230] Avg episode reward: [(0, '43.460'), (1, '45.020')] +[2023-10-11 17:56:56,701][85176] Updated weights for policy 0, policy_version 71172 (0.0009) +[2023-10-11 17:56:57,070][85176] Updated weights for policy 0, policy_version 71182 (0.0007) +[2023-10-11 17:56:57,448][85176] Updated weights for policy 0, policy_version 71192 (0.0007) +[2023-10-11 17:56:58,406][85175] Updated weights for policy 1, policy_version 72230 (0.0008) +[2023-10-11 17:56:58,774][85175] Updated weights for policy 1, policy_version 72240 (0.0008) +[2023-10-11 17:56:59,147][85175] Updated weights for policy 1, policy_version 72250 (0.0008) +[2023-10-11 17:57:01,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 146898944. Throughput: 0: 1683.0, 1: 1695.8. Samples: 36736220. Policy #0 lag: (min: 25.0, avg: 29.9, max: 57.0) +[2023-10-11 17:57:01,063][84230] Avg episode reward: [(0, '41.040'), (1, '45.660')] +[2023-10-11 17:57:01,566][85176] Updated weights for policy 0, policy_version 71202 (0.0007) +[2023-10-11 17:57:01,940][85176] Updated weights for policy 0, policy_version 71212 (0.0007) +[2023-10-11 17:57:02,319][85176] Updated weights for policy 0, policy_version 71222 (0.0010) +[2023-10-11 17:57:02,690][85176] Updated weights for policy 0, policy_version 71232 (0.0008) +[2023-10-11 17:57:03,220][85175] Updated weights for policy 1, policy_version 72260 (0.0009) +[2023-10-11 17:57:03,582][85175] Updated weights for policy 1, policy_version 72270 (0.0007) +[2023-10-11 17:57:03,949][85175] Updated weights for policy 1, policy_version 72280 (0.0008) +[2023-10-11 17:57:06,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 146964480. Throughput: 0: 1679.6, 1: 1699.3. Samples: 36746216. Policy #0 lag: (min: 25.0, avg: 29.9, max: 57.0) +[2023-10-11 17:57:06,063][84230] Avg episode reward: [(0, '43.430'), (1, '45.990')] +[2023-10-11 17:57:06,785][85176] Updated weights for policy 0, policy_version 71242 (0.0010) +[2023-10-11 17:57:07,171][85176] Updated weights for policy 0, policy_version 71252 (0.0008) +[2023-10-11 17:57:07,548][85176] Updated weights for policy 0, policy_version 71262 (0.0008) +[2023-10-11 17:57:07,866][85175] Updated weights for policy 1, policy_version 72290 (0.0007) +[2023-10-11 17:57:08,236][85175] Updated weights for policy 1, policy_version 72300 (0.0007) +[2023-10-11 17:57:08,609][85175] Updated weights for policy 1, policy_version 72310 (0.0007) +[2023-10-11 17:57:08,983][85175] Updated weights for policy 1, policy_version 72320 (0.0008) +[2023-10-11 17:57:11,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 147030016. Throughput: 0: 1680.8, 1: 1689.6. Samples: 36766304. Policy #0 lag: (min: 25.0, avg: 29.9, max: 57.0) +[2023-10-11 17:57:11,064][84230] Avg episode reward: [(0, '43.090'), (1, '46.190')] +[2023-10-11 17:57:11,591][85176] Updated weights for policy 0, policy_version 71272 (0.0009) +[2023-10-11 17:57:11,959][85176] Updated weights for policy 0, policy_version 71282 (0.0007) +[2023-10-11 17:57:12,339][85176] Updated weights for policy 0, policy_version 71292 (0.0009) +[2023-10-11 17:57:12,976][85175] Updated weights for policy 1, policy_version 72330 (0.0008) +[2023-10-11 17:57:13,349][85175] Updated weights for policy 1, policy_version 72340 (0.0008) +[2023-10-11 17:57:13,711][85175] Updated weights for policy 1, policy_version 72350 (0.0007) +[2023-10-11 17:57:16,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 147095552. Throughput: 0: 1679.6, 1: 1723.5. Samples: 36787330. Policy #0 lag: (min: 25.0, avg: 29.9, max: 57.0) +[2023-10-11 17:57:16,063][84230] Avg episode reward: [(0, '43.790'), (1, '44.090')] +[2023-10-11 17:57:16,546][85176] Updated weights for policy 0, policy_version 71302 (0.0009) +[2023-10-11 17:57:16,917][85176] Updated weights for policy 0, policy_version 71312 (0.0011) +[2023-10-11 17:57:17,299][85176] Updated weights for policy 0, policy_version 71322 (0.0009) +[2023-10-11 17:57:17,734][85175] Updated weights for policy 1, policy_version 72360 (0.0009) +[2023-10-11 17:57:18,099][85175] Updated weights for policy 1, policy_version 72370 (0.0008) +[2023-10-11 17:57:18,466][85175] Updated weights for policy 1, policy_version 72380 (0.0007) +[2023-10-11 17:57:21,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 147161088. Throughput: 0: 1680.1, 1: 1692.1. Samples: 36796624. Policy #0 lag: (min: 25.0, avg: 29.9, max: 57.0) +[2023-10-11 17:57:21,064][84230] Avg episode reward: [(0, '42.720'), (1, '46.890')] +[2023-10-11 17:57:21,204][85176] Updated weights for policy 0, policy_version 71332 (0.0010) +[2023-10-11 17:57:21,575][85176] Updated weights for policy 0, policy_version 71342 (0.0009) +[2023-10-11 17:57:21,954][85176] Updated weights for policy 0, policy_version 71352 (0.0010) +[2023-10-11 17:57:22,512][85175] Updated weights for policy 1, policy_version 72390 (0.0009) +[2023-10-11 17:57:22,879][85175] Updated weights for policy 1, policy_version 72400 (0.0009) +[2023-10-11 17:57:23,246][85175] Updated weights for policy 1, policy_version 72410 (0.0009) +[2023-10-11 17:57:26,029][85176] Updated weights for policy 0, policy_version 71362 (0.0009) +[2023-10-11 17:57:26,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 147226624. Throughput: 0: 1678.9, 1: 1701.7. Samples: 36817292. Policy #0 lag: (min: 25.0, avg: 29.9, max: 57.0) +[2023-10-11 17:57:26,064][84230] Avg episode reward: [(0, '44.790'), (1, '43.870')] +[2023-10-11 17:57:26,403][85176] Updated weights for policy 0, policy_version 71372 (0.0007) +[2023-10-11 17:57:26,775][85176] Updated weights for policy 0, policy_version 71382 (0.0008) +[2023-10-11 17:57:27,145][85176] Updated weights for policy 0, policy_version 71392 (0.0010) +[2023-10-11 17:57:27,258][85175] Updated weights for policy 1, policy_version 72420 (0.0008) +[2023-10-11 17:57:27,649][85175] Updated weights for policy 1, policy_version 72430 (0.0009) +[2023-10-11 17:57:28,026][85175] Updated weights for policy 1, policy_version 72440 (0.0008) +[2023-10-11 17:57:31,063][84230] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 147292160. Throughput: 0: 1681.5, 1: 1714.5. Samples: 36837916. Policy #0 lag: (min: 25.0, avg: 29.9, max: 57.0) +[2023-10-11 17:57:31,063][84230] Avg episode reward: [(0, '44.900'), (1, '46.500')] +[2023-10-11 17:57:31,311][85176] Updated weights for policy 0, policy_version 71402 (0.0007) +[2023-10-11 17:57:31,693][85176] Updated weights for policy 0, policy_version 71412 (0.0009) +[2023-10-11 17:57:31,898][85175] Updated weights for policy 1, policy_version 72450 (0.0008) +[2023-10-11 17:57:32,066][85176] Updated weights for policy 0, policy_version 71422 (0.0008) +[2023-10-11 17:57:32,267][85175] Updated weights for policy 1, policy_version 72460 (0.0010) +[2023-10-11 17:57:32,648][85175] Updated weights for policy 1, policy_version 72470 (0.0010) +[2023-10-11 17:57:33,015][85175] Updated weights for policy 1, policy_version 72480 (0.0008) +[2023-10-11 17:57:36,062][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 147357696. Throughput: 0: 1680.0, 1: 1690.6. Samples: 36847026. Policy #0 lag: (min: 25.0, avg: 29.9, max: 57.0) +[2023-10-11 17:57:36,063][84230] Avg episode reward: [(0, '46.150'), (1, '43.850')] +[2023-10-11 17:57:36,171][85176] Updated weights for policy 0, policy_version 71432 (0.0010) +[2023-10-11 17:57:36,544][85176] Updated weights for policy 0, policy_version 71442 (0.0008) +[2023-10-11 17:57:36,928][85176] Updated weights for policy 0, policy_version 71452 (0.0008) +[2023-10-11 17:57:37,071][85175] Updated weights for policy 1, policy_version 72490 (0.0008) +[2023-10-11 17:57:37,439][85175] Updated weights for policy 1, policy_version 72500 (0.0009) +[2023-10-11 17:57:37,810][85175] Updated weights for policy 1, policy_version 72510 (0.0007) +[2023-10-11 17:57:40,867][85176] Updated weights for policy 0, policy_version 71462 (0.0009) +[2023-10-11 17:57:41,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 147423232. Throughput: 0: 1678.8, 1: 1706.9. Samples: 36867806. Policy #0 lag: (min: 25.0, avg: 29.9, max: 57.0) +[2023-10-11 17:57:41,063][84230] Avg episode reward: [(0, '45.090'), (1, '44.690')] +[2023-10-11 17:57:41,249][85176] Updated weights for policy 0, policy_version 71472 (0.0010) +[2023-10-11 17:57:41,628][85176] Updated weights for policy 0, policy_version 71482 (0.0007) +[2023-10-11 17:57:42,001][85175] Updated weights for policy 1, policy_version 72520 (0.0008) +[2023-10-11 17:57:42,374][85175] Updated weights for policy 1, policy_version 72530 (0.0007) +[2023-10-11 17:57:42,744][85175] Updated weights for policy 1, policy_version 72540 (0.0009) +[2023-10-11 17:57:45,754][85176] Updated weights for policy 0, policy_version 71492 (0.0007) +[2023-10-11 17:57:46,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 147488768. Throughput: 0: 1677.1, 1: 1704.2. Samples: 36888380. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-11 17:57:46,064][84230] Avg episode reward: [(0, '45.250'), (1, '43.630')] +[2023-10-11 17:57:46,121][85176] Updated weights for policy 0, policy_version 71502 (0.0009) +[2023-10-11 17:57:46,488][85176] Updated weights for policy 0, policy_version 71512 (0.0008) +[2023-10-11 17:57:46,745][85175] Updated weights for policy 1, policy_version 72550 (0.0008) +[2023-10-11 17:57:47,116][85175] Updated weights for policy 1, policy_version 72560 (0.0009) +[2023-10-11 17:57:47,481][85175] Updated weights for policy 1, policy_version 72570 (0.0009) +[2023-10-11 17:57:50,681][85176] Updated weights for policy 0, policy_version 71522 (0.0009) +[2023-10-11 17:57:51,050][85176] Updated weights for policy 0, policy_version 71532 (0.0008) +[2023-10-11 17:57:51,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 147554304. Throughput: 0: 1675.0, 1: 1682.3. Samples: 36897292. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-11 17:57:51,063][84230] Avg episode reward: [(0, '43.670'), (1, '43.830')] +[2023-10-11 17:57:51,418][85176] Updated weights for policy 0, policy_version 71542 (0.0008) +[2023-10-11 17:57:51,546][85175] Updated weights for policy 1, policy_version 72580 (0.0009) +[2023-10-11 17:57:51,793][85176] Updated weights for policy 0, policy_version 71552 (0.0008) +[2023-10-11 17:57:51,918][85175] Updated weights for policy 1, policy_version 72590 (0.0007) +[2023-10-11 17:57:52,285][85175] Updated weights for policy 1, policy_version 72600 (0.0008) +[2023-10-11 17:57:55,874][85176] Updated weights for policy 0, policy_version 71562 (0.0010) +[2023-10-11 17:57:56,063][84230] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 147619840. Throughput: 0: 1673.7, 1: 1699.6. Samples: 36918102. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-11 17:57:56,063][84230] Avg episode reward: [(0, '43.500'), (1, '43.290')] +[2023-10-11 17:57:56,188][85175] Updated weights for policy 1, policy_version 72610 (0.0007) +[2023-10-11 17:57:56,247][85176] Updated weights for policy 0, policy_version 71572 (0.0009) +[2023-10-11 17:57:56,561][85175] Updated weights for policy 1, policy_version 72620 (0.0008) +[2023-10-11 17:57:56,614][85176] Updated weights for policy 0, policy_version 71582 (0.0008) +[2023-10-11 17:57:56,915][85175] Updated weights for policy 1, policy_version 72630 (0.0010) +[2023-10-11 17:57:57,287][85175] Updated weights for policy 1, policy_version 72640 (0.0008) +[2023-10-11 17:58:00,694][85176] Updated weights for policy 0, policy_version 71592 (0.0011) +[2023-10-11 17:58:01,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 147685376. Throughput: 0: 1666.2, 1: 1696.1. Samples: 36938636. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-11 17:58:01,063][84230] Avg episode reward: [(0, '42.270'), (1, '43.200')] +[2023-10-11 17:58:01,070][85176] Updated weights for policy 0, policy_version 71602 (0.0008) +[2023-10-11 17:58:01,184][85175] Updated weights for policy 1, policy_version 72650 (0.0009) +[2023-10-11 17:58:01,445][85176] Updated weights for policy 0, policy_version 71612 (0.0009) +[2023-10-11 17:58:01,556][85175] Updated weights for policy 1, policy_version 72660 (0.0007) +[2023-10-11 17:58:01,927][85175] Updated weights for policy 1, policy_version 72670 (0.0007) +[2023-10-11 17:58:05,456][85176] Updated weights for policy 0, policy_version 71622 (0.0007) +[2023-10-11 17:58:05,782][85175] Updated weights for policy 1, policy_version 72680 (0.0009) +[2023-10-11 17:58:05,828][85176] Updated weights for policy 0, policy_version 71632 (0.0007) +[2023-10-11 17:58:06,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 147750912. Throughput: 0: 1671.0, 1: 1696.2. Samples: 36948146. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-11 17:58:06,063][84230] Avg episode reward: [(0, '44.120'), (1, '44.790')] +[2023-10-11 17:58:06,153][85175] Updated weights for policy 1, policy_version 72690 (0.0007) +[2023-10-11 17:58:06,205][85176] Updated weights for policy 0, policy_version 71642 (0.0009) +[2023-10-11 17:58:06,515][85175] Updated weights for policy 1, policy_version 72700 (0.0007) +[2023-10-11 17:58:10,349][85176] Updated weights for policy 0, policy_version 71652 (0.0008) +[2023-10-11 17:58:10,681][85175] Updated weights for policy 1, policy_version 72710 (0.0008) +[2023-10-11 17:58:10,733][85176] Updated weights for policy 0, policy_version 71662 (0.0008) +[2023-10-11 17:58:11,053][85175] Updated weights for policy 1, policy_version 72720 (0.0008) +[2023-10-11 17:58:11,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 147816448. Throughput: 0: 1667.6, 1: 1698.8. Samples: 36968776. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-11 17:58:11,063][84230] Avg episode reward: [(0, '42.440'), (1, '43.250')] +[2023-10-11 17:58:11,097][85176] Updated weights for policy 0, policy_version 71672 (0.0008) +[2023-10-11 17:58:11,417][85175] Updated weights for policy 1, policy_version 72730 (0.0008) +[2023-10-11 17:58:15,260][85176] Updated weights for policy 0, policy_version 71682 (0.0008) +[2023-10-11 17:58:15,315][85175] Updated weights for policy 1, policy_version 72740 (0.0008) +[2023-10-11 17:58:15,653][85176] Updated weights for policy 0, policy_version 71692 (0.0008) +[2023-10-11 17:58:15,672][85175] Updated weights for policy 1, policy_version 72750 (0.0007) +[2023-10-11 17:58:16,027][85176] Updated weights for policy 0, policy_version 71702 (0.0007) +[2023-10-11 17:58:16,040][85175] Updated weights for policy 1, policy_version 72760 (0.0007) +[2023-10-11 17:58:16,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 147881984. Throughput: 0: 1653.5, 1: 1695.9. Samples: 36988640. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-11 17:58:16,063][84230] Avg episode reward: [(0, '45.270'), (1, '45.420')] +[2023-10-11 17:58:16,327][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000072768_74514432.pth... +[2023-10-11 17:58:16,358][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000071168_72876032.pth +[2023-10-11 17:58:16,400][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000071712_73433088.pth... +[2023-10-11 17:58:16,400][85176] Updated weights for policy 0, policy_version 71712 (0.0007) +[2023-10-11 17:58:16,429][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000070144_71827456.pth +[2023-10-11 17:58:20,246][85175] Updated weights for policy 1, policy_version 72770 (0.0007) +[2023-10-11 17:58:20,541][85176] Updated weights for policy 0, policy_version 71722 (0.0008) +[2023-10-11 17:58:20,608][85175] Updated weights for policy 1, policy_version 72780 (0.0008) +[2023-10-11 17:58:20,913][85176] Updated weights for policy 0, policy_version 71732 (0.0008) +[2023-10-11 17:58:20,979][85175] Updated weights for policy 1, policy_version 72790 (0.0009) +[2023-10-11 17:58:21,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 147947520. Throughput: 0: 1664.1, 1: 1700.4. Samples: 36998426. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-11 17:58:21,063][84230] Avg episode reward: [(0, '45.510'), (1, '40.880')] +[2023-10-11 17:58:21,284][85176] Updated weights for policy 0, policy_version 71742 (0.0007) +[2023-10-11 17:58:21,339][85175] Updated weights for policy 1, policy_version 72800 (0.0007) +[2023-10-11 17:58:25,370][85176] Updated weights for policy 0, policy_version 71752 (0.0007) +[2023-10-11 17:58:25,538][85175] Updated weights for policy 1, policy_version 72810 (0.0008) +[2023-10-11 17:58:25,734][85176] Updated weights for policy 0, policy_version 71762 (0.0007) +[2023-10-11 17:58:25,905][85175] Updated weights for policy 1, policy_version 72820 (0.0007) +[2023-10-11 17:58:26,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 148013056. Throughput: 0: 1658.0, 1: 1702.4. Samples: 37019024. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-11 17:58:26,063][84230] Avg episode reward: [(0, '43.750'), (1, '42.110')] +[2023-10-11 17:58:26,117][85176] Updated weights for policy 0, policy_version 71772 (0.0008) +[2023-10-11 17:58:26,269][85175] Updated weights for policy 1, policy_version 72830 (0.0007) +[2023-10-11 17:58:30,233][85176] Updated weights for policy 0, policy_version 71782 (0.0007) +[2023-10-11 17:58:30,392][85175] Updated weights for policy 1, policy_version 72840 (0.0009) +[2023-10-11 17:58:30,604][85176] Updated weights for policy 0, policy_version 71792 (0.0007) +[2023-10-11 17:58:30,768][85175] Updated weights for policy 1, policy_version 72850 (0.0007) +[2023-10-11 17:58:30,969][85176] Updated weights for policy 0, policy_version 71802 (0.0008) +[2023-10-11 17:58:31,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 148078592. Throughput: 0: 1653.7, 1: 1692.2. Samples: 37038946. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-11 17:58:31,063][84230] Avg episode reward: [(0, '43.040'), (1, '39.600')] +[2023-10-11 17:58:31,130][85175] Updated weights for policy 1, policy_version 72860 (0.0008) +[2023-10-11 17:58:35,193][85175] Updated weights for policy 1, policy_version 72870 (0.0008) +[2023-10-11 17:58:35,222][85176] Updated weights for policy 0, policy_version 71812 (0.0007) +[2023-10-11 17:58:35,568][85175] Updated weights for policy 1, policy_version 72880 (0.0009) +[2023-10-11 17:58:35,600][85176] Updated weights for policy 0, policy_version 71822 (0.0007) +[2023-10-11 17:58:35,929][85175] Updated weights for policy 1, policy_version 72890 (0.0009) +[2023-10-11 17:58:35,975][85176] Updated weights for policy 0, policy_version 71832 (0.0009) +[2023-10-11 17:58:36,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 148144128. Throughput: 0: 1669.1, 1: 1705.9. Samples: 37049164. Policy #0 lag: (min: 29.0, avg: 29.1, max: 34.0) +[2023-10-11 17:58:36,063][84230] Avg episode reward: [(0, '42.540'), (1, '43.210')] +[2023-10-11 17:58:39,952][85176] Updated weights for policy 0, policy_version 71842 (0.0009) +[2023-10-11 17:58:40,047][85175] Updated weights for policy 1, policy_version 72900 (0.0008) +[2023-10-11 17:58:40,317][85176] Updated weights for policy 0, policy_version 71852 (0.0010) +[2023-10-11 17:58:40,413][85175] Updated weights for policy 1, policy_version 72910 (0.0010) +[2023-10-11 17:58:40,701][85176] Updated weights for policy 0, policy_version 71862 (0.0007) +[2023-10-11 17:58:40,778][85175] Updated weights for policy 1, policy_version 72920 (0.0008) +[2023-10-11 17:58:41,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 148209664. Throughput: 0: 1669.7, 1: 1700.1. Samples: 37069746. Policy #0 lag: (min: 29.0, avg: 29.1, max: 34.0) +[2023-10-11 17:58:41,064][84230] Avg episode reward: [(0, '43.550'), (1, '43.610')] +[2023-10-11 17:58:41,064][85176] Updated weights for policy 0, policy_version 71872 (0.0007) +[2023-10-11 17:58:44,812][85175] Updated weights for policy 1, policy_version 72930 (0.0008) +[2023-10-11 17:58:45,109][85176] Updated weights for policy 0, policy_version 71882 (0.0008) +[2023-10-11 17:58:45,179][85175] Updated weights for policy 1, policy_version 72940 (0.0009) +[2023-10-11 17:58:45,485][85176] Updated weights for policy 0, policy_version 71892 (0.0007) +[2023-10-11 17:58:45,542][85175] Updated weights for policy 1, policy_version 72950 (0.0007) +[2023-10-11 17:58:45,852][85176] Updated weights for policy 0, policy_version 71902 (0.0007) +[2023-10-11 17:58:45,909][85175] Updated weights for policy 1, policy_version 72960 (0.0008) +[2023-10-11 17:58:46,063][84230] Fps is (10 sec: 19660.4, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 148340736. Throughput: 0: 1660.1, 1: 1677.3. Samples: 37088820. Policy #0 lag: (min: 29.0, avg: 29.1, max: 34.0) +[2023-10-11 17:58:46,064][84230] Avg episode reward: [(0, '44.440'), (1, '45.520')] +[2023-10-11 17:58:49,935][85176] Updated weights for policy 0, policy_version 71912 (0.0008) +[2023-10-11 17:58:49,972][85175] Updated weights for policy 1, policy_version 72970 (0.0008) +[2023-10-11 17:58:50,300][85176] Updated weights for policy 0, policy_version 71922 (0.0008) +[2023-10-11 17:58:50,335][85175] Updated weights for policy 1, policy_version 72980 (0.0008) +[2023-10-11 17:58:50,667][85176] Updated weights for policy 0, policy_version 71932 (0.0007) +[2023-10-11 17:58:50,710][85175] Updated weights for policy 1, policy_version 72990 (0.0008) +[2023-10-11 17:58:51,062][84230] Fps is (10 sec: 19661.3, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 148406272. Throughput: 0: 1671.4, 1: 1691.6. Samples: 37099480. Policy #0 lag: (min: 29.0, avg: 29.1, max: 34.0) +[2023-10-11 17:58:51,063][84230] Avg episode reward: [(0, '45.600'), (1, '43.880')] +[2023-10-11 17:58:54,688][85175] Updated weights for policy 1, policy_version 73000 (0.0007) +[2023-10-11 17:58:54,806][85176] Updated weights for policy 0, policy_version 71942 (0.0008) +[2023-10-11 17:58:55,046][85175] Updated weights for policy 1, policy_version 73010 (0.0008) +[2023-10-11 17:58:55,178][85176] Updated weights for policy 0, policy_version 71952 (0.0011) +[2023-10-11 17:58:55,407][85175] Updated weights for policy 1, policy_version 73020 (0.0008) +[2023-10-11 17:58:55,562][85176] Updated weights for policy 0, policy_version 71962 (0.0010) +[2023-10-11 17:58:56,063][84230] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 148471808. Throughput: 0: 1672.6, 1: 1691.6. Samples: 37120164. Policy #0 lag: (min: 29.0, avg: 29.1, max: 34.0) +[2023-10-11 17:58:56,064][84230] Avg episode reward: [(0, '45.010'), (1, '43.880')] +[2023-10-11 17:58:59,474][85175] Updated weights for policy 1, policy_version 73030 (0.0008) +[2023-10-11 17:58:59,554][85176] Updated weights for policy 0, policy_version 71972 (0.0007) +[2023-10-11 17:58:59,839][85175] Updated weights for policy 1, policy_version 73040 (0.0007) +[2023-10-11 17:58:59,923][85176] Updated weights for policy 0, policy_version 71982 (0.0007) +[2023-10-11 17:59:00,201][85175] Updated weights for policy 1, policy_version 73050 (0.0009) +[2023-10-11 17:59:00,298][85176] Updated weights for policy 0, policy_version 71992 (0.0007) +[2023-10-11 17:59:01,063][84230] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 148537344. Throughput: 0: 1659.8, 1: 1669.6. Samples: 37138462. Policy #0 lag: (min: 29.0, avg: 29.1, max: 34.0) +[2023-10-11 17:59:01,063][84230] Avg episode reward: [(0, '43.930'), (1, '45.160')] +[2023-10-11 17:59:04,276][85175] Updated weights for policy 1, policy_version 73060 (0.0007) +[2023-10-11 17:59:04,345][85176] Updated weights for policy 0, policy_version 72002 (0.0007) +[2023-10-11 17:59:04,662][85175] Updated weights for policy 1, policy_version 73070 (0.0008) +[2023-10-11 17:59:04,729][85176] Updated weights for policy 0, policy_version 72012 (0.0007) +[2023-10-11 17:59:05,028][85175] Updated weights for policy 1, policy_version 73080 (0.0008) +[2023-10-11 17:59:05,098][85176] Updated weights for policy 0, policy_version 72022 (0.0007) +[2023-10-11 17:59:05,471][85176] Updated weights for policy 0, policy_version 72032 (0.0007) +[2023-10-11 17:59:06,063][84230] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 148602880. Throughput: 0: 1681.9, 1: 1693.2. Samples: 37150308. Policy #0 lag: (min: 29.0, avg: 29.1, max: 34.0) +[2023-10-11 17:59:06,064][84230] Avg episode reward: [(0, '41.820'), (1, '41.730')] +[2023-10-11 17:59:09,003][85175] Updated weights for policy 1, policy_version 73090 (0.0008) +[2023-10-11 17:59:09,372][85175] Updated weights for policy 1, policy_version 73100 (0.0008) +[2023-10-11 17:59:09,444][85176] Updated weights for policy 0, policy_version 72042 (0.0007) +[2023-10-11 17:59:09,737][85175] Updated weights for policy 1, policy_version 73110 (0.0008) +[2023-10-11 17:59:09,805][85176] Updated weights for policy 0, policy_version 72052 (0.0009) +[2023-10-11 17:59:10,107][85175] Updated weights for policy 1, policy_version 73120 (0.0007) +[2023-10-11 17:59:10,182][85176] Updated weights for policy 0, policy_version 72062 (0.0009) +[2023-10-11 17:59:11,063][84230] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 148668416. Throughput: 0: 1669.6, 1: 1680.4. Samples: 37169776. Policy #0 lag: (min: 29.0, avg: 29.1, max: 34.0) +[2023-10-11 17:59:11,064][84230] Avg episode reward: [(0, '41.510'), (1, '45.180')] +[2023-10-11 17:59:14,218][85175] Updated weights for policy 1, policy_version 73130 (0.0008) +[2023-10-11 17:59:14,269][85176] Updated weights for policy 0, policy_version 72072 (0.0008) +[2023-10-11 17:59:14,581][85175] Updated weights for policy 1, policy_version 73140 (0.0009) +[2023-10-11 17:59:14,641][85176] Updated weights for policy 0, policy_version 72082 (0.0009) +[2023-10-11 17:59:14,945][85175] Updated weights for policy 1, policy_version 73150 (0.0009) +[2023-10-11 17:59:15,008][85176] Updated weights for policy 0, policy_version 72092 (0.0008) +[2023-10-11 17:59:16,063][84230] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 148733952. Throughput: 0: 1661.0, 1: 1676.6. Samples: 37189140. Policy #0 lag: (min: 29.0, avg: 29.1, max: 34.0) +[2023-10-11 17:59:16,063][84230] Avg episode reward: [(0, '42.840'), (1, '42.620')] +[2023-10-11 17:59:18,843][85175] Updated weights for policy 1, policy_version 73160 (0.0008) +[2023-10-11 17:59:19,139][85176] Updated weights for policy 0, policy_version 72102 (0.0010) +[2023-10-11 17:59:19,202][85175] Updated weights for policy 1, policy_version 73170 (0.0008) +[2023-10-11 17:59:19,513][85176] Updated weights for policy 0, policy_version 72112 (0.0009) +[2023-10-11 17:59:19,574][85175] Updated weights for policy 1, policy_version 73180 (0.0008) +[2023-10-11 17:59:19,889][85176] Updated weights for policy 0, policy_version 72122 (0.0008) +[2023-10-11 17:59:21,063][84230] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 148799488. Throughput: 0: 1678.5, 1: 1693.0. Samples: 37200882. Policy #0 lag: (min: 31.0, avg: 31.1, max: 37.0) +[2023-10-11 17:59:21,063][84230] Avg episode reward: [(0, '42.220'), (1, '44.980')] +[2023-10-11 17:59:23,600][85175] Updated weights for policy 1, policy_version 73190 (0.0008) +[2023-10-11 17:59:23,962][85176] Updated weights for policy 0, policy_version 72132 (0.0009) +[2023-10-11 17:59:23,975][85175] Updated weights for policy 1, policy_version 73200 (0.0009) +[2023-10-11 17:59:24,327][85176] Updated weights for policy 0, policy_version 72142 (0.0008) +[2023-10-11 17:59:24,334][85175] Updated weights for policy 1, policy_version 73210 (0.0009) +[2023-10-11 17:59:24,705][85176] Updated weights for policy 0, policy_version 72152 (0.0007) +[2023-10-11 17:59:26,062][84230] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 148865024. Throughput: 0: 1664.5, 1: 1670.1. Samples: 37219802. Policy #0 lag: (min: 31.0, avg: 31.1, max: 37.0) +[2023-10-11 17:59:26,063][84230] Avg episode reward: [(0, '42.390'), (1, '42.640')] +[2023-10-11 17:59:28,547][85175] Updated weights for policy 1, policy_version 73220 (0.0009) +[2023-10-11 17:59:28,902][85176] Updated weights for policy 0, policy_version 72162 (0.0010) +[2023-10-11 17:59:28,910][85175] Updated weights for policy 1, policy_version 73230 (0.0008) +[2023-10-11 17:59:29,270][85176] Updated weights for policy 0, policy_version 72172 (0.0007) +[2023-10-11 17:59:29,278][85175] Updated weights for policy 1, policy_version 73240 (0.0008) +[2023-10-11 17:59:29,642][85176] Updated weights for policy 0, policy_version 72182 (0.0009) +[2023-10-11 17:59:30,017][85176] Updated weights for policy 0, policy_version 72192 (0.0010) +[2023-10-11 17:59:31,062][84230] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 148930560. Throughput: 0: 1669.6, 1: 1684.5. Samples: 37239752. Policy #0 lag: (min: 31.0, avg: 31.1, max: 37.0) +[2023-10-11 17:59:31,063][84230] Avg episode reward: [(0, '40.630'), (1, '45.810')] +[2023-10-11 17:59:33,265][85175] Updated weights for policy 1, policy_version 73250 (0.0008) +[2023-10-11 17:59:33,623][85175] Updated weights for policy 1, policy_version 73260 (0.0009) +[2023-10-11 17:59:33,812][85176] Updated weights for policy 0, policy_version 72202 (0.0007) +[2023-10-11 17:59:33,991][85175] Updated weights for policy 1, policy_version 73270 (0.0008) +[2023-10-11 17:59:34,182][85176] Updated weights for policy 0, policy_version 72212 (0.0008) +[2023-10-11 17:59:34,358][85175] Updated weights for policy 1, policy_version 73280 (0.0009) +[2023-10-11 17:59:34,558][85176] Updated weights for policy 0, policy_version 72222 (0.0009) +[2023-10-11 17:59:36,062][84230] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 148996096. Throughput: 0: 1682.2, 1: 1687.0. Samples: 37251092. Policy #0 lag: (min: 31.0, avg: 31.1, max: 37.0) +[2023-10-11 17:59:36,063][84230] Avg episode reward: [(0, '41.230'), (1, '42.740')] +[2023-10-11 17:59:38,295][85175] Updated weights for policy 1, policy_version 73290 (0.0008) +[2023-10-11 17:59:38,553][85176] Updated weights for policy 0, policy_version 72232 (0.0009) +[2023-10-11 17:59:38,659][85175] Updated weights for policy 1, policy_version 73300 (0.0009) +[2023-10-11 17:59:38,925][85176] Updated weights for policy 0, policy_version 72242 (0.0008) +[2023-10-11 17:59:39,029][85175] Updated weights for policy 1, policy_version 73310 (0.0009) +[2023-10-11 17:59:39,288][85176] Updated weights for policy 0, policy_version 72252 (0.0009) +[2023-10-11 17:59:41,062][84230] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 149061632. Throughput: 0: 1657.1, 1: 1671.8. Samples: 37269962. Policy #0 lag: (min: 31.0, avg: 31.1, max: 37.0) +[2023-10-11 17:59:41,063][84230] Avg episode reward: [(0, '44.530'), (1, '46.580')] +[2023-10-11 17:59:43,110][85175] Updated weights for policy 1, policy_version 73320 (0.0010) +[2023-10-11 17:59:43,473][85175] Updated weights for policy 1, policy_version 73330 (0.0009) +[2023-10-11 17:59:43,541][85176] Updated weights for policy 0, policy_version 72262 (0.0008) +[2023-10-11 17:59:43,841][85175] Updated weights for policy 1, policy_version 73340 (0.0009) +[2023-10-11 17:59:43,910][85176] Updated weights for policy 0, policy_version 72272 (0.0007) +[2023-10-11 17:59:44,288][85176] Updated weights for policy 0, policy_version 72282 (0.0008) +[2023-10-11 17:59:46,063][84230] Fps is (10 sec: 13106.8, 60 sec: 13107.2, 300 sec: 13551.5). Total num frames: 149127168. Throughput: 0: 1687.8, 1: 1696.0. Samples: 37290732. Policy #0 lag: (min: 31.0, avg: 31.1, max: 37.0) +[2023-10-11 17:59:46,064][84230] Avg episode reward: [(0, '41.470'), (1, '42.550')] +[2023-10-11 17:59:47,915][85175] Updated weights for policy 1, policy_version 73350 (0.0008) +[2023-10-11 17:59:48,291][85175] Updated weights for policy 1, policy_version 73360 (0.0008) +[2023-10-11 17:59:48,365][85176] Updated weights for policy 0, policy_version 72292 (0.0009) +[2023-10-11 17:59:48,662][85175] Updated weights for policy 1, policy_version 73370 (0.0008) +[2023-10-11 17:59:48,737][85176] Updated weights for policy 0, policy_version 72302 (0.0007) +[2023-10-11 17:59:49,116][85176] Updated weights for policy 0, policy_version 72312 (0.0008) +[2023-10-11 17:59:51,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 149192704. Throughput: 0: 1678.4, 1: 1672.8. Samples: 37301114. Policy #0 lag: (min: 31.0, avg: 31.1, max: 37.0) +[2023-10-11 17:59:51,063][84230] Avg episode reward: [(0, '42.940'), (1, '49.840')] +[2023-10-11 17:59:51,064][85000] Saving new best policy, reward=49.840! +[2023-10-11 17:59:52,726][85175] Updated weights for policy 1, policy_version 73380 (0.0008) +[2023-10-11 17:59:53,091][85175] Updated weights for policy 1, policy_version 73390 (0.0008) +[2023-10-11 17:59:53,136][85176] Updated weights for policy 0, policy_version 72322 (0.0009) +[2023-10-11 17:59:53,465][85175] Updated weights for policy 1, policy_version 73400 (0.0008) +[2023-10-11 17:59:53,505][85176] Updated weights for policy 0, policy_version 72332 (0.0008) +[2023-10-11 17:59:53,885][85176] Updated weights for policy 0, policy_version 72342 (0.0008) +[2023-10-11 17:59:54,250][85176] Updated weights for policy 0, policy_version 72352 (0.0008) +[2023-10-11 17:59:56,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 149258240. Throughput: 0: 1670.3, 1: 1684.0. Samples: 37320718. Policy #0 lag: (min: 31.0, avg: 31.1, max: 37.0) +[2023-10-11 17:59:56,064][84230] Avg episode reward: [(0, '39.020'), (1, '42.590')] +[2023-10-11 17:59:57,404][85175] Updated weights for policy 1, policy_version 73410 (0.0007) +[2023-10-11 17:59:57,813][85175] Updated weights for policy 1, policy_version 73420 (0.0008) +[2023-10-11 17:59:58,159][85176] Updated weights for policy 0, policy_version 72362 (0.0011) +[2023-10-11 17:59:58,178][85175] Updated weights for policy 1, policy_version 73430 (0.0008) +[2023-10-11 17:59:58,542][85176] Updated weights for policy 0, policy_version 72372 (0.0008) +[2023-10-11 17:59:58,554][85175] Updated weights for policy 1, policy_version 73440 (0.0007) +[2023-10-11 17:59:58,920][85176] Updated weights for policy 0, policy_version 72382 (0.0010) +[2023-10-11 18:00:01,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 149323776. Throughput: 0: 1684.9, 1: 1699.0. Samples: 37341416. Policy #0 lag: (min: 31.0, avg: 31.1, max: 37.0) +[2023-10-11 18:00:01,064][84230] Avg episode reward: [(0, '43.450'), (1, '46.300')] +[2023-10-11 18:00:02,546][85175] Updated weights for policy 1, policy_version 73450 (0.0009) +[2023-10-11 18:00:02,913][85175] Updated weights for policy 1, policy_version 73460 (0.0010) +[2023-10-11 18:00:03,104][85176] Updated weights for policy 0, policy_version 72392 (0.0008) +[2023-10-11 18:00:03,276][85175] Updated weights for policy 1, policy_version 73470 (0.0007) +[2023-10-11 18:00:03,485][85176] Updated weights for policy 0, policy_version 72402 (0.0008) +[2023-10-11 18:00:03,860][85176] Updated weights for policy 0, policy_version 72412 (0.0008) +[2023-10-11 18:00:06,062][84230] Fps is (10 sec: 13107.7, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 149389312. Throughput: 0: 1666.0, 1: 1671.9. Samples: 37351086. Policy #0 lag: (min: 31.0, avg: 31.1, max: 37.0) +[2023-10-11 18:00:06,063][84230] Avg episode reward: [(0, '41.830'), (1, '40.740')] +[2023-10-11 18:00:07,426][85175] Updated weights for policy 1, policy_version 73480 (0.0008) +[2023-10-11 18:00:07,792][85175] Updated weights for policy 1, policy_version 73490 (0.0009) +[2023-10-11 18:00:08,075][85176] Updated weights for policy 0, policy_version 72422 (0.0008) +[2023-10-11 18:00:08,155][85175] Updated weights for policy 1, policy_version 73500 (0.0008) +[2023-10-11 18:00:08,442][85176] Updated weights for policy 0, policy_version 72432 (0.0008) +[2023-10-11 18:00:08,813][85176] Updated weights for policy 0, policy_version 72442 (0.0007) +[2023-10-11 18:00:11,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 149454848. Throughput: 0: 1670.6, 1: 1697.8. Samples: 37371380. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:00:11,063][84230] Avg episode reward: [(0, '43.330'), (1, '46.090')] +[2023-10-11 18:00:12,088][85175] Updated weights for policy 1, policy_version 73510 (0.0009) +[2023-10-11 18:00:12,442][85175] Updated weights for policy 1, policy_version 73520 (0.0009) +[2023-10-11 18:00:12,817][85175] Updated weights for policy 1, policy_version 73530 (0.0010) +[2023-10-11 18:00:13,025][85176] Updated weights for policy 0, policy_version 72452 (0.0008) +[2023-10-11 18:00:13,400][85176] Updated weights for policy 0, policy_version 72462 (0.0007) +[2023-10-11 18:00:13,770][85176] Updated weights for policy 0, policy_version 72472 (0.0008) +[2023-10-11 18:00:16,063][84230] Fps is (10 sec: 13106.8, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 149520384. Throughput: 0: 1681.9, 1: 1704.9. Samples: 37392158. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:00:16,064][84230] Avg episode reward: [(0, '43.670'), (1, '42.960')] +[2023-10-11 18:00:16,078][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000072480_74219520.pth... +[2023-10-11 18:00:16,078][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000073536_75300864.pth... +[2023-10-11 18:00:16,112][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000070912_72613888.pth +[2023-10-11 18:00:16,120][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000071968_73695232.pth +[2023-10-11 18:00:16,841][85175] Updated weights for policy 1, policy_version 73540 (0.0007) +[2023-10-11 18:00:17,210][85175] Updated weights for policy 1, policy_version 73550 (0.0008) +[2023-10-11 18:00:17,560][85176] Updated weights for policy 0, policy_version 72482 (0.0008) +[2023-10-11 18:00:17,576][85175] Updated weights for policy 1, policy_version 73560 (0.0007) +[2023-10-11 18:00:17,937][85176] Updated weights for policy 0, policy_version 72492 (0.0008) +[2023-10-11 18:00:18,310][85176] Updated weights for policy 0, policy_version 72502 (0.0008) +[2023-10-11 18:00:18,673][85176] Updated weights for policy 0, policy_version 72512 (0.0009) +[2023-10-11 18:00:21,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 149585920. Throughput: 0: 1659.1, 1: 1687.1. Samples: 37401668. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:00:21,064][84230] Avg episode reward: [(0, '43.250'), (1, '46.460')] +[2023-10-11 18:00:21,644][85175] Updated weights for policy 1, policy_version 73570 (0.0008) +[2023-10-11 18:00:22,013][85175] Updated weights for policy 1, policy_version 73580 (0.0007) +[2023-10-11 18:00:22,385][85175] Updated weights for policy 1, policy_version 73590 (0.0007) +[2023-10-11 18:00:22,748][85175] Updated weights for policy 1, policy_version 73600 (0.0009) +[2023-10-11 18:00:22,749][85176] Updated weights for policy 0, policy_version 72522 (0.0010) +[2023-10-11 18:00:23,119][85176] Updated weights for policy 0, policy_version 72532 (0.0010) +[2023-10-11 18:00:23,493][85176] Updated weights for policy 0, policy_version 72542 (0.0009) +[2023-10-11 18:00:26,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 149651456. Throughput: 0: 1684.0, 1: 1701.7. Samples: 37422322. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:00:26,064][84230] Avg episode reward: [(0, '43.730'), (1, '47.060')] +[2023-10-11 18:00:26,705][85175] Updated weights for policy 1, policy_version 73610 (0.0008) +[2023-10-11 18:00:27,080][85175] Updated weights for policy 1, policy_version 73620 (0.0008) +[2023-10-11 18:00:27,440][85175] Updated weights for policy 1, policy_version 73630 (0.0009) +[2023-10-11 18:00:27,617][85176] Updated weights for policy 0, policy_version 72552 (0.0009) +[2023-10-11 18:00:27,989][85176] Updated weights for policy 0, policy_version 72562 (0.0007) +[2023-10-11 18:00:28,366][85176] Updated weights for policy 0, policy_version 72572 (0.0008) +[2023-10-11 18:00:31,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 149716992. Throughput: 0: 1681.3, 1: 1709.3. Samples: 37443304. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:00:31,063][84230] Avg episode reward: [(0, '43.020'), (1, '46.400')] +[2023-10-11 18:00:31,367][85175] Updated weights for policy 1, policy_version 73640 (0.0009) +[2023-10-11 18:00:31,736][85175] Updated weights for policy 1, policy_version 73650 (0.0008) +[2023-10-11 18:00:32,104][85175] Updated weights for policy 1, policy_version 73660 (0.0008) +[2023-10-11 18:00:32,408][85176] Updated weights for policy 0, policy_version 72582 (0.0007) +[2023-10-11 18:00:32,781][85176] Updated weights for policy 0, policy_version 72592 (0.0008) +[2023-10-11 18:00:33,148][85176] Updated weights for policy 0, policy_version 72602 (0.0009) +[2023-10-11 18:00:36,062][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 149782528. Throughput: 0: 1662.5, 1: 1700.8. Samples: 37452466. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:00:36,063][84230] Avg episode reward: [(0, '43.340'), (1, '46.170')] +[2023-10-11 18:00:36,275][85175] Updated weights for policy 1, policy_version 73670 (0.0007) +[2023-10-11 18:00:36,642][85175] Updated weights for policy 1, policy_version 73680 (0.0008) +[2023-10-11 18:00:37,008][85175] Updated weights for policy 1, policy_version 73690 (0.0009) +[2023-10-11 18:00:37,463][85176] Updated weights for policy 0, policy_version 72612 (0.0008) +[2023-10-11 18:00:37,833][85176] Updated weights for policy 0, policy_version 72622 (0.0011) +[2023-10-11 18:00:38,202][85176] Updated weights for policy 0, policy_version 72632 (0.0010) +[2023-10-11 18:00:40,927][85175] Updated weights for policy 1, policy_version 73700 (0.0009) +[2023-10-11 18:00:41,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 149848064. Throughput: 0: 1681.6, 1: 1708.4. Samples: 37473266. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:00:41,063][84230] Avg episode reward: [(0, '43.570'), (1, '44.340')] +[2023-10-11 18:00:41,303][85175] Updated weights for policy 1, policy_version 73710 (0.0007) +[2023-10-11 18:00:41,672][85175] Updated weights for policy 1, policy_version 73720 (0.0008) +[2023-10-11 18:00:42,217][85176] Updated weights for policy 0, policy_version 72642 (0.0009) +[2023-10-11 18:00:42,617][85176] Updated weights for policy 0, policy_version 72652 (0.0009) +[2023-10-11 18:00:42,988][85176] Updated weights for policy 0, policy_version 72662 (0.0008) +[2023-10-11 18:00:43,360][85176] Updated weights for policy 0, policy_version 72672 (0.0007) +[2023-10-11 18:00:45,723][85175] Updated weights for policy 1, policy_version 73730 (0.0009) +[2023-10-11 18:00:46,062][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 149913600. Throughput: 0: 1682.5, 1: 1708.8. Samples: 37494022. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:00:46,063][84230] Avg episode reward: [(0, '43.830'), (1, '47.440')] +[2023-10-11 18:00:46,150][85175] Updated weights for policy 1, policy_version 73740 (0.0009) +[2023-10-11 18:00:46,515][85175] Updated weights for policy 1, policy_version 73750 (0.0007) +[2023-10-11 18:00:46,885][85175] Updated weights for policy 1, policy_version 73760 (0.0009) +[2023-10-11 18:00:47,327][85176] Updated weights for policy 0, policy_version 72682 (0.0009) +[2023-10-11 18:00:47,709][85176] Updated weights for policy 0, policy_version 72692 (0.0010) +[2023-10-11 18:00:48,087][85176] Updated weights for policy 0, policy_version 72702 (0.0009) +[2023-10-11 18:00:50,776][85175] Updated weights for policy 1, policy_version 73770 (0.0011) +[2023-10-11 18:00:51,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 149979136. Throughput: 0: 1672.8, 1: 1703.4. Samples: 37503016. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:00:51,064][84230] Avg episode reward: [(0, '42.760'), (1, '44.180')] +[2023-10-11 18:00:51,151][85175] Updated weights for policy 1, policy_version 73780 (0.0010) +[2023-10-11 18:00:51,512][85175] Updated weights for policy 1, policy_version 73790 (0.0007) +[2023-10-11 18:00:51,977][85176] Updated weights for policy 0, policy_version 72712 (0.0010) +[2023-10-11 18:00:52,349][85176] Updated weights for policy 0, policy_version 72722 (0.0010) +[2023-10-11 18:00:52,711][85176] Updated weights for policy 0, policy_version 72732 (0.0009) +[2023-10-11 18:00:55,652][85175] Updated weights for policy 1, policy_version 73800 (0.0010) +[2023-10-11 18:00:56,032][85175] Updated weights for policy 1, policy_version 73810 (0.0009) +[2023-10-11 18:00:56,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 150044672. Throughput: 0: 1679.4, 1: 1701.0. Samples: 37523500. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:00:56,063][84230] Avg episode reward: [(0, '45.260'), (1, '47.050')] +[2023-10-11 18:00:56,388][85175] Updated weights for policy 1, policy_version 73820 (0.0008) +[2023-10-11 18:00:56,916][85176] Updated weights for policy 0, policy_version 72742 (0.0008) +[2023-10-11 18:00:57,304][85176] Updated weights for policy 0, policy_version 72752 (0.0009) +[2023-10-11 18:00:57,685][85176] Updated weights for policy 0, policy_version 72762 (0.0009) +[2023-10-11 18:01:00,615][85175] Updated weights for policy 1, policy_version 73830 (0.0009) +[2023-10-11 18:01:00,988][85175] Updated weights for policy 1, policy_version 73840 (0.0010) +[2023-10-11 18:01:01,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 150110208. Throughput: 0: 1682.5, 1: 1697.7. Samples: 37544266. Policy #0 lag: (min: 31.0, avg: 38.3, max: 63.0) +[2023-10-11 18:01:01,063][84230] Avg episode reward: [(0, '43.800'), (1, '40.740')] +[2023-10-11 18:01:01,351][85175] Updated weights for policy 1, policy_version 73850 (0.0007) +[2023-10-11 18:01:01,822][85176] Updated weights for policy 0, policy_version 72772 (0.0009) +[2023-10-11 18:01:02,198][85176] Updated weights for policy 0, policy_version 72782 (0.0011) +[2023-10-11 18:01:02,574][85176] Updated weights for policy 0, policy_version 72792 (0.0009) +[2023-10-11 18:01:05,348][85175] Updated weights for policy 1, policy_version 73860 (0.0007) +[2023-10-11 18:01:05,714][85175] Updated weights for policy 1, policy_version 73870 (0.0007) +[2023-10-11 18:01:06,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 150175744. Throughput: 0: 1672.8, 1: 1702.7. Samples: 37553564. Policy #0 lag: (min: 31.0, avg: 38.3, max: 63.0) +[2023-10-11 18:01:06,063][84230] Avg episode reward: [(0, '44.030'), (1, '46.410')] +[2023-10-11 18:01:06,096][85175] Updated weights for policy 1, policy_version 73880 (0.0007) +[2023-10-11 18:01:06,718][85176] Updated weights for policy 0, policy_version 72802 (0.0008) +[2023-10-11 18:01:07,090][85176] Updated weights for policy 0, policy_version 72812 (0.0008) +[2023-10-11 18:01:07,461][85176] Updated weights for policy 0, policy_version 72822 (0.0008) +[2023-10-11 18:01:07,837][85176] Updated weights for policy 0, policy_version 72832 (0.0007) +[2023-10-11 18:01:10,043][85175] Updated weights for policy 1, policy_version 73890 (0.0007) +[2023-10-11 18:01:10,417][85175] Updated weights for policy 1, policy_version 73900 (0.0008) +[2023-10-11 18:01:10,787][85175] Updated weights for policy 1, policy_version 73910 (0.0007) +[2023-10-11 18:01:11,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 150241280. Throughput: 0: 1669.1, 1: 1705.1. Samples: 37574160. Policy #0 lag: (min: 31.0, avg: 38.3, max: 63.0) +[2023-10-11 18:01:11,063][84230] Avg episode reward: [(0, '42.400'), (1, '41.530')] +[2023-10-11 18:01:11,159][85175] Updated weights for policy 1, policy_version 73920 (0.0009) +[2023-10-11 18:01:11,910][85176] Updated weights for policy 0, policy_version 72842 (0.0008) +[2023-10-11 18:01:12,276][85176] Updated weights for policy 0, policy_version 72852 (0.0008) +[2023-10-11 18:01:12,648][85176] Updated weights for policy 0, policy_version 72862 (0.0009) +[2023-10-11 18:01:15,113][85175] Updated weights for policy 1, policy_version 73930 (0.0008) +[2023-10-11 18:01:15,494][85175] Updated weights for policy 1, policy_version 73940 (0.0008) +[2023-10-11 18:01:15,858][85175] Updated weights for policy 1, policy_version 73950 (0.0008) +[2023-10-11 18:01:16,063][84230] Fps is (10 sec: 16383.7, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 150339584. Throughput: 0: 1667.5, 1: 1689.5. Samples: 37594366. Policy #0 lag: (min: 31.0, avg: 38.3, max: 63.0) +[2023-10-11 18:01:16,063][84230] Avg episode reward: [(0, '42.270'), (1, '47.060')] +[2023-10-11 18:01:16,781][85176] Updated weights for policy 0, policy_version 72872 (0.0008) +[2023-10-11 18:01:17,146][85176] Updated weights for policy 0, policy_version 72882 (0.0007) +[2023-10-11 18:01:17,524][85176] Updated weights for policy 0, policy_version 72892 (0.0007) +[2023-10-11 18:01:19,615][85175] Updated weights for policy 1, policy_version 73960 (0.0008) +[2023-10-11 18:01:19,978][85175] Updated weights for policy 1, policy_version 73970 (0.0009) +[2023-10-11 18:01:20,338][85175] Updated weights for policy 1, policy_version 73980 (0.0009) +[2023-10-11 18:01:21,062][84230] Fps is (10 sec: 16384.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 150405120. Throughput: 0: 1664.4, 1: 1711.2. Samples: 37604370. Policy #0 lag: (min: 31.0, avg: 38.3, max: 63.0) +[2023-10-11 18:01:21,063][84230] Avg episode reward: [(0, '42.510'), (1, '40.820')] +[2023-10-11 18:01:21,613][85176] Updated weights for policy 0, policy_version 72902 (0.0007) +[2023-10-11 18:01:21,988][85176] Updated weights for policy 0, policy_version 72912 (0.0009) +[2023-10-11 18:01:22,365][85176] Updated weights for policy 0, policy_version 72922 (0.0010) +[2023-10-11 18:01:24,248][85175] Updated weights for policy 1, policy_version 73990 (0.0009) +[2023-10-11 18:01:24,606][85175] Updated weights for policy 1, policy_version 74000 (0.0008) +[2023-10-11 18:01:24,972][85175] Updated weights for policy 1, policy_version 74010 (0.0008) +[2023-10-11 18:01:26,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 150470656. Throughput: 0: 1664.3, 1: 1700.7. Samples: 37624692. Policy #0 lag: (min: 31.0, avg: 38.3, max: 63.0) +[2023-10-11 18:01:26,064][84230] Avg episode reward: [(0, '44.520'), (1, '46.700')] +[2023-10-11 18:01:26,373][85176] Updated weights for policy 0, policy_version 72932 (0.0009) +[2023-10-11 18:01:26,733][85176] Updated weights for policy 0, policy_version 72942 (0.0011) +[2023-10-11 18:01:27,104][85176] Updated weights for policy 0, policy_version 72952 (0.0011) +[2023-10-11 18:01:29,158][85175] Updated weights for policy 1, policy_version 74020 (0.0009) +[2023-10-11 18:01:29,517][85175] Updated weights for policy 1, policy_version 74030 (0.0008) +[2023-10-11 18:01:29,888][85175] Updated weights for policy 1, policy_version 74040 (0.0008) +[2023-10-11 18:01:31,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 150536192. Throughput: 0: 1664.7, 1: 1680.0. Samples: 37644534. Policy #0 lag: (min: 31.0, avg: 38.3, max: 63.0) +[2023-10-11 18:01:31,063][84230] Avg episode reward: [(0, '42.650'), (1, '39.800')] +[2023-10-11 18:01:31,289][85176] Updated weights for policy 0, policy_version 72962 (0.0010) +[2023-10-11 18:01:31,685][85176] Updated weights for policy 0, policy_version 72972 (0.0009) +[2023-10-11 18:01:32,073][85176] Updated weights for policy 0, policy_version 72982 (0.0007) +[2023-10-11 18:01:32,442][85176] Updated weights for policy 0, policy_version 72992 (0.0007) +[2023-10-11 18:01:33,965][85175] Updated weights for policy 1, policy_version 74050 (0.0009) +[2023-10-11 18:01:34,376][85175] Updated weights for policy 1, policy_version 74060 (0.0008) +[2023-10-11 18:01:34,749][85175] Updated weights for policy 1, policy_version 74070 (0.0007) +[2023-10-11 18:01:35,125][85175] Updated weights for policy 1, policy_version 74080 (0.0007) +[2023-10-11 18:01:36,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 150601728. Throughput: 0: 1659.4, 1: 1716.4. Samples: 37654926. Policy #0 lag: (min: 31.0, avg: 38.3, max: 63.0) +[2023-10-11 18:01:36,063][84230] Avg episode reward: [(0, '40.940'), (1, '46.910')] +[2023-10-11 18:01:36,438][85176] Updated weights for policy 0, policy_version 73002 (0.0007) +[2023-10-11 18:01:36,813][85176] Updated weights for policy 0, policy_version 73012 (0.0009) +[2023-10-11 18:01:37,182][85176] Updated weights for policy 0, policy_version 73022 (0.0009) +[2023-10-11 18:01:39,087][85175] Updated weights for policy 1, policy_version 74090 (0.0008) +[2023-10-11 18:01:39,452][85175] Updated weights for policy 1, policy_version 74100 (0.0008) +[2023-10-11 18:01:39,825][85175] Updated weights for policy 1, policy_version 74110 (0.0009) +[2023-10-11 18:01:41,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 150667264. Throughput: 0: 1670.7, 1: 1693.3. Samples: 37674878. Policy #0 lag: (min: 31.0, avg: 38.3, max: 63.0) +[2023-10-11 18:01:41,064][84230] Avg episode reward: [(0, '42.690'), (1, '42.590')] +[2023-10-11 18:01:41,110][85176] Updated weights for policy 0, policy_version 73032 (0.0007) +[2023-10-11 18:01:41,483][85176] Updated weights for policy 0, policy_version 73042 (0.0007) +[2023-10-11 18:01:41,862][85176] Updated weights for policy 0, policy_version 73052 (0.0009) +[2023-10-11 18:01:44,022][85175] Updated weights for policy 1, policy_version 74120 (0.0008) +[2023-10-11 18:01:44,393][85175] Updated weights for policy 1, policy_version 74130 (0.0008) +[2023-10-11 18:01:44,764][85175] Updated weights for policy 1, policy_version 74140 (0.0010) +[2023-10-11 18:01:45,909][85176] Updated weights for policy 0, policy_version 73062 (0.0007) +[2023-10-11 18:01:46,062][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 150732800. Throughput: 0: 1672.8, 1: 1684.2. Samples: 37695328. Policy #0 lag: (min: 31.0, avg: 38.3, max: 63.0) +[2023-10-11 18:01:46,063][84230] Avg episode reward: [(0, '40.920'), (1, '45.050')] +[2023-10-11 18:01:46,279][85176] Updated weights for policy 0, policy_version 73072 (0.0007) +[2023-10-11 18:01:46,653][85176] Updated weights for policy 0, policy_version 73082 (0.0007) +[2023-10-11 18:01:48,771][85175] Updated weights for policy 1, policy_version 74150 (0.0009) +[2023-10-11 18:01:49,146][85175] Updated weights for policy 1, policy_version 74160 (0.0009) +[2023-10-11 18:01:49,518][85175] Updated weights for policy 1, policy_version 74170 (0.0008) +[2023-10-11 18:01:50,963][85176] Updated weights for policy 0, policy_version 73092 (0.0009) +[2023-10-11 18:01:51,062][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 150798336. Throughput: 0: 1675.4, 1: 1706.7. Samples: 37705760. Policy #0 lag: (min: 31.0, avg: 34.4, max: 63.0) +[2023-10-11 18:01:51,063][84230] Avg episode reward: [(0, '46.080'), (1, '41.230')] +[2023-10-11 18:01:51,334][85176] Updated weights for policy 0, policy_version 73102 (0.0011) +[2023-10-11 18:01:51,711][85176] Updated weights for policy 0, policy_version 73112 (0.0011) +[2023-10-11 18:01:53,374][85175] Updated weights for policy 1, policy_version 74180 (0.0009) +[2023-10-11 18:01:53,753][85175] Updated weights for policy 1, policy_version 74190 (0.0007) +[2023-10-11 18:01:54,121][85175] Updated weights for policy 1, policy_version 74200 (0.0007) +[2023-10-11 18:01:55,577][85176] Updated weights for policy 0, policy_version 73122 (0.0011) +[2023-10-11 18:01:55,945][85176] Updated weights for policy 0, policy_version 73132 (0.0011) +[2023-10-11 18:01:56,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 150863872. Throughput: 0: 1681.2, 1: 1679.3. Samples: 37725384. Policy #0 lag: (min: 31.0, avg: 34.4, max: 63.0) +[2023-10-11 18:01:56,064][84230] Avg episode reward: [(0, '43.600'), (1, '42.440')] +[2023-10-11 18:01:56,317][85176] Updated weights for policy 0, policy_version 73142 (0.0009) +[2023-10-11 18:01:56,694][85176] Updated weights for policy 0, policy_version 73152 (0.0007) +[2023-10-11 18:01:58,245][85175] Updated weights for policy 1, policy_version 74210 (0.0008) +[2023-10-11 18:01:58,627][85175] Updated weights for policy 1, policy_version 74220 (0.0009) +[2023-10-11 18:01:58,984][85175] Updated weights for policy 1, policy_version 74230 (0.0009) +[2023-10-11 18:01:59,352][85175] Updated weights for policy 1, policy_version 74240 (0.0008) +[2023-10-11 18:02:00,902][85176] Updated weights for policy 0, policy_version 73162 (0.0010) +[2023-10-11 18:02:01,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 150929408. Throughput: 0: 1682.7, 1: 1686.3. Samples: 37745970. Policy #0 lag: (min: 31.0, avg: 34.4, max: 63.0) +[2023-10-11 18:02:01,063][84230] Avg episode reward: [(0, '45.190'), (1, '40.690')] +[2023-10-11 18:02:01,268][85176] Updated weights for policy 0, policy_version 73172 (0.0012) +[2023-10-11 18:02:01,654][85176] Updated weights for policy 0, policy_version 73182 (0.0010) +[2023-10-11 18:02:03,453][85175] Updated weights for policy 1, policy_version 74250 (0.0008) +[2023-10-11 18:02:03,827][85175] Updated weights for policy 1, policy_version 74260 (0.0008) +[2023-10-11 18:02:04,188][85175] Updated weights for policy 1, policy_version 74270 (0.0007) +[2023-10-11 18:02:05,899][85176] Updated weights for policy 0, policy_version 73192 (0.0009) +[2023-10-11 18:02:06,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 150994944. Throughput: 0: 1681.9, 1: 1682.3. Samples: 37755764. Policy #0 lag: (min: 31.0, avg: 34.4, max: 63.0) +[2023-10-11 18:02:06,064][84230] Avg episode reward: [(0, '41.260'), (1, '39.900')] +[2023-10-11 18:02:06,279][85176] Updated weights for policy 0, policy_version 73202 (0.0008) +[2023-10-11 18:02:06,653][85176] Updated weights for policy 0, policy_version 73212 (0.0009) +[2023-10-11 18:02:08,109][85175] Updated weights for policy 1, policy_version 74280 (0.0009) +[2023-10-11 18:02:08,484][85175] Updated weights for policy 1, policy_version 74290 (0.0008) +[2023-10-11 18:02:08,849][85175] Updated weights for policy 1, policy_version 74300 (0.0010) +[2023-10-11 18:02:10,589][85176] Updated weights for policy 0, policy_version 73222 (0.0008) +[2023-10-11 18:02:10,962][85176] Updated weights for policy 0, policy_version 73232 (0.0008) +[2023-10-11 18:02:11,063][84230] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 151060480. Throughput: 0: 1685.5, 1: 1677.4. Samples: 37776024. Policy #0 lag: (min: 31.0, avg: 34.4, max: 63.0) +[2023-10-11 18:02:11,064][84230] Avg episode reward: [(0, '44.520'), (1, '42.990')] +[2023-10-11 18:02:11,343][85176] Updated weights for policy 0, policy_version 73242 (0.0008) +[2023-10-11 18:02:12,690][85175] Updated weights for policy 1, policy_version 74310 (0.0007) +[2023-10-11 18:02:13,058][85175] Updated weights for policy 1, policy_version 74320 (0.0010) +[2023-10-11 18:02:13,426][85175] Updated weights for policy 1, policy_version 74330 (0.0011) +[2023-10-11 18:02:15,338][85176] Updated weights for policy 0, policy_version 73252 (0.0008) +[2023-10-11 18:02:15,719][85176] Updated weights for policy 0, policy_version 73262 (0.0007) +[2023-10-11 18:02:16,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 151126016. Throughput: 0: 1675.8, 1: 1703.2. Samples: 37796592. Policy #0 lag: (min: 31.0, avg: 34.4, max: 63.0) +[2023-10-11 18:02:16,064][84230] Avg episode reward: [(0, '41.250'), (1, '42.390')] +[2023-10-11 18:02:16,074][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000074336_76120064.pth... +[2023-10-11 18:02:16,092][85176] Updated weights for policy 0, policy_version 73272 (0.0007) +[2023-10-11 18:02:16,111][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000072768_74514432.pth +[2023-10-11 18:02:16,381][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000073280_75038720.pth... +[2023-10-11 18:02:16,412][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000071712_73433088.pth +[2023-10-11 18:02:17,436][85175] Updated weights for policy 1, policy_version 74340 (0.0008) +[2023-10-11 18:02:17,799][85175] Updated weights for policy 1, policy_version 74350 (0.0008) +[2023-10-11 18:02:18,173][85175] Updated weights for policy 1, policy_version 74360 (0.0007) +[2023-10-11 18:02:20,174][85176] Updated weights for policy 0, policy_version 73282 (0.0008) +[2023-10-11 18:02:20,573][85176] Updated weights for policy 0, policy_version 73292 (0.0011) +[2023-10-11 18:02:20,952][85176] Updated weights for policy 0, policy_version 73302 (0.0008) +[2023-10-11 18:02:21,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 151191552. Throughput: 0: 1689.1, 1: 1670.0. Samples: 37806088. Policy #0 lag: (min: 31.0, avg: 34.4, max: 63.0) +[2023-10-11 18:02:21,064][84230] Avg episode reward: [(0, '44.140'), (1, '44.080')] +[2023-10-11 18:02:21,314][85176] Updated weights for policy 0, policy_version 73312 (0.0007) +[2023-10-11 18:02:22,277][85175] Updated weights for policy 1, policy_version 74370 (0.0008) +[2023-10-11 18:02:22,651][85175] Updated weights for policy 1, policy_version 74380 (0.0007) +[2023-10-11 18:02:23,021][85175] Updated weights for policy 1, policy_version 74390 (0.0007) +[2023-10-11 18:02:23,387][85175] Updated weights for policy 1, policy_version 74400 (0.0008) +[2023-10-11 18:02:25,369][85176] Updated weights for policy 0, policy_version 73322 (0.0009) +[2023-10-11 18:02:25,744][85176] Updated weights for policy 0, policy_version 73332 (0.0007) +[2023-10-11 18:02:26,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 151257088. Throughput: 0: 1679.2, 1: 1692.6. Samples: 37826610. Policy #0 lag: (min: 31.0, avg: 34.4, max: 63.0) +[2023-10-11 18:02:26,064][84230] Avg episode reward: [(0, '42.100'), (1, '42.010')] +[2023-10-11 18:02:26,122][85176] Updated weights for policy 0, policy_version 73342 (0.0009) +[2023-10-11 18:02:27,496][85175] Updated weights for policy 1, policy_version 74410 (0.0009) +[2023-10-11 18:02:27,863][85175] Updated weights for policy 1, policy_version 74420 (0.0010) +[2023-10-11 18:02:28,230][85175] Updated weights for policy 1, policy_version 74430 (0.0007) +[2023-10-11 18:02:30,203][85176] Updated weights for policy 0, policy_version 73352 (0.0008) +[2023-10-11 18:02:30,574][85176] Updated weights for policy 0, policy_version 73362 (0.0008) +[2023-10-11 18:02:30,946][85176] Updated weights for policy 0, policy_version 73372 (0.0007) +[2023-10-11 18:02:31,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 151322624. Throughput: 0: 1659.8, 1: 1706.0. Samples: 37846792. Policy #0 lag: (min: 31.0, avg: 34.4, max: 63.0) +[2023-10-11 18:02:31,063][84230] Avg episode reward: [(0, '45.060'), (1, '43.390')] +[2023-10-11 18:02:32,173][85175] Updated weights for policy 1, policy_version 74440 (0.0007) +[2023-10-11 18:02:32,541][85175] Updated weights for policy 1, policy_version 74450 (0.0007) +[2023-10-11 18:02:32,905][85175] Updated weights for policy 1, policy_version 74460 (0.0008) +[2023-10-11 18:02:35,113][85176] Updated weights for policy 0, policy_version 73382 (0.0007) +[2023-10-11 18:02:35,489][85176] Updated weights for policy 0, policy_version 73392 (0.0008) +[2023-10-11 18:02:35,866][85176] Updated weights for policy 0, policy_version 73402 (0.0008) +[2023-10-11 18:02:36,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 151388160. Throughput: 0: 1679.0, 1: 1679.6. Samples: 37856900. Policy #0 lag: (min: 31.0, avg: 34.4, max: 63.0) +[2023-10-11 18:02:36,064][84230] Avg episode reward: [(0, '42.840'), (1, '43.330')] +[2023-10-11 18:02:36,951][85175] Updated weights for policy 1, policy_version 74470 (0.0007) +[2023-10-11 18:02:37,320][85175] Updated weights for policy 1, policy_version 74480 (0.0008) +[2023-10-11 18:02:37,687][85175] Updated weights for policy 1, policy_version 74490 (0.0007) +[2023-10-11 18:02:40,054][85176] Updated weights for policy 0, policy_version 73412 (0.0008) +[2023-10-11 18:02:40,429][85176] Updated weights for policy 0, policy_version 73422 (0.0008) +[2023-10-11 18:02:40,801][85176] Updated weights for policy 0, policy_version 73432 (0.0010) +[2023-10-11 18:02:41,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 151453696. Throughput: 0: 1678.6, 1: 1709.0. Samples: 37877824. Policy #0 lag: (min: 31.0, avg: 31.0, max: 35.0) +[2023-10-11 18:02:41,063][84230] Avg episode reward: [(0, '45.570'), (1, '45.580')] +[2023-10-11 18:02:41,651][85175] Updated weights for policy 1, policy_version 74500 (0.0008) +[2023-10-11 18:02:42,018][85175] Updated weights for policy 1, policy_version 74510 (0.0009) +[2023-10-11 18:02:42,392][85175] Updated weights for policy 1, policy_version 74520 (0.0009) +[2023-10-11 18:02:44,883][85176] Updated weights for policy 0, policy_version 73442 (0.0009) +[2023-10-11 18:02:45,261][85176] Updated weights for policy 0, policy_version 73452 (0.0008) +[2023-10-11 18:02:45,640][85176] Updated weights for policy 0, policy_version 73462 (0.0010) +[2023-10-11 18:02:46,015][85176] Updated weights for policy 0, policy_version 73472 (0.0009) +[2023-10-11 18:02:46,063][84230] Fps is (10 sec: 16384.3, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 151552000. Throughput: 0: 1656.1, 1: 1709.5. Samples: 37897420. Policy #0 lag: (min: 31.0, avg: 31.0, max: 35.0) +[2023-10-11 18:02:46,064][84230] Avg episode reward: [(0, '41.590'), (1, '45.010')] +[2023-10-11 18:02:46,480][85175] Updated weights for policy 1, policy_version 74530 (0.0008) +[2023-10-11 18:02:46,855][85175] Updated weights for policy 1, policy_version 74540 (0.0008) +[2023-10-11 18:02:47,216][85175] Updated weights for policy 1, policy_version 74550 (0.0007) +[2023-10-11 18:02:47,577][85175] Updated weights for policy 1, policy_version 74560 (0.0007) +[2023-10-11 18:02:50,034][85176] Updated weights for policy 0, policy_version 73482 (0.0007) +[2023-10-11 18:02:50,411][85176] Updated weights for policy 0, policy_version 73492 (0.0010) +[2023-10-11 18:02:50,780][85176] Updated weights for policy 0, policy_version 73502 (0.0009) +[2023-10-11 18:02:51,063][84230] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 151617536. Throughput: 0: 1677.2, 1: 1694.2. Samples: 37907480. Policy #0 lag: (min: 31.0, avg: 31.0, max: 35.0) +[2023-10-11 18:02:51,063][84230] Avg episode reward: [(0, '46.290'), (1, '45.520')] +[2023-10-11 18:02:51,649][85175] Updated weights for policy 1, policy_version 74570 (0.0009) +[2023-10-11 18:02:52,010][85175] Updated weights for policy 1, policy_version 74580 (0.0008) +[2023-10-11 18:02:52,376][85175] Updated weights for policy 1, policy_version 74590 (0.0007) +[2023-10-11 18:02:54,759][85176] Updated weights for policy 0, policy_version 73512 (0.0008) +[2023-10-11 18:02:55,122][85176] Updated weights for policy 0, policy_version 73522 (0.0007) +[2023-10-11 18:02:55,491][85176] Updated weights for policy 0, policy_version 73532 (0.0008) +[2023-10-11 18:02:56,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 151683072. Throughput: 0: 1674.4, 1: 1707.7. Samples: 37928214. Policy #0 lag: (min: 31.0, avg: 31.0, max: 35.0) +[2023-10-11 18:02:56,063][84230] Avg episode reward: [(0, '40.840'), (1, '43.570')] +[2023-10-11 18:02:56,418][85175] Updated weights for policy 1, policy_version 74600 (0.0007) +[2023-10-11 18:02:56,788][85175] Updated weights for policy 1, policy_version 74610 (0.0008) +[2023-10-11 18:02:57,155][85175] Updated weights for policy 1, policy_version 74620 (0.0009) +[2023-10-11 18:02:59,395][85176] Updated weights for policy 0, policy_version 73542 (0.0010) +[2023-10-11 18:02:59,765][85176] Updated weights for policy 0, policy_version 73552 (0.0008) +[2023-10-11 18:03:00,133][85176] Updated weights for policy 0, policy_version 73562 (0.0008) +[2023-10-11 18:03:01,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 151748608. Throughput: 0: 1657.4, 1: 1706.6. Samples: 37947972. Policy #0 lag: (min: 31.0, avg: 31.0, max: 35.0) +[2023-10-11 18:03:01,064][84230] Avg episode reward: [(0, '45.460'), (1, '44.330')] +[2023-10-11 18:03:01,070][85175] Updated weights for policy 1, policy_version 74630 (0.0010) +[2023-10-11 18:03:01,444][85175] Updated weights for policy 1, policy_version 74640 (0.0011) +[2023-10-11 18:03:01,818][85175] Updated weights for policy 1, policy_version 74650 (0.0011) +[2023-10-11 18:03:04,190][85176] Updated weights for policy 0, policy_version 73572 (0.0007) +[2023-10-11 18:03:04,566][85176] Updated weights for policy 0, policy_version 73582 (0.0010) +[2023-10-11 18:03:04,932][85176] Updated weights for policy 0, policy_version 73592 (0.0008) +[2023-10-11 18:03:05,860][85175] Updated weights for policy 1, policy_version 74660 (0.0009) +[2023-10-11 18:03:06,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 151814144. Throughput: 0: 1678.1, 1: 1705.2. Samples: 37958336. Policy #0 lag: (min: 31.0, avg: 31.0, max: 35.0) +[2023-10-11 18:03:06,063][84230] Avg episode reward: [(0, '40.850'), (1, '44.710')] +[2023-10-11 18:03:06,221][85175] Updated weights for policy 1, policy_version 74670 (0.0010) +[2023-10-11 18:03:06,583][85175] Updated weights for policy 1, policy_version 74680 (0.0009) +[2023-10-11 18:03:09,161][85176] Updated weights for policy 0, policy_version 73602 (0.0007) +[2023-10-11 18:03:09,565][85176] Updated weights for policy 0, policy_version 73612 (0.0009) +[2023-10-11 18:03:09,951][85176] Updated weights for policy 0, policy_version 73622 (0.0009) +[2023-10-11 18:03:10,322][85176] Updated weights for policy 0, policy_version 73632 (0.0008) +[2023-10-11 18:03:10,692][85175] Updated weights for policy 1, policy_version 74690 (0.0009) +[2023-10-11 18:03:11,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 151879680. Throughput: 0: 1668.4, 1: 1702.7. Samples: 37978310. Policy #0 lag: (min: 31.0, avg: 31.0, max: 35.0) +[2023-10-11 18:03:11,063][84230] Avg episode reward: [(0, '46.550'), (1, '45.830')] +[2023-10-11 18:03:11,064][85175] Updated weights for policy 1, policy_version 74700 (0.0008) +[2023-10-11 18:03:11,431][85175] Updated weights for policy 1, policy_version 74710 (0.0009) +[2023-10-11 18:03:11,803][85175] Updated weights for policy 1, policy_version 74720 (0.0008) +[2023-10-11 18:03:14,367][85176] Updated weights for policy 0, policy_version 73642 (0.0009) +[2023-10-11 18:03:14,737][85176] Updated weights for policy 0, policy_version 73652 (0.0010) +[2023-10-11 18:03:15,105][85176] Updated weights for policy 0, policy_version 73662 (0.0011) +[2023-10-11 18:03:16,032][85175] Updated weights for policy 1, policy_version 74730 (0.0008) +[2023-10-11 18:03:16,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 151945216. Throughput: 0: 1661.9, 1: 1701.1. Samples: 37998124. Policy #0 lag: (min: 31.0, avg: 31.0, max: 35.0) +[2023-10-11 18:03:16,064][84230] Avg episode reward: [(0, '40.140'), (1, '44.870')] +[2023-10-11 18:03:16,420][85175] Updated weights for policy 1, policy_version 74740 (0.0009) +[2023-10-11 18:03:16,781][85175] Updated weights for policy 1, policy_version 74750 (0.0010) +[2023-10-11 18:03:19,121][85176] Updated weights for policy 0, policy_version 73672 (0.0008) +[2023-10-11 18:03:19,496][85176] Updated weights for policy 0, policy_version 73682 (0.0009) +[2023-10-11 18:03:19,862][85176] Updated weights for policy 0, policy_version 73692 (0.0010) +[2023-10-11 18:03:20,620][85175] Updated weights for policy 1, policy_version 74760 (0.0008) +[2023-10-11 18:03:20,980][85175] Updated weights for policy 1, policy_version 74770 (0.0009) +[2023-10-11 18:03:21,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 152010752. Throughput: 0: 1673.2, 1: 1695.9. Samples: 38008508. Policy #0 lag: (min: 31.0, avg: 31.0, max: 35.0) +[2023-10-11 18:03:21,064][84230] Avg episode reward: [(0, '43.950'), (1, '41.900')] +[2023-10-11 18:03:21,351][85175] Updated weights for policy 1, policy_version 74780 (0.0010) +[2023-10-11 18:03:24,047][85176] Updated weights for policy 0, policy_version 73702 (0.0010) +[2023-10-11 18:03:24,425][85176] Updated weights for policy 0, policy_version 73712 (0.0008) +[2023-10-11 18:03:24,792][85176] Updated weights for policy 0, policy_version 73722 (0.0008) +[2023-10-11 18:03:25,365][85175] Updated weights for policy 1, policy_version 74790 (0.0008) +[2023-10-11 18:03:25,739][85175] Updated weights for policy 1, policy_version 74800 (0.0007) +[2023-10-11 18:03:26,062][84230] Fps is (10 sec: 13107.6, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 152076288. Throughput: 0: 1651.6, 1: 1693.6. Samples: 38028358. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-10-11 18:03:26,063][84230] Avg episode reward: [(0, '40.990'), (1, '45.000')] +[2023-10-11 18:03:26,104][85175] Updated weights for policy 1, policy_version 74810 (0.0009) +[2023-10-11 18:03:28,867][85176] Updated weights for policy 0, policy_version 73732 (0.0009) +[2023-10-11 18:03:29,234][85176] Updated weights for policy 0, policy_version 73742 (0.0010) +[2023-10-11 18:03:29,605][85176] Updated weights for policy 0, policy_version 73752 (0.0008) +[2023-10-11 18:03:30,124][85175] Updated weights for policy 1, policy_version 74820 (0.0009) +[2023-10-11 18:03:30,495][85175] Updated weights for policy 1, policy_version 74830 (0.0010) +[2023-10-11 18:03:30,863][85175] Updated weights for policy 1, policy_version 74840 (0.0008) +[2023-10-11 18:03:31,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 152141824. Throughput: 0: 1665.1, 1: 1689.0. Samples: 38048354. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-10-11 18:03:31,063][84230] Avg episode reward: [(0, '47.570'), (1, '43.220')] +[2023-10-11 18:03:33,746][85176] Updated weights for policy 0, policy_version 73762 (0.0008) +[2023-10-11 18:03:34,117][85176] Updated weights for policy 0, policy_version 73772 (0.0011) +[2023-10-11 18:03:34,500][85176] Updated weights for policy 0, policy_version 73782 (0.0010) +[2023-10-11 18:03:34,816][85175] Updated weights for policy 1, policy_version 74850 (0.0007) +[2023-10-11 18:03:34,866][85176] Updated weights for policy 0, policy_version 73792 (0.0009) +[2023-10-11 18:03:35,180][85175] Updated weights for policy 1, policy_version 74860 (0.0011) +[2023-10-11 18:03:35,553][85175] Updated weights for policy 1, policy_version 74870 (0.0009) +[2023-10-11 18:03:35,923][85175] Updated weights for policy 1, policy_version 74880 (0.0007) +[2023-10-11 18:03:36,063][84230] Fps is (10 sec: 16383.6, 60 sec: 14199.5, 300 sec: 13662.6). Total num frames: 152240128. Throughput: 0: 1672.9, 1: 1701.0. Samples: 38059306. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-10-11 18:03:36,064][84230] Avg episode reward: [(0, '43.520'), (1, '45.890')] +[2023-10-11 18:03:39,051][85176] Updated weights for policy 0, policy_version 73802 (0.0008) +[2023-10-11 18:03:39,430][85176] Updated weights for policy 0, policy_version 73812 (0.0008) +[2023-10-11 18:03:39,798][85176] Updated weights for policy 0, policy_version 73822 (0.0009) +[2023-10-11 18:03:39,824][85175] Updated weights for policy 1, policy_version 74890 (0.0007) +[2023-10-11 18:03:40,189][85175] Updated weights for policy 1, policy_version 74900 (0.0009) +[2023-10-11 18:03:40,557][85175] Updated weights for policy 1, policy_version 74910 (0.0009) +[2023-10-11 18:03:41,063][84230] Fps is (10 sec: 16383.7, 60 sec: 14199.4, 300 sec: 13440.4). Total num frames: 152305664. Throughput: 0: 1652.7, 1: 1703.2. Samples: 38079232. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-10-11 18:03:41,064][84230] Avg episode reward: [(0, '47.030'), (1, '43.430')] +[2023-10-11 18:03:44,037][85176] Updated weights for policy 0, policy_version 73832 (0.0007) +[2023-10-11 18:03:44,418][85176] Updated weights for policy 0, policy_version 73842 (0.0008) +[2023-10-11 18:03:44,674][85175] Updated weights for policy 1, policy_version 74920 (0.0009) +[2023-10-11 18:03:44,796][85176] Updated weights for policy 0, policy_version 73852 (0.0010) +[2023-10-11 18:03:45,036][85175] Updated weights for policy 1, policy_version 74930 (0.0008) +[2023-10-11 18:03:45,399][85175] Updated weights for policy 1, policy_version 74940 (0.0010) +[2023-10-11 18:03:46,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 152371200. Throughput: 0: 1672.0, 1: 1674.6. Samples: 38098568. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-10-11 18:03:46,064][84230] Avg episode reward: [(0, '44.540'), (1, '46.610')] +[2023-10-11 18:03:48,822][85176] Updated weights for policy 0, policy_version 73862 (0.0008) +[2023-10-11 18:03:49,185][85176] Updated weights for policy 0, policy_version 73872 (0.0009) +[2023-10-11 18:03:49,551][85176] Updated weights for policy 0, policy_version 73882 (0.0008) +[2023-10-11 18:03:49,564][85175] Updated weights for policy 1, policy_version 74950 (0.0011) +[2023-10-11 18:03:49,937][85175] Updated weights for policy 1, policy_version 74960 (0.0007) +[2023-10-11 18:03:50,316][85175] Updated weights for policy 1, policy_version 74970 (0.0007) +[2023-10-11 18:03:51,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 152436736. Throughput: 0: 1669.9, 1: 1698.4. Samples: 38109910. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-10-11 18:03:51,064][84230] Avg episode reward: [(0, '46.380'), (1, '44.100')] +[2023-10-11 18:03:53,627][85176] Updated weights for policy 0, policy_version 73892 (0.0008) +[2023-10-11 18:03:54,006][85176] Updated weights for policy 0, policy_version 73902 (0.0008) +[2023-10-11 18:03:54,312][85175] Updated weights for policy 1, policy_version 74980 (0.0007) +[2023-10-11 18:03:54,376][85176] Updated weights for policy 0, policy_version 73912 (0.0008) +[2023-10-11 18:03:54,672][85175] Updated weights for policy 1, policy_version 74990 (0.0007) +[2023-10-11 18:03:55,044][85175] Updated weights for policy 1, policy_version 75000 (0.0008) +[2023-10-11 18:03:56,062][84230] Fps is (10 sec: 13107.6, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 152502272. Throughput: 0: 1656.0, 1: 1694.1. Samples: 38129064. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-10-11 18:03:56,063][84230] Avg episode reward: [(0, '44.670'), (1, '48.010')] +[2023-10-11 18:03:58,564][85176] Updated weights for policy 0, policy_version 73922 (0.0007) +[2023-10-11 18:03:58,944][85176] Updated weights for policy 0, policy_version 73932 (0.0008) +[2023-10-11 18:03:59,227][85175] Updated weights for policy 1, policy_version 75010 (0.0009) +[2023-10-11 18:03:59,313][85176] Updated weights for policy 0, policy_version 73942 (0.0008) +[2023-10-11 18:03:59,584][85175] Updated weights for policy 1, policy_version 75020 (0.0008) +[2023-10-11 18:03:59,674][85176] Updated weights for policy 0, policy_version 73952 (0.0008) +[2023-10-11 18:03:59,956][85175] Updated weights for policy 1, policy_version 75030 (0.0007) +[2023-10-11 18:04:00,325][85175] Updated weights for policy 1, policy_version 75040 (0.0009) +[2023-10-11 18:04:01,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 152567808. Throughput: 0: 1673.8, 1: 1672.4. Samples: 38148704. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-10-11 18:04:01,063][84230] Avg episode reward: [(0, '44.240'), (1, '45.080')] +[2023-10-11 18:04:03,611][85176] Updated weights for policy 0, policy_version 73962 (0.0008) +[2023-10-11 18:04:03,986][85176] Updated weights for policy 0, policy_version 73972 (0.0008) +[2023-10-11 18:04:04,359][85176] Updated weights for policy 0, policy_version 73982 (0.0007) +[2023-10-11 18:04:04,436][85175] Updated weights for policy 1, policy_version 75050 (0.0009) +[2023-10-11 18:04:04,805][85175] Updated weights for policy 1, policy_version 75060 (0.0010) +[2023-10-11 18:04:05,176][85175] Updated weights for policy 1, policy_version 75070 (0.0009) +[2023-10-11 18:04:06,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 152633344. Throughput: 0: 1661.6, 1: 1706.8. Samples: 38160084. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-10-11 18:04:06,063][84230] Avg episode reward: [(0, '41.510'), (1, '46.330')] +[2023-10-11 18:04:08,348][85176] Updated weights for policy 0, policy_version 73992 (0.0007) +[2023-10-11 18:04:08,719][85176] Updated weights for policy 0, policy_version 74002 (0.0007) +[2023-10-11 18:04:09,071][85175] Updated weights for policy 1, policy_version 75080 (0.0008) +[2023-10-11 18:04:09,089][85176] Updated weights for policy 0, policy_version 74012 (0.0008) +[2023-10-11 18:04:09,432][85175] Updated weights for policy 1, policy_version 75090 (0.0010) +[2023-10-11 18:04:09,798][85175] Updated weights for policy 1, policy_version 75100 (0.0010) +[2023-10-11 18:04:11,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 152698880. Throughput: 0: 1660.3, 1: 1688.3. Samples: 38179046. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-10-11 18:04:11,064][84230] Avg episode reward: [(0, '44.170'), (1, '44.190')] +[2023-10-11 18:04:13,175][85176] Updated weights for policy 0, policy_version 74022 (0.0008) +[2023-10-11 18:04:13,540][85176] Updated weights for policy 0, policy_version 74032 (0.0008) +[2023-10-11 18:04:13,581][85175] Updated weights for policy 1, policy_version 75110 (0.0008) +[2023-10-11 18:04:13,911][85176] Updated weights for policy 0, policy_version 74042 (0.0007) +[2023-10-11 18:04:13,954][85175] Updated weights for policy 1, policy_version 75120 (0.0009) +[2023-10-11 18:04:14,316][85175] Updated weights for policy 1, policy_version 75130 (0.0008) +[2023-10-11 18:04:16,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 152764416. Throughput: 0: 1669.2, 1: 1689.2. Samples: 38199484. Policy #0 lag: (min: 31.0, avg: 31.0, max: 32.0) +[2023-10-11 18:04:16,064][84230] Avg episode reward: [(0, '42.270'), (1, '46.720')] +[2023-10-11 18:04:16,078][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000075136_76939264.pth... +[2023-10-11 18:04:16,078][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000074048_75825152.pth... +[2023-10-11 18:04:16,131][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000072480_74219520.pth +[2023-10-11 18:04:16,131][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000073536_75300864.pth +[2023-10-11 18:04:18,106][85176] Updated weights for policy 0, policy_version 74052 (0.0009) +[2023-10-11 18:04:18,338][85175] Updated weights for policy 1, policy_version 75140 (0.0007) +[2023-10-11 18:04:18,475][85176] Updated weights for policy 0, policy_version 74062 (0.0007) +[2023-10-11 18:04:18,712][85175] Updated weights for policy 1, policy_version 75150 (0.0008) +[2023-10-11 18:04:18,847][85176] Updated weights for policy 0, policy_version 74072 (0.0008) +[2023-10-11 18:04:19,078][85175] Updated weights for policy 1, policy_version 75160 (0.0007) +[2023-10-11 18:04:21,062][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 152829952. Throughput: 0: 1654.5, 1: 1697.2. Samples: 38210134. Policy #0 lag: (min: 31.0, avg: 31.0, max: 32.0) +[2023-10-11 18:04:21,063][84230] Avg episode reward: [(0, '43.530'), (1, '44.570')] +[2023-10-11 18:04:22,979][85175] Updated weights for policy 1, policy_version 75170 (0.0009) +[2023-10-11 18:04:22,997][85176] Updated weights for policy 0, policy_version 74082 (0.0008) +[2023-10-11 18:04:23,339][85175] Updated weights for policy 1, policy_version 75180 (0.0007) +[2023-10-11 18:04:23,364][85176] Updated weights for policy 0, policy_version 74092 (0.0007) +[2023-10-11 18:04:23,704][85175] Updated weights for policy 1, policy_version 75190 (0.0010) +[2023-10-11 18:04:23,733][85176] Updated weights for policy 0, policy_version 74102 (0.0007) +[2023-10-11 18:04:24,069][85175] Updated weights for policy 1, policy_version 75200 (0.0007) +[2023-10-11 18:04:24,101][85176] Updated weights for policy 0, policy_version 74112 (0.0008) +[2023-10-11 18:04:26,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 152895488. Throughput: 0: 1660.5, 1: 1678.3. Samples: 38229478. Policy #0 lag: (min: 31.0, avg: 31.0, max: 32.0) +[2023-10-11 18:04:26,064][84230] Avg episode reward: [(0, '44.570'), (1, '48.140')] +[2023-10-11 18:04:28,171][85175] Updated weights for policy 1, policy_version 75210 (0.0007) +[2023-10-11 18:04:28,198][85176] Updated weights for policy 0, policy_version 74122 (0.0009) +[2023-10-11 18:04:28,542][85175] Updated weights for policy 1, policy_version 75220 (0.0009) +[2023-10-11 18:04:28,572][85176] Updated weights for policy 0, policy_version 74132 (0.0007) +[2023-10-11 18:04:28,902][85175] Updated weights for policy 1, policy_version 75230 (0.0009) +[2023-10-11 18:04:28,944][85176] Updated weights for policy 0, policy_version 74142 (0.0008) +[2023-10-11 18:04:31,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 152961024. Throughput: 0: 1672.0, 1: 1702.4. Samples: 38250414. Policy #0 lag: (min: 31.0, avg: 31.0, max: 32.0) +[2023-10-11 18:04:31,063][84230] Avg episode reward: [(0, '44.640'), (1, '43.700')] +[2023-10-11 18:04:33,051][85175] Updated weights for policy 1, policy_version 75240 (0.0008) +[2023-10-11 18:04:33,090][85176] Updated weights for policy 0, policy_version 74152 (0.0010) +[2023-10-11 18:04:33,426][85175] Updated weights for policy 1, policy_version 75250 (0.0008) +[2023-10-11 18:04:33,469][85176] Updated weights for policy 0, policy_version 74162 (0.0008) +[2023-10-11 18:04:33,797][85175] Updated weights for policy 1, policy_version 75260 (0.0009) +[2023-10-11 18:04:33,825][85176] Updated weights for policy 0, policy_version 74172 (0.0009) +[2023-10-11 18:04:36,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 153026560. Throughput: 0: 1650.0, 1: 1690.4. Samples: 38260226. Policy #0 lag: (min: 31.0, avg: 31.0, max: 32.0) +[2023-10-11 18:04:36,064][84230] Avg episode reward: [(0, '43.330'), (1, '45.610')] +[2023-10-11 18:04:37,900][85175] Updated weights for policy 1, policy_version 75270 (0.0007) +[2023-10-11 18:04:37,930][85176] Updated weights for policy 0, policy_version 74182 (0.0009) +[2023-10-11 18:04:38,270][85175] Updated weights for policy 1, policy_version 75280 (0.0008) +[2023-10-11 18:04:38,303][85176] Updated weights for policy 0, policy_version 74192 (0.0009) +[2023-10-11 18:04:38,645][85175] Updated weights for policy 1, policy_version 75290 (0.0008) +[2023-10-11 18:04:38,682][85176] Updated weights for policy 0, policy_version 74202 (0.0008) +[2023-10-11 18:04:41,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 153092096. Throughput: 0: 1665.3, 1: 1685.8. Samples: 38279864. Policy #0 lag: (min: 31.0, avg: 31.0, max: 32.0) +[2023-10-11 18:04:41,064][84230] Avg episode reward: [(0, '43.110'), (1, '42.710')] +[2023-10-11 18:04:42,549][85175] Updated weights for policy 1, policy_version 75300 (0.0010) +[2023-10-11 18:04:42,844][85176] Updated weights for policy 0, policy_version 74212 (0.0009) +[2023-10-11 18:04:42,912][85175] Updated weights for policy 1, policy_version 75310 (0.0009) +[2023-10-11 18:04:43,213][85176] Updated weights for policy 0, policy_version 74222 (0.0008) +[2023-10-11 18:04:43,275][85175] Updated weights for policy 1, policy_version 75320 (0.0008) +[2023-10-11 18:04:43,589][85176] Updated weights for policy 0, policy_version 74232 (0.0010) +[2023-10-11 18:04:46,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 153157632. Throughput: 0: 1666.5, 1: 1708.3. Samples: 38300568. Policy #0 lag: (min: 31.0, avg: 31.0, max: 32.0) +[2023-10-11 18:04:46,064][84230] Avg episode reward: [(0, '43.110'), (1, '47.690')] +[2023-10-11 18:04:47,317][85175] Updated weights for policy 1, policy_version 75330 (0.0007) +[2023-10-11 18:04:47,579][85176] Updated weights for policy 0, policy_version 74242 (0.0008) +[2023-10-11 18:04:47,682][85175] Updated weights for policy 1, policy_version 75340 (0.0009) +[2023-10-11 18:04:47,957][85176] Updated weights for policy 0, policy_version 74252 (0.0008) +[2023-10-11 18:04:48,044][85175] Updated weights for policy 1, policy_version 75350 (0.0008) +[2023-10-11 18:04:48,330][85176] Updated weights for policy 0, policy_version 74262 (0.0007) +[2023-10-11 18:04:48,418][85175] Updated weights for policy 1, policy_version 75360 (0.0008) +[2023-10-11 18:04:48,699][85176] Updated weights for policy 0, policy_version 74272 (0.0007) +[2023-10-11 18:04:51,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 153223168. Throughput: 0: 1647.9, 1: 1674.3. Samples: 38309582. Policy #0 lag: (min: 31.0, avg: 31.0, max: 32.0) +[2023-10-11 18:04:51,064][84230] Avg episode reward: [(0, '45.060'), (1, '44.400')] +[2023-10-11 18:04:52,539][85175] Updated weights for policy 1, policy_version 75370 (0.0009) +[2023-10-11 18:04:52,898][85175] Updated weights for policy 1, policy_version 75380 (0.0008) +[2023-10-11 18:04:52,909][85176] Updated weights for policy 0, policy_version 74282 (0.0008) +[2023-10-11 18:04:53,268][85175] Updated weights for policy 1, policy_version 75390 (0.0007) +[2023-10-11 18:04:53,286][85176] Updated weights for policy 0, policy_version 74292 (0.0009) +[2023-10-11 18:04:53,653][85176] Updated weights for policy 0, policy_version 74302 (0.0007) +[2023-10-11 18:04:56,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 153288704. Throughput: 0: 1662.7, 1: 1693.7. Samples: 38330084. Policy #0 lag: (min: 31.0, avg: 31.0, max: 32.0) +[2023-10-11 18:04:56,064][84230] Avg episode reward: [(0, '45.360'), (1, '45.440')] +[2023-10-11 18:04:57,461][85175] Updated weights for policy 1, policy_version 75400 (0.0009) +[2023-10-11 18:04:57,803][85176] Updated weights for policy 0, policy_version 74312 (0.0008) +[2023-10-11 18:04:57,824][85175] Updated weights for policy 1, policy_version 75410 (0.0008) +[2023-10-11 18:04:58,166][85176] Updated weights for policy 0, policy_version 74322 (0.0010) +[2023-10-11 18:04:58,200][85175] Updated weights for policy 1, policy_version 75420 (0.0009) +[2023-10-11 18:04:58,548][85176] Updated weights for policy 0, policy_version 74332 (0.0009) +[2023-10-11 18:05:01,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 153354240. Throughput: 0: 1661.6, 1: 1698.9. Samples: 38350706. Policy #0 lag: (min: 31.0, avg: 31.0, max: 32.0) +[2023-10-11 18:05:01,064][84230] Avg episode reward: [(0, '45.750'), (1, '42.260')] +[2023-10-11 18:05:02,205][85175] Updated weights for policy 1, policy_version 75430 (0.0007) +[2023-10-11 18:05:02,573][85175] Updated weights for policy 1, policy_version 75440 (0.0009) +[2023-10-11 18:05:02,760][85176] Updated weights for policy 0, policy_version 74342 (0.0009) +[2023-10-11 18:05:02,946][85175] Updated weights for policy 1, policy_version 75450 (0.0007) +[2023-10-11 18:05:03,137][85176] Updated weights for policy 0, policy_version 74352 (0.0009) +[2023-10-11 18:05:03,511][85176] Updated weights for policy 0, policy_version 74362 (0.0008) +[2023-10-11 18:05:06,063][84230] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 153419776. Throughput: 0: 1655.2, 1: 1678.8. Samples: 38360162. Policy #0 lag: (min: 31.0, avg: 36.5, max: 63.0) +[2023-10-11 18:05:06,063][84230] Avg episode reward: [(0, '45.300'), (1, '43.870')] +[2023-10-11 18:05:06,825][85175] Updated weights for policy 1, policy_version 75460 (0.0007) +[2023-10-11 18:05:07,194][85175] Updated weights for policy 1, policy_version 75470 (0.0009) +[2023-10-11 18:05:07,406][85176] Updated weights for policy 0, policy_version 74372 (0.0009) +[2023-10-11 18:05:07,561][85175] Updated weights for policy 1, policy_version 75480 (0.0008) +[2023-10-11 18:05:07,772][85176] Updated weights for policy 0, policy_version 74382 (0.0008) +[2023-10-11 18:05:08,137][85176] Updated weights for policy 0, policy_version 74392 (0.0009) +[2023-10-11 18:05:11,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 153485312. Throughput: 0: 1665.0, 1: 1701.1. Samples: 38380954. Policy #0 lag: (min: 31.0, avg: 36.5, max: 63.0) +[2023-10-11 18:05:11,064][84230] Avg episode reward: [(0, '44.310'), (1, '44.300')] +[2023-10-11 18:05:11,518][85175] Updated weights for policy 1, policy_version 75490 (0.0007) +[2023-10-11 18:05:11,887][85175] Updated weights for policy 1, policy_version 75500 (0.0010) +[2023-10-11 18:05:12,270][85175] Updated weights for policy 1, policy_version 75510 (0.0009) +[2023-10-11 18:05:12,377][85176] Updated weights for policy 0, policy_version 74402 (0.0010) +[2023-10-11 18:05:12,643][85175] Updated weights for policy 1, policy_version 75520 (0.0008) +[2023-10-11 18:05:12,753][85176] Updated weights for policy 0, policy_version 74412 (0.0008) +[2023-10-11 18:05:13,123][85176] Updated weights for policy 0, policy_version 74422 (0.0009) +[2023-10-11 18:05:13,490][85176] Updated weights for policy 0, policy_version 74432 (0.0011) +[2023-10-11 18:05:16,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 153550848. Throughput: 0: 1659.9, 1: 1706.9. Samples: 38401920. Policy #0 lag: (min: 31.0, avg: 36.5, max: 63.0) +[2023-10-11 18:05:16,063][84230] Avg episode reward: [(0, '43.410'), (1, '44.230')] +[2023-10-11 18:05:16,534][85175] Updated weights for policy 1, policy_version 75530 (0.0008) +[2023-10-11 18:05:16,912][85175] Updated weights for policy 1, policy_version 75540 (0.0008) +[2023-10-11 18:05:17,279][85175] Updated weights for policy 1, policy_version 75550 (0.0008) +[2023-10-11 18:05:17,452][85176] Updated weights for policy 0, policy_version 74442 (0.0009) +[2023-10-11 18:05:17,828][85176] Updated weights for policy 0, policy_version 74452 (0.0009) +[2023-10-11 18:05:18,196][85176] Updated weights for policy 0, policy_version 74462 (0.0009) +[2023-10-11 18:05:21,062][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 153616384. Throughput: 0: 1654.4, 1: 1697.7. Samples: 38411072. Policy #0 lag: (min: 31.0, avg: 36.5, max: 63.0) +[2023-10-11 18:05:21,063][84230] Avg episode reward: [(0, '42.630'), (1, '44.760')] +[2023-10-11 18:05:21,346][85175] Updated weights for policy 1, policy_version 75560 (0.0010) +[2023-10-11 18:05:21,717][85175] Updated weights for policy 1, policy_version 75570 (0.0010) +[2023-10-11 18:05:22,085][85175] Updated weights for policy 1, policy_version 75580 (0.0008) +[2023-10-11 18:05:22,212][85176] Updated weights for policy 0, policy_version 74472 (0.0007) +[2023-10-11 18:05:22,575][85176] Updated weights for policy 0, policy_version 74482 (0.0010) +[2023-10-11 18:05:22,956][85176] Updated weights for policy 0, policy_version 74492 (0.0011) +[2023-10-11 18:05:26,042][85175] Updated weights for policy 1, policy_version 75590 (0.0008) +[2023-10-11 18:05:26,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 153681920. Throughput: 0: 1665.3, 1: 1713.2. Samples: 38431898. Policy #0 lag: (min: 31.0, avg: 36.5, max: 63.0) +[2023-10-11 18:05:26,064][84230] Avg episode reward: [(0, '42.970'), (1, '43.320')] +[2023-10-11 18:05:26,411][85175] Updated weights for policy 1, policy_version 75600 (0.0007) +[2023-10-11 18:05:26,785][85175] Updated weights for policy 1, policy_version 75610 (0.0008) +[2023-10-11 18:05:27,018][85176] Updated weights for policy 0, policy_version 74502 (0.0010) +[2023-10-11 18:05:27,394][85176] Updated weights for policy 0, policy_version 74512 (0.0008) +[2023-10-11 18:05:27,770][85176] Updated weights for policy 0, policy_version 74522 (0.0007) +[2023-10-11 18:05:30,824][85175] Updated weights for policy 1, policy_version 75620 (0.0008) +[2023-10-11 18:05:31,063][84230] Fps is (10 sec: 13106.8, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 153747456. Throughput: 0: 1667.2, 1: 1709.8. Samples: 38452530. Policy #0 lag: (min: 31.0, avg: 36.5, max: 63.0) +[2023-10-11 18:05:31,064][84230] Avg episode reward: [(0, '43.960'), (1, '43.960')] +[2023-10-11 18:05:31,197][85175] Updated weights for policy 1, policy_version 75630 (0.0008) +[2023-10-11 18:05:31,569][85175] Updated weights for policy 1, policy_version 75640 (0.0008) +[2023-10-11 18:05:31,936][85176] Updated weights for policy 0, policy_version 74532 (0.0007) +[2023-10-11 18:05:32,317][85176] Updated weights for policy 0, policy_version 74542 (0.0007) +[2023-10-11 18:05:32,688][85176] Updated weights for policy 0, policy_version 74552 (0.0008) +[2023-10-11 18:05:35,630][85175] Updated weights for policy 1, policy_version 75650 (0.0007) +[2023-10-11 18:05:35,996][85175] Updated weights for policy 1, policy_version 75660 (0.0009) +[2023-10-11 18:05:36,062][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 153812992. Throughput: 0: 1670.6, 1: 1712.1. Samples: 38461802. Policy #0 lag: (min: 31.0, avg: 36.5, max: 63.0) +[2023-10-11 18:05:36,063][84230] Avg episode reward: [(0, '42.530'), (1, '43.380')] +[2023-10-11 18:05:36,368][85175] Updated weights for policy 1, policy_version 75670 (0.0007) +[2023-10-11 18:05:36,741][85175] Updated weights for policy 1, policy_version 75680 (0.0009) +[2023-10-11 18:05:36,782][85176] Updated weights for policy 0, policy_version 74562 (0.0010) +[2023-10-11 18:05:37,185][85176] Updated weights for policy 0, policy_version 74572 (0.0010) +[2023-10-11 18:05:37,564][85176] Updated weights for policy 0, policy_version 74582 (0.0009) +[2023-10-11 18:05:37,934][85176] Updated weights for policy 0, policy_version 74592 (0.0009) +[2023-10-11 18:05:40,667][85175] Updated weights for policy 1, policy_version 75690 (0.0010) +[2023-10-11 18:05:41,043][85175] Updated weights for policy 1, policy_version 75700 (0.0010) +[2023-10-11 18:05:41,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 153878528. Throughput: 0: 1671.6, 1: 1713.7. Samples: 38482420. Policy #0 lag: (min: 31.0, avg: 36.5, max: 63.0) +[2023-10-11 18:05:41,064][84230] Avg episode reward: [(0, '42.710'), (1, '44.230')] +[2023-10-11 18:05:41,414][85175] Updated weights for policy 1, policy_version 75710 (0.0011) +[2023-10-11 18:05:42,045][85176] Updated weights for policy 0, policy_version 74602 (0.0007) +[2023-10-11 18:05:42,409][85176] Updated weights for policy 0, policy_version 74612 (0.0009) +[2023-10-11 18:05:42,785][85176] Updated weights for policy 0, policy_version 74622 (0.0007) +[2023-10-11 18:05:45,334][85175] Updated weights for policy 1, policy_version 75720 (0.0007) +[2023-10-11 18:05:45,703][85175] Updated weights for policy 1, policy_version 75730 (0.0007) +[2023-10-11 18:05:46,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 153944064. Throughput: 0: 1671.7, 1: 1705.4. Samples: 38502674. Policy #0 lag: (min: 31.0, avg: 36.5, max: 63.0) +[2023-10-11 18:05:46,063][84230] Avg episode reward: [(0, '43.740'), (1, '44.530')] +[2023-10-11 18:05:46,069][85175] Updated weights for policy 1, policy_version 75740 (0.0011) +[2023-10-11 18:05:46,827][85176] Updated weights for policy 0, policy_version 74632 (0.0008) +[2023-10-11 18:05:47,209][85176] Updated weights for policy 0, policy_version 74642 (0.0009) +[2023-10-11 18:05:47,579][85176] Updated weights for policy 0, policy_version 74652 (0.0011) +[2023-10-11 18:05:50,114][85175] Updated weights for policy 1, policy_version 75750 (0.0010) +[2023-10-11 18:05:50,483][85175] Updated weights for policy 1, policy_version 75760 (0.0007) +[2023-10-11 18:05:50,858][85175] Updated weights for policy 1, policy_version 75770 (0.0009) +[2023-10-11 18:05:51,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 154009600. Throughput: 0: 1663.0, 1: 1716.8. Samples: 38512252. Policy #0 lag: (min: 31.0, avg: 36.5, max: 63.0) +[2023-10-11 18:05:51,063][84230] Avg episode reward: [(0, '42.280'), (1, '46.290')] +[2023-10-11 18:05:51,733][85176] Updated weights for policy 0, policy_version 74662 (0.0008) +[2023-10-11 18:05:52,106][85176] Updated weights for policy 0, policy_version 74672 (0.0010) +[2023-10-11 18:05:52,486][85176] Updated weights for policy 0, policy_version 74682 (0.0007) +[2023-10-11 18:05:54,999][85175] Updated weights for policy 1, policy_version 75780 (0.0009) +[2023-10-11 18:05:55,372][85175] Updated weights for policy 1, policy_version 75790 (0.0008) +[2023-10-11 18:05:55,741][85175] Updated weights for policy 1, policy_version 75800 (0.0007) +[2023-10-11 18:05:56,063][84230] Fps is (10 sec: 16383.8, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 154107904. Throughput: 0: 1667.3, 1: 1706.4. Samples: 38532772. Policy #0 lag: (min: 14.0, avg: 14.8, max: 33.0) +[2023-10-11 18:05:56,063][84230] Avg episode reward: [(0, '43.200'), (1, '45.100')] +[2023-10-11 18:05:56,537][85176] Updated weights for policy 0, policy_version 74692 (0.0008) +[2023-10-11 18:05:56,907][85176] Updated weights for policy 0, policy_version 74702 (0.0010) +[2023-10-11 18:05:57,287][85176] Updated weights for policy 0, policy_version 74712 (0.0007) +[2023-10-11 18:05:59,822][85175] Updated weights for policy 1, policy_version 75810 (0.0007) +[2023-10-11 18:06:00,194][85175] Updated weights for policy 1, policy_version 75820 (0.0008) +[2023-10-11 18:06:00,558][85175] Updated weights for policy 1, policy_version 75830 (0.0008) +[2023-10-11 18:06:00,925][85175] Updated weights for policy 1, policy_version 75840 (0.0010) +[2023-10-11 18:06:01,062][84230] Fps is (10 sec: 16384.1, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 154173440. Throughput: 0: 1669.3, 1: 1684.1. Samples: 38552824. Policy #0 lag: (min: 14.0, avg: 14.8, max: 33.0) +[2023-10-11 18:06:01,063][84230] Avg episode reward: [(0, '43.140'), (1, '47.870')] +[2023-10-11 18:06:01,400][85176] Updated weights for policy 0, policy_version 74722 (0.0009) +[2023-10-11 18:06:01,772][85176] Updated weights for policy 0, policy_version 74732 (0.0008) +[2023-10-11 18:06:02,146][85176] Updated weights for policy 0, policy_version 74742 (0.0007) +[2023-10-11 18:06:02,508][85176] Updated weights for policy 0, policy_version 74752 (0.0007) +[2023-10-11 18:06:04,967][85175] Updated weights for policy 1, policy_version 75850 (0.0007) +[2023-10-11 18:06:05,332][85175] Updated weights for policy 1, policy_version 75860 (0.0007) +[2023-10-11 18:06:05,694][85175] Updated weights for policy 1, policy_version 75870 (0.0007) +[2023-10-11 18:06:06,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 154238976. Throughput: 0: 1670.6, 1: 1700.0. Samples: 38562746. Policy #0 lag: (min: 14.0, avg: 14.8, max: 33.0) +[2023-10-11 18:06:06,063][84230] Avg episode reward: [(0, '47.210'), (1, '46.700')] +[2023-10-11 18:06:06,638][85176] Updated weights for policy 0, policy_version 74762 (0.0008) +[2023-10-11 18:06:07,004][85176] Updated weights for policy 0, policy_version 74772 (0.0007) +[2023-10-11 18:06:07,369][85176] Updated weights for policy 0, policy_version 74782 (0.0009) +[2023-10-11 18:06:09,598][85175] Updated weights for policy 1, policy_version 75880 (0.0007) +[2023-10-11 18:06:09,961][85175] Updated weights for policy 1, policy_version 75890 (0.0007) +[2023-10-11 18:06:10,326][85175] Updated weights for policy 1, policy_version 75900 (0.0009) +[2023-10-11 18:06:11,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 154304512. Throughput: 0: 1669.1, 1: 1699.6. Samples: 38583486. Policy #0 lag: (min: 14.0, avg: 14.8, max: 33.0) +[2023-10-11 18:06:11,063][84230] Avg episode reward: [(0, '42.290'), (1, '46.420')] +[2023-10-11 18:06:11,559][85176] Updated weights for policy 0, policy_version 74792 (0.0008) +[2023-10-11 18:06:11,933][85176] Updated weights for policy 0, policy_version 74802 (0.0007) +[2023-10-11 18:06:12,316][85176] Updated weights for policy 0, policy_version 74812 (0.0009) +[2023-10-11 18:06:14,415][85175] Updated weights for policy 1, policy_version 75910 (0.0008) +[2023-10-11 18:06:14,789][85175] Updated weights for policy 1, policy_version 75920 (0.0008) +[2023-10-11 18:06:15,158][85175] Updated weights for policy 1, policy_version 75930 (0.0008) +[2023-10-11 18:06:16,063][84230] Fps is (10 sec: 13106.7, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 154370048. Throughput: 0: 1673.7, 1: 1677.1. Samples: 38603318. Policy #0 lag: (min: 14.0, avg: 14.8, max: 33.0) +[2023-10-11 18:06:16,064][84230] Avg episode reward: [(0, '44.930'), (1, '47.350')] +[2023-10-11 18:06:16,075][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000075936_77758464.pth... +[2023-10-11 18:06:16,109][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000074336_76120064.pth +[2023-10-11 18:06:16,226][85176] Updated weights for policy 0, policy_version 74822 (0.0009) +[2023-10-11 18:06:16,604][85176] Updated weights for policy 0, policy_version 74832 (0.0009) +[2023-10-11 18:06:16,974][85176] Updated weights for policy 0, policy_version 74842 (0.0008) +[2023-10-11 18:06:17,194][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000074848_76644352.pth... +[2023-10-11 18:06:17,235][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000073280_75038720.pth +[2023-10-11 18:06:19,230][85175] Updated weights for policy 1, policy_version 75940 (0.0008) +[2023-10-11 18:06:19,604][85175] Updated weights for policy 1, policy_version 75950 (0.0007) +[2023-10-11 18:06:19,965][85175] Updated weights for policy 1, policy_version 75960 (0.0007) +[2023-10-11 18:06:21,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 154435584. Throughput: 0: 1671.7, 1: 1705.8. Samples: 38613792. Policy #0 lag: (min: 14.0, avg: 14.8, max: 33.0) +[2023-10-11 18:06:21,063][84230] Avg episode reward: [(0, '41.410'), (1, '44.190')] +[2023-10-11 18:06:21,124][85176] Updated weights for policy 0, policy_version 74852 (0.0008) +[2023-10-11 18:06:21,495][85176] Updated weights for policy 0, policy_version 74862 (0.0008) +[2023-10-11 18:06:21,873][85176] Updated weights for policy 0, policy_version 74872 (0.0009) +[2023-10-11 18:06:23,951][85175] Updated weights for policy 1, policy_version 75970 (0.0008) +[2023-10-11 18:06:24,323][85175] Updated weights for policy 1, policy_version 75980 (0.0008) +[2023-10-11 18:06:24,697][85175] Updated weights for policy 1, policy_version 75990 (0.0007) +[2023-10-11 18:06:25,068][85175] Updated weights for policy 1, policy_version 76000 (0.0007) +[2023-10-11 18:06:25,910][85176] Updated weights for policy 0, policy_version 74882 (0.0007) +[2023-10-11 18:06:26,063][84230] Fps is (10 sec: 13107.6, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 154501120. Throughput: 0: 1676.9, 1: 1691.1. Samples: 38633982. Policy #0 lag: (min: 14.0, avg: 14.8, max: 33.0) +[2023-10-11 18:06:26,063][84230] Avg episode reward: [(0, '46.260'), (1, '47.930')] +[2023-10-11 18:06:26,282][85176] Updated weights for policy 0, policy_version 74892 (0.0010) +[2023-10-11 18:06:26,658][85176] Updated weights for policy 0, policy_version 74902 (0.0011) +[2023-10-11 18:06:27,036][85176] Updated weights for policy 0, policy_version 74912 (0.0009) +[2023-10-11 18:06:29,164][85175] Updated weights for policy 1, policy_version 76010 (0.0009) +[2023-10-11 18:06:29,533][85175] Updated weights for policy 1, policy_version 76020 (0.0009) +[2023-10-11 18:06:29,907][85175] Updated weights for policy 1, policy_version 76030 (0.0009) +[2023-10-11 18:06:31,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 154566656. Throughput: 0: 1679.9, 1: 1687.2. Samples: 38654196. Policy #0 lag: (min: 14.0, avg: 14.8, max: 33.0) +[2023-10-11 18:06:31,063][84230] Avg episode reward: [(0, '42.750'), (1, '44.820')] +[2023-10-11 18:06:31,082][85176] Updated weights for policy 0, policy_version 74922 (0.0008) +[2023-10-11 18:06:31,460][85176] Updated weights for policy 0, policy_version 74932 (0.0009) +[2023-10-11 18:06:31,835][85176] Updated weights for policy 0, policy_version 74942 (0.0009) +[2023-10-11 18:06:33,873][85175] Updated weights for policy 1, policy_version 76040 (0.0008) +[2023-10-11 18:06:34,249][85175] Updated weights for policy 1, policy_version 76050 (0.0008) +[2023-10-11 18:06:34,611][85175] Updated weights for policy 1, policy_version 76060 (0.0008) +[2023-10-11 18:06:35,998][85176] Updated weights for policy 0, policy_version 74952 (0.0008) +[2023-10-11 18:06:36,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 154632192. Throughput: 0: 1682.1, 1: 1702.5. Samples: 38664560. Policy #0 lag: (min: 14.0, avg: 14.8, max: 33.0) +[2023-10-11 18:06:36,064][84230] Avg episode reward: [(0, '45.040'), (1, '46.310')] +[2023-10-11 18:06:36,360][85176] Updated weights for policy 0, policy_version 74962 (0.0008) +[2023-10-11 18:06:36,731][85176] Updated weights for policy 0, policy_version 74972 (0.0009) +[2023-10-11 18:06:38,647][85175] Updated weights for policy 1, policy_version 76070 (0.0008) +[2023-10-11 18:06:39,017][85175] Updated weights for policy 1, policy_version 76080 (0.0008) +[2023-10-11 18:06:39,379][85175] Updated weights for policy 1, policy_version 76090 (0.0007) +[2023-10-11 18:06:40,914][85176] Updated weights for policy 0, policy_version 74982 (0.0008) +[2023-10-11 18:06:41,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 154697728. Throughput: 0: 1678.2, 1: 1683.6. Samples: 38684050. Policy #0 lag: (min: 14.0, avg: 14.8, max: 33.0) +[2023-10-11 18:06:41,063][84230] Avg episode reward: [(0, '41.430'), (1, '43.130')] +[2023-10-11 18:06:41,274][85176] Updated weights for policy 0, policy_version 74992 (0.0008) +[2023-10-11 18:06:41,648][85176] Updated weights for policy 0, policy_version 75002 (0.0007) +[2023-10-11 18:06:43,350][85175] Updated weights for policy 1, policy_version 76100 (0.0008) +[2023-10-11 18:06:43,720][85175] Updated weights for policy 1, policy_version 76110 (0.0008) +[2023-10-11 18:06:44,080][85175] Updated weights for policy 1, policy_version 76120 (0.0009) +[2023-10-11 18:06:45,823][85176] Updated weights for policy 0, policy_version 75012 (0.0007) +[2023-10-11 18:06:46,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 154763264. Throughput: 0: 1676.0, 1: 1702.4. Samples: 38704850. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:06:46,063][84230] Avg episode reward: [(0, '45.530'), (1, '45.690')] +[2023-10-11 18:06:46,189][85176] Updated weights for policy 0, policy_version 75022 (0.0008) +[2023-10-11 18:06:46,573][85176] Updated weights for policy 0, policy_version 75032 (0.0007) +[2023-10-11 18:06:47,996][85175] Updated weights for policy 1, policy_version 76130 (0.0010) +[2023-10-11 18:06:48,375][85175] Updated weights for policy 1, policy_version 76140 (0.0007) +[2023-10-11 18:06:48,741][85175] Updated weights for policy 1, policy_version 76150 (0.0009) +[2023-10-11 18:06:49,108][85175] Updated weights for policy 1, policy_version 76160 (0.0008) +[2023-10-11 18:06:50,568][85176] Updated weights for policy 0, policy_version 75042 (0.0009) +[2023-10-11 18:06:50,943][85176] Updated weights for policy 0, policy_version 75052 (0.0009) +[2023-10-11 18:06:51,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 154828800. Throughput: 0: 1675.7, 1: 1700.7. Samples: 38714686. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:06:51,064][84230] Avg episode reward: [(0, '43.170'), (1, '41.540')] +[2023-10-11 18:06:51,319][85176] Updated weights for policy 0, policy_version 75062 (0.0008) +[2023-10-11 18:06:51,692][85176] Updated weights for policy 0, policy_version 75072 (0.0007) +[2023-10-11 18:06:53,005][85175] Updated weights for policy 1, policy_version 76170 (0.0011) +[2023-10-11 18:06:53,368][85175] Updated weights for policy 1, policy_version 76180 (0.0010) +[2023-10-11 18:06:53,745][85175] Updated weights for policy 1, policy_version 76190 (0.0010) +[2023-10-11 18:06:55,633][85176] Updated weights for policy 0, policy_version 75082 (0.0010) +[2023-10-11 18:06:56,001][85176] Updated weights for policy 0, policy_version 75092 (0.0008) +[2023-10-11 18:06:56,062][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 154894336. Throughput: 0: 1680.6, 1: 1685.9. Samples: 38734978. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:06:56,063][84230] Avg episode reward: [(0, '44.540'), (1, '47.430')] +[2023-10-11 18:06:56,377][85176] Updated weights for policy 0, policy_version 75102 (0.0007) +[2023-10-11 18:06:57,786][85175] Updated weights for policy 1, policy_version 76200 (0.0008) +[2023-10-11 18:06:58,157][85175] Updated weights for policy 1, policy_version 76210 (0.0007) +[2023-10-11 18:06:58,522][85175] Updated weights for policy 1, policy_version 76220 (0.0007) +[2023-10-11 18:07:00,308][85176] Updated weights for policy 0, policy_version 75112 (0.0007) +[2023-10-11 18:07:00,671][85176] Updated weights for policy 0, policy_version 75122 (0.0009) +[2023-10-11 18:07:01,035][85176] Updated weights for policy 0, policy_version 75132 (0.0010) +[2023-10-11 18:07:01,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 154959872. Throughput: 0: 1664.8, 1: 1717.0. Samples: 38755496. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:07:01,064][84230] Avg episode reward: [(0, '44.270'), (1, '44.340')] +[2023-10-11 18:07:02,407][85175] Updated weights for policy 1, policy_version 76230 (0.0009) +[2023-10-11 18:07:02,784][85175] Updated weights for policy 1, policy_version 76240 (0.0008) +[2023-10-11 18:07:03,158][85175] Updated weights for policy 1, policy_version 76250 (0.0008) +[2023-10-11 18:07:05,157][85176] Updated weights for policy 0, policy_version 75142 (0.0008) +[2023-10-11 18:07:05,530][85176] Updated weights for policy 0, policy_version 75152 (0.0009) +[2023-10-11 18:07:05,906][85176] Updated weights for policy 0, policy_version 75162 (0.0009) +[2023-10-11 18:07:06,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 155025408. Throughput: 0: 1675.2, 1: 1688.6. Samples: 38765164. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:07:06,064][84230] Avg episode reward: [(0, '46.230'), (1, '48.100')] +[2023-10-11 18:07:06,962][85175] Updated weights for policy 1, policy_version 76260 (0.0008) +[2023-10-11 18:07:07,329][85175] Updated weights for policy 1, policy_version 76270 (0.0008) +[2023-10-11 18:07:07,699][85175] Updated weights for policy 1, policy_version 76280 (0.0010) +[2023-10-11 18:07:09,979][85176] Updated weights for policy 0, policy_version 75172 (0.0008) +[2023-10-11 18:07:10,344][85176] Updated weights for policy 0, policy_version 75182 (0.0009) +[2023-10-11 18:07:10,722][85176] Updated weights for policy 0, policy_version 75192 (0.0010) +[2023-10-11 18:07:11,063][84230] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 155123712. Throughput: 0: 1675.7, 1: 1701.6. Samples: 38785964. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:07:11,064][84230] Avg episode reward: [(0, '42.960'), (1, '42.610')] +[2023-10-11 18:07:11,752][85175] Updated weights for policy 1, policy_version 76290 (0.0008) +[2023-10-11 18:07:12,118][85175] Updated weights for policy 1, policy_version 76300 (0.0007) +[2023-10-11 18:07:12,501][85175] Updated weights for policy 1, policy_version 76310 (0.0009) +[2023-10-11 18:07:12,876][85175] Updated weights for policy 1, policy_version 76320 (0.0008) +[2023-10-11 18:07:14,840][85176] Updated weights for policy 0, policy_version 75202 (0.0008) +[2023-10-11 18:07:15,234][85176] Updated weights for policy 0, policy_version 75212 (0.0007) +[2023-10-11 18:07:15,599][85176] Updated weights for policy 0, policy_version 75222 (0.0008) +[2023-10-11 18:07:15,972][85176] Updated weights for policy 0, policy_version 75232 (0.0007) +[2023-10-11 18:07:16,062][84230] Fps is (10 sec: 16384.3, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 155189248. Throughput: 0: 1654.0, 1: 1725.1. Samples: 38806254. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:07:16,063][84230] Avg episode reward: [(0, '43.640'), (1, '47.630')] +[2023-10-11 18:07:16,662][85175] Updated weights for policy 1, policy_version 76330 (0.0010) +[2023-10-11 18:07:17,028][85175] Updated weights for policy 1, policy_version 76340 (0.0009) +[2023-10-11 18:07:17,400][85175] Updated weights for policy 1, policy_version 76350 (0.0010) +[2023-10-11 18:07:20,120][85176] Updated weights for policy 0, policy_version 75242 (0.0008) +[2023-10-11 18:07:20,498][85176] Updated weights for policy 0, policy_version 75252 (0.0010) +[2023-10-11 18:07:20,868][85176] Updated weights for policy 0, policy_version 75262 (0.0008) +[2023-10-11 18:07:21,062][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 155254784. Throughput: 0: 1672.8, 1: 1697.1. Samples: 38816208. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:07:21,063][84230] Avg episode reward: [(0, '43.070'), (1, '44.100')] +[2023-10-11 18:07:21,436][85175] Updated weights for policy 1, policy_version 76360 (0.0009) +[2023-10-11 18:07:21,802][85175] Updated weights for policy 1, policy_version 76370 (0.0009) +[2023-10-11 18:07:22,165][85175] Updated weights for policy 1, policy_version 76380 (0.0009) +[2023-10-11 18:07:24,980][85176] Updated weights for policy 0, policy_version 75272 (0.0008) +[2023-10-11 18:07:25,354][85176] Updated weights for policy 0, policy_version 75282 (0.0007) +[2023-10-11 18:07:25,731][85176] Updated weights for policy 0, policy_version 75292 (0.0007) +[2023-10-11 18:07:26,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 155320320. Throughput: 0: 1677.9, 1: 1721.8. Samples: 38837040. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:07:26,064][84230] Avg episode reward: [(0, '43.300'), (1, '47.520')] +[2023-10-11 18:07:26,093][85175] Updated weights for policy 1, policy_version 76390 (0.0009) +[2023-10-11 18:07:26,462][85175] Updated weights for policy 1, policy_version 76400 (0.0007) +[2023-10-11 18:07:26,828][85175] Updated weights for policy 1, policy_version 76410 (0.0009) +[2023-10-11 18:07:29,893][85176] Updated weights for policy 0, policy_version 75302 (0.0007) +[2023-10-11 18:07:30,269][85176] Updated weights for policy 0, policy_version 75312 (0.0009) +[2023-10-11 18:07:30,647][85176] Updated weights for policy 0, policy_version 75322 (0.0011) +[2023-10-11 18:07:30,956][85175] Updated weights for policy 1, policy_version 76420 (0.0008) +[2023-10-11 18:07:31,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 155385856. Throughput: 0: 1658.7, 1: 1723.0. Samples: 38857024. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) +[2023-10-11 18:07:31,063][84230] Avg episode reward: [(0, '44.390'), (1, '44.520')] +[2023-10-11 18:07:31,332][85175] Updated weights for policy 1, policy_version 76430 (0.0008) +[2023-10-11 18:07:31,704][85175] Updated weights for policy 1, policy_version 76440 (0.0007) +[2023-10-11 18:07:34,660][85176] Updated weights for policy 0, policy_version 75332 (0.0009) +[2023-10-11 18:07:35,032][85176] Updated weights for policy 0, policy_version 75342 (0.0010) +[2023-10-11 18:07:35,397][85176] Updated weights for policy 0, policy_version 75352 (0.0009) +[2023-10-11 18:07:35,920][85175] Updated weights for policy 1, policy_version 76450 (0.0009) +[2023-10-11 18:07:36,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 155451392. Throughput: 0: 1680.1, 1: 1708.8. Samples: 38867186. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) +[2023-10-11 18:07:36,064][84230] Avg episode reward: [(0, '44.920'), (1, '46.500')] +[2023-10-11 18:07:36,285][85175] Updated weights for policy 1, policy_version 76460 (0.0011) +[2023-10-11 18:07:36,661][85175] Updated weights for policy 1, policy_version 76470 (0.0009) +[2023-10-11 18:07:37,029][85175] Updated weights for policy 1, policy_version 76480 (0.0008) +[2023-10-11 18:07:39,340][85176] Updated weights for policy 0, policy_version 75362 (0.0008) +[2023-10-11 18:07:39,718][85176] Updated weights for policy 0, policy_version 75372 (0.0007) +[2023-10-11 18:07:40,091][85176] Updated weights for policy 0, policy_version 75382 (0.0009) +[2023-10-11 18:07:40,464][85176] Updated weights for policy 0, policy_version 75392 (0.0011) +[2023-10-11 18:07:41,016][85175] Updated weights for policy 1, policy_version 76490 (0.0007) +[2023-10-11 18:07:41,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 155516928. Throughput: 0: 1671.5, 1: 1722.8. Samples: 38887722. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) +[2023-10-11 18:07:41,064][84230] Avg episode reward: [(0, '42.810'), (1, '44.710')] +[2023-10-11 18:07:41,390][85175] Updated weights for policy 1, policy_version 76500 (0.0007) +[2023-10-11 18:07:41,752][85175] Updated weights for policy 1, policy_version 76510 (0.0009) +[2023-10-11 18:07:44,530][85176] Updated weights for policy 0, policy_version 75402 (0.0010) +[2023-10-11 18:07:44,902][85176] Updated weights for policy 0, policy_version 75412 (0.0007) +[2023-10-11 18:07:45,269][85176] Updated weights for policy 0, policy_version 75422 (0.0010) +[2023-10-11 18:07:45,721][85175] Updated weights for policy 1, policy_version 76520 (0.0008) +[2023-10-11 18:07:46,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 155582464. Throughput: 0: 1661.2, 1: 1721.9. Samples: 38907732. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) +[2023-10-11 18:07:46,063][84230] Avg episode reward: [(0, '45.840'), (1, '42.990')] +[2023-10-11 18:07:46,082][85175] Updated weights for policy 1, policy_version 76530 (0.0007) +[2023-10-11 18:07:46,454][85175] Updated weights for policy 1, policy_version 76540 (0.0007) +[2023-10-11 18:07:49,367][85176] Updated weights for policy 0, policy_version 75432 (0.0009) +[2023-10-11 18:07:49,729][85176] Updated weights for policy 0, policy_version 75442 (0.0008) +[2023-10-11 18:07:50,116][85176] Updated weights for policy 0, policy_version 75452 (0.0007) +[2023-10-11 18:07:50,414][85175] Updated weights for policy 1, policy_version 76550 (0.0007) +[2023-10-11 18:07:50,780][85175] Updated weights for policy 1, policy_version 76560 (0.0008) +[2023-10-11 18:07:51,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 155648000. Throughput: 0: 1677.3, 1: 1722.1. Samples: 38918140. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) +[2023-10-11 18:07:51,064][84230] Avg episode reward: [(0, '44.840'), (1, '45.410')] +[2023-10-11 18:07:51,159][85175] Updated weights for policy 1, policy_version 76570 (0.0007) +[2023-10-11 18:07:54,213][85176] Updated weights for policy 0, policy_version 75462 (0.0007) +[2023-10-11 18:07:54,590][85176] Updated weights for policy 0, policy_version 75472 (0.0007) +[2023-10-11 18:07:54,966][85176] Updated weights for policy 0, policy_version 75482 (0.0009) +[2023-10-11 18:07:55,013][85175] Updated weights for policy 1, policy_version 76580 (0.0009) +[2023-10-11 18:07:55,384][85175] Updated weights for policy 1, policy_version 76590 (0.0010) +[2023-10-11 18:07:55,746][85175] Updated weights for policy 1, policy_version 76600 (0.0010) +[2023-10-11 18:07:56,062][84230] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 155746304. Throughput: 0: 1666.7, 1: 1725.1. Samples: 38938592. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) +[2023-10-11 18:07:56,063][84230] Avg episode reward: [(0, '44.590'), (1, '44.700')] +[2023-10-11 18:07:59,002][85176] Updated weights for policy 0, policy_version 75492 (0.0010) +[2023-10-11 18:07:59,379][85176] Updated weights for policy 0, policy_version 75502 (0.0011) +[2023-10-11 18:07:59,755][85176] Updated weights for policy 0, policy_version 75512 (0.0009) +[2023-10-11 18:07:59,787][85175] Updated weights for policy 1, policy_version 76610 (0.0009) +[2023-10-11 18:08:00,148][85175] Updated weights for policy 1, policy_version 76620 (0.0008) +[2023-10-11 18:08:00,512][85175] Updated weights for policy 1, policy_version 76630 (0.0009) +[2023-10-11 18:08:00,889][85175] Updated weights for policy 1, policy_version 76640 (0.0010) +[2023-10-11 18:08:01,063][84230] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 155811840. Throughput: 0: 1672.2, 1: 1699.7. Samples: 38957992. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) +[2023-10-11 18:08:01,064][84230] Avg episode reward: [(0, '45.140'), (1, '46.970')] +[2023-10-11 18:08:03,896][85176] Updated weights for policy 0, policy_version 75522 (0.0008) +[2023-10-11 18:08:04,302][85176] Updated weights for policy 0, policy_version 75532 (0.0008) +[2023-10-11 18:08:04,679][85176] Updated weights for policy 0, policy_version 75542 (0.0007) +[2023-10-11 18:08:04,877][85175] Updated weights for policy 1, policy_version 76650 (0.0007) +[2023-10-11 18:08:05,047][85176] Updated weights for policy 0, policy_version 75552 (0.0007) +[2023-10-11 18:08:05,236][85175] Updated weights for policy 1, policy_version 76660 (0.0009) +[2023-10-11 18:08:05,600][85175] Updated weights for policy 1, policy_version 76670 (0.0009) +[2023-10-11 18:08:06,063][84230] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 155877376. Throughput: 0: 1680.5, 1: 1720.5. Samples: 38969254. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) +[2023-10-11 18:08:06,063][84230] Avg episode reward: [(0, '44.610'), (1, '44.640')] +[2023-10-11 18:08:08,965][85176] Updated weights for policy 0, policy_version 75562 (0.0007) +[2023-10-11 18:08:09,346][85176] Updated weights for policy 0, policy_version 75572 (0.0008) +[2023-10-11 18:08:09,638][85175] Updated weights for policy 1, policy_version 76680 (0.0010) +[2023-10-11 18:08:09,719][85176] Updated weights for policy 0, policy_version 75582 (0.0008) +[2023-10-11 18:08:10,012][85175] Updated weights for policy 1, policy_version 76690 (0.0008) +[2023-10-11 18:08:10,385][85175] Updated weights for policy 1, policy_version 76700 (0.0009) +[2023-10-11 18:08:11,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 155942912. Throughput: 0: 1660.3, 1: 1711.6. Samples: 38988774. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) +[2023-10-11 18:08:11,064][84230] Avg episode reward: [(0, '47.400'), (1, '46.530')] +[2023-10-11 18:08:14,112][85176] Updated weights for policy 0, policy_version 75592 (0.0010) +[2023-10-11 18:08:14,287][85175] Updated weights for policy 1, policy_version 76710 (0.0008) +[2023-10-11 18:08:14,487][85176] Updated weights for policy 0, policy_version 75602 (0.0008) +[2023-10-11 18:08:14,652][85175] Updated weights for policy 1, policy_version 76720 (0.0007) +[2023-10-11 18:08:14,854][85176] Updated weights for policy 0, policy_version 75612 (0.0010) +[2023-10-11 18:08:15,017][85175] Updated weights for policy 1, policy_version 76730 (0.0007) +[2023-10-11 18:08:16,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 156008448. Throughput: 0: 1673.9, 1: 1689.9. Samples: 39008396. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) +[2023-10-11 18:08:16,063][84230] Avg episode reward: [(0, '44.550'), (1, '45.710')] +[2023-10-11 18:08:16,072][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000075616_77430784.pth... +[2023-10-11 18:08:16,073][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000076736_78577664.pth... +[2023-10-11 18:08:16,103][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000074048_75825152.pth +[2023-10-11 18:08:16,112][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000075136_76939264.pth +[2023-10-11 18:08:18,805][85176] Updated weights for policy 0, policy_version 75622 (0.0009) +[2023-10-11 18:08:19,045][85175] Updated weights for policy 1, policy_version 76740 (0.0008) +[2023-10-11 18:08:19,180][85176] Updated weights for policy 0, policy_version 75632 (0.0007) +[2023-10-11 18:08:19,413][85175] Updated weights for policy 1, policy_version 76750 (0.0007) +[2023-10-11 18:08:19,554][85176] Updated weights for policy 0, policy_version 75642 (0.0010) +[2023-10-11 18:08:19,775][85175] Updated weights for policy 1, policy_version 76760 (0.0009) +[2023-10-11 18:08:21,062][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 156073984. Throughput: 0: 1680.5, 1: 1721.8. Samples: 39020290. Policy #0 lag: (min: 19.0, avg: 25.4, max: 51.0) +[2023-10-11 18:08:21,063][84230] Avg episode reward: [(0, '44.560'), (1, '45.990')] +[2023-10-11 18:08:23,566][85176] Updated weights for policy 0, policy_version 75652 (0.0009) +[2023-10-11 18:08:23,830][85175] Updated weights for policy 1, policy_version 76770 (0.0008) +[2023-10-11 18:08:23,947][85176] Updated weights for policy 0, policy_version 75662 (0.0007) +[2023-10-11 18:08:24,192][85175] Updated weights for policy 1, policy_version 76780 (0.0007) +[2023-10-11 18:08:24,319][85176] Updated weights for policy 0, policy_version 75672 (0.0007) +[2023-10-11 18:08:24,557][85175] Updated weights for policy 1, policy_version 76790 (0.0007) +[2023-10-11 18:08:24,926][85175] Updated weights for policy 1, policy_version 76800 (0.0007) +[2023-10-11 18:08:26,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 156139520. Throughput: 0: 1656.1, 1: 1706.2. Samples: 39039026. Policy #0 lag: (min: 19.0, avg: 25.4, max: 51.0) +[2023-10-11 18:08:26,063][84230] Avg episode reward: [(0, '43.750'), (1, '46.490')] +[2023-10-11 18:08:28,554][85176] Updated weights for policy 0, policy_version 75682 (0.0008) +[2023-10-11 18:08:28,710][85175] Updated weights for policy 1, policy_version 76810 (0.0007) +[2023-10-11 18:08:28,925][85176] Updated weights for policy 0, policy_version 75692 (0.0010) +[2023-10-11 18:08:29,083][85175] Updated weights for policy 1, policy_version 76820 (0.0009) +[2023-10-11 18:08:29,300][85176] Updated weights for policy 0, policy_version 75702 (0.0008) +[2023-10-11 18:08:29,444][85175] Updated weights for policy 1, policy_version 76830 (0.0008) +[2023-10-11 18:08:29,663][85176] Updated weights for policy 0, policy_version 75712 (0.0008) +[2023-10-11 18:08:31,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 156205056. Throughput: 0: 1672.6, 1: 1697.5. Samples: 39059386. Policy #0 lag: (min: 19.0, avg: 25.4, max: 51.0) +[2023-10-11 18:08:31,063][84230] Avg episode reward: [(0, '44.770'), (1, '45.870')] +[2023-10-11 18:08:33,443][85175] Updated weights for policy 1, policy_version 76840 (0.0008) +[2023-10-11 18:08:33,780][85176] Updated weights for policy 0, policy_version 75722 (0.0007) +[2023-10-11 18:08:33,814][85175] Updated weights for policy 1, policy_version 76850 (0.0009) +[2023-10-11 18:08:34,155][85176] Updated weights for policy 0, policy_version 75732 (0.0009) +[2023-10-11 18:08:34,176][85175] Updated weights for policy 1, policy_version 76860 (0.0008) +[2023-10-11 18:08:34,528][85176] Updated weights for policy 0, policy_version 75742 (0.0007) +[2023-10-11 18:08:36,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 156270592. Throughput: 0: 1670.7, 1: 1719.8. Samples: 39070712. Policy #0 lag: (min: 19.0, avg: 25.4, max: 51.0) +[2023-10-11 18:08:36,063][84230] Avg episode reward: [(0, '43.100'), (1, '46.230')] +[2023-10-11 18:08:38,119][85175] Updated weights for policy 1, policy_version 76870 (0.0008) +[2023-10-11 18:08:38,490][85176] Updated weights for policy 0, policy_version 75752 (0.0008) +[2023-10-11 18:08:38,500][85175] Updated weights for policy 1, policy_version 76880 (0.0008) +[2023-10-11 18:08:38,862][85175] Updated weights for policy 1, policy_version 76890 (0.0009) +[2023-10-11 18:08:38,866][85176] Updated weights for policy 0, policy_version 75762 (0.0008) +[2023-10-11 18:08:39,234][85176] Updated weights for policy 0, policy_version 75772 (0.0007) +[2023-10-11 18:08:41,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 156336128. Throughput: 0: 1656.6, 1: 1693.8. Samples: 39089360. Policy #0 lag: (min: 19.0, avg: 25.4, max: 51.0) +[2023-10-11 18:08:41,064][84230] Avg episode reward: [(0, '42.610'), (1, '46.060')] +[2023-10-11 18:08:42,853][85175] Updated weights for policy 1, policy_version 76900 (0.0009) +[2023-10-11 18:08:43,218][85175] Updated weights for policy 1, policy_version 76910 (0.0007) +[2023-10-11 18:08:43,226][85176] Updated weights for policy 0, policy_version 75782 (0.0007) +[2023-10-11 18:08:43,585][85175] Updated weights for policy 1, policy_version 76920 (0.0007) +[2023-10-11 18:08:43,595][85176] Updated weights for policy 0, policy_version 75792 (0.0009) +[2023-10-11 18:08:43,979][85176] Updated weights for policy 0, policy_version 75802 (0.0008) +[2023-10-11 18:08:46,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 156401664. Throughput: 0: 1671.7, 1: 1715.8. Samples: 39110426. Policy #0 lag: (min: 19.0, avg: 25.4, max: 51.0) +[2023-10-11 18:08:46,063][84230] Avg episode reward: [(0, '41.590'), (1, '44.910')] +[2023-10-11 18:08:47,673][85175] Updated weights for policy 1, policy_version 76930 (0.0007) +[2023-10-11 18:08:47,893][85176] Updated weights for policy 0, policy_version 75812 (0.0009) +[2023-10-11 18:08:48,046][85175] Updated weights for policy 1, policy_version 76940 (0.0008) +[2023-10-11 18:08:48,269][85176] Updated weights for policy 0, policy_version 75822 (0.0009) +[2023-10-11 18:08:48,419][85175] Updated weights for policy 1, policy_version 76950 (0.0008) +[2023-10-11 18:08:48,645][85176] Updated weights for policy 0, policy_version 75832 (0.0009) +[2023-10-11 18:08:48,779][85175] Updated weights for policy 1, policy_version 76960 (0.0009) +[2023-10-11 18:08:51,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 156467200. Throughput: 0: 1656.8, 1: 1700.7. Samples: 39120340. Policy #0 lag: (min: 19.0, avg: 25.4, max: 51.0) +[2023-10-11 18:08:51,063][84230] Avg episode reward: [(0, '42.060'), (1, '44.820')] +[2023-10-11 18:08:52,764][85176] Updated weights for policy 0, policy_version 75842 (0.0009) +[2023-10-11 18:08:52,802][85175] Updated weights for policy 1, policy_version 76970 (0.0009) +[2023-10-11 18:08:53,141][85176] Updated weights for policy 0, policy_version 75852 (0.0007) +[2023-10-11 18:08:53,163][85175] Updated weights for policy 1, policy_version 76980 (0.0007) +[2023-10-11 18:08:53,521][85176] Updated weights for policy 0, policy_version 75862 (0.0008) +[2023-10-11 18:08:53,531][85175] Updated weights for policy 1, policy_version 76990 (0.0010) +[2023-10-11 18:08:53,892][85176] Updated weights for policy 0, policy_version 75872 (0.0008) +[2023-10-11 18:08:56,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 156532736. Throughput: 0: 1663.6, 1: 1697.4. Samples: 39140018. Policy #0 lag: (min: 19.0, avg: 25.4, max: 51.0) +[2023-10-11 18:08:56,063][84230] Avg episode reward: [(0, '42.460'), (1, '43.290')] +[2023-10-11 18:08:57,710][85175] Updated weights for policy 1, policy_version 77000 (0.0010) +[2023-10-11 18:08:57,993][85176] Updated weights for policy 0, policy_version 75882 (0.0007) +[2023-10-11 18:08:58,082][85175] Updated weights for policy 1, policy_version 77010 (0.0007) +[2023-10-11 18:08:58,363][85176] Updated weights for policy 0, policy_version 75892 (0.0007) +[2023-10-11 18:08:58,451][85175] Updated weights for policy 1, policy_version 77020 (0.0007) +[2023-10-11 18:08:58,736][85176] Updated weights for policy 0, policy_version 75902 (0.0007) +[2023-10-11 18:09:01,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 156598272. Throughput: 0: 1670.7, 1: 1712.8. Samples: 39160652. Policy #0 lag: (min: 19.0, avg: 25.4, max: 51.0) +[2023-10-11 18:09:01,064][84230] Avg episode reward: [(0, '42.800'), (1, '44.160')] +[2023-10-11 18:09:02,380][85175] Updated weights for policy 1, policy_version 77030 (0.0009) +[2023-10-11 18:09:02,749][85175] Updated weights for policy 1, policy_version 77040 (0.0008) +[2023-10-11 18:09:02,842][85176] Updated weights for policy 0, policy_version 75912 (0.0007) +[2023-10-11 18:09:03,120][85175] Updated weights for policy 1, policy_version 77050 (0.0009) +[2023-10-11 18:09:03,220][85176] Updated weights for policy 0, policy_version 75922 (0.0007) +[2023-10-11 18:09:03,599][85176] Updated weights for policy 0, policy_version 75932 (0.0009) +[2023-10-11 18:09:06,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 156663808. Throughput: 0: 1647.6, 1: 1677.3. Samples: 39169908. Policy #0 lag: (min: 19.0, avg: 25.4, max: 51.0) +[2023-10-11 18:09:06,063][84230] Avg episode reward: [(0, '41.670'), (1, '44.370')] +[2023-10-11 18:09:07,070][85175] Updated weights for policy 1, policy_version 77060 (0.0007) +[2023-10-11 18:09:07,437][85175] Updated weights for policy 1, policy_version 77070 (0.0007) +[2023-10-11 18:09:07,794][85175] Updated weights for policy 1, policy_version 77080 (0.0007) +[2023-10-11 18:09:07,826][85176] Updated weights for policy 0, policy_version 75942 (0.0007) +[2023-10-11 18:09:08,205][85176] Updated weights for policy 0, policy_version 75952 (0.0007) +[2023-10-11 18:09:08,594][85176] Updated weights for policy 0, policy_version 75962 (0.0008) +[2023-10-11 18:09:11,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 156729344. Throughput: 0: 1670.2, 1: 1695.4. Samples: 39190476. Policy #0 lag: (min: 24.0, avg: 46.0, max: 56.0) +[2023-10-11 18:09:11,064][84230] Avg episode reward: [(0, '40.430'), (1, '46.910')] +[2023-10-11 18:09:11,774][85175] Updated weights for policy 1, policy_version 77090 (0.0008) +[2023-10-11 18:09:12,139][85175] Updated weights for policy 1, policy_version 77100 (0.0010) +[2023-10-11 18:09:12,504][85175] Updated weights for policy 1, policy_version 77110 (0.0011) +[2023-10-11 18:09:12,696][85176] Updated weights for policy 0, policy_version 75972 (0.0008) +[2023-10-11 18:09:12,878][85175] Updated weights for policy 1, policy_version 77120 (0.0009) +[2023-10-11 18:09:13,067][85176] Updated weights for policy 0, policy_version 75982 (0.0009) +[2023-10-11 18:09:13,451][85176] Updated weights for policy 0, policy_version 75992 (0.0009) +[2023-10-11 18:09:16,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 156794880. Throughput: 0: 1674.3, 1: 1701.6. Samples: 39211304. Policy #0 lag: (min: 24.0, avg: 46.0, max: 56.0) +[2023-10-11 18:09:16,064][84230] Avg episode reward: [(0, '41.000'), (1, '46.950')] +[2023-10-11 18:09:16,953][85175] Updated weights for policy 1, policy_version 77130 (0.0010) +[2023-10-11 18:09:17,315][85175] Updated weights for policy 1, policy_version 77140 (0.0008) +[2023-10-11 18:09:17,583][85176] Updated weights for policy 0, policy_version 76002 (0.0010) +[2023-10-11 18:09:17,681][85175] Updated weights for policy 1, policy_version 77150 (0.0008) +[2023-10-11 18:09:17,959][85176] Updated weights for policy 0, policy_version 76012 (0.0008) +[2023-10-11 18:09:18,341][85176] Updated weights for policy 0, policy_version 76022 (0.0007) +[2023-10-11 18:09:18,722][85176] Updated weights for policy 0, policy_version 76032 (0.0007) +[2023-10-11 18:09:21,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 156860416. Throughput: 0: 1652.4, 1: 1679.9. Samples: 39220666. Policy #0 lag: (min: 24.0, avg: 46.0, max: 56.0) +[2023-10-11 18:09:21,064][84230] Avg episode reward: [(0, '43.610'), (1, '46.850')] +[2023-10-11 18:09:21,803][85175] Updated weights for policy 1, policy_version 77160 (0.0007) +[2023-10-11 18:09:22,170][85175] Updated weights for policy 1, policy_version 77170 (0.0007) +[2023-10-11 18:09:22,541][85175] Updated weights for policy 1, policy_version 77180 (0.0008) +[2023-10-11 18:09:22,752][85176] Updated weights for policy 0, policy_version 76042 (0.0010) +[2023-10-11 18:09:23,131][85176] Updated weights for policy 0, policy_version 76052 (0.0008) +[2023-10-11 18:09:23,506][85176] Updated weights for policy 0, policy_version 76062 (0.0008) +[2023-10-11 18:09:26,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 156925952. Throughput: 0: 1675.1, 1: 1704.0. Samples: 39241418. Policy #0 lag: (min: 24.0, avg: 46.0, max: 56.0) +[2023-10-11 18:09:26,064][84230] Avg episode reward: [(0, '43.160'), (1, '44.650')] +[2023-10-11 18:09:26,431][85175] Updated weights for policy 1, policy_version 77190 (0.0008) +[2023-10-11 18:09:26,807][85175] Updated weights for policy 1, policy_version 77200 (0.0011) +[2023-10-11 18:09:27,179][85175] Updated weights for policy 1, policy_version 77210 (0.0011) +[2023-10-11 18:09:27,568][85176] Updated weights for policy 0, policy_version 76072 (0.0009) +[2023-10-11 18:09:27,944][85176] Updated weights for policy 0, policy_version 76082 (0.0009) +[2023-10-11 18:09:28,318][85176] Updated weights for policy 0, policy_version 76092 (0.0007) +[2023-10-11 18:09:31,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 156991488. Throughput: 0: 1671.2, 1: 1702.3. Samples: 39262236. Policy #0 lag: (min: 24.0, avg: 46.0, max: 56.0) +[2023-10-11 18:09:31,063][84230] Avg episode reward: [(0, '44.950'), (1, '45.450')] +[2023-10-11 18:09:31,172][85175] Updated weights for policy 1, policy_version 77220 (0.0009) +[2023-10-11 18:09:31,542][85175] Updated weights for policy 1, policy_version 77230 (0.0008) +[2023-10-11 18:09:31,900][85175] Updated weights for policy 1, policy_version 77240 (0.0010) +[2023-10-11 18:09:32,304][85176] Updated weights for policy 0, policy_version 76102 (0.0008) +[2023-10-11 18:09:32,678][85176] Updated weights for policy 0, policy_version 76112 (0.0011) +[2023-10-11 18:09:33,051][85176] Updated weights for policy 0, policy_version 76122 (0.0009) +[2023-10-11 18:09:35,885][85175] Updated weights for policy 1, policy_version 77250 (0.0009) +[2023-10-11 18:09:36,062][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 157057024. Throughput: 0: 1661.2, 1: 1699.1. Samples: 39271552. Policy #0 lag: (min: 24.0, avg: 46.0, max: 56.0) +[2023-10-11 18:09:36,063][84230] Avg episode reward: [(0, '41.620'), (1, '44.980')] +[2023-10-11 18:09:36,250][85175] Updated weights for policy 1, policy_version 77260 (0.0008) +[2023-10-11 18:09:36,612][85175] Updated weights for policy 1, policy_version 77270 (0.0008) +[2023-10-11 18:09:36,981][85175] Updated weights for policy 1, policy_version 77280 (0.0010) +[2023-10-11 18:09:37,072][85176] Updated weights for policy 0, policy_version 76132 (0.0010) +[2023-10-11 18:09:37,442][85176] Updated weights for policy 0, policy_version 76142 (0.0007) +[2023-10-11 18:09:37,811][85176] Updated weights for policy 0, policy_version 76152 (0.0007) +[2023-10-11 18:09:40,798][85175] Updated weights for policy 1, policy_version 77290 (0.0008) +[2023-10-11 18:09:41,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 157122560. Throughput: 0: 1672.8, 1: 1715.2. Samples: 39292480. Policy #0 lag: (min: 24.0, avg: 46.0, max: 56.0) +[2023-10-11 18:09:41,063][84230] Avg episode reward: [(0, '44.820'), (1, '45.860')] +[2023-10-11 18:09:41,170][85175] Updated weights for policy 1, policy_version 77300 (0.0007) +[2023-10-11 18:09:41,551][85175] Updated weights for policy 1, policy_version 77310 (0.0007) +[2023-10-11 18:09:42,085][85176] Updated weights for policy 0, policy_version 76162 (0.0009) +[2023-10-11 18:09:42,496][85176] Updated weights for policy 0, policy_version 76172 (0.0008) +[2023-10-11 18:09:42,884][85176] Updated weights for policy 0, policy_version 76182 (0.0009) +[2023-10-11 18:09:43,263][85176] Updated weights for policy 0, policy_version 76192 (0.0009) +[2023-10-11 18:09:45,597][85175] Updated weights for policy 1, policy_version 77320 (0.0008) +[2023-10-11 18:09:45,971][85175] Updated weights for policy 1, policy_version 77330 (0.0007) +[2023-10-11 18:09:46,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 157188096. Throughput: 0: 1664.2, 1: 1716.4. Samples: 39312778. Policy #0 lag: (min: 24.0, avg: 46.0, max: 56.0) +[2023-10-11 18:09:46,063][84230] Avg episode reward: [(0, '44.040'), (1, '43.570')] +[2023-10-11 18:09:46,349][85175] Updated weights for policy 1, policy_version 77340 (0.0007) +[2023-10-11 18:09:47,501][85176] Updated weights for policy 0, policy_version 76202 (0.0009) +[2023-10-11 18:09:47,868][85176] Updated weights for policy 0, policy_version 76212 (0.0009) +[2023-10-11 18:09:48,249][85176] Updated weights for policy 0, policy_version 76222 (0.0008) +[2023-10-11 18:09:50,384][85175] Updated weights for policy 1, policy_version 77350 (0.0008) +[2023-10-11 18:09:50,756][85175] Updated weights for policy 1, policy_version 77360 (0.0007) +[2023-10-11 18:09:51,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.5). Total num frames: 157253632. Throughput: 0: 1656.6, 1: 1722.6. Samples: 39321970. Policy #0 lag: (min: 24.0, avg: 46.0, max: 56.0) +[2023-10-11 18:09:51,063][84230] Avg episode reward: [(0, '41.140'), (1, '43.370')] +[2023-10-11 18:09:51,121][85175] Updated weights for policy 1, policy_version 77370 (0.0007) +[2023-10-11 18:09:52,345][85176] Updated weights for policy 0, policy_version 76232 (0.0009) +[2023-10-11 18:09:52,713][85176] Updated weights for policy 0, policy_version 76242 (0.0008) +[2023-10-11 18:09:53,084][85176] Updated weights for policy 0, policy_version 76252 (0.0008) +[2023-10-11 18:09:55,200][85175] Updated weights for policy 1, policy_version 77380 (0.0008) +[2023-10-11 18:09:55,564][85175] Updated weights for policy 1, policy_version 77390 (0.0008) +[2023-10-11 18:09:55,943][85175] Updated weights for policy 1, policy_version 77400 (0.0007) +[2023-10-11 18:09:56,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 157319168. Throughput: 0: 1664.4, 1: 1717.6. Samples: 39342664. Policy #0 lag: (min: 24.0, avg: 46.0, max: 56.0) +[2023-10-11 18:09:56,064][84230] Avg episode reward: [(0, '41.820'), (1, '42.660')] +[2023-10-11 18:09:57,041][85176] Updated weights for policy 0, policy_version 76262 (0.0007) +[2023-10-11 18:09:57,413][85176] Updated weights for policy 0, policy_version 76272 (0.0008) +[2023-10-11 18:09:57,789][85176] Updated weights for policy 0, policy_version 76282 (0.0007) +[2023-10-11 18:09:59,934][85175] Updated weights for policy 1, policy_version 77410 (0.0009) +[2023-10-11 18:10:00,301][85175] Updated weights for policy 1, policy_version 77420 (0.0011) +[2023-10-11 18:10:00,665][85175] Updated weights for policy 1, policy_version 77430 (0.0010) +[2023-10-11 18:10:01,036][85175] Updated weights for policy 1, policy_version 77440 (0.0009) +[2023-10-11 18:10:01,063][84230] Fps is (10 sec: 16383.6, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 157417472. Throughput: 0: 1669.5, 1: 1699.7. Samples: 39362916. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:10:01,064][84230] Avg episode reward: [(0, '39.950'), (1, '42.780')] +[2023-10-11 18:10:01,937][85176] Updated weights for policy 0, policy_version 76292 (0.0008) +[2023-10-11 18:10:02,300][85176] Updated weights for policy 0, policy_version 76302 (0.0011) +[2023-10-11 18:10:02,666][85176] Updated weights for policy 0, policy_version 76312 (0.0010) +[2023-10-11 18:10:05,069][85175] Updated weights for policy 1, policy_version 77450 (0.0007) +[2023-10-11 18:10:05,437][85175] Updated weights for policy 1, policy_version 77460 (0.0009) +[2023-10-11 18:10:05,805][85175] Updated weights for policy 1, policy_version 77470 (0.0009) +[2023-10-11 18:10:06,062][84230] Fps is (10 sec: 16384.3, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 157483008. Throughput: 0: 1666.4, 1: 1713.4. Samples: 39372754. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:10:06,063][84230] Avg episode reward: [(0, '42.790'), (1, '44.730')] +[2023-10-11 18:10:06,866][85176] Updated weights for policy 0, policy_version 76322 (0.0009) +[2023-10-11 18:10:07,249][85176] Updated weights for policy 0, policy_version 76332 (0.0007) +[2023-10-11 18:10:07,625][85176] Updated weights for policy 0, policy_version 76342 (0.0007) +[2023-10-11 18:10:08,008][85176] Updated weights for policy 0, policy_version 76352 (0.0007) +[2023-10-11 18:10:09,861][85175] Updated weights for policy 1, policy_version 77480 (0.0008) +[2023-10-11 18:10:10,233][85175] Updated weights for policy 1, policy_version 77490 (0.0010) +[2023-10-11 18:10:10,602][85175] Updated weights for policy 1, policy_version 77500 (0.0009) +[2023-10-11 18:10:11,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 157548544. Throughput: 0: 1667.8, 1: 1710.2. Samples: 39393428. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:10:11,064][84230] Avg episode reward: [(0, '41.250'), (1, '43.240')] +[2023-10-11 18:10:11,978][85176] Updated weights for policy 0, policy_version 76362 (0.0008) +[2023-10-11 18:10:12,350][85176] Updated weights for policy 0, policy_version 76372 (0.0007) +[2023-10-11 18:10:12,723][85176] Updated weights for policy 0, policy_version 76382 (0.0007) +[2023-10-11 18:10:14,541][85175] Updated weights for policy 1, policy_version 77510 (0.0009) +[2023-10-11 18:10:14,913][85175] Updated weights for policy 1, policy_version 77520 (0.0007) +[2023-10-11 18:10:15,280][85175] Updated weights for policy 1, policy_version 77530 (0.0007) +[2023-10-11 18:10:16,063][84230] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 157614080. Throughput: 0: 1674.3, 1: 1679.8. Samples: 39413172. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:10:16,064][84230] Avg episode reward: [(0, '43.200'), (1, '43.610')] +[2023-10-11 18:10:16,076][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000077536_79396864.pth... +[2023-10-11 18:10:16,077][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000076384_78217216.pth... +[2023-10-11 18:10:16,107][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000075936_77758464.pth +[2023-10-11 18:10:16,113][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000074848_76644352.pth +[2023-10-11 18:10:16,711][85176] Updated weights for policy 0, policy_version 76392 (0.0007) +[2023-10-11 18:10:17,073][85176] Updated weights for policy 0, policy_version 76402 (0.0009) +[2023-10-11 18:10:17,442][85176] Updated weights for policy 0, policy_version 76412 (0.0008) +[2023-10-11 18:10:19,443][85175] Updated weights for policy 1, policy_version 77540 (0.0008) +[2023-10-11 18:10:19,809][85175] Updated weights for policy 1, policy_version 77550 (0.0008) +[2023-10-11 18:10:20,180][85175] Updated weights for policy 1, policy_version 77560 (0.0008) +[2023-10-11 18:10:21,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 157679616. Throughput: 0: 1672.9, 1: 1702.9. Samples: 39423462. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:10:21,063][84230] Avg episode reward: [(0, '41.060'), (1, '43.240')] +[2023-10-11 18:10:21,414][85176] Updated weights for policy 0, policy_version 76422 (0.0009) +[2023-10-11 18:10:21,790][85176] Updated weights for policy 0, policy_version 76432 (0.0009) +[2023-10-11 18:10:22,168][85176] Updated weights for policy 0, policy_version 76442 (0.0009) +[2023-10-11 18:10:24,161][85175] Updated weights for policy 1, policy_version 77570 (0.0009) +[2023-10-11 18:10:24,529][85175] Updated weights for policy 1, policy_version 77580 (0.0010) +[2023-10-11 18:10:24,899][85175] Updated weights for policy 1, policy_version 77590 (0.0008) +[2023-10-11 18:10:25,259][85175] Updated weights for policy 1, policy_version 77600 (0.0009) +[2023-10-11 18:10:26,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 157745152. Throughput: 0: 1677.6, 1: 1686.9. Samples: 39443882. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:10:26,064][84230] Avg episode reward: [(0, '43.520'), (1, '46.440')] +[2023-10-11 18:10:26,160][85176] Updated weights for policy 0, policy_version 76452 (0.0008) +[2023-10-11 18:10:26,540][85176] Updated weights for policy 0, policy_version 76462 (0.0008) +[2023-10-11 18:10:26,913][85176] Updated weights for policy 0, policy_version 76472 (0.0007) +[2023-10-11 18:10:29,342][85175] Updated weights for policy 1, policy_version 77610 (0.0009) +[2023-10-11 18:10:29,715][85175] Updated weights for policy 1, policy_version 77620 (0.0007) +[2023-10-11 18:10:30,076][85175] Updated weights for policy 1, policy_version 77630 (0.0009) +[2023-10-11 18:10:30,844][85176] Updated weights for policy 0, policy_version 76482 (0.0007) +[2023-10-11 18:10:31,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 157810688. Throughput: 0: 1692.2, 1: 1667.6. Samples: 39463970. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:10:31,064][84230] Avg episode reward: [(0, '39.270'), (1, '43.270')] +[2023-10-11 18:10:31,259][85176] Updated weights for policy 0, policy_version 76492 (0.0007) +[2023-10-11 18:10:31,630][85176] Updated weights for policy 0, policy_version 76502 (0.0007) +[2023-10-11 18:10:31,995][85176] Updated weights for policy 0, policy_version 76512 (0.0009) +[2023-10-11 18:10:34,086][85175] Updated weights for policy 1, policy_version 77640 (0.0008) +[2023-10-11 18:10:34,472][85175] Updated weights for policy 1, policy_version 77650 (0.0008) +[2023-10-11 18:10:34,839][85175] Updated weights for policy 1, policy_version 77660 (0.0007) +[2023-10-11 18:10:36,012][85176] Updated weights for policy 0, policy_version 76522 (0.0008) +[2023-10-11 18:10:36,062][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 157876224. Throughput: 0: 1690.8, 1: 1695.6. Samples: 39474358. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:10:36,063][84230] Avg episode reward: [(0, '43.450'), (1, '46.240')] +[2023-10-11 18:10:36,383][85176] Updated weights for policy 0, policy_version 76532 (0.0008) +[2023-10-11 18:10:36,755][85176] Updated weights for policy 0, policy_version 76542 (0.0009) +[2023-10-11 18:10:39,003][85175] Updated weights for policy 1, policy_version 77670 (0.0008) +[2023-10-11 18:10:39,362][85175] Updated weights for policy 1, policy_version 77680 (0.0008) +[2023-10-11 18:10:39,736][85175] Updated weights for policy 1, policy_version 77690 (0.0007) +[2023-10-11 18:10:40,978][85176] Updated weights for policy 0, policy_version 76552 (0.0011) +[2023-10-11 18:10:41,062][84230] Fps is (10 sec: 13107.6, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 157941760. Throughput: 0: 1691.7, 1: 1676.5. Samples: 39494232. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:10:41,063][84230] Avg episode reward: [(0, '41.840'), (1, '43.170')] +[2023-10-11 18:10:41,350][85176] Updated weights for policy 0, policy_version 76562 (0.0011) +[2023-10-11 18:10:41,719][85176] Updated weights for policy 0, policy_version 76572 (0.0011) +[2023-10-11 18:10:43,744][85175] Updated weights for policy 1, policy_version 77700 (0.0009) +[2023-10-11 18:10:44,118][85175] Updated weights for policy 1, policy_version 77710 (0.0010) +[2023-10-11 18:10:44,488][85175] Updated weights for policy 1, policy_version 77720 (0.0010) +[2023-10-11 18:10:45,841][85176] Updated weights for policy 0, policy_version 76582 (0.0009) +[2023-10-11 18:10:46,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 158007296. Throughput: 0: 1688.5, 1: 1682.3. Samples: 39514602. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:10:46,063][84230] Avg episode reward: [(0, '39.690'), (1, '43.660')] +[2023-10-11 18:10:46,209][85176] Updated weights for policy 0, policy_version 76592 (0.0010) +[2023-10-11 18:10:46,574][85176] Updated weights for policy 0, policy_version 76602 (0.0009) +[2023-10-11 18:10:48,488][85175] Updated weights for policy 1, policy_version 77730 (0.0010) +[2023-10-11 18:10:48,850][85175] Updated weights for policy 1, policy_version 77740 (0.0008) +[2023-10-11 18:10:49,226][85175] Updated weights for policy 1, policy_version 77750 (0.0009) +[2023-10-11 18:10:49,585][85175] Updated weights for policy 1, policy_version 77760 (0.0010) +[2023-10-11 18:10:50,798][85176] Updated weights for policy 0, policy_version 76612 (0.0008) +[2023-10-11 18:10:51,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 158072832. Throughput: 0: 1687.6, 1: 1698.0. Samples: 39525106. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-10-11 18:10:51,063][84230] Avg episode reward: [(0, '42.940'), (1, '40.820')] +[2023-10-11 18:10:51,178][85176] Updated weights for policy 0, policy_version 76622 (0.0009) +[2023-10-11 18:10:51,561][85176] Updated weights for policy 0, policy_version 76632 (0.0009) +[2023-10-11 18:10:53,552][85175] Updated weights for policy 1, policy_version 77770 (0.0007) +[2023-10-11 18:10:53,920][85175] Updated weights for policy 1, policy_version 77780 (0.0008) +[2023-10-11 18:10:54,291][85175] Updated weights for policy 1, policy_version 77790 (0.0010) +[2023-10-11 18:10:55,674][85176] Updated weights for policy 0, policy_version 76642 (0.0008) +[2023-10-11 18:10:56,044][85176] Updated weights for policy 0, policy_version 76652 (0.0010) +[2023-10-11 18:10:56,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 158138368. Throughput: 0: 1688.3, 1: 1673.4. Samples: 39544702. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-10-11 18:10:56,064][84230] Avg episode reward: [(0, '41.280'), (1, '44.270')] +[2023-10-11 18:10:56,418][85176] Updated weights for policy 0, policy_version 76662 (0.0009) +[2023-10-11 18:10:56,791][85176] Updated weights for policy 0, policy_version 76672 (0.0009) +[2023-10-11 18:10:58,285][85175] Updated weights for policy 1, policy_version 77800 (0.0008) +[2023-10-11 18:10:58,659][85175] Updated weights for policy 1, policy_version 77810 (0.0009) +[2023-10-11 18:10:59,024][85175] Updated weights for policy 1, policy_version 77820 (0.0008) +[2023-10-11 18:11:00,695][85176] Updated weights for policy 0, policy_version 76682 (0.0008) +[2023-10-11 18:11:01,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 158203904. Throughput: 0: 1682.9, 1: 1699.0. Samples: 39565358. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-10-11 18:11:01,063][84230] Avg episode reward: [(0, '44.510'), (1, '44.420')] +[2023-10-11 18:11:01,071][85176] Updated weights for policy 0, policy_version 76692 (0.0009) +[2023-10-11 18:11:01,437][85176] Updated weights for policy 0, policy_version 76702 (0.0010) +[2023-10-11 18:11:03,209][85175] Updated weights for policy 1, policy_version 77830 (0.0008) +[2023-10-11 18:11:03,579][85175] Updated weights for policy 1, policy_version 77840 (0.0009) +[2023-10-11 18:11:03,949][85175] Updated weights for policy 1, policy_version 77850 (0.0010) +[2023-10-11 18:11:05,468][85176] Updated weights for policy 0, policy_version 76712 (0.0010) +[2023-10-11 18:11:05,844][85176] Updated weights for policy 0, policy_version 76722 (0.0009) +[2023-10-11 18:11:06,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 158269440. Throughput: 0: 1688.8, 1: 1690.5. Samples: 39575532. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-10-11 18:11:06,064][84230] Avg episode reward: [(0, '42.260'), (1, '47.340')] +[2023-10-11 18:11:06,227][85176] Updated weights for policy 0, policy_version 76732 (0.0011) +[2023-10-11 18:11:07,796][85175] Updated weights for policy 1, policy_version 77860 (0.0010) +[2023-10-11 18:11:08,164][85175] Updated weights for policy 1, policy_version 77870 (0.0009) +[2023-10-11 18:11:08,537][85175] Updated weights for policy 1, policy_version 77880 (0.0012) +[2023-10-11 18:11:10,335][85176] Updated weights for policy 0, policy_version 76742 (0.0009) +[2023-10-11 18:11:10,703][85176] Updated weights for policy 0, policy_version 76752 (0.0007) +[2023-10-11 18:11:11,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.3, 300 sec: 13440.5). Total num frames: 158334976. Throughput: 0: 1685.1, 1: 1684.5. Samples: 39595514. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-10-11 18:11:11,063][84230] Avg episode reward: [(0, '43.490'), (1, '45.450')] +[2023-10-11 18:11:11,082][85176] Updated weights for policy 0, policy_version 76762 (0.0007) +[2023-10-11 18:11:12,557][85175] Updated weights for policy 1, policy_version 77890 (0.0010) +[2023-10-11 18:11:12,922][85175] Updated weights for policy 1, policy_version 77900 (0.0009) +[2023-10-11 18:11:13,294][85175] Updated weights for policy 1, policy_version 77910 (0.0008) +[2023-10-11 18:11:13,658][85175] Updated weights for policy 1, policy_version 77920 (0.0009) +[2023-10-11 18:11:15,074][85176] Updated weights for policy 0, policy_version 76772 (0.0008) +[2023-10-11 18:11:15,448][85176] Updated weights for policy 0, policy_version 76782 (0.0009) +[2023-10-11 18:11:15,833][85176] Updated weights for policy 0, policy_version 76792 (0.0007) +[2023-10-11 18:11:16,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 158400512. Throughput: 0: 1664.5, 1: 1707.3. Samples: 39615700. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-10-11 18:11:16,063][84230] Avg episode reward: [(0, '41.620'), (1, '42.500')] +[2023-10-11 18:11:17,578][85175] Updated weights for policy 1, policy_version 77930 (0.0010) +[2023-10-11 18:11:17,947][85175] Updated weights for policy 1, policy_version 77940 (0.0008) +[2023-10-11 18:11:18,316][85175] Updated weights for policy 1, policy_version 77950 (0.0007) +[2023-10-11 18:11:19,859][85176] Updated weights for policy 0, policy_version 76802 (0.0007) +[2023-10-11 18:11:20,278][85176] Updated weights for policy 0, policy_version 76812 (0.0011) +[2023-10-11 18:11:20,643][85176] Updated weights for policy 0, policy_version 76822 (0.0010) +[2023-10-11 18:11:21,018][85176] Updated weights for policy 0, policy_version 76832 (0.0008) +[2023-10-11 18:11:21,062][84230] Fps is (10 sec: 16384.1, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 158498816. Throughput: 0: 1684.0, 1: 1677.1. Samples: 39625604. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-10-11 18:11:21,063][84230] Avg episode reward: [(0, '43.800'), (1, '42.980')] +[2023-10-11 18:11:22,235][85175] Updated weights for policy 1, policy_version 77960 (0.0008) +[2023-10-11 18:11:22,595][85175] Updated weights for policy 1, policy_version 77970 (0.0008) +[2023-10-11 18:11:22,955][85175] Updated weights for policy 1, policy_version 77980 (0.0009) +[2023-10-11 18:11:25,122][85176] Updated weights for policy 0, policy_version 76842 (0.0007) +[2023-10-11 18:11:25,498][85176] Updated weights for policy 0, policy_version 76852 (0.0008) +[2023-10-11 18:11:25,861][85176] Updated weights for policy 0, policy_version 76862 (0.0009) +[2023-10-11 18:11:26,063][84230] Fps is (10 sec: 16384.1, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 158564352. Throughput: 0: 1679.0, 1: 1700.0. Samples: 39646288. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-10-11 18:11:26,063][84230] Avg episode reward: [(0, '43.480'), (1, '45.270')] +[2023-10-11 18:11:27,104][85175] Updated weights for policy 1, policy_version 77990 (0.0009) +[2023-10-11 18:11:27,495][85175] Updated weights for policy 1, policy_version 78000 (0.0008) +[2023-10-11 18:11:27,858][85175] Updated weights for policy 1, policy_version 78010 (0.0007) +[2023-10-11 18:11:29,956][85176] Updated weights for policy 0, policy_version 76872 (0.0010) +[2023-10-11 18:11:30,340][85176] Updated weights for policy 0, policy_version 76882 (0.0008) +[2023-10-11 18:11:30,708][85176] Updated weights for policy 0, policy_version 76892 (0.0009) +[2023-10-11 18:11:31,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 158629888. Throughput: 0: 1660.0, 1: 1706.2. Samples: 39666084. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-10-11 18:11:31,063][84230] Avg episode reward: [(0, '43.370'), (1, '44.670')] +[2023-10-11 18:11:31,942][85175] Updated weights for policy 1, policy_version 78020 (0.0007) +[2023-10-11 18:11:32,310][85175] Updated weights for policy 1, policy_version 78030 (0.0007) +[2023-10-11 18:11:32,680][85175] Updated weights for policy 1, policy_version 78040 (0.0007) +[2023-10-11 18:11:34,709][85176] Updated weights for policy 0, policy_version 76902 (0.0008) +[2023-10-11 18:11:35,082][85176] Updated weights for policy 0, policy_version 76912 (0.0009) +[2023-10-11 18:11:35,450][85176] Updated weights for policy 0, policy_version 76922 (0.0007) +[2023-10-11 18:11:36,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 158695424. Throughput: 0: 1685.1, 1: 1676.4. Samples: 39676372. Policy #0 lag: (min: 31.0, avg: 46.6, max: 63.0) +[2023-10-11 18:11:36,063][84230] Avg episode reward: [(0, '42.750'), (1, '44.530')] +[2023-10-11 18:11:36,648][85175] Updated weights for policy 1, policy_version 78050 (0.0009) +[2023-10-11 18:11:37,014][85175] Updated weights for policy 1, policy_version 78060 (0.0009) +[2023-10-11 18:11:37,388][85175] Updated weights for policy 1, policy_version 78070 (0.0007) +[2023-10-11 18:11:37,747][85175] Updated weights for policy 1, policy_version 78080 (0.0009) +[2023-10-11 18:11:39,602][85176] Updated weights for policy 0, policy_version 76932 (0.0008) +[2023-10-11 18:11:39,978][85176] Updated weights for policy 0, policy_version 76942 (0.0008) +[2023-10-11 18:11:40,353][85176] Updated weights for policy 0, policy_version 76952 (0.0011) +[2023-10-11 18:11:41,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 158760960. Throughput: 0: 1680.0, 1: 1703.9. Samples: 39696976. Policy #0 lag: (min: 31.0, avg: 46.6, max: 63.0) +[2023-10-11 18:11:41,063][84230] Avg episode reward: [(0, '43.430'), (1, '44.190')] +[2023-10-11 18:11:41,882][85175] Updated weights for policy 1, policy_version 78090 (0.0008) +[2023-10-11 18:11:42,251][85175] Updated weights for policy 1, policy_version 78100 (0.0009) +[2023-10-11 18:11:42,614][85175] Updated weights for policy 1, policy_version 78110 (0.0009) +[2023-10-11 18:11:44,262][85176] Updated weights for policy 0, policy_version 76962 (0.0010) +[2023-10-11 18:11:44,642][85176] Updated weights for policy 0, policy_version 76972 (0.0008) +[2023-10-11 18:11:45,013][85176] Updated weights for policy 0, policy_version 76982 (0.0011) +[2023-10-11 18:11:45,389][85176] Updated weights for policy 0, policy_version 76992 (0.0009) +[2023-10-11 18:11:46,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 158826496. Throughput: 0: 1654.4, 1: 1714.4. Samples: 39716956. Policy #0 lag: (min: 31.0, avg: 46.6, max: 63.0) +[2023-10-11 18:11:46,063][84230] Avg episode reward: [(0, '43.210'), (1, '43.980')] +[2023-10-11 18:11:46,486][85175] Updated weights for policy 1, policy_version 78120 (0.0008) +[2023-10-11 18:11:46,862][85175] Updated weights for policy 1, policy_version 78130 (0.0008) +[2023-10-11 18:11:47,229][85175] Updated weights for policy 1, policy_version 78140 (0.0009) +[2023-10-11 18:11:49,449][85176] Updated weights for policy 0, policy_version 77002 (0.0008) +[2023-10-11 18:11:49,811][85176] Updated weights for policy 0, policy_version 77012 (0.0008) +[2023-10-11 18:11:50,184][85176] Updated weights for policy 0, policy_version 77022 (0.0008) +[2023-10-11 18:11:51,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 158892032. Throughput: 0: 1678.2, 1: 1693.3. Samples: 39727248. Policy #0 lag: (min: 31.0, avg: 46.6, max: 63.0) +[2023-10-11 18:11:51,063][84230] Avg episode reward: [(0, '43.960'), (1, '44.630')] +[2023-10-11 18:11:51,232][85175] Updated weights for policy 1, policy_version 78150 (0.0009) +[2023-10-11 18:11:51,596][85175] Updated weights for policy 1, policy_version 78160 (0.0007) +[2023-10-11 18:11:51,969][85175] Updated weights for policy 1, policy_version 78170 (0.0009) +[2023-10-11 18:11:54,322][85176] Updated weights for policy 0, policy_version 77032 (0.0009) +[2023-10-11 18:11:54,703][85176] Updated weights for policy 0, policy_version 77042 (0.0008) +[2023-10-11 18:11:55,072][85176] Updated weights for policy 0, policy_version 77052 (0.0009) +[2023-10-11 18:11:55,956][85175] Updated weights for policy 1, policy_version 78180 (0.0009) +[2023-10-11 18:11:56,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 158957568. Throughput: 0: 1662.6, 1: 1711.1. Samples: 39747330. Policy #0 lag: (min: 31.0, avg: 46.6, max: 63.0) +[2023-10-11 18:11:56,063][84230] Avg episode reward: [(0, '43.070'), (1, '43.670')] +[2023-10-11 18:11:56,331][85175] Updated weights for policy 1, policy_version 78190 (0.0007) +[2023-10-11 18:11:56,694][85175] Updated weights for policy 1, policy_version 78200 (0.0008) +[2023-10-11 18:11:58,990][85176] Updated weights for policy 0, policy_version 77062 (0.0010) +[2023-10-11 18:11:59,354][85176] Updated weights for policy 0, policy_version 77072 (0.0011) +[2023-10-11 18:11:59,720][85176] Updated weights for policy 0, policy_version 77082 (0.0009) +[2023-10-11 18:12:00,566][85175] Updated weights for policy 1, policy_version 78210 (0.0008) +[2023-10-11 18:12:00,934][85175] Updated weights for policy 1, policy_version 78220 (0.0010) +[2023-10-11 18:12:01,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 159023104. Throughput: 0: 1659.6, 1: 1713.1. Samples: 39767468. Policy #0 lag: (min: 31.0, avg: 46.6, max: 63.0) +[2023-10-11 18:12:01,063][84230] Avg episode reward: [(0, '43.210'), (1, '43.670')] +[2023-10-11 18:12:01,312][85175] Updated weights for policy 1, policy_version 78230 (0.0008) +[2023-10-11 18:12:01,683][85175] Updated weights for policy 1, policy_version 78240 (0.0008) +[2023-10-11 18:12:04,035][85176] Updated weights for policy 0, policy_version 77092 (0.0008) +[2023-10-11 18:12:04,405][85176] Updated weights for policy 0, policy_version 77102 (0.0010) +[2023-10-11 18:12:04,773][85176] Updated weights for policy 0, policy_version 77112 (0.0008) +[2023-10-11 18:12:05,712][85175] Updated weights for policy 1, policy_version 78250 (0.0009) +[2023-10-11 18:12:06,062][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 159088640. Throughput: 0: 1673.6, 1: 1714.6. Samples: 39778074. Policy #0 lag: (min: 31.0, avg: 46.6, max: 63.0) +[2023-10-11 18:12:06,063][84230] Avg episode reward: [(0, '43.560'), (1, '45.230')] +[2023-10-11 18:12:06,082][85175] Updated weights for policy 1, policy_version 78260 (0.0009) +[2023-10-11 18:12:06,445][85175] Updated weights for policy 1, policy_version 78270 (0.0009) +[2023-10-11 18:12:08,990][85176] Updated weights for policy 0, policy_version 77122 (0.0007) +[2023-10-11 18:12:09,373][85176] Updated weights for policy 0, policy_version 77132 (0.0009) +[2023-10-11 18:12:09,751][85176] Updated weights for policy 0, policy_version 77142 (0.0010) +[2023-10-11 18:12:10,118][85176] Updated weights for policy 0, policy_version 77152 (0.0008) +[2023-10-11 18:12:10,516][85175] Updated weights for policy 1, policy_version 78280 (0.0008) +[2023-10-11 18:12:10,892][85175] Updated weights for policy 1, policy_version 78290 (0.0008) +[2023-10-11 18:12:11,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 159154176. Throughput: 0: 1660.9, 1: 1709.6. Samples: 39797962. Policy #0 lag: (min: 31.0, avg: 46.6, max: 63.0) +[2023-10-11 18:12:11,063][84230] Avg episode reward: [(0, '41.790'), (1, '44.520')] +[2023-10-11 18:12:11,256][85175] Updated weights for policy 1, policy_version 78300 (0.0009) +[2023-10-11 18:12:14,216][85176] Updated weights for policy 0, policy_version 77162 (0.0009) +[2023-10-11 18:12:14,587][85176] Updated weights for policy 0, policy_version 77172 (0.0007) +[2023-10-11 18:12:14,965][85176] Updated weights for policy 0, policy_version 77182 (0.0009) +[2023-10-11 18:12:15,346][85175] Updated weights for policy 1, policy_version 78310 (0.0007) +[2023-10-11 18:12:15,726][85175] Updated weights for policy 1, policy_version 78320 (0.0007) +[2023-10-11 18:12:16,063][84230] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 159219712. Throughput: 0: 1666.2, 1: 1704.3. Samples: 39817760. Policy #0 lag: (min: 31.0, avg: 46.6, max: 63.0) +[2023-10-11 18:12:16,064][84230] Avg episode reward: [(0, '43.030'), (1, '43.190')] +[2023-10-11 18:12:16,076][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000077184_79036416.pth... +[2023-10-11 18:12:16,101][85175] Updated weights for policy 1, policy_version 78330 (0.0009) +[2023-10-11 18:12:16,108][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000075616_77430784.pth +[2023-10-11 18:12:16,322][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000078336_80216064.pth... +[2023-10-11 18:12:16,363][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000076736_78577664.pth +[2023-10-11 18:12:19,173][85176] Updated weights for policy 0, policy_version 77192 (0.0008) +[2023-10-11 18:12:19,540][85176] Updated weights for policy 0, policy_version 77202 (0.0010) +[2023-10-11 18:12:19,903][85176] Updated weights for policy 0, policy_version 77212 (0.0010) +[2023-10-11 18:12:20,129][85175] Updated weights for policy 1, policy_version 78340 (0.0008) +[2023-10-11 18:12:20,501][85175] Updated weights for policy 1, policy_version 78350 (0.0007) +[2023-10-11 18:12:20,862][85175] Updated weights for policy 1, policy_version 78360 (0.0008) +[2023-10-11 18:12:21,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 159285248. Throughput: 0: 1665.8, 1: 1709.5. Samples: 39828260. Policy #0 lag: (min: 31.0, avg: 46.6, max: 63.0) +[2023-10-11 18:12:21,063][84230] Avg episode reward: [(0, '41.040'), (1, '43.270')] +[2023-10-11 18:12:23,952][85176] Updated weights for policy 0, policy_version 77222 (0.0009) +[2023-10-11 18:12:24,326][85176] Updated weights for policy 0, policy_version 77232 (0.0009) +[2023-10-11 18:12:24,709][85176] Updated weights for policy 0, policy_version 77242 (0.0008) +[2023-10-11 18:12:24,888][85175] Updated weights for policy 1, policy_version 78370 (0.0010) +[2023-10-11 18:12:25,267][85175] Updated weights for policy 1, policy_version 78380 (0.0007) +[2023-10-11 18:12:25,633][85175] Updated weights for policy 1, policy_version 78390 (0.0007) +[2023-10-11 18:12:25,997][85175] Updated weights for policy 1, policy_version 78400 (0.0008) +[2023-10-11 18:12:26,063][84230] Fps is (10 sec: 16384.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 159383552. Throughput: 0: 1651.0, 1: 1706.7. Samples: 39848074. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:12:26,064][84230] Avg episode reward: [(0, '44.360'), (1, '44.640')] +[2023-10-11 18:12:28,709][85176] Updated weights for policy 0, policy_version 77252 (0.0008) +[2023-10-11 18:12:29,086][85176] Updated weights for policy 0, policy_version 77262 (0.0007) +[2023-10-11 18:12:29,457][85176] Updated weights for policy 0, policy_version 77272 (0.0008) +[2023-10-11 18:12:29,856][85175] Updated weights for policy 1, policy_version 78410 (0.0007) +[2023-10-11 18:12:30,234][85175] Updated weights for policy 1, policy_version 78420 (0.0011) +[2023-10-11 18:12:30,612][85175] Updated weights for policy 1, policy_version 78430 (0.0011) +[2023-10-11 18:12:31,063][84230] Fps is (10 sec: 16383.5, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 159449088. Throughput: 0: 1667.9, 1: 1678.3. Samples: 39867534. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:12:31,064][84230] Avg episode reward: [(0, '42.260'), (1, '44.200')] +[2023-10-11 18:12:33,524][85176] Updated weights for policy 0, policy_version 77282 (0.0007) +[2023-10-11 18:12:33,891][85176] Updated weights for policy 0, policy_version 77292 (0.0007) +[2023-10-11 18:12:34,262][85176] Updated weights for policy 0, policy_version 77302 (0.0008) +[2023-10-11 18:12:34,635][85176] Updated weights for policy 0, policy_version 77312 (0.0007) +[2023-10-11 18:12:34,733][85175] Updated weights for policy 1, policy_version 78440 (0.0009) +[2023-10-11 18:12:35,096][85175] Updated weights for policy 1, policy_version 78450 (0.0010) +[2023-10-11 18:12:35,459][85175] Updated weights for policy 1, policy_version 78460 (0.0007) +[2023-10-11 18:12:36,062][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 159514624. Throughput: 0: 1665.6, 1: 1705.3. Samples: 39878936. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:12:36,063][84230] Avg episode reward: [(0, '41.970'), (1, '43.880')] +[2023-10-11 18:12:38,512][85176] Updated weights for policy 0, policy_version 77322 (0.0008) +[2023-10-11 18:12:38,899][85176] Updated weights for policy 0, policy_version 77332 (0.0010) +[2023-10-11 18:12:39,263][85176] Updated weights for policy 0, policy_version 77342 (0.0010) +[2023-10-11 18:12:39,357][85175] Updated weights for policy 1, policy_version 78470 (0.0009) +[2023-10-11 18:12:39,727][85175] Updated weights for policy 1, policy_version 78480 (0.0010) +[2023-10-11 18:12:40,095][85175] Updated weights for policy 1, policy_version 78490 (0.0008) +[2023-10-11 18:12:41,062][84230] Fps is (10 sec: 13107.6, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 159580160. Throughput: 0: 1663.8, 1: 1697.8. Samples: 39898602. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:12:41,063][84230] Avg episode reward: [(0, '43.140'), (1, '45.080')] +[2023-10-11 18:12:43,370][85176] Updated weights for policy 0, policy_version 77352 (0.0008) +[2023-10-11 18:12:43,749][85176] Updated weights for policy 0, policy_version 77362 (0.0007) +[2023-10-11 18:12:44,118][85176] Updated weights for policy 0, policy_version 77372 (0.0008) +[2023-10-11 18:12:44,142][85175] Updated weights for policy 1, policy_version 78500 (0.0010) +[2023-10-11 18:12:44,500][85175] Updated weights for policy 1, policy_version 78510 (0.0008) +[2023-10-11 18:12:44,867][85175] Updated weights for policy 1, policy_version 78520 (0.0010) +[2023-10-11 18:12:46,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 159645696. Throughput: 0: 1681.5, 1: 1676.4. Samples: 39918576. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:12:46,064][84230] Avg episode reward: [(0, '44.290'), (1, '45.230')] +[2023-10-11 18:12:48,128][85176] Updated weights for policy 0, policy_version 77382 (0.0010) +[2023-10-11 18:12:48,508][85176] Updated weights for policy 0, policy_version 77392 (0.0010) +[2023-10-11 18:12:48,859][85175] Updated weights for policy 1, policy_version 78530 (0.0009) +[2023-10-11 18:12:48,872][85176] Updated weights for policy 0, policy_version 77402 (0.0009) +[2023-10-11 18:12:49,218][85175] Updated weights for policy 1, policy_version 78540 (0.0009) +[2023-10-11 18:12:49,576][85175] Updated weights for policy 1, policy_version 78550 (0.0007) +[2023-10-11 18:12:49,947][85175] Updated weights for policy 1, policy_version 78560 (0.0008) +[2023-10-11 18:12:51,062][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 159711232. Throughput: 0: 1665.0, 1: 1705.8. Samples: 39929758. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:12:51,063][84230] Avg episode reward: [(0, '45.550'), (1, '46.420')] +[2023-10-11 18:12:52,920][85176] Updated weights for policy 0, policy_version 77412 (0.0008) +[2023-10-11 18:12:53,303][85176] Updated weights for policy 0, policy_version 77422 (0.0007) +[2023-10-11 18:12:53,666][85176] Updated weights for policy 0, policy_version 77432 (0.0009) +[2023-10-11 18:12:54,167][85175] Updated weights for policy 1, policy_version 78570 (0.0009) +[2023-10-11 18:12:54,549][85175] Updated weights for policy 1, policy_version 78580 (0.0008) +[2023-10-11 18:12:54,911][85175] Updated weights for policy 1, policy_version 78590 (0.0010) +[2023-10-11 18:12:56,063][84230] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 159776768. Throughput: 0: 1672.7, 1: 1689.9. Samples: 39949280. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:12:56,063][84230] Avg episode reward: [(0, '44.020'), (1, '46.020')] +[2023-10-11 18:12:57,709][85176] Updated weights for policy 0, policy_version 77442 (0.0009) +[2023-10-11 18:12:58,090][85176] Updated weights for policy 0, policy_version 77452 (0.0008) +[2023-10-11 18:12:58,466][85176] Updated weights for policy 0, policy_version 77462 (0.0007) +[2023-10-11 18:12:58,837][85176] Updated weights for policy 0, policy_version 77472 (0.0008) +[2023-10-11 18:12:58,875][85175] Updated weights for policy 1, policy_version 78600 (0.0008) +[2023-10-11 18:12:59,247][85175] Updated weights for policy 1, policy_version 78610 (0.0007) +[2023-10-11 18:12:59,621][85175] Updated weights for policy 1, policy_version 78620 (0.0008) +[2023-10-11 18:13:01,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 159842304. Throughput: 0: 1684.8, 1: 1684.4. Samples: 39969376. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:13:01,064][84230] Avg episode reward: [(0, '46.750'), (1, '41.760')] +[2023-10-11 18:13:02,903][85176] Updated weights for policy 0, policy_version 77482 (0.0007) +[2023-10-11 18:13:03,275][85176] Updated weights for policy 0, policy_version 77492 (0.0008) +[2023-10-11 18:13:03,647][85176] Updated weights for policy 0, policy_version 77502 (0.0008) +[2023-10-11 18:13:03,719][85175] Updated weights for policy 1, policy_version 78630 (0.0007) +[2023-10-11 18:13:04,090][85175] Updated weights for policy 1, policy_version 78640 (0.0009) +[2023-10-11 18:13:04,457][85175] Updated weights for policy 1, policy_version 78650 (0.0007) +[2023-10-11 18:13:06,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 159907840. Throughput: 0: 1665.0, 1: 1708.7. Samples: 39980076. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:13:06,064][84230] Avg episode reward: [(0, '42.000'), (1, '44.850')] +[2023-10-11 18:13:07,707][85176] Updated weights for policy 0, policy_version 77512 (0.0010) +[2023-10-11 18:13:08,085][85176] Updated weights for policy 0, policy_version 77522 (0.0008) +[2023-10-11 18:13:08,421][85175] Updated weights for policy 1, policy_version 78660 (0.0009) +[2023-10-11 18:13:08,454][85176] Updated weights for policy 0, policy_version 77532 (0.0009) +[2023-10-11 18:13:08,794][85175] Updated weights for policy 1, policy_version 78670 (0.0008) +[2023-10-11 18:13:09,160][85175] Updated weights for policy 1, policy_version 78680 (0.0010) +[2023-10-11 18:13:11,063][84230] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 159973376. Throughput: 0: 1685.2, 1: 1681.3. Samples: 39999570. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:13:11,063][84230] Avg episode reward: [(0, '46.000'), (1, '44.560')] +[2023-10-11 18:13:12,371][85176] Updated weights for policy 0, policy_version 77542 (0.0010) +[2023-10-11 18:13:12,747][85176] Updated weights for policy 0, policy_version 77552 (0.0011) +[2023-10-11 18:13:13,119][85176] Updated weights for policy 0, policy_version 77562 (0.0008) +[2023-10-11 18:13:13,195][85175] Updated weights for policy 1, policy_version 78690 (0.0010) +[2023-10-11 18:13:13,570][85175] Updated weights for policy 1, policy_version 78700 (0.0009) +[2023-10-11 18:13:13,936][85175] Updated weights for policy 1, policy_version 78710 (0.0009) +[2023-10-11 18:13:14,305][85175] Updated weights for policy 1, policy_version 78720 (0.0010) +[2023-10-11 18:13:16,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 160038912. Throughput: 0: 1696.9, 1: 1704.0. Samples: 40020574. Policy #0 lag: (min: 4.0, avg: 5.8, max: 32.0) +[2023-10-11 18:13:16,063][84230] Avg episode reward: [(0, '41.240'), (1, '48.160')] +[2023-10-11 18:13:17,273][85176] Updated weights for policy 0, policy_version 77572 (0.0007) +[2023-10-11 18:13:17,638][85176] Updated weights for policy 0, policy_version 77582 (0.0008) +[2023-10-11 18:13:18,014][85176] Updated weights for policy 0, policy_version 77592 (0.0008) +[2023-10-11 18:13:18,207][85175] Updated weights for policy 1, policy_version 78730 (0.0008) +[2023-10-11 18:13:18,578][85175] Updated weights for policy 1, policy_version 78740 (0.0009) +[2023-10-11 18:13:18,940][85175] Updated weights for policy 1, policy_version 78750 (0.0010) +[2023-10-11 18:13:21,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 160104448. Throughput: 0: 1668.0, 1: 1697.3. Samples: 40030374. Policy #0 lag: (min: 4.0, avg: 5.8, max: 32.0) +[2023-10-11 18:13:21,063][84230] Avg episode reward: [(0, '45.640'), (1, '46.530')] +[2023-10-11 18:13:22,130][85176] Updated weights for policy 0, policy_version 77602 (0.0009) +[2023-10-11 18:13:22,503][85176] Updated weights for policy 0, policy_version 77612 (0.0007) +[2023-10-11 18:13:22,877][85176] Updated weights for policy 0, policy_version 77622 (0.0009) +[2023-10-11 18:13:23,013][85175] Updated weights for policy 1, policy_version 78760 (0.0007) +[2023-10-11 18:13:23,258][85176] Updated weights for policy 0, policy_version 77632 (0.0008) +[2023-10-11 18:13:23,379][85175] Updated weights for policy 1, policy_version 78770 (0.0009) +[2023-10-11 18:13:23,754][85175] Updated weights for policy 1, policy_version 78780 (0.0007) +[2023-10-11 18:13:26,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 160169984. Throughput: 0: 1682.0, 1: 1689.6. Samples: 40050326. Policy #0 lag: (min: 4.0, avg: 5.8, max: 32.0) +[2023-10-11 18:13:26,063][84230] Avg episode reward: [(0, '39.020'), (1, '49.070')] +[2023-10-11 18:13:27,528][85176] Updated weights for policy 0, policy_version 77642 (0.0010) +[2023-10-11 18:13:27,657][85175] Updated weights for policy 1, policy_version 78790 (0.0008) +[2023-10-11 18:13:27,899][85176] Updated weights for policy 0, policy_version 77652 (0.0008) +[2023-10-11 18:13:28,022][85175] Updated weights for policy 1, policy_version 78800 (0.0008) +[2023-10-11 18:13:28,268][85176] Updated weights for policy 0, policy_version 77662 (0.0009) +[2023-10-11 18:13:28,384][85175] Updated weights for policy 1, policy_version 78810 (0.0009) +[2023-10-11 18:13:31,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 160235520. Throughput: 0: 1677.8, 1: 1713.9. Samples: 40071202. Policy #0 lag: (min: 4.0, avg: 5.8, max: 32.0) +[2023-10-11 18:13:31,063][84230] Avg episode reward: [(0, '44.270'), (1, '45.040')] +[2023-10-11 18:13:32,327][85176] Updated weights for policy 0, policy_version 77672 (0.0008) +[2023-10-11 18:13:32,427][85175] Updated weights for policy 1, policy_version 78820 (0.0009) +[2023-10-11 18:13:32,699][85176] Updated weights for policy 0, policy_version 77682 (0.0009) +[2023-10-11 18:13:32,791][85175] Updated weights for policy 1, policy_version 78830 (0.0008) +[2023-10-11 18:13:33,072][85176] Updated weights for policy 0, policy_version 77692 (0.0007) +[2023-10-11 18:13:33,163][85175] Updated weights for policy 1, policy_version 78840 (0.0007) +[2023-10-11 18:13:36,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 160301056. Throughput: 0: 1664.0, 1: 1681.8. Samples: 40080322. Policy #0 lag: (min: 4.0, avg: 5.8, max: 32.0) +[2023-10-11 18:13:36,063][84230] Avg episode reward: [(0, '40.380'), (1, '47.110')] +[2023-10-11 18:13:37,126][85176] Updated weights for policy 0, policy_version 77702 (0.0007) +[2023-10-11 18:13:37,136][85175] Updated weights for policy 1, policy_version 78850 (0.0007) +[2023-10-11 18:13:37,500][85176] Updated weights for policy 0, policy_version 77712 (0.0009) +[2023-10-11 18:13:37,505][85175] Updated weights for policy 1, policy_version 78860 (0.0007) +[2023-10-11 18:13:37,865][85175] Updated weights for policy 1, policy_version 78870 (0.0008) +[2023-10-11 18:13:37,877][85176] Updated weights for policy 0, policy_version 77722 (0.0010) +[2023-10-11 18:13:38,232][85175] Updated weights for policy 1, policy_version 78880 (0.0009) +[2023-10-11 18:13:41,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 160366592. Throughput: 0: 1677.3, 1: 1700.6. Samples: 40101286. Policy #0 lag: (min: 4.0, avg: 5.8, max: 32.0) +[2023-10-11 18:13:41,063][84230] Avg episode reward: [(0, '46.340'), (1, '45.360')] +[2023-10-11 18:13:42,010][85176] Updated weights for policy 0, policy_version 77732 (0.0008) +[2023-10-11 18:13:42,257][85175] Updated weights for policy 1, policy_version 78890 (0.0008) +[2023-10-11 18:13:42,389][85176] Updated weights for policy 0, policy_version 77742 (0.0007) +[2023-10-11 18:13:42,621][85175] Updated weights for policy 1, policy_version 78900 (0.0008) +[2023-10-11 18:13:42,755][85176] Updated weights for policy 0, policy_version 77752 (0.0008) +[2023-10-11 18:13:42,973][85175] Updated weights for policy 1, policy_version 78910 (0.0008) +[2023-10-11 18:13:46,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 160432128. Throughput: 0: 1677.9, 1: 1714.1. Samples: 40122016. Policy #0 lag: (min: 4.0, avg: 5.8, max: 32.0) +[2023-10-11 18:13:46,063][84230] Avg episode reward: [(0, '41.430'), (1, '45.690')] +[2023-10-11 18:13:46,922][85176] Updated weights for policy 0, policy_version 77762 (0.0009) +[2023-10-11 18:13:47,113][85175] Updated weights for policy 1, policy_version 78920 (0.0008) +[2023-10-11 18:13:47,327][85176] Updated weights for policy 0, policy_version 77772 (0.0007) +[2023-10-11 18:13:47,475][85175] Updated weights for policy 1, policy_version 78930 (0.0009) +[2023-10-11 18:13:47,699][85176] Updated weights for policy 0, policy_version 77782 (0.0008) +[2023-10-11 18:13:47,839][85175] Updated weights for policy 1, policy_version 78940 (0.0009) +[2023-10-11 18:13:48,073][85176] Updated weights for policy 0, policy_version 77792 (0.0007) +[2023-10-11 18:13:51,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 160497664. Throughput: 0: 1669.6, 1: 1684.6. Samples: 40131014. Policy #0 lag: (min: 4.0, avg: 5.8, max: 32.0) +[2023-10-11 18:13:51,063][84230] Avg episode reward: [(0, '47.050'), (1, '45.030')] +[2023-10-11 18:13:51,957][85176] Updated weights for policy 0, policy_version 77802 (0.0009) +[2023-10-11 18:13:52,015][85175] Updated weights for policy 1, policy_version 78950 (0.0009) +[2023-10-11 18:13:52,331][85176] Updated weights for policy 0, policy_version 77812 (0.0008) +[2023-10-11 18:13:52,385][85175] Updated weights for policy 1, policy_version 78960 (0.0008) +[2023-10-11 18:13:52,706][85176] Updated weights for policy 0, policy_version 77822 (0.0009) +[2023-10-11 18:13:52,752][85175] Updated weights for policy 1, policy_version 78970 (0.0008) +[2023-10-11 18:13:56,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 160563200. Throughput: 0: 1668.6, 1: 1708.1. Samples: 40151524. Policy #0 lag: (min: 4.0, avg: 5.8, max: 32.0) +[2023-10-11 18:13:56,063][84230] Avg episode reward: [(0, '43.000'), (1, '44.930')] +[2023-10-11 18:13:56,669][85175] Updated weights for policy 1, policy_version 78980 (0.0009) +[2023-10-11 18:13:56,752][85176] Updated weights for policy 0, policy_version 77832 (0.0010) +[2023-10-11 18:13:57,039][85175] Updated weights for policy 1, policy_version 78990 (0.0008) +[2023-10-11 18:13:57,129][85176] Updated weights for policy 0, policy_version 77842 (0.0009) +[2023-10-11 18:13:57,402][85175] Updated weights for policy 1, policy_version 79000 (0.0009) +[2023-10-11 18:13:57,503][85176] Updated weights for policy 0, policy_version 77852 (0.0008) +[2023-10-11 18:14:01,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 160628736. Throughput: 0: 1673.4, 1: 1706.0. Samples: 40172646. Policy #0 lag: (min: 4.0, avg: 5.8, max: 32.0) +[2023-10-11 18:14:01,064][84230] Avg episode reward: [(0, '46.650'), (1, '46.220')] +[2023-10-11 18:14:01,434][85176] Updated weights for policy 0, policy_version 77862 (0.0008) +[2023-10-11 18:14:01,521][85175] Updated weights for policy 1, policy_version 79010 (0.0008) +[2023-10-11 18:14:01,816][85176] Updated weights for policy 0, policy_version 77872 (0.0008) +[2023-10-11 18:14:01,890][85175] Updated weights for policy 1, policy_version 79020 (0.0007) +[2023-10-11 18:14:02,182][85176] Updated weights for policy 0, policy_version 77882 (0.0009) +[2023-10-11 18:14:02,257][85175] Updated weights for policy 1, policy_version 79030 (0.0007) +[2023-10-11 18:14:02,615][85175] Updated weights for policy 1, policy_version 79040 (0.0008) +[2023-10-11 18:14:06,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 160694272. Throughput: 0: 1675.9, 1: 1688.1. Samples: 40181752. Policy #0 lag: (min: 9.0, avg: 14.1, max: 41.0) +[2023-10-11 18:14:06,063][84230] Avg episode reward: [(0, '42.830'), (1, '44.680')] +[2023-10-11 18:14:06,139][85176] Updated weights for policy 0, policy_version 77892 (0.0008) +[2023-10-11 18:14:06,517][85176] Updated weights for policy 0, policy_version 77902 (0.0008) +[2023-10-11 18:14:06,607][85175] Updated weights for policy 1, policy_version 79050 (0.0007) +[2023-10-11 18:14:06,888][85176] Updated weights for policy 0, policy_version 77912 (0.0010) +[2023-10-11 18:14:06,986][85175] Updated weights for policy 1, policy_version 79060 (0.0007) +[2023-10-11 18:14:07,344][85175] Updated weights for policy 1, policy_version 79070 (0.0007) +[2023-10-11 18:14:10,969][85176] Updated weights for policy 0, policy_version 77922 (0.0009) +[2023-10-11 18:14:11,062][84230] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 160759808. Throughput: 0: 1682.6, 1: 1703.5. Samples: 40202700. Policy #0 lag: (min: 9.0, avg: 14.1, max: 41.0) +[2023-10-11 18:14:11,063][84230] Avg episode reward: [(0, '47.020'), (1, '46.670')] +[2023-10-11 18:14:11,349][85176] Updated weights for policy 0, policy_version 77932 (0.0007) +[2023-10-11 18:14:11,435][85175] Updated weights for policy 1, policy_version 79080 (0.0008) +[2023-10-11 18:14:11,723][85176] Updated weights for policy 0, policy_version 77942 (0.0007) +[2023-10-11 18:14:11,807][85175] Updated weights for policy 1, policy_version 79090 (0.0009) +[2023-10-11 18:14:12,103][85176] Updated weights for policy 0, policy_version 77952 (0.0009) +[2023-10-11 18:14:12,185][85175] Updated weights for policy 1, policy_version 79100 (0.0009) +[2023-10-11 18:14:16,063][84230] Fps is (10 sec: 13106.8, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 160825344. Throughput: 0: 1683.2, 1: 1693.7. Samples: 40223164. Policy #0 lag: (min: 9.0, avg: 14.1, max: 41.0) +[2023-10-11 18:14:16,064][84230] Avg episode reward: [(0, '41.300'), (1, '43.190')] +[2023-10-11 18:14:16,273][85175] Updated weights for policy 1, policy_version 79110 (0.0009) +[2023-10-11 18:14:16,342][85176] Updated weights for policy 0, policy_version 77962 (0.0009) +[2023-10-11 18:14:16,647][85175] Updated weights for policy 1, policy_version 79120 (0.0008) +[2023-10-11 18:14:16,713][85176] Updated weights for policy 0, policy_version 77972 (0.0008) +[2023-10-11 18:14:17,005][85175] Updated weights for policy 1, policy_version 79130 (0.0008) +[2023-10-11 18:14:17,080][85176] Updated weights for policy 0, policy_version 77982 (0.0008) +[2023-10-11 18:14:17,154][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000077984_79855616.pth... +[2023-10-11 18:14:17,192][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000076384_78217216.pth +[2023-10-11 18:14:17,196][84801] Saving a milestone ./train_atari/atari_frostbite_APPO/checkpoint_p0/milestones/checkpoint_000077984_79855616.pth +[2023-10-11 18:14:17,221][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000079136_81035264.pth... +[2023-10-11 18:14:17,250][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000077536_79396864.pth +[2023-10-11 18:14:17,254][85000] Saving a milestone ./train_atari/atari_frostbite_APPO/checkpoint_p1/milestones/checkpoint_000079136_81035264.pth +[2023-10-11 18:14:21,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 160890880. Throughput: 0: 1680.8, 1: 1692.9. Samples: 40232140. Policy #0 lag: (min: 9.0, avg: 14.1, max: 41.0) +[2023-10-11 18:14:21,063][84230] Avg episode reward: [(0, '46.380'), (1, '47.260')] +[2023-10-11 18:14:21,069][85175] Updated weights for policy 1, policy_version 79140 (0.0008) +[2023-10-11 18:14:21,330][85176] Updated weights for policy 0, policy_version 77992 (0.0008) +[2023-10-11 18:14:21,434][85175] Updated weights for policy 1, policy_version 79150 (0.0008) +[2023-10-11 18:14:21,693][85176] Updated weights for policy 0, policy_version 78002 (0.0008) +[2023-10-11 18:14:21,802][85175] Updated weights for policy 1, policy_version 79160 (0.0008) +[2023-10-11 18:14:22,061][85176] Updated weights for policy 0, policy_version 78012 (0.0007) +[2023-10-11 18:14:25,895][85175] Updated weights for policy 1, policy_version 79170 (0.0007) +[2023-10-11 18:14:26,063][84230] Fps is (10 sec: 13107.7, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 160956416. Throughput: 0: 1671.8, 1: 1690.2. Samples: 40252578. Policy #0 lag: (min: 9.0, avg: 14.1, max: 41.0) +[2023-10-11 18:14:26,063][84230] Avg episode reward: [(0, '40.820'), (1, '40.270')] +[2023-10-11 18:14:26,154][85176] Updated weights for policy 0, policy_version 78022 (0.0008) +[2023-10-11 18:14:26,259][85175] Updated weights for policy 1, policy_version 79180 (0.0007) +[2023-10-11 18:14:26,518][85176] Updated weights for policy 0, policy_version 78032 (0.0007) +[2023-10-11 18:14:26,624][85175] Updated weights for policy 1, policy_version 79190 (0.0007) +[2023-10-11 18:14:26,890][85176] Updated weights for policy 0, policy_version 78042 (0.0008) +[2023-10-11 18:14:26,994][85175] Updated weights for policy 1, policy_version 79200 (0.0008) +[2023-10-11 18:14:30,822][85176] Updated weights for policy 0, policy_version 78052 (0.0009) +[2023-10-11 18:14:31,053][85175] Updated weights for policy 1, policy_version 79210 (0.0008) +[2023-10-11 18:14:31,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 161021952. Throughput: 0: 1679.5, 1: 1688.1. Samples: 40273558. Policy #0 lag: (min: 9.0, avg: 14.1, max: 41.0) +[2023-10-11 18:14:31,064][84230] Avg episode reward: [(0, '46.490'), (1, '47.050')] +[2023-10-11 18:14:31,195][85176] Updated weights for policy 0, policy_version 78062 (0.0009) +[2023-10-11 18:14:31,420][85175] Updated weights for policy 1, policy_version 79220 (0.0008) +[2023-10-11 18:14:31,560][85176] Updated weights for policy 0, policy_version 78072 (0.0008) +[2023-10-11 18:14:31,791][85175] Updated weights for policy 1, policy_version 79230 (0.0007) +[2023-10-11 18:14:35,787][85176] Updated weights for policy 0, policy_version 78082 (0.0008) +[2023-10-11 18:14:35,842][85175] Updated weights for policy 1, policy_version 79240 (0.0008) +[2023-10-11 18:14:36,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 161087488. Throughput: 0: 1678.6, 1: 1686.9. Samples: 40282462. Policy #0 lag: (min: 9.0, avg: 14.1, max: 41.0) +[2023-10-11 18:14:36,063][84230] Avg episode reward: [(0, '38.850'), (1, '41.990')] +[2023-10-11 18:14:36,191][85176] Updated weights for policy 0, policy_version 78092 (0.0007) +[2023-10-11 18:14:36,209][85175] Updated weights for policy 1, policy_version 79250 (0.0009) +[2023-10-11 18:14:36,564][85175] Updated weights for policy 1, policy_version 79260 (0.0009) +[2023-10-11 18:14:36,568][85176] Updated weights for policy 0, policy_version 78102 (0.0007) +[2023-10-11 18:14:36,940][85176] Updated weights for policy 0, policy_version 78112 (0.0009) +[2023-10-11 18:14:40,597][85175] Updated weights for policy 1, policy_version 79270 (0.0007) +[2023-10-11 18:14:40,988][85175] Updated weights for policy 1, policy_version 79280 (0.0009) +[2023-10-11 18:14:41,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 161153024. Throughput: 0: 1673.9, 1: 1695.1. Samples: 40303128. Policy #0 lag: (min: 9.0, avg: 14.1, max: 41.0) +[2023-10-11 18:14:41,064][84230] Avg episode reward: [(0, '45.280'), (1, '47.930')] +[2023-10-11 18:14:41,159][85176] Updated weights for policy 0, policy_version 78122 (0.0007) +[2023-10-11 18:14:41,356][85175] Updated weights for policy 1, policy_version 79290 (0.0008) +[2023-10-11 18:14:41,528][85176] Updated weights for policy 0, policy_version 78132 (0.0008) +[2023-10-11 18:14:41,897][85176] Updated weights for policy 0, policy_version 78142 (0.0007) +[2023-10-11 18:14:45,393][85175] Updated weights for policy 1, policy_version 79300 (0.0008) +[2023-10-11 18:14:45,768][85175] Updated weights for policy 1, policy_version 79310 (0.0009) +[2023-10-11 18:14:46,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 161218560. Throughput: 0: 1662.9, 1: 1684.5. Samples: 40323282. Policy #0 lag: (min: 9.0, avg: 14.1, max: 41.0) +[2023-10-11 18:14:46,063][84230] Avg episode reward: [(0, '40.700'), (1, '42.410')] +[2023-10-11 18:14:46,065][85176] Updated weights for policy 0, policy_version 78152 (0.0007) +[2023-10-11 18:14:46,128][85175] Updated weights for policy 1, policy_version 79320 (0.0007) +[2023-10-11 18:14:46,440][85176] Updated weights for policy 0, policy_version 78162 (0.0007) +[2023-10-11 18:14:46,814][85176] Updated weights for policy 0, policy_version 78172 (0.0009) +[2023-10-11 18:14:50,085][85175] Updated weights for policy 1, policy_version 79330 (0.0008) +[2023-10-11 18:14:50,461][85175] Updated weights for policy 1, policy_version 79340 (0.0011) +[2023-10-11 18:14:50,833][85175] Updated weights for policy 1, policy_version 79350 (0.0009) +[2023-10-11 18:14:50,900][85176] Updated weights for policy 0, policy_version 78182 (0.0008) +[2023-10-11 18:14:51,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 161284096. Throughput: 0: 1659.9, 1: 1695.8. Samples: 40332760. Policy #0 lag: (min: 9.0, avg: 14.1, max: 41.0) +[2023-10-11 18:14:51,064][84230] Avg episode reward: [(0, '45.660'), (1, '45.600')] +[2023-10-11 18:14:51,194][85175] Updated weights for policy 1, policy_version 79360 (0.0009) +[2023-10-11 18:14:51,275][85176] Updated weights for policy 0, policy_version 78192 (0.0010) +[2023-10-11 18:14:51,648][85176] Updated weights for policy 0, policy_version 78202 (0.0008) +[2023-10-11 18:14:55,286][85175] Updated weights for policy 1, policy_version 79370 (0.0009) +[2023-10-11 18:14:55,653][85175] Updated weights for policy 1, policy_version 79380 (0.0007) +[2023-10-11 18:14:55,762][85176] Updated weights for policy 0, policy_version 78212 (0.0007) +[2023-10-11 18:14:56,014][85175] Updated weights for policy 1, policy_version 79390 (0.0009) +[2023-10-11 18:14:56,062][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 161349632. Throughput: 0: 1655.9, 1: 1693.0. Samples: 40353400. Policy #0 lag: (min: 11.0, avg: 20.0, max: 43.0) +[2023-10-11 18:14:56,063][84230] Avg episode reward: [(0, '41.300'), (1, '43.410')] +[2023-10-11 18:14:56,147][85176] Updated weights for policy 0, policy_version 78222 (0.0009) +[2023-10-11 18:14:56,512][85176] Updated weights for policy 0, policy_version 78232 (0.0010) +[2023-10-11 18:14:59,921][85175] Updated weights for policy 1, policy_version 79400 (0.0008) +[2023-10-11 18:15:00,300][85175] Updated weights for policy 1, policy_version 79410 (0.0009) +[2023-10-11 18:15:00,658][85176] Updated weights for policy 0, policy_version 78242 (0.0008) +[2023-10-11 18:15:00,670][85175] Updated weights for policy 1, policy_version 79420 (0.0007) +[2023-10-11 18:15:01,033][85176] Updated weights for policy 0, policy_version 78252 (0.0008) +[2023-10-11 18:15:01,063][84230] Fps is (10 sec: 16384.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 161447936. Throughput: 0: 1662.5, 1: 1681.8. Samples: 40373654. Policy #0 lag: (min: 11.0, avg: 20.0, max: 43.0) +[2023-10-11 18:15:01,064][84230] Avg episode reward: [(0, '44.980'), (1, '42.620')] +[2023-10-11 18:15:01,408][85176] Updated weights for policy 0, policy_version 78262 (0.0009) +[2023-10-11 18:15:01,781][85176] Updated weights for policy 0, policy_version 78272 (0.0007) +[2023-10-11 18:15:04,592][85175] Updated weights for policy 1, policy_version 79430 (0.0008) +[2023-10-11 18:15:04,958][85175] Updated weights for policy 1, policy_version 79440 (0.0008) +[2023-10-11 18:15:05,328][85175] Updated weights for policy 1, policy_version 79450 (0.0009) +[2023-10-11 18:15:05,969][85176] Updated weights for policy 0, policy_version 78282 (0.0008) +[2023-10-11 18:15:06,063][84230] Fps is (10 sec: 16383.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 161513472. Throughput: 0: 1662.6, 1: 1706.5. Samples: 40383750. Policy #0 lag: (min: 11.0, avg: 20.0, max: 43.0) +[2023-10-11 18:15:06,064][84230] Avg episode reward: [(0, '42.330'), (1, '43.740')] +[2023-10-11 18:15:06,345][85176] Updated weights for policy 0, policy_version 78292 (0.0007) +[2023-10-11 18:15:06,710][85176] Updated weights for policy 0, policy_version 78302 (0.0008) +[2023-10-11 18:15:09,423][85175] Updated weights for policy 1, policy_version 79460 (0.0009) +[2023-10-11 18:15:09,793][85175] Updated weights for policy 1, policy_version 79470 (0.0009) +[2023-10-11 18:15:10,166][85175] Updated weights for policy 1, policy_version 79480 (0.0007) +[2023-10-11 18:15:10,812][85176] Updated weights for policy 0, policy_version 78312 (0.0008) +[2023-10-11 18:15:11,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 161579008. Throughput: 0: 1668.9, 1: 1699.7. Samples: 40404166. Policy #0 lag: (min: 11.0, avg: 20.0, max: 43.0) +[2023-10-11 18:15:11,064][84230] Avg episode reward: [(0, '44.760'), (1, '42.440')] +[2023-10-11 18:15:11,173][85176] Updated weights for policy 0, policy_version 78322 (0.0009) +[2023-10-11 18:15:11,549][85176] Updated weights for policy 0, policy_version 78332 (0.0010) +[2023-10-11 18:15:14,140][85175] Updated weights for policy 1, policy_version 79490 (0.0011) +[2023-10-11 18:15:14,512][85175] Updated weights for policy 1, policy_version 79500 (0.0009) +[2023-10-11 18:15:14,877][85175] Updated weights for policy 1, policy_version 79510 (0.0009) +[2023-10-11 18:15:15,236][85175] Updated weights for policy 1, policy_version 79520 (0.0008) +[2023-10-11 18:15:15,530][85176] Updated weights for policy 0, policy_version 78342 (0.0008) +[2023-10-11 18:15:15,910][85176] Updated weights for policy 0, policy_version 78352 (0.0008) +[2023-10-11 18:15:16,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 161644544. Throughput: 0: 1658.4, 1: 1682.5. Samples: 40423900. Policy #0 lag: (min: 11.0, avg: 20.0, max: 43.0) +[2023-10-11 18:15:16,064][84230] Avg episode reward: [(0, '42.790'), (1, '47.020')] +[2023-10-11 18:15:16,287][85176] Updated weights for policy 0, policy_version 78362 (0.0009) +[2023-10-11 18:15:19,116][85175] Updated weights for policy 1, policy_version 79530 (0.0007) +[2023-10-11 18:15:19,485][85175] Updated weights for policy 1, policy_version 79540 (0.0007) +[2023-10-11 18:15:19,849][85175] Updated weights for policy 1, policy_version 79550 (0.0008) +[2023-10-11 18:15:20,418][85176] Updated weights for policy 0, policy_version 78372 (0.0009) +[2023-10-11 18:15:20,809][85176] Updated weights for policy 0, policy_version 78382 (0.0008) +[2023-10-11 18:15:21,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 161710080. Throughput: 0: 1668.8, 1: 1717.4. Samples: 40434842. Policy #0 lag: (min: 11.0, avg: 20.0, max: 43.0) +[2023-10-11 18:15:21,063][84230] Avg episode reward: [(0, '43.300'), (1, '43.760')] +[2023-10-11 18:15:21,183][85176] Updated weights for policy 0, policy_version 78392 (0.0009) +[2023-10-11 18:15:23,918][85175] Updated weights for policy 1, policy_version 79560 (0.0008) +[2023-10-11 18:15:24,283][85175] Updated weights for policy 1, policy_version 79570 (0.0009) +[2023-10-11 18:15:24,661][85175] Updated weights for policy 1, policy_version 79580 (0.0008) +[2023-10-11 18:15:25,152][85176] Updated weights for policy 0, policy_version 78402 (0.0011) +[2023-10-11 18:15:25,534][85176] Updated weights for policy 0, policy_version 78412 (0.0008) +[2023-10-11 18:15:25,897][85176] Updated weights for policy 0, policy_version 78422 (0.0009) +[2023-10-11 18:15:26,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 161775616. Throughput: 0: 1674.9, 1: 1691.8. Samples: 40454630. Policy #0 lag: (min: 11.0, avg: 20.0, max: 43.0) +[2023-10-11 18:15:26,063][84230] Avg episode reward: [(0, '42.530'), (1, '48.600')] +[2023-10-11 18:15:26,273][85176] Updated weights for policy 0, policy_version 78432 (0.0009) +[2023-10-11 18:15:28,698][85175] Updated weights for policy 1, policy_version 79590 (0.0010) +[2023-10-11 18:15:29,083][85175] Updated weights for policy 1, policy_version 79600 (0.0011) +[2023-10-11 18:15:29,459][85175] Updated weights for policy 1, policy_version 79610 (0.0010) +[2023-10-11 18:15:30,341][85176] Updated weights for policy 0, policy_version 78442 (0.0011) +[2023-10-11 18:15:30,713][85176] Updated weights for policy 0, policy_version 78452 (0.0010) +[2023-10-11 18:15:31,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 161841152. Throughput: 0: 1664.7, 1: 1694.8. Samples: 40474460. Policy #0 lag: (min: 11.0, avg: 20.0, max: 43.0) +[2023-10-11 18:15:31,063][84230] Avg episode reward: [(0, '42.800'), (1, '43.840')] +[2023-10-11 18:15:31,094][85176] Updated weights for policy 0, policy_version 78462 (0.0010) +[2023-10-11 18:15:33,418][85175] Updated weights for policy 1, policy_version 79620 (0.0010) +[2023-10-11 18:15:33,787][85175] Updated weights for policy 1, policy_version 79630 (0.0010) +[2023-10-11 18:15:34,152][85175] Updated weights for policy 1, policy_version 79640 (0.0009) +[2023-10-11 18:15:35,040][85176] Updated weights for policy 0, policy_version 78472 (0.0008) +[2023-10-11 18:15:35,427][85176] Updated weights for policy 0, policy_version 78482 (0.0011) +[2023-10-11 18:15:35,799][85176] Updated weights for policy 0, policy_version 78492 (0.0010) +[2023-10-11 18:15:36,063][84230] Fps is (10 sec: 16384.0, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 161939456. Throughput: 0: 1680.6, 1: 1708.3. Samples: 40485260. Policy #0 lag: (min: 11.0, avg: 20.0, max: 43.0) +[2023-10-11 18:15:36,063][84230] Avg episode reward: [(0, '42.770'), (1, '49.140')] +[2023-10-11 18:15:37,979][85175] Updated weights for policy 1, policy_version 79650 (0.0009) +[2023-10-11 18:15:38,348][85175] Updated weights for policy 1, policy_version 79660 (0.0008) +[2023-10-11 18:15:38,712][85175] Updated weights for policy 1, policy_version 79670 (0.0009) +[2023-10-11 18:15:39,080][85175] Updated weights for policy 1, policy_version 79680 (0.0008) +[2023-10-11 18:15:39,734][85176] Updated weights for policy 0, policy_version 78502 (0.0009) +[2023-10-11 18:15:40,106][85176] Updated weights for policy 0, policy_version 78512 (0.0008) +[2023-10-11 18:15:40,478][85176] Updated weights for policy 0, policy_version 78522 (0.0009) +[2023-10-11 18:15:41,063][84230] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 162004992. Throughput: 0: 1681.4, 1: 1688.6. Samples: 40505048. Policy #0 lag: (min: 31.0, avg: 34.0, max: 63.0) +[2023-10-11 18:15:41,063][84230] Avg episode reward: [(0, '43.600'), (1, '42.220')] +[2023-10-11 18:15:43,000][85175] Updated weights for policy 1, policy_version 79690 (0.0007) +[2023-10-11 18:15:43,365][85175] Updated weights for policy 1, policy_version 79700 (0.0007) +[2023-10-11 18:15:43,732][85175] Updated weights for policy 1, policy_version 79710 (0.0009) +[2023-10-11 18:15:44,451][85176] Updated weights for policy 0, policy_version 78532 (0.0009) +[2023-10-11 18:15:44,820][85176] Updated weights for policy 0, policy_version 78542 (0.0010) +[2023-10-11 18:15:45,187][85176] Updated weights for policy 0, policy_version 78552 (0.0010) +[2023-10-11 18:15:46,063][84230] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 162070528. Throughput: 0: 1654.7, 1: 1709.3. Samples: 40525030. Policy #0 lag: (min: 31.0, avg: 34.0, max: 63.0) +[2023-10-11 18:15:46,063][84230] Avg episode reward: [(0, '45.160'), (1, '47.040')] +[2023-10-11 18:15:47,937][85175] Updated weights for policy 1, policy_version 79720 (0.0010) +[2023-10-11 18:15:48,307][85175] Updated weights for policy 1, policy_version 79730 (0.0008) +[2023-10-11 18:15:48,668][85175] Updated weights for policy 1, policy_version 79740 (0.0007) +[2023-10-11 18:15:49,268][85176] Updated weights for policy 0, policy_version 78562 (0.0009) +[2023-10-11 18:15:49,654][85176] Updated weights for policy 0, policy_version 78572 (0.0010) +[2023-10-11 18:15:50,027][85176] Updated weights for policy 0, policy_version 78582 (0.0007) +[2023-10-11 18:15:50,389][85176] Updated weights for policy 0, policy_version 78592 (0.0007) +[2023-10-11 18:15:51,062][84230] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 162136064. Throughput: 0: 1687.5, 1: 1692.7. Samples: 40535856. Policy #0 lag: (min: 31.0, avg: 34.0, max: 63.0) +[2023-10-11 18:15:51,063][84230] Avg episode reward: [(0, '46.220'), (1, '41.580')] +[2023-10-11 18:15:52,503][85175] Updated weights for policy 1, policy_version 79750 (0.0009) +[2023-10-11 18:15:52,870][85175] Updated weights for policy 1, policy_version 79760 (0.0007) +[2023-10-11 18:15:53,248][85175] Updated weights for policy 1, policy_version 79770 (0.0008) +[2023-10-11 18:15:54,552][85176] Updated weights for policy 0, policy_version 78602 (0.0008) +[2023-10-11 18:15:54,928][85176] Updated weights for policy 0, policy_version 78612 (0.0007) +[2023-10-11 18:15:55,300][85176] Updated weights for policy 0, policy_version 78622 (0.0009) +[2023-10-11 18:15:56,062][84230] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 162201600. Throughput: 0: 1674.4, 1: 1700.6. Samples: 40556040. Policy #0 lag: (min: 31.0, avg: 34.0, max: 63.0) +[2023-10-11 18:15:56,063][84230] Avg episode reward: [(0, '45.750'), (1, '47.060')] +[2023-10-11 18:15:57,217][85175] Updated weights for policy 1, policy_version 79780 (0.0007) +[2023-10-11 18:15:57,590][85175] Updated weights for policy 1, policy_version 79790 (0.0008) +[2023-10-11 18:15:57,960][85175] Updated weights for policy 1, policy_version 79800 (0.0008) +[2023-10-11 18:15:59,383][85176] Updated weights for policy 0, policy_version 78632 (0.0009) +[2023-10-11 18:15:59,748][85176] Updated weights for policy 0, policy_version 78642 (0.0009) +[2023-10-11 18:16:00,120][85176] Updated weights for policy 0, policy_version 78652 (0.0009) +[2023-10-11 18:16:01,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 162267136. Throughput: 0: 1659.3, 1: 1724.3. Samples: 40576162. Policy #0 lag: (min: 31.0, avg: 34.0, max: 63.0) +[2023-10-11 18:16:01,063][84230] Avg episode reward: [(0, '45.110'), (1, '42.410')] +[2023-10-11 18:16:02,084][85175] Updated weights for policy 1, policy_version 79810 (0.0007) +[2023-10-11 18:16:02,458][85175] Updated weights for policy 1, policy_version 79820 (0.0007) +[2023-10-11 18:16:02,820][85175] Updated weights for policy 1, policy_version 79830 (0.0009) +[2023-10-11 18:16:03,186][85175] Updated weights for policy 1, policy_version 79840 (0.0007) +[2023-10-11 18:16:04,206][85176] Updated weights for policy 0, policy_version 78662 (0.0008) +[2023-10-11 18:16:04,573][85176] Updated weights for policy 0, policy_version 78672 (0.0008) +[2023-10-11 18:16:04,938][85176] Updated weights for policy 0, policy_version 78682 (0.0008) +[2023-10-11 18:16:06,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 162332672. Throughput: 0: 1681.2, 1: 1686.2. Samples: 40586374. Policy #0 lag: (min: 31.0, avg: 34.0, max: 63.0) +[2023-10-11 18:16:06,063][84230] Avg episode reward: [(0, '43.330'), (1, '46.880')] +[2023-10-11 18:16:07,103][85175] Updated weights for policy 1, policy_version 79850 (0.0008) +[2023-10-11 18:16:07,469][85175] Updated weights for policy 1, policy_version 79860 (0.0010) +[2023-10-11 18:16:07,847][85175] Updated weights for policy 1, policy_version 79870 (0.0011) +[2023-10-11 18:16:09,114][85176] Updated weights for policy 0, policy_version 78692 (0.0008) +[2023-10-11 18:16:09,504][85176] Updated weights for policy 0, policy_version 78702 (0.0010) +[2023-10-11 18:16:09,874][85176] Updated weights for policy 0, policy_version 78712 (0.0009) +[2023-10-11 18:16:11,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 162398208. Throughput: 0: 1666.7, 1: 1713.8. Samples: 40606752. Policy #0 lag: (min: 31.0, avg: 34.0, max: 63.0) +[2023-10-11 18:16:11,064][84230] Avg episode reward: [(0, '43.190'), (1, '43.980')] +[2023-10-11 18:16:11,889][85175] Updated weights for policy 1, policy_version 79880 (0.0008) +[2023-10-11 18:16:12,261][85175] Updated weights for policy 1, policy_version 79890 (0.0007) +[2023-10-11 18:16:12,634][85175] Updated weights for policy 1, policy_version 79900 (0.0009) +[2023-10-11 18:16:13,905][85176] Updated weights for policy 0, policy_version 78722 (0.0011) +[2023-10-11 18:16:14,281][85176] Updated weights for policy 0, policy_version 78732 (0.0009) +[2023-10-11 18:16:14,646][85176] Updated weights for policy 0, policy_version 78742 (0.0008) +[2023-10-11 18:16:15,021][85176] Updated weights for policy 0, policy_version 78752 (0.0007) +[2023-10-11 18:16:16,063][84230] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 162463744. Throughput: 0: 1665.9, 1: 1720.4. Samples: 40626848. Policy #0 lag: (min: 31.0, avg: 34.0, max: 63.0) +[2023-10-11 18:16:16,064][84230] Avg episode reward: [(0, '44.580'), (1, '44.910')] +[2023-10-11 18:16:16,075][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000079904_81821696.pth... +[2023-10-11 18:16:16,075][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000078752_80642048.pth... +[2023-10-11 18:16:16,115][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000077184_79036416.pth +[2023-10-11 18:16:16,116][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000078336_80216064.pth +[2023-10-11 18:16:16,726][85175] Updated weights for policy 1, policy_version 79910 (0.0009) +[2023-10-11 18:16:17,115][85175] Updated weights for policy 1, policy_version 79920 (0.0009) +[2023-10-11 18:16:17,479][85175] Updated weights for policy 1, policy_version 79930 (0.0008) +[2023-10-11 18:16:19,124][85176] Updated weights for policy 0, policy_version 78762 (0.0009) +[2023-10-11 18:16:19,504][85176] Updated weights for policy 0, policy_version 78772 (0.0008) +[2023-10-11 18:16:19,878][85176] Updated weights for policy 0, policy_version 78782 (0.0010) +[2023-10-11 18:16:21,063][84230] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 162529280. Throughput: 0: 1682.3, 1: 1693.5. Samples: 40637170. Policy #0 lag: (min: 31.0, avg: 34.0, max: 63.0) +[2023-10-11 18:16:21,063][84230] Avg episode reward: [(0, '42.950'), (1, '43.120')] +[2023-10-11 18:16:21,618][85175] Updated weights for policy 1, policy_version 79940 (0.0010) +[2023-10-11 18:16:21,987][85175] Updated weights for policy 1, policy_version 79950 (0.0007) +[2023-10-11 18:16:22,361][85175] Updated weights for policy 1, policy_version 79960 (0.0007) +[2023-10-11 18:16:23,904][85176] Updated weights for policy 0, policy_version 78792 (0.0009) +[2023-10-11 18:16:24,279][85176] Updated weights for policy 0, policy_version 78802 (0.0009) +[2023-10-11 18:16:24,662][85176] Updated weights for policy 0, policy_version 78812 (0.0007) +[2023-10-11 18:16:26,063][84230] Fps is (10 sec: 13107.6, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 162594816. Throughput: 0: 1660.3, 1: 1717.9. Samples: 40657068. Policy #0 lag: (min: 31.0, avg: 34.0, max: 63.0) +[2023-10-11 18:16:26,063][84230] Avg episode reward: [(0, '46.750'), (1, '45.220')] +[2023-10-11 18:16:26,157][85175] Updated weights for policy 1, policy_version 79970 (0.0008) +[2023-10-11 18:16:26,530][85175] Updated weights for policy 1, policy_version 79980 (0.0008) +[2023-10-11 18:16:26,890][85175] Updated weights for policy 1, policy_version 79990 (0.0008) +[2023-10-11 18:16:27,261][85175] Updated weights for policy 1, policy_version 80000 (0.0007) +[2023-10-11 18:16:28,601][85176] Updated weights for policy 0, policy_version 78822 (0.0008) +[2023-10-11 18:16:28,978][85176] Updated weights for policy 0, policy_version 78832 (0.0008) +[2023-10-11 18:16:29,352][85176] Updated weights for policy 0, policy_version 78842 (0.0008) +[2023-10-11 18:16:31,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 162660352. Throughput: 0: 1679.5, 1: 1712.9. Samples: 40677688. Policy #0 lag: (min: 31.0, avg: 34.0, max: 63.0) +[2023-10-11 18:16:31,063][84230] Avg episode reward: [(0, '43.930'), (1, '44.430')] +[2023-10-11 18:16:31,313][85175] Updated weights for policy 1, policy_version 80010 (0.0008) +[2023-10-11 18:16:31,674][85175] Updated weights for policy 1, policy_version 80020 (0.0008) +[2023-10-11 18:16:32,045][85175] Updated weights for policy 1, policy_version 80030 (0.0008) +[2023-10-11 18:16:33,570][85176] Updated weights for policy 0, policy_version 78852 (0.0009) +[2023-10-11 18:16:33,940][85176] Updated weights for policy 0, policy_version 78862 (0.0008) +[2023-10-11 18:16:34,310][85176] Updated weights for policy 0, policy_version 78872 (0.0008) +[2023-10-11 18:16:35,994][85175] Updated weights for policy 1, policy_version 80040 (0.0008) +[2023-10-11 18:16:36,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 162725888. Throughput: 0: 1674.2, 1: 1704.8. Samples: 40687914. Policy #0 lag: (min: 22.0, avg: 35.8, max: 54.0) +[2023-10-11 18:16:36,063][84230] Avg episode reward: [(0, '45.510'), (1, '42.030')] +[2023-10-11 18:16:36,363][85175] Updated weights for policy 1, policy_version 80050 (0.0008) +[2023-10-11 18:16:36,735][85175] Updated weights for policy 1, policy_version 80060 (0.0008) +[2023-10-11 18:16:38,421][85176] Updated weights for policy 0, policy_version 78882 (0.0010) +[2023-10-11 18:16:38,794][85176] Updated weights for policy 0, policy_version 78892 (0.0009) +[2023-10-11 18:16:39,164][85176] Updated weights for policy 0, policy_version 78902 (0.0010) +[2023-10-11 18:16:39,532][85176] Updated weights for policy 0, policy_version 78912 (0.0007) +[2023-10-11 18:16:40,605][85175] Updated weights for policy 1, policy_version 80070 (0.0008) +[2023-10-11 18:16:40,974][85175] Updated weights for policy 1, policy_version 80080 (0.0007) +[2023-10-11 18:16:41,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 162791424. Throughput: 0: 1663.6, 1: 1708.3. Samples: 40707774. Policy #0 lag: (min: 22.0, avg: 35.8, max: 54.0) +[2023-10-11 18:16:41,063][84230] Avg episode reward: [(0, '42.430'), (1, '43.270')] +[2023-10-11 18:16:41,342][85175] Updated weights for policy 1, policy_version 80090 (0.0007) +[2023-10-11 18:16:43,647][85176] Updated weights for policy 0, policy_version 78922 (0.0010) +[2023-10-11 18:16:44,012][85176] Updated weights for policy 0, policy_version 78932 (0.0009) +[2023-10-11 18:16:44,389][85176] Updated weights for policy 0, policy_version 78942 (0.0011) +[2023-10-11 18:16:45,511][85175] Updated weights for policy 1, policy_version 80100 (0.0008) +[2023-10-11 18:16:45,888][85175] Updated weights for policy 1, policy_version 80110 (0.0008) +[2023-10-11 18:16:46,062][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 162856960. Throughput: 0: 1680.5, 1: 1698.9. Samples: 40728236. Policy #0 lag: (min: 22.0, avg: 35.8, max: 54.0) +[2023-10-11 18:16:46,063][84230] Avg episode reward: [(0, '45.220'), (1, '43.130')] +[2023-10-11 18:16:46,258][85175] Updated weights for policy 1, policy_version 80120 (0.0007) +[2023-10-11 18:16:48,323][85176] Updated weights for policy 0, policy_version 78952 (0.0009) +[2023-10-11 18:16:48,704][85176] Updated weights for policy 0, policy_version 78962 (0.0008) +[2023-10-11 18:16:49,071][85176] Updated weights for policy 0, policy_version 78972 (0.0009) +[2023-10-11 18:16:50,447][85175] Updated weights for policy 1, policy_version 80130 (0.0008) +[2023-10-11 18:16:50,811][85175] Updated weights for policy 1, policy_version 80140 (0.0009) +[2023-10-11 18:16:51,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 162922496. Throughput: 0: 1670.6, 1: 1703.7. Samples: 40738220. Policy #0 lag: (min: 22.0, avg: 35.8, max: 54.0) +[2023-10-11 18:16:51,064][84230] Avg episode reward: [(0, '44.850'), (1, '47.470')] +[2023-10-11 18:16:51,170][85175] Updated weights for policy 1, policy_version 80150 (0.0009) +[2023-10-11 18:16:51,539][85175] Updated weights for policy 1, policy_version 80160 (0.0010) +[2023-10-11 18:16:53,204][85176] Updated weights for policy 0, policy_version 78982 (0.0010) +[2023-10-11 18:16:53,571][85176] Updated weights for policy 0, policy_version 78992 (0.0008) +[2023-10-11 18:16:53,949][85176] Updated weights for policy 0, policy_version 79002 (0.0008) +[2023-10-11 18:16:55,630][85175] Updated weights for policy 1, policy_version 80170 (0.0009) +[2023-10-11 18:16:55,994][85175] Updated weights for policy 1, policy_version 80180 (0.0007) +[2023-10-11 18:16:56,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 162988032. Throughput: 0: 1670.4, 1: 1696.4. Samples: 40758258. Policy #0 lag: (min: 22.0, avg: 35.8, max: 54.0) +[2023-10-11 18:16:56,064][84230] Avg episode reward: [(0, '45.420'), (1, '45.920')] +[2023-10-11 18:16:56,361][85175] Updated weights for policy 1, policy_version 80190 (0.0007) +[2023-10-11 18:16:57,978][85176] Updated weights for policy 0, policy_version 79012 (0.0009) +[2023-10-11 18:16:58,350][85176] Updated weights for policy 0, policy_version 79022 (0.0008) +[2023-10-11 18:16:58,709][85176] Updated weights for policy 0, policy_version 79032 (0.0008) +[2023-10-11 18:17:00,297][85175] Updated weights for policy 1, policy_version 80200 (0.0011) +[2023-10-11 18:17:00,664][85175] Updated weights for policy 1, policy_version 80210 (0.0009) +[2023-10-11 18:17:01,032][85175] Updated weights for policy 1, policy_version 80220 (0.0007) +[2023-10-11 18:17:01,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 163053568. Throughput: 0: 1686.1, 1: 1681.9. Samples: 40778408. Policy #0 lag: (min: 22.0, avg: 35.8, max: 54.0) +[2023-10-11 18:17:01,064][84230] Avg episode reward: [(0, '44.530'), (1, '49.560')] +[2023-10-11 18:17:02,827][85176] Updated weights for policy 0, policy_version 79042 (0.0007) +[2023-10-11 18:17:03,204][85176] Updated weights for policy 0, policy_version 79052 (0.0008) +[2023-10-11 18:17:03,578][85176] Updated weights for policy 0, policy_version 79062 (0.0007) +[2023-10-11 18:17:03,956][85176] Updated weights for policy 0, policy_version 79072 (0.0007) +[2023-10-11 18:17:05,106][85175] Updated weights for policy 1, policy_version 80230 (0.0007) +[2023-10-11 18:17:05,495][85175] Updated weights for policy 1, policy_version 80240 (0.0010) +[2023-10-11 18:17:05,863][85175] Updated weights for policy 1, policy_version 80250 (0.0008) +[2023-10-11 18:17:06,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 163119104. Throughput: 0: 1666.4, 1: 1698.4. Samples: 40788588. Policy #0 lag: (min: 22.0, avg: 35.8, max: 54.0) +[2023-10-11 18:17:06,064][84230] Avg episode reward: [(0, '45.720'), (1, '42.750')] +[2023-10-11 18:17:08,024][85176] Updated weights for policy 0, policy_version 79082 (0.0009) +[2023-10-11 18:17:08,406][85176] Updated weights for policy 0, policy_version 79092 (0.0008) +[2023-10-11 18:17:08,778][85176] Updated weights for policy 0, policy_version 79102 (0.0008) +[2023-10-11 18:17:09,945][85175] Updated weights for policy 1, policy_version 80260 (0.0010) +[2023-10-11 18:17:10,309][85175] Updated weights for policy 1, policy_version 80270 (0.0008) +[2023-10-11 18:17:10,685][85175] Updated weights for policy 1, policy_version 80280 (0.0009) +[2023-10-11 18:17:11,062][84230] Fps is (10 sec: 16384.4, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 163217408. Throughput: 0: 1678.9, 1: 1689.4. Samples: 40808642. Policy #0 lag: (min: 22.0, avg: 35.8, max: 54.0) +[2023-10-11 18:17:11,063][84230] Avg episode reward: [(0, '44.020'), (1, '46.690')] +[2023-10-11 18:17:12,649][85176] Updated weights for policy 0, policy_version 79112 (0.0009) +[2023-10-11 18:17:13,018][85176] Updated weights for policy 0, policy_version 79122 (0.0007) +[2023-10-11 18:17:13,398][85176] Updated weights for policy 0, policy_version 79132 (0.0009) +[2023-10-11 18:17:14,689][85175] Updated weights for policy 1, policy_version 80290 (0.0009) +[2023-10-11 18:17:15,051][85175] Updated weights for policy 1, policy_version 80300 (0.0010) +[2023-10-11 18:17:15,419][85175] Updated weights for policy 1, policy_version 80310 (0.0009) +[2023-10-11 18:17:15,782][85175] Updated weights for policy 1, policy_version 80320 (0.0008) +[2023-10-11 18:17:16,062][84230] Fps is (10 sec: 16384.4, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 163282944. Throughput: 0: 1689.0, 1: 1668.9. Samples: 40828796. Policy #0 lag: (min: 22.0, avg: 35.8, max: 54.0) +[2023-10-11 18:17:16,063][84230] Avg episode reward: [(0, '43.850'), (1, '42.310')] +[2023-10-11 18:17:17,377][85176] Updated weights for policy 0, policy_version 79142 (0.0007) +[2023-10-11 18:17:17,741][85176] Updated weights for policy 0, policy_version 79152 (0.0008) +[2023-10-11 18:17:18,105][85176] Updated weights for policy 0, policy_version 79162 (0.0008) +[2023-10-11 18:17:19,676][85175] Updated weights for policy 1, policy_version 80330 (0.0008) +[2023-10-11 18:17:20,037][85175] Updated weights for policy 1, policy_version 80340 (0.0010) +[2023-10-11 18:17:20,402][85175] Updated weights for policy 1, policy_version 80350 (0.0008) +[2023-10-11 18:17:21,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 163348480. Throughput: 0: 1661.7, 1: 1691.5. Samples: 40838808. Policy #0 lag: (min: 22.0, avg: 35.8, max: 54.0) +[2023-10-11 18:17:21,063][84230] Avg episode reward: [(0, '42.850'), (1, '46.520')] +[2023-10-11 18:17:22,258][85176] Updated weights for policy 0, policy_version 79172 (0.0010) +[2023-10-11 18:17:22,630][85176] Updated weights for policy 0, policy_version 79182 (0.0010) +[2023-10-11 18:17:23,011][85176] Updated weights for policy 0, policy_version 79192 (0.0008) +[2023-10-11 18:17:24,546][85175] Updated weights for policy 1, policy_version 80360 (0.0007) +[2023-10-11 18:17:24,918][85175] Updated weights for policy 1, policy_version 80370 (0.0007) +[2023-10-11 18:17:25,284][85175] Updated weights for policy 1, policy_version 80380 (0.0008) +[2023-10-11 18:17:26,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 163414016. Throughput: 0: 1683.0, 1: 1681.1. Samples: 40859158. Policy #0 lag: (min: 22.0, avg: 35.8, max: 54.0) +[2023-10-11 18:17:26,064][84230] Avg episode reward: [(0, '44.290'), (1, '44.740')] +[2023-10-11 18:17:27,150][85176] Updated weights for policy 0, policy_version 79202 (0.0008) +[2023-10-11 18:17:27,528][85176] Updated weights for policy 0, policy_version 79212 (0.0009) +[2023-10-11 18:17:27,889][85176] Updated weights for policy 0, policy_version 79222 (0.0009) +[2023-10-11 18:17:28,276][85176] Updated weights for policy 0, policy_version 79232 (0.0009) +[2023-10-11 18:17:29,352][85175] Updated weights for policy 1, policy_version 80390 (0.0009) +[2023-10-11 18:17:29,716][85175] Updated weights for policy 1, policy_version 80400 (0.0007) +[2023-10-11 18:17:30,085][85175] Updated weights for policy 1, policy_version 80410 (0.0007) +[2023-10-11 18:17:31,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 163479552. Throughput: 0: 1685.5, 1: 1667.0. Samples: 40879098. Policy #0 lag: (min: 1.0, avg: 8.8, max: 33.0) +[2023-10-11 18:17:31,064][84230] Avg episode reward: [(0, '44.220'), (1, '46.960')] +[2023-10-11 18:17:32,435][85176] Updated weights for policy 0, policy_version 79242 (0.0007) +[2023-10-11 18:17:32,804][85176] Updated weights for policy 0, policy_version 79252 (0.0010) +[2023-10-11 18:17:33,174][85176] Updated weights for policy 0, policy_version 79262 (0.0007) +[2023-10-11 18:17:33,959][85175] Updated weights for policy 1, policy_version 80420 (0.0009) +[2023-10-11 18:17:34,328][85175] Updated weights for policy 1, policy_version 80430 (0.0007) +[2023-10-11 18:17:34,697][85175] Updated weights for policy 1, policy_version 80440 (0.0008) +[2023-10-11 18:17:36,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 163545088. Throughput: 0: 1666.8, 1: 1699.7. Samples: 40889710. Policy #0 lag: (min: 1.0, avg: 8.8, max: 33.0) +[2023-10-11 18:17:36,063][84230] Avg episode reward: [(0, '45.410'), (1, '44.260')] +[2023-10-11 18:17:37,251][85176] Updated weights for policy 0, policy_version 79272 (0.0007) +[2023-10-11 18:17:37,633][85176] Updated weights for policy 0, policy_version 79282 (0.0007) +[2023-10-11 18:17:37,993][85176] Updated weights for policy 0, policy_version 79292 (0.0008) +[2023-10-11 18:17:38,815][85175] Updated weights for policy 1, policy_version 80450 (0.0008) +[2023-10-11 18:17:39,181][85175] Updated weights for policy 1, policy_version 80460 (0.0010) +[2023-10-11 18:17:39,553][85175] Updated weights for policy 1, policy_version 80470 (0.0008) +[2023-10-11 18:17:39,920][85175] Updated weights for policy 1, policy_version 80480 (0.0008) +[2023-10-11 18:17:41,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 163610624. Throughput: 0: 1683.5, 1: 1682.1. Samples: 40909710. Policy #0 lag: (min: 1.0, avg: 8.8, max: 33.0) +[2023-10-11 18:17:41,063][84230] Avg episode reward: [(0, '45.060'), (1, '45.100')] +[2023-10-11 18:17:41,870][85176] Updated weights for policy 0, policy_version 79302 (0.0009) +[2023-10-11 18:17:42,244][85176] Updated weights for policy 0, policy_version 79312 (0.0009) +[2023-10-11 18:17:42,614][85176] Updated weights for policy 0, policy_version 79322 (0.0010) +[2023-10-11 18:17:43,965][85175] Updated weights for policy 1, policy_version 80490 (0.0010) +[2023-10-11 18:17:44,322][85175] Updated weights for policy 1, policy_version 80500 (0.0008) +[2023-10-11 18:17:44,692][85175] Updated weights for policy 1, policy_version 80510 (0.0007) +[2023-10-11 18:17:46,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 163676160. Throughput: 0: 1685.4, 1: 1684.7. Samples: 40930060. Policy #0 lag: (min: 1.0, avg: 8.8, max: 33.0) +[2023-10-11 18:17:46,063][84230] Avg episode reward: [(0, '45.380'), (1, '44.660')] +[2023-10-11 18:17:46,759][85176] Updated weights for policy 0, policy_version 79332 (0.0009) +[2023-10-11 18:17:47,148][85176] Updated weights for policy 0, policy_version 79342 (0.0010) +[2023-10-11 18:17:47,523][85176] Updated weights for policy 0, policy_version 79352 (0.0007) +[2023-10-11 18:17:48,737][85175] Updated weights for policy 1, policy_version 80520 (0.0010) +[2023-10-11 18:17:49,104][85175] Updated weights for policy 1, policy_version 80530 (0.0007) +[2023-10-11 18:17:49,471][85175] Updated weights for policy 1, policy_version 80540 (0.0008) +[2023-10-11 18:17:51,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 163741696. Throughput: 0: 1670.7, 1: 1698.8. Samples: 40940214. Policy #0 lag: (min: 1.0, avg: 8.8, max: 33.0) +[2023-10-11 18:17:51,063][84230] Avg episode reward: [(0, '44.170'), (1, '43.750')] +[2023-10-11 18:17:51,663][85176] Updated weights for policy 0, policy_version 79362 (0.0008) +[2023-10-11 18:17:52,041][85176] Updated weights for policy 0, policy_version 79372 (0.0009) +[2023-10-11 18:17:52,414][85176] Updated weights for policy 0, policy_version 79382 (0.0008) +[2023-10-11 18:17:52,786][85176] Updated weights for policy 0, policy_version 79392 (0.0008) +[2023-10-11 18:17:53,539][85175] Updated weights for policy 1, policy_version 80550 (0.0010) +[2023-10-11 18:17:53,909][85175] Updated weights for policy 1, policy_version 80560 (0.0009) +[2023-10-11 18:17:54,278][85175] Updated weights for policy 1, policy_version 80570 (0.0010) +[2023-10-11 18:17:56,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 163807232. Throughput: 0: 1682.1, 1: 1681.5. Samples: 40960002. Policy #0 lag: (min: 1.0, avg: 8.8, max: 33.0) +[2023-10-11 18:17:56,064][84230] Avg episode reward: [(0, '43.400'), (1, '42.440')] +[2023-10-11 18:17:56,721][85176] Updated weights for policy 0, policy_version 79402 (0.0007) +[2023-10-11 18:17:57,098][85176] Updated weights for policy 0, policy_version 79412 (0.0008) +[2023-10-11 18:17:57,470][85176] Updated weights for policy 0, policy_version 79422 (0.0008) +[2023-10-11 18:17:58,465][85175] Updated weights for policy 1, policy_version 80580 (0.0009) +[2023-10-11 18:17:58,856][85175] Updated weights for policy 1, policy_version 80590 (0.0009) +[2023-10-11 18:17:59,219][85175] Updated weights for policy 1, policy_version 80600 (0.0009) +[2023-10-11 18:18:01,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 163872768. Throughput: 0: 1680.0, 1: 1698.5. Samples: 40980830. Policy #0 lag: (min: 1.0, avg: 8.8, max: 33.0) +[2023-10-11 18:18:01,063][84230] Avg episode reward: [(0, '46.790'), (1, '43.510')] +[2023-10-11 18:18:01,487][85176] Updated weights for policy 0, policy_version 79432 (0.0008) +[2023-10-11 18:18:01,861][85176] Updated weights for policy 0, policy_version 79442 (0.0010) +[2023-10-11 18:18:02,232][85176] Updated weights for policy 0, policy_version 79452 (0.0009) +[2023-10-11 18:18:02,914][85175] Updated weights for policy 1, policy_version 80610 (0.0007) +[2023-10-11 18:18:03,280][85175] Updated weights for policy 1, policy_version 80620 (0.0010) +[2023-10-11 18:18:03,639][85175] Updated weights for policy 1, policy_version 80630 (0.0009) +[2023-10-11 18:18:04,009][85175] Updated weights for policy 1, policy_version 80640 (0.0008) +[2023-10-11 18:18:06,062][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 163938304. Throughput: 0: 1679.7, 1: 1693.4. Samples: 40990596. Policy #0 lag: (min: 1.0, avg: 8.8, max: 33.0) +[2023-10-11 18:18:06,063][84230] Avg episode reward: [(0, '44.120'), (1, '47.200')] +[2023-10-11 18:18:06,317][85176] Updated weights for policy 0, policy_version 79462 (0.0007) +[2023-10-11 18:18:06,690][85176] Updated weights for policy 0, policy_version 79472 (0.0007) +[2023-10-11 18:18:07,057][85176] Updated weights for policy 0, policy_version 79482 (0.0007) +[2023-10-11 18:18:07,805][85175] Updated weights for policy 1, policy_version 80650 (0.0011) +[2023-10-11 18:18:08,172][85175] Updated weights for policy 1, policy_version 80660 (0.0009) +[2023-10-11 18:18:08,538][85175] Updated weights for policy 1, policy_version 80670 (0.0009) +[2023-10-11 18:18:11,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 164003840. Throughput: 0: 1678.8, 1: 1690.9. Samples: 41010796. Policy #0 lag: (min: 1.0, avg: 8.8, max: 33.0) +[2023-10-11 18:18:11,063][84230] Avg episode reward: [(0, '46.950'), (1, '44.160')] +[2023-10-11 18:18:11,094][85176] Updated weights for policy 0, policy_version 79492 (0.0007) +[2023-10-11 18:18:11,465][85176] Updated weights for policy 0, policy_version 79502 (0.0007) +[2023-10-11 18:18:11,846][85176] Updated weights for policy 0, policy_version 79512 (0.0008) +[2023-10-11 18:18:12,733][85175] Updated weights for policy 1, policy_version 80680 (0.0008) +[2023-10-11 18:18:13,105][85175] Updated weights for policy 1, policy_version 80690 (0.0009) +[2023-10-11 18:18:13,470][85175] Updated weights for policy 1, policy_version 80700 (0.0009) +[2023-10-11 18:18:15,966][85176] Updated weights for policy 0, policy_version 79522 (0.0008) +[2023-10-11 18:18:16,063][84230] Fps is (10 sec: 13106.8, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 164069376. Throughput: 0: 1684.9, 1: 1711.7. Samples: 41031946. Policy #0 lag: (min: 1.0, avg: 8.8, max: 33.0) +[2023-10-11 18:18:16,064][84230] Avg episode reward: [(0, '45.330'), (1, '46.220')] +[2023-10-11 18:18:16,074][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000080704_82640896.pth... +[2023-10-11 18:18:16,107][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000079136_81035264.pth +[2023-10-11 18:18:16,338][85176] Updated weights for policy 0, policy_version 79532 (0.0009) +[2023-10-11 18:18:16,712][85176] Updated weights for policy 0, policy_version 79542 (0.0008) +[2023-10-11 18:18:17,081][85176] Updated weights for policy 0, policy_version 79552 (0.0008) +[2023-10-11 18:18:17,082][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000079552_81461248.pth... +[2023-10-11 18:18:17,122][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000077984_79855616.pth +[2023-10-11 18:18:17,454][85175] Updated weights for policy 1, policy_version 80710 (0.0009) +[2023-10-11 18:18:17,813][85175] Updated weights for policy 1, policy_version 80720 (0.0009) +[2023-10-11 18:18:18,184][85175] Updated weights for policy 1, policy_version 80730 (0.0007) +[2023-10-11 18:18:21,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 164134912. Throughput: 0: 1684.6, 1: 1682.1. Samples: 41041212. Policy #0 lag: (min: 8.0, avg: 29.5, max: 40.0) +[2023-10-11 18:18:21,063][84230] Avg episode reward: [(0, '46.630'), (1, '42.200')] +[2023-10-11 18:18:21,201][85176] Updated weights for policy 0, policy_version 79562 (0.0009) +[2023-10-11 18:18:21,578][85176] Updated weights for policy 0, policy_version 79572 (0.0007) +[2023-10-11 18:18:21,951][85176] Updated weights for policy 0, policy_version 79582 (0.0007) +[2023-10-11 18:18:22,076][85175] Updated weights for policy 1, policy_version 80740 (0.0008) +[2023-10-11 18:18:22,440][85175] Updated weights for policy 1, policy_version 80750 (0.0008) +[2023-10-11 18:18:22,816][85175] Updated weights for policy 1, policy_version 80760 (0.0011) +[2023-10-11 18:18:26,063][84230] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 164200448. Throughput: 0: 1681.4, 1: 1699.4. Samples: 41061848. Policy #0 lag: (min: 8.0, avg: 29.5, max: 40.0) +[2023-10-11 18:18:26,063][84230] Avg episode reward: [(0, '44.590'), (1, '46.350')] +[2023-10-11 18:18:26,143][85176] Updated weights for policy 0, policy_version 79592 (0.0007) +[2023-10-11 18:18:26,516][85176] Updated weights for policy 0, policy_version 79602 (0.0009) +[2023-10-11 18:18:26,743][85175] Updated weights for policy 1, policy_version 80770 (0.0008) +[2023-10-11 18:18:26,897][85176] Updated weights for policy 0, policy_version 79612 (0.0009) +[2023-10-11 18:18:27,113][85175] Updated weights for policy 1, policy_version 80780 (0.0007) +[2023-10-11 18:18:27,478][85175] Updated weights for policy 1, policy_version 80790 (0.0007) +[2023-10-11 18:18:27,844][85175] Updated weights for policy 1, policy_version 80800 (0.0010) +[2023-10-11 18:18:31,026][85176] Updated weights for policy 0, policy_version 79622 (0.0007) +[2023-10-11 18:18:31,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 164265984. Throughput: 0: 1677.0, 1: 1716.4. Samples: 41082764. Policy #0 lag: (min: 8.0, avg: 29.5, max: 40.0) +[2023-10-11 18:18:31,063][84230] Avg episode reward: [(0, '43.800'), (1, '43.420')] +[2023-10-11 18:18:31,399][85176] Updated weights for policy 0, policy_version 79632 (0.0010) +[2023-10-11 18:18:31,773][85176] Updated weights for policy 0, policy_version 79642 (0.0009) +[2023-10-11 18:18:31,905][85175] Updated weights for policy 1, policy_version 80810 (0.0007) +[2023-10-11 18:18:32,272][85175] Updated weights for policy 1, policy_version 80820 (0.0009) +[2023-10-11 18:18:32,642][85175] Updated weights for policy 1, policy_version 80830 (0.0009) +[2023-10-11 18:18:35,926][85176] Updated weights for policy 0, policy_version 79652 (0.0009) +[2023-10-11 18:18:36,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 164331520. Throughput: 0: 1683.1, 1: 1691.9. Samples: 41092092. Policy #0 lag: (min: 8.0, avg: 29.5, max: 40.0) +[2023-10-11 18:18:36,064][84230] Avg episode reward: [(0, '42.670'), (1, '47.390')] +[2023-10-11 18:18:36,310][85176] Updated weights for policy 0, policy_version 79662 (0.0007) +[2023-10-11 18:18:36,575][85175] Updated weights for policy 1, policy_version 80840 (0.0007) +[2023-10-11 18:18:36,688][85176] Updated weights for policy 0, policy_version 79672 (0.0007) +[2023-10-11 18:18:36,948][85175] Updated weights for policy 1, policy_version 80850 (0.0007) +[2023-10-11 18:18:37,325][85175] Updated weights for policy 1, policy_version 80860 (0.0007) +[2023-10-11 18:18:40,614][85176] Updated weights for policy 0, policy_version 79682 (0.0008) +[2023-10-11 18:18:40,980][85176] Updated weights for policy 0, policy_version 79692 (0.0009) +[2023-10-11 18:18:41,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 164397056. Throughput: 0: 1677.8, 1: 1712.4. Samples: 41112560. Policy #0 lag: (min: 8.0, avg: 29.5, max: 40.0) +[2023-10-11 18:18:41,063][84230] Avg episode reward: [(0, '43.680'), (1, '42.590')] +[2023-10-11 18:18:41,336][85175] Updated weights for policy 1, policy_version 80870 (0.0007) +[2023-10-11 18:18:41,360][85176] Updated weights for policy 0, policy_version 79702 (0.0008) +[2023-10-11 18:18:41,706][85175] Updated weights for policy 1, policy_version 80880 (0.0007) +[2023-10-11 18:18:41,727][85176] Updated weights for policy 0, policy_version 79712 (0.0009) +[2023-10-11 18:18:42,075][85175] Updated weights for policy 1, policy_version 80890 (0.0008) +[2023-10-11 18:18:45,830][85176] Updated weights for policy 0, policy_version 79722 (0.0009) +[2023-10-11 18:18:46,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 164462592. Throughput: 0: 1668.4, 1: 1717.9. Samples: 41133212. Policy #0 lag: (min: 8.0, avg: 29.5, max: 40.0) +[2023-10-11 18:18:46,064][84230] Avg episode reward: [(0, '43.200'), (1, '47.590')] +[2023-10-11 18:18:46,214][85175] Updated weights for policy 1, policy_version 80900 (0.0007) +[2023-10-11 18:18:46,216][85176] Updated weights for policy 0, policy_version 79732 (0.0008) +[2023-10-11 18:18:46,590][85176] Updated weights for policy 0, policy_version 79742 (0.0009) +[2023-10-11 18:18:46,610][85175] Updated weights for policy 1, policy_version 80910 (0.0007) +[2023-10-11 18:18:46,978][85175] Updated weights for policy 1, policy_version 80920 (0.0009) +[2023-10-11 18:18:50,699][85176] Updated weights for policy 0, policy_version 79752 (0.0009) +[2023-10-11 18:18:50,995][85175] Updated weights for policy 1, policy_version 80930 (0.0010) +[2023-10-11 18:18:51,062][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 164528128. Throughput: 0: 1674.0, 1: 1696.5. Samples: 41142266. Policy #0 lag: (min: 8.0, avg: 29.5, max: 40.0) +[2023-10-11 18:18:51,063][84230] Avg episode reward: [(0, '44.330'), (1, '43.260')] +[2023-10-11 18:18:51,077][85176] Updated weights for policy 0, policy_version 79762 (0.0009) +[2023-10-11 18:18:51,359][85175] Updated weights for policy 1, policy_version 80940 (0.0008) +[2023-10-11 18:18:51,460][85176] Updated weights for policy 0, policy_version 79772 (0.0008) +[2023-10-11 18:18:51,727][85175] Updated weights for policy 1, policy_version 80950 (0.0008) +[2023-10-11 18:18:52,093][85175] Updated weights for policy 1, policy_version 80960 (0.0007) +[2023-10-11 18:18:55,584][85176] Updated weights for policy 0, policy_version 79782 (0.0008) +[2023-10-11 18:18:55,949][85176] Updated weights for policy 0, policy_version 79792 (0.0009) +[2023-10-11 18:18:56,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 164593664. Throughput: 0: 1669.2, 1: 1707.6. Samples: 41162750. Policy #0 lag: (min: 8.0, avg: 29.5, max: 40.0) +[2023-10-11 18:18:56,063][84230] Avg episode reward: [(0, '43.270'), (1, '47.360')] +[2023-10-11 18:18:56,119][85175] Updated weights for policy 1, policy_version 80970 (0.0008) +[2023-10-11 18:18:56,332][85176] Updated weights for policy 0, policy_version 79802 (0.0008) +[2023-10-11 18:18:56,496][85175] Updated weights for policy 1, policy_version 80980 (0.0008) +[2023-10-11 18:18:56,875][85175] Updated weights for policy 1, policy_version 80990 (0.0009) +[2023-10-11 18:19:00,216][85176] Updated weights for policy 0, policy_version 79812 (0.0008) +[2023-10-11 18:19:00,595][85176] Updated weights for policy 0, policy_version 79822 (0.0007) +[2023-10-11 18:19:00,964][85176] Updated weights for policy 0, policy_version 79832 (0.0008) +[2023-10-11 18:19:00,965][85175] Updated weights for policy 1, policy_version 81000 (0.0008) +[2023-10-11 18:19:01,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 164659200. Throughput: 0: 1659.1, 1: 1706.1. Samples: 41183378. Policy #0 lag: (min: 8.0, avg: 29.5, max: 40.0) +[2023-10-11 18:19:01,063][84230] Avg episode reward: [(0, '45.290'), (1, '42.360')] +[2023-10-11 18:19:01,322][85175] Updated weights for policy 1, policy_version 81010 (0.0007) +[2023-10-11 18:19:01,701][85175] Updated weights for policy 1, policy_version 81020 (0.0007) +[2023-10-11 18:19:04,988][85176] Updated weights for policy 0, policy_version 79842 (0.0009) +[2023-10-11 18:19:05,364][85176] Updated weights for policy 0, policy_version 79852 (0.0008) +[2023-10-11 18:19:05,649][85175] Updated weights for policy 1, policy_version 81030 (0.0008) +[2023-10-11 18:19:05,742][85176] Updated weights for policy 0, policy_version 79862 (0.0008) +[2023-10-11 18:19:06,020][85175] Updated weights for policy 1, policy_version 81040 (0.0008) +[2023-10-11 18:19:06,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 164724736. Throughput: 0: 1671.9, 1: 1704.3. Samples: 41193142. Policy #0 lag: (min: 8.0, avg: 29.5, max: 40.0) +[2023-10-11 18:19:06,063][84230] Avg episode reward: [(0, '43.320'), (1, '45.130')] +[2023-10-11 18:19:06,106][85176] Updated weights for policy 0, policy_version 79872 (0.0008) +[2023-10-11 18:19:06,386][85175] Updated weights for policy 1, policy_version 81050 (0.0009) +[2023-10-11 18:19:10,222][85176] Updated weights for policy 0, policy_version 79882 (0.0009) +[2023-10-11 18:19:10,493][85175] Updated weights for policy 1, policy_version 81060 (0.0007) +[2023-10-11 18:19:10,588][85176] Updated weights for policy 0, policy_version 79892 (0.0007) +[2023-10-11 18:19:10,861][85175] Updated weights for policy 1, policy_version 81070 (0.0007) +[2023-10-11 18:19:10,955][85176] Updated weights for policy 0, policy_version 79902 (0.0009) +[2023-10-11 18:19:11,062][84230] Fps is (10 sec: 16384.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 164823040. Throughput: 0: 1673.3, 1: 1706.8. Samples: 41213954. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2023-10-11 18:19:11,063][84230] Avg episode reward: [(0, '43.570'), (1, '41.270')] +[2023-10-11 18:19:11,227][85175] Updated weights for policy 1, policy_version 81080 (0.0008) +[2023-10-11 18:19:15,076][85176] Updated weights for policy 0, policy_version 79912 (0.0008) +[2023-10-11 18:19:15,183][85175] Updated weights for policy 1, policy_version 81090 (0.0009) +[2023-10-11 18:19:15,448][85176] Updated weights for policy 0, policy_version 79922 (0.0008) +[2023-10-11 18:19:15,553][85175] Updated weights for policy 1, policy_version 81100 (0.0007) +[2023-10-11 18:19:15,826][85176] Updated weights for policy 0, policy_version 79932 (0.0008) +[2023-10-11 18:19:15,930][85175] Updated weights for policy 1, policy_version 81110 (0.0007) +[2023-10-11 18:19:16,063][84230] Fps is (10 sec: 16383.4, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 164888576. Throughput: 0: 1654.4, 1: 1695.4. Samples: 41233506. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2023-10-11 18:19:16,064][84230] Avg episode reward: [(0, '42.830'), (1, '45.380')] +[2023-10-11 18:19:16,289][85175] Updated weights for policy 1, policy_version 81120 (0.0007) +[2023-10-11 18:19:20,022][85176] Updated weights for policy 0, policy_version 79942 (0.0009) +[2023-10-11 18:19:20,393][85176] Updated weights for policy 0, policy_version 79952 (0.0009) +[2023-10-11 18:19:20,401][85175] Updated weights for policy 1, policy_version 81130 (0.0008) +[2023-10-11 18:19:20,764][85175] Updated weights for policy 1, policy_version 81140 (0.0008) +[2023-10-11 18:19:20,767][85176] Updated weights for policy 0, policy_version 79962 (0.0008) +[2023-10-11 18:19:21,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 164954112. Throughput: 0: 1667.4, 1: 1704.4. Samples: 41243824. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2023-10-11 18:19:21,064][84230] Avg episode reward: [(0, '44.350'), (1, '45.270')] +[2023-10-11 18:19:21,137][85175] Updated weights for policy 1, policy_version 81150 (0.0010) +[2023-10-11 18:19:24,836][85176] Updated weights for policy 0, policy_version 79972 (0.0009) +[2023-10-11 18:19:25,180][85175] Updated weights for policy 1, policy_version 81160 (0.0009) +[2023-10-11 18:19:25,224][85176] Updated weights for policy 0, policy_version 79982 (0.0007) +[2023-10-11 18:19:25,544][85175] Updated weights for policy 1, policy_version 81170 (0.0008) +[2023-10-11 18:19:25,594][85176] Updated weights for policy 0, policy_version 79992 (0.0007) +[2023-10-11 18:19:25,916][85175] Updated weights for policy 1, policy_version 81180 (0.0007) +[2023-10-11 18:19:26,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 165019648. Throughput: 0: 1676.1, 1: 1701.2. Samples: 41264540. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2023-10-11 18:19:26,064][84230] Avg episode reward: [(0, '44.070'), (1, '45.420')] +[2023-10-11 18:19:29,747][85176] Updated weights for policy 0, policy_version 80002 (0.0007) +[2023-10-11 18:19:29,987][85175] Updated weights for policy 1, policy_version 81190 (0.0007) +[2023-10-11 18:19:30,113][85176] Updated weights for policy 0, policy_version 80012 (0.0008) +[2023-10-11 18:19:30,349][85175] Updated weights for policy 1, policy_version 81200 (0.0008) +[2023-10-11 18:19:30,496][85176] Updated weights for policy 0, policy_version 80022 (0.0009) +[2023-10-11 18:19:30,724][85175] Updated weights for policy 1, policy_version 81210 (0.0008) +[2023-10-11 18:19:30,856][85176] Updated weights for policy 0, policy_version 80032 (0.0009) +[2023-10-11 18:19:31,063][84230] Fps is (10 sec: 16384.2, 60 sec: 14199.5, 300 sec: 13662.6). Total num frames: 165117952. Throughput: 0: 1654.6, 1: 1678.9. Samples: 41283220. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2023-10-11 18:19:31,063][84230] Avg episode reward: [(0, '42.950'), (1, '45.620')] +[2023-10-11 18:19:34,837][85175] Updated weights for policy 1, policy_version 81220 (0.0007) +[2023-10-11 18:19:35,025][85176] Updated weights for policy 0, policy_version 80042 (0.0007) +[2023-10-11 18:19:35,245][85175] Updated weights for policy 1, policy_version 81230 (0.0008) +[2023-10-11 18:19:35,392][85176] Updated weights for policy 0, policy_version 80052 (0.0007) +[2023-10-11 18:19:35,605][85175] Updated weights for policy 1, policy_version 81240 (0.0010) +[2023-10-11 18:19:35,764][85176] Updated weights for policy 0, policy_version 80062 (0.0008) +[2023-10-11 18:19:36,062][84230] Fps is (10 sec: 16384.4, 60 sec: 14199.5, 300 sec: 13662.6). Total num frames: 165183488. Throughput: 0: 1672.5, 1: 1703.0. Samples: 41294162. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2023-10-11 18:19:36,063][84230] Avg episode reward: [(0, '44.160'), (1, '42.720')] +[2023-10-11 18:19:39,631][85175] Updated weights for policy 1, policy_version 81250 (0.0009) +[2023-10-11 18:19:39,770][85176] Updated weights for policy 0, policy_version 80072 (0.0007) +[2023-10-11 18:19:39,985][85175] Updated weights for policy 1, policy_version 81260 (0.0010) +[2023-10-11 18:19:40,141][85176] Updated weights for policy 0, policy_version 80082 (0.0007) +[2023-10-11 18:19:40,355][85175] Updated weights for policy 1, policy_version 81270 (0.0008) +[2023-10-11 18:19:40,515][85176] Updated weights for policy 0, policy_version 80092 (0.0008) +[2023-10-11 18:19:40,721][85175] Updated weights for policy 1, policy_version 81280 (0.0007) +[2023-10-11 18:19:41,063][84230] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 13662.6). Total num frames: 165249024. Throughput: 0: 1679.8, 1: 1695.3. Samples: 41314630. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2023-10-11 18:19:41,064][84230] Avg episode reward: [(0, '42.040'), (1, '45.140')] +[2023-10-11 18:19:44,579][85176] Updated weights for policy 0, policy_version 80102 (0.0007) +[2023-10-11 18:19:44,802][85175] Updated weights for policy 1, policy_version 81290 (0.0008) +[2023-10-11 18:19:44,950][85176] Updated weights for policy 0, policy_version 80112 (0.0007) +[2023-10-11 18:19:45,182][85175] Updated weights for policy 1, policy_version 81300 (0.0009) +[2023-10-11 18:19:45,325][85176] Updated weights for policy 0, policy_version 80122 (0.0007) +[2023-10-11 18:19:45,538][85175] Updated weights for policy 1, policy_version 81310 (0.0009) +[2023-10-11 18:19:46,063][84230] Fps is (10 sec: 13106.8, 60 sec: 14199.4, 300 sec: 13662.6). Total num frames: 165314560. Throughput: 0: 1659.8, 1: 1669.5. Samples: 41333196. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2023-10-11 18:19:46,064][84230] Avg episode reward: [(0, '44.380'), (1, '41.720')] +[2023-10-11 18:19:49,209][85176] Updated weights for policy 0, policy_version 80132 (0.0008) +[2023-10-11 18:19:49,583][85176] Updated weights for policy 0, policy_version 80142 (0.0008) +[2023-10-11 18:19:49,665][85175] Updated weights for policy 1, policy_version 81320 (0.0009) +[2023-10-11 18:19:49,956][85176] Updated weights for policy 0, policy_version 80152 (0.0008) +[2023-10-11 18:19:50,031][85175] Updated weights for policy 1, policy_version 81330 (0.0008) +[2023-10-11 18:19:50,403][85175] Updated weights for policy 1, policy_version 81340 (0.0007) +[2023-10-11 18:19:51,062][84230] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 13662.6). Total num frames: 165380096. Throughput: 0: 1675.7, 1: 1689.3. Samples: 41344568. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2023-10-11 18:19:51,063][84230] Avg episode reward: [(0, '42.270'), (1, '47.300')] +[2023-10-11 18:19:53,955][85176] Updated weights for policy 0, policy_version 80162 (0.0008) +[2023-10-11 18:19:54,318][85176] Updated weights for policy 0, policy_version 80172 (0.0008) +[2023-10-11 18:19:54,431][85175] Updated weights for policy 1, policy_version 81350 (0.0008) +[2023-10-11 18:19:54,686][85176] Updated weights for policy 0, policy_version 80182 (0.0007) +[2023-10-11 18:19:54,792][85175] Updated weights for policy 1, policy_version 81360 (0.0007) +[2023-10-11 18:19:55,058][85176] Updated weights for policy 0, policy_version 80192 (0.0008) +[2023-10-11 18:19:55,154][85175] Updated weights for policy 1, policy_version 81370 (0.0008) +[2023-10-11 18:19:56,063][84230] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 165445632. Throughput: 0: 1659.3, 1: 1678.4. Samples: 41364152. Policy #0 lag: (min: 31.0, avg: 34.3, max: 63.0) +[2023-10-11 18:19:56,064][84230] Avg episode reward: [(0, '45.150'), (1, '41.460')] +[2023-10-11 18:19:59,168][85176] Updated weights for policy 0, policy_version 80202 (0.0009) +[2023-10-11 18:19:59,176][85175] Updated weights for policy 1, policy_version 81380 (0.0009) +[2023-10-11 18:19:59,545][85175] Updated weights for policy 1, policy_version 81390 (0.0007) +[2023-10-11 18:19:59,548][85176] Updated weights for policy 0, policy_version 80212 (0.0009) +[2023-10-11 18:19:59,908][85175] Updated weights for policy 1, policy_version 81400 (0.0009) +[2023-10-11 18:19:59,913][85176] Updated weights for policy 0, policy_version 80222 (0.0009) +[2023-10-11 18:20:01,063][84230] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 165511168. Throughput: 0: 1666.7, 1: 1662.0. Samples: 41383298. Policy #0 lag: (min: 31.0, avg: 34.3, max: 63.0) +[2023-10-11 18:20:01,063][84230] Avg episode reward: [(0, '42.130'), (1, '45.480')] +[2023-10-11 18:20:04,002][85176] Updated weights for policy 0, policy_version 80232 (0.0009) +[2023-10-11 18:20:04,005][85175] Updated weights for policy 1, policy_version 81410 (0.0011) +[2023-10-11 18:20:04,362][85175] Updated weights for policy 1, policy_version 81420 (0.0008) +[2023-10-11 18:20:04,374][85176] Updated weights for policy 0, policy_version 80242 (0.0007) +[2023-10-11 18:20:04,733][85175] Updated weights for policy 1, policy_version 81430 (0.0007) +[2023-10-11 18:20:04,744][85176] Updated weights for policy 0, policy_version 80252 (0.0008) +[2023-10-11 18:20:05,099][85175] Updated weights for policy 1, policy_version 81440 (0.0009) +[2023-10-11 18:20:06,063][84230] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 165576704. Throughput: 0: 1679.6, 1: 1679.1. Samples: 41394966. Policy #0 lag: (min: 31.0, avg: 34.3, max: 63.0) +[2023-10-11 18:20:06,064][84230] Avg episode reward: [(0, '43.430'), (1, '40.930')] +[2023-10-11 18:20:08,965][85176] Updated weights for policy 0, policy_version 80262 (0.0008) +[2023-10-11 18:20:09,204][85175] Updated weights for policy 1, policy_version 81450 (0.0008) +[2023-10-11 18:20:09,338][85176] Updated weights for policy 0, policy_version 80272 (0.0007) +[2023-10-11 18:20:09,563][85175] Updated weights for policy 1, policy_version 81460 (0.0007) +[2023-10-11 18:20:09,710][85176] Updated weights for policy 0, policy_version 80282 (0.0009) +[2023-10-11 18:20:09,931][85175] Updated weights for policy 1, policy_version 81470 (0.0007) +[2023-10-11 18:20:11,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 165642240. Throughput: 0: 1652.2, 1: 1669.8. Samples: 41414028. Policy #0 lag: (min: 31.0, avg: 34.3, max: 63.0) +[2023-10-11 18:20:11,064][84230] Avg episode reward: [(0, '43.850'), (1, '44.190')] +[2023-10-11 18:20:13,796][85175] Updated weights for policy 1, policy_version 81480 (0.0008) +[2023-10-11 18:20:14,027][85176] Updated weights for policy 0, policy_version 80292 (0.0009) +[2023-10-11 18:20:14,162][85175] Updated weights for policy 1, policy_version 81490 (0.0008) +[2023-10-11 18:20:14,421][85176] Updated weights for policy 0, policy_version 80302 (0.0010) +[2023-10-11 18:20:14,533][85175] Updated weights for policy 1, policy_version 81500 (0.0007) +[2023-10-11 18:20:14,791][85176] Updated weights for policy 0, policy_version 80312 (0.0007) +[2023-10-11 18:20:16,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 165707776. Throughput: 0: 1668.9, 1: 1674.8. Samples: 41433686. Policy #0 lag: (min: 31.0, avg: 34.3, max: 63.0) +[2023-10-11 18:20:16,063][84230] Avg episode reward: [(0, '43.230'), (1, '41.820')] +[2023-10-11 18:20:16,070][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000080320_82247680.pth... +[2023-10-11 18:20:16,071][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000081504_83460096.pth... +[2023-10-11 18:20:16,110][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000079904_81821696.pth +[2023-10-11 18:20:16,110][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000078752_80642048.pth +[2023-10-11 18:20:18,825][85175] Updated weights for policy 1, policy_version 81510 (0.0009) +[2023-10-11 18:20:18,906][85176] Updated weights for policy 0, policy_version 80322 (0.0007) +[2023-10-11 18:20:19,189][85175] Updated weights for policy 1, policy_version 81520 (0.0007) +[2023-10-11 18:20:19,270][85176] Updated weights for policy 0, policy_version 80332 (0.0009) +[2023-10-11 18:20:19,556][85175] Updated weights for policy 1, policy_version 81530 (0.0008) +[2023-10-11 18:20:19,645][85176] Updated weights for policy 0, policy_version 80342 (0.0008) +[2023-10-11 18:20:20,016][85176] Updated weights for policy 0, policy_version 80352 (0.0009) +[2023-10-11 18:20:21,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 165773312. Throughput: 0: 1675.5, 1: 1679.6. Samples: 41445146. Policy #0 lag: (min: 31.0, avg: 34.3, max: 63.0) +[2023-10-11 18:20:21,064][84230] Avg episode reward: [(0, '44.580'), (1, '45.820')] +[2023-10-11 18:20:23,771][85175] Updated weights for policy 1, policy_version 81540 (0.0008) +[2023-10-11 18:20:24,014][85176] Updated weights for policy 0, policy_version 80362 (0.0008) +[2023-10-11 18:20:24,174][85175] Updated weights for policy 1, policy_version 81550 (0.0009) +[2023-10-11 18:20:24,396][85176] Updated weights for policy 0, policy_version 80372 (0.0009) +[2023-10-11 18:20:24,537][85175] Updated weights for policy 1, policy_version 81560 (0.0007) +[2023-10-11 18:20:24,765][85176] Updated weights for policy 0, policy_version 80382 (0.0008) +[2023-10-11 18:20:26,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 165838848. Throughput: 0: 1653.6, 1: 1663.4. Samples: 41463894. Policy #0 lag: (min: 31.0, avg: 34.3, max: 63.0) +[2023-10-11 18:20:26,064][84230] Avg episode reward: [(0, '41.850'), (1, '42.210')] +[2023-10-11 18:20:28,569][85175] Updated weights for policy 1, policy_version 81570 (0.0007) +[2023-10-11 18:20:28,865][85176] Updated weights for policy 0, policy_version 80392 (0.0009) +[2023-10-11 18:20:28,936][85175] Updated weights for policy 1, policy_version 81580 (0.0009) +[2023-10-11 18:20:29,251][85176] Updated weights for policy 0, policy_version 80402 (0.0009) +[2023-10-11 18:20:29,309][85175] Updated weights for policy 1, policy_version 81590 (0.0009) +[2023-10-11 18:20:29,620][85176] Updated weights for policy 0, policy_version 80412 (0.0007) +[2023-10-11 18:20:29,674][85175] Updated weights for policy 1, policy_version 81600 (0.0007) +[2023-10-11 18:20:31,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 165904384. Throughput: 0: 1671.4, 1: 1678.6. Samples: 41483944. Policy #0 lag: (min: 31.0, avg: 34.3, max: 63.0) +[2023-10-11 18:20:31,064][84230] Avg episode reward: [(0, '42.960'), (1, '44.330')] +[2023-10-11 18:20:33,567][85176] Updated weights for policy 0, policy_version 80422 (0.0008) +[2023-10-11 18:20:33,717][85175] Updated weights for policy 1, policy_version 81610 (0.0007) +[2023-10-11 18:20:33,937][85176] Updated weights for policy 0, policy_version 80432 (0.0009) +[2023-10-11 18:20:34,079][85175] Updated weights for policy 1, policy_version 81620 (0.0008) +[2023-10-11 18:20:34,301][85176] Updated weights for policy 0, policy_version 80442 (0.0009) +[2023-10-11 18:20:34,448][85175] Updated weights for policy 1, policy_version 81630 (0.0008) +[2023-10-11 18:20:36,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 165969920. Throughput: 0: 1665.0, 1: 1678.6. Samples: 41495030. Policy #0 lag: (min: 31.0, avg: 34.3, max: 63.0) +[2023-10-11 18:20:36,064][84230] Avg episode reward: [(0, '39.110'), (1, '44.330')] +[2023-10-11 18:20:38,409][85176] Updated weights for policy 0, policy_version 80452 (0.0009) +[2023-10-11 18:20:38,445][85175] Updated weights for policy 1, policy_version 81640 (0.0009) +[2023-10-11 18:20:38,775][85176] Updated weights for policy 0, policy_version 80462 (0.0007) +[2023-10-11 18:20:38,807][85175] Updated weights for policy 1, policy_version 81650 (0.0008) +[2023-10-11 18:20:39,152][85176] Updated weights for policy 0, policy_version 80472 (0.0008) +[2023-10-11 18:20:39,173][85175] Updated weights for policy 1, policy_version 81660 (0.0009) +[2023-10-11 18:20:41,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 166035456. Throughput: 0: 1654.0, 1: 1663.7. Samples: 41513450. Policy #0 lag: (min: 31.0, avg: 34.3, max: 63.0) +[2023-10-11 18:20:41,064][84230] Avg episode reward: [(0, '43.220'), (1, '45.360')] +[2023-10-11 18:20:43,270][85176] Updated weights for policy 0, policy_version 80482 (0.0008) +[2023-10-11 18:20:43,292][85175] Updated weights for policy 1, policy_version 81670 (0.0007) +[2023-10-11 18:20:43,641][85176] Updated weights for policy 0, policy_version 80492 (0.0008) +[2023-10-11 18:20:43,660][85175] Updated weights for policy 1, policy_version 81680 (0.0007) +[2023-10-11 18:20:44,010][85176] Updated weights for policy 0, policy_version 80502 (0.0010) +[2023-10-11 18:20:44,021][85175] Updated weights for policy 1, policy_version 81690 (0.0008) +[2023-10-11 18:20:44,383][85176] Updated weights for policy 0, policy_version 80512 (0.0009) +[2023-10-11 18:20:46,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 166100992. Throughput: 0: 1670.0, 1: 1687.0. Samples: 41534364. Policy #0 lag: (min: 31.0, avg: 34.3, max: 63.0) +[2023-10-11 18:20:46,064][84230] Avg episode reward: [(0, '40.460'), (1, '44.760')] +[2023-10-11 18:20:47,925][85175] Updated weights for policy 1, policy_version 81700 (0.0009) +[2023-10-11 18:20:48,289][85175] Updated weights for policy 1, policy_version 81710 (0.0007) +[2023-10-11 18:20:48,432][85176] Updated weights for policy 0, policy_version 80522 (0.0010) +[2023-10-11 18:20:48,660][85175] Updated weights for policy 1, policy_version 81720 (0.0007) +[2023-10-11 18:20:48,807][85176] Updated weights for policy 0, policy_version 80532 (0.0008) +[2023-10-11 18:20:49,173][85176] Updated weights for policy 0, policy_version 80542 (0.0008) +[2023-10-11 18:20:51,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 166166528. Throughput: 0: 1657.7, 1: 1670.0. Samples: 41544712. Policy #0 lag: (min: 31.0, avg: 42.2, max: 63.0) +[2023-10-11 18:20:51,063][84230] Avg episode reward: [(0, '45.500'), (1, '44.440')] +[2023-10-11 18:20:52,511][85175] Updated weights for policy 1, policy_version 81730 (0.0007) +[2023-10-11 18:20:52,887][85175] Updated weights for policy 1, policy_version 81740 (0.0011) +[2023-10-11 18:20:53,254][85175] Updated weights for policy 1, policy_version 81750 (0.0010) +[2023-10-11 18:20:53,343][85176] Updated weights for policy 0, policy_version 80552 (0.0011) +[2023-10-11 18:20:53,620][85175] Updated weights for policy 1, policy_version 81760 (0.0009) +[2023-10-11 18:20:53,722][85176] Updated weights for policy 0, policy_version 80562 (0.0008) +[2023-10-11 18:20:54,092][85176] Updated weights for policy 0, policy_version 80572 (0.0010) +[2023-10-11 18:20:56,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 166232064. Throughput: 0: 1660.9, 1: 1681.2. Samples: 41564424. Policy #0 lag: (min: 31.0, avg: 42.2, max: 63.0) +[2023-10-11 18:20:56,063][84230] Avg episode reward: [(0, '42.630'), (1, '46.400')] +[2023-10-11 18:20:57,732][85175] Updated weights for policy 1, policy_version 81770 (0.0009) +[2023-10-11 18:20:58,062][85176] Updated weights for policy 0, policy_version 80582 (0.0008) +[2023-10-11 18:20:58,100][85175] Updated weights for policy 1, policy_version 81780 (0.0009) +[2023-10-11 18:20:58,428][85176] Updated weights for policy 0, policy_version 80592 (0.0007) +[2023-10-11 18:20:58,455][85175] Updated weights for policy 1, policy_version 81790 (0.0009) +[2023-10-11 18:20:58,809][85176] Updated weights for policy 0, policy_version 80602 (0.0008) +[2023-10-11 18:21:01,063][84230] Fps is (10 sec: 13106.8, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 166297600. Throughput: 0: 1681.7, 1: 1693.1. Samples: 41585552. Policy #0 lag: (min: 31.0, avg: 42.2, max: 63.0) +[2023-10-11 18:21:01,064][84230] Avg episode reward: [(0, '46.150'), (1, '44.480')] +[2023-10-11 18:21:02,335][85175] Updated weights for policy 1, policy_version 81800 (0.0008) +[2023-10-11 18:21:02,708][85175] Updated weights for policy 1, policy_version 81810 (0.0009) +[2023-10-11 18:21:02,922][85176] Updated weights for policy 0, policy_version 80612 (0.0008) +[2023-10-11 18:21:03,072][85175] Updated weights for policy 1, policy_version 81820 (0.0007) +[2023-10-11 18:21:03,324][85176] Updated weights for policy 0, policy_version 80622 (0.0010) +[2023-10-11 18:21:03,697][85176] Updated weights for policy 0, policy_version 80632 (0.0008) +[2023-10-11 18:21:06,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 166363136. Throughput: 0: 1659.2, 1: 1670.3. Samples: 41594970. Policy #0 lag: (min: 31.0, avg: 42.2, max: 63.0) +[2023-10-11 18:21:06,064][84230] Avg episode reward: [(0, '41.070'), (1, '45.960')] +[2023-10-11 18:21:07,138][85175] Updated weights for policy 1, policy_version 81830 (0.0008) +[2023-10-11 18:21:07,503][85175] Updated weights for policy 1, policy_version 81840 (0.0011) +[2023-10-11 18:21:07,749][85176] Updated weights for policy 0, policy_version 80642 (0.0007) +[2023-10-11 18:21:07,880][85175] Updated weights for policy 1, policy_version 81850 (0.0010) +[2023-10-11 18:21:08,123][85176] Updated weights for policy 0, policy_version 80652 (0.0009) +[2023-10-11 18:21:08,502][85176] Updated weights for policy 0, policy_version 80662 (0.0009) +[2023-10-11 18:21:08,879][85176] Updated weights for policy 0, policy_version 80672 (0.0010) +[2023-10-11 18:21:11,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 166428672. Throughput: 0: 1666.4, 1: 1695.5. Samples: 41615180. Policy #0 lag: (min: 31.0, avg: 42.2, max: 63.0) +[2023-10-11 18:21:11,064][84230] Avg episode reward: [(0, '46.280'), (1, '41.880')] +[2023-10-11 18:21:11,955][85175] Updated weights for policy 1, policy_version 81860 (0.0008) +[2023-10-11 18:21:12,359][85175] Updated weights for policy 1, policy_version 81870 (0.0008) +[2023-10-11 18:21:12,729][85175] Updated weights for policy 1, policy_version 81880 (0.0007) +[2023-10-11 18:21:13,157][85176] Updated weights for policy 0, policy_version 80682 (0.0008) +[2023-10-11 18:21:13,527][85176] Updated weights for policy 0, policy_version 80692 (0.0008) +[2023-10-11 18:21:13,902][85176] Updated weights for policy 0, policy_version 80702 (0.0007) +[2023-10-11 18:21:16,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 166494208. Throughput: 0: 1667.5, 1: 1702.1. Samples: 41635574. Policy #0 lag: (min: 31.0, avg: 42.2, max: 63.0) +[2023-10-11 18:21:16,063][84230] Avg episode reward: [(0, '43.630'), (1, '47.920')] +[2023-10-11 18:21:16,873][85175] Updated weights for policy 1, policy_version 81890 (0.0007) +[2023-10-11 18:21:17,237][85175] Updated weights for policy 1, policy_version 81900 (0.0008) +[2023-10-11 18:21:17,610][85175] Updated weights for policy 1, policy_version 81910 (0.0009) +[2023-10-11 18:21:17,972][85175] Updated weights for policy 1, policy_version 81920 (0.0009) +[2023-10-11 18:21:17,979][85176] Updated weights for policy 0, policy_version 80712 (0.0010) +[2023-10-11 18:21:18,347][85176] Updated weights for policy 0, policy_version 80722 (0.0007) +[2023-10-11 18:21:18,730][85176] Updated weights for policy 0, policy_version 80732 (0.0008) +[2023-10-11 18:21:21,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 166559744. Throughput: 0: 1657.4, 1: 1676.3. Samples: 41645046. Policy #0 lag: (min: 31.0, avg: 42.2, max: 63.0) +[2023-10-11 18:21:21,063][84230] Avg episode reward: [(0, '46.020'), (1, '43.840')] +[2023-10-11 18:21:22,124][85175] Updated weights for policy 1, policy_version 81930 (0.0007) +[2023-10-11 18:21:22,485][85175] Updated weights for policy 1, policy_version 81940 (0.0007) +[2023-10-11 18:21:22,796][85176] Updated weights for policy 0, policy_version 80742 (0.0009) +[2023-10-11 18:21:22,857][85175] Updated weights for policy 1, policy_version 81950 (0.0007) +[2023-10-11 18:21:23,162][85176] Updated weights for policy 0, policy_version 80752 (0.0007) +[2023-10-11 18:21:23,533][85176] Updated weights for policy 0, policy_version 80762 (0.0008) +[2023-10-11 18:21:26,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 166625280. Throughput: 0: 1679.3, 1: 1700.0. Samples: 41665520. Policy #0 lag: (min: 31.0, avg: 42.2, max: 63.0) +[2023-10-11 18:21:26,063][84230] Avg episode reward: [(0, '42.480'), (1, '47.450')] +[2023-10-11 18:21:26,863][85175] Updated weights for policy 1, policy_version 81960 (0.0009) +[2023-10-11 18:21:27,227][85175] Updated weights for policy 1, policy_version 81970 (0.0007) +[2023-10-11 18:21:27,454][85176] Updated weights for policy 0, policy_version 80772 (0.0007) +[2023-10-11 18:21:27,598][85175] Updated weights for policy 1, policy_version 81980 (0.0008) +[2023-10-11 18:21:27,823][85176] Updated weights for policy 0, policy_version 80782 (0.0007) +[2023-10-11 18:21:28,195][85176] Updated weights for policy 0, policy_version 80792 (0.0008) +[2023-10-11 18:21:31,063][84230] Fps is (10 sec: 13106.7, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 166690816. Throughput: 0: 1678.7, 1: 1704.4. Samples: 41686604. Policy #0 lag: (min: 31.0, avg: 42.2, max: 63.0) +[2023-10-11 18:21:31,064][84230] Avg episode reward: [(0, '46.880'), (1, '42.850')] +[2023-10-11 18:21:31,460][85175] Updated weights for policy 1, policy_version 81990 (0.0011) +[2023-10-11 18:21:31,827][85175] Updated weights for policy 1, policy_version 82000 (0.0009) +[2023-10-11 18:21:32,176][85176] Updated weights for policy 0, policy_version 80802 (0.0010) +[2023-10-11 18:21:32,189][85175] Updated weights for policy 1, policy_version 82010 (0.0009) +[2023-10-11 18:21:32,540][85176] Updated weights for policy 0, policy_version 80812 (0.0008) +[2023-10-11 18:21:32,917][85176] Updated weights for policy 0, policy_version 80822 (0.0009) +[2023-10-11 18:21:33,285][85176] Updated weights for policy 0, policy_version 80832 (0.0008) +[2023-10-11 18:21:36,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 166756352. Throughput: 0: 1663.3, 1: 1696.5. Samples: 41695906. Policy #0 lag: (min: 31.0, avg: 42.2, max: 63.0) +[2023-10-11 18:21:36,063][84230] Avg episode reward: [(0, '41.360'), (1, '49.010')] +[2023-10-11 18:21:36,159][85175] Updated weights for policy 1, policy_version 82020 (0.0008) +[2023-10-11 18:21:36,533][85175] Updated weights for policy 1, policy_version 82030 (0.0007) +[2023-10-11 18:21:36,896][85175] Updated weights for policy 1, policy_version 82040 (0.0009) +[2023-10-11 18:21:37,292][85176] Updated weights for policy 0, policy_version 80842 (0.0009) +[2023-10-11 18:21:37,679][85176] Updated weights for policy 0, policy_version 80852 (0.0010) +[2023-10-11 18:21:38,049][85176] Updated weights for policy 0, policy_version 80862 (0.0012) +[2023-10-11 18:21:40,967][85175] Updated weights for policy 1, policy_version 82050 (0.0007) +[2023-10-11 18:21:41,062][84230] Fps is (10 sec: 13107.7, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 166821888. Throughput: 0: 1682.4, 1: 1700.0. Samples: 41716630. Policy #0 lag: (min: 31.0, avg: 42.2, max: 63.0) +[2023-10-11 18:21:41,063][84230] Avg episode reward: [(0, '45.430'), (1, '43.400')] +[2023-10-11 18:21:41,332][85175] Updated weights for policy 1, policy_version 82060 (0.0008) +[2023-10-11 18:21:41,705][85175] Updated weights for policy 1, policy_version 82070 (0.0008) +[2023-10-11 18:21:42,072][85175] Updated weights for policy 1, policy_version 82080 (0.0007) +[2023-10-11 18:21:42,210][85176] Updated weights for policy 0, policy_version 80872 (0.0008) +[2023-10-11 18:21:42,595][85176] Updated weights for policy 0, policy_version 80882 (0.0010) +[2023-10-11 18:21:42,971][85176] Updated weights for policy 0, policy_version 80892 (0.0009) +[2023-10-11 18:21:46,059][85175] Updated weights for policy 1, policy_version 82090 (0.0007) +[2023-10-11 18:21:46,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 166887424. Throughput: 0: 1673.7, 1: 1705.4. Samples: 41737610. Policy #0 lag: (min: 14.0, avg: 17.9, max: 46.0) +[2023-10-11 18:21:46,063][84230] Avg episode reward: [(0, '40.650'), (1, '48.130')] +[2023-10-11 18:21:46,424][85175] Updated weights for policy 1, policy_version 82100 (0.0007) +[2023-10-11 18:21:46,789][85175] Updated weights for policy 1, policy_version 82110 (0.0008) +[2023-10-11 18:21:47,022][85176] Updated weights for policy 0, policy_version 80902 (0.0007) +[2023-10-11 18:21:47,395][85176] Updated weights for policy 0, policy_version 80912 (0.0008) +[2023-10-11 18:21:47,774][85176] Updated weights for policy 0, policy_version 80922 (0.0010) +[2023-10-11 18:21:50,810][85175] Updated weights for policy 1, policy_version 82120 (0.0007) +[2023-10-11 18:21:51,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 166952960. Throughput: 0: 1670.4, 1: 1703.5. Samples: 41746796. Policy #0 lag: (min: 14.0, avg: 17.9, max: 46.0) +[2023-10-11 18:21:51,063][84230] Avg episode reward: [(0, '45.580'), (1, '43.330')] +[2023-10-11 18:21:51,179][85175] Updated weights for policy 1, policy_version 82130 (0.0008) +[2023-10-11 18:21:51,544][85175] Updated weights for policy 1, policy_version 82140 (0.0009) +[2023-10-11 18:21:52,011][85176] Updated weights for policy 0, policy_version 80932 (0.0010) +[2023-10-11 18:21:52,389][85176] Updated weights for policy 0, policy_version 80942 (0.0009) +[2023-10-11 18:21:52,770][85176] Updated weights for policy 0, policy_version 80952 (0.0011) +[2023-10-11 18:21:55,654][85175] Updated weights for policy 1, policy_version 82150 (0.0008) +[2023-10-11 18:21:56,020][85175] Updated weights for policy 1, policy_version 82160 (0.0008) +[2023-10-11 18:21:56,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 167018496. Throughput: 0: 1682.6, 1: 1699.3. Samples: 41767364. Policy #0 lag: (min: 14.0, avg: 17.9, max: 46.0) +[2023-10-11 18:21:56,063][84230] Avg episode reward: [(0, '44.700'), (1, '46.350')] +[2023-10-11 18:21:56,386][85175] Updated weights for policy 1, policy_version 82170 (0.0009) +[2023-10-11 18:21:56,916][85176] Updated weights for policy 0, policy_version 80962 (0.0009) +[2023-10-11 18:21:57,292][85176] Updated weights for policy 0, policy_version 80972 (0.0007) +[2023-10-11 18:21:57,663][85176] Updated weights for policy 0, policy_version 80982 (0.0008) +[2023-10-11 18:21:58,027][85176] Updated weights for policy 0, policy_version 80992 (0.0009) +[2023-10-11 18:22:00,344][85175] Updated weights for policy 1, policy_version 82180 (0.0009) +[2023-10-11 18:22:00,744][85175] Updated weights for policy 1, policy_version 82190 (0.0009) +[2023-10-11 18:22:01,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 167084032. Throughput: 0: 1689.5, 1: 1698.1. Samples: 41788016. Policy #0 lag: (min: 14.0, avg: 17.9, max: 46.0) +[2023-10-11 18:22:01,063][84230] Avg episode reward: [(0, '46.000'), (1, '44.290')] +[2023-10-11 18:22:01,116][85175] Updated weights for policy 1, policy_version 82200 (0.0008) +[2023-10-11 18:22:01,987][85176] Updated weights for policy 0, policy_version 81002 (0.0009) +[2023-10-11 18:22:02,359][85176] Updated weights for policy 0, policy_version 81012 (0.0007) +[2023-10-11 18:22:02,735][85176] Updated weights for policy 0, policy_version 81022 (0.0009) +[2023-10-11 18:22:05,007][85175] Updated weights for policy 1, policy_version 82210 (0.0010) +[2023-10-11 18:22:05,373][85175] Updated weights for policy 1, policy_version 82220 (0.0007) +[2023-10-11 18:22:05,742][85175] Updated weights for policy 1, policy_version 82230 (0.0007) +[2023-10-11 18:22:06,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 167149568. Throughput: 0: 1674.5, 1: 1712.8. Samples: 41797478. Policy #0 lag: (min: 14.0, avg: 17.9, max: 46.0) +[2023-10-11 18:22:06,063][84230] Avg episode reward: [(0, '43.700'), (1, '42.720')] +[2023-10-11 18:22:06,107][85175] Updated weights for policy 1, policy_version 82240 (0.0007) +[2023-10-11 18:22:06,945][85176] Updated weights for policy 0, policy_version 81032 (0.0008) +[2023-10-11 18:22:07,315][85176] Updated weights for policy 0, policy_version 81042 (0.0008) +[2023-10-11 18:22:07,688][85176] Updated weights for policy 0, policy_version 81052 (0.0008) +[2023-10-11 18:22:09,925][85175] Updated weights for policy 1, policy_version 82250 (0.0011) +[2023-10-11 18:22:10,290][85175] Updated weights for policy 1, policy_version 82260 (0.0010) +[2023-10-11 18:22:10,665][85175] Updated weights for policy 1, policy_version 82270 (0.0008) +[2023-10-11 18:22:11,062][84230] Fps is (10 sec: 16383.9, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 167247872. Throughput: 0: 1679.7, 1: 1720.2. Samples: 41818518. Policy #0 lag: (min: 14.0, avg: 17.9, max: 46.0) +[2023-10-11 18:22:11,063][84230] Avg episode reward: [(0, '44.370'), (1, '44.710')] +[2023-10-11 18:22:11,684][85176] Updated weights for policy 0, policy_version 81062 (0.0010) +[2023-10-11 18:22:12,068][85176] Updated weights for policy 0, policy_version 81072 (0.0010) +[2023-10-11 18:22:12,437][85176] Updated weights for policy 0, policy_version 81082 (0.0010) +[2023-10-11 18:22:14,656][85175] Updated weights for policy 1, policy_version 82280 (0.0007) +[2023-10-11 18:22:15,022][85175] Updated weights for policy 1, policy_version 82290 (0.0007) +[2023-10-11 18:22:15,395][85175] Updated weights for policy 1, policy_version 82300 (0.0008) +[2023-10-11 18:22:16,063][84230] Fps is (10 sec: 16383.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 167313408. Throughput: 0: 1682.1, 1: 1689.3. Samples: 41838318. Policy #0 lag: (min: 14.0, avg: 17.9, max: 46.0) +[2023-10-11 18:22:16,063][84230] Avg episode reward: [(0, '42.860'), (1, '41.000')] +[2023-10-11 18:22:16,073][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000082304_84279296.pth... +[2023-10-11 18:22:16,105][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000080704_82640896.pth +[2023-10-11 18:22:16,278][85176] Updated weights for policy 0, policy_version 81092 (0.0007) +[2023-10-11 18:22:16,648][85176] Updated weights for policy 0, policy_version 81102 (0.0010) +[2023-10-11 18:22:17,019][85176] Updated weights for policy 0, policy_version 81112 (0.0008) +[2023-10-11 18:22:17,314][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000081120_83066880.pth... +[2023-10-11 18:22:17,357][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000079552_81461248.pth +[2023-10-11 18:22:19,397][85175] Updated weights for policy 1, policy_version 82310 (0.0007) +[2023-10-11 18:22:19,768][85175] Updated weights for policy 1, policy_version 82320 (0.0009) +[2023-10-11 18:22:20,130][85175] Updated weights for policy 1, policy_version 82330 (0.0007) +[2023-10-11 18:22:21,009][85176] Updated weights for policy 0, policy_version 81122 (0.0007) +[2023-10-11 18:22:21,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 167378944. Throughput: 0: 1681.7, 1: 1715.6. Samples: 41848788. Policy #0 lag: (min: 14.0, avg: 17.9, max: 46.0) +[2023-10-11 18:22:21,063][84230] Avg episode reward: [(0, '42.140'), (1, '45.590')] +[2023-10-11 18:22:21,378][85176] Updated weights for policy 0, policy_version 81132 (0.0009) +[2023-10-11 18:22:21,759][85176] Updated weights for policy 0, policy_version 81142 (0.0009) +[2023-10-11 18:22:22,137][85176] Updated weights for policy 0, policy_version 81152 (0.0009) +[2023-10-11 18:22:24,103][85175] Updated weights for policy 1, policy_version 82340 (0.0009) +[2023-10-11 18:22:24,468][85175] Updated weights for policy 1, policy_version 82350 (0.0009) +[2023-10-11 18:22:24,843][85175] Updated weights for policy 1, policy_version 82360 (0.0009) +[2023-10-11 18:22:26,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 167444480. Throughput: 0: 1684.0, 1: 1702.7. Samples: 41869036. Policy #0 lag: (min: 14.0, avg: 17.9, max: 46.0) +[2023-10-11 18:22:26,064][84230] Avg episode reward: [(0, '41.270'), (1, '41.050')] +[2023-10-11 18:22:26,197][85176] Updated weights for policy 0, policy_version 81162 (0.0009) +[2023-10-11 18:22:26,562][85176] Updated weights for policy 0, policy_version 81172 (0.0009) +[2023-10-11 18:22:26,932][85176] Updated weights for policy 0, policy_version 81182 (0.0007) +[2023-10-11 18:22:28,728][85175] Updated weights for policy 1, policy_version 82370 (0.0009) +[2023-10-11 18:22:29,097][85175] Updated weights for policy 1, policy_version 82380 (0.0010) +[2023-10-11 18:22:29,460][85175] Updated weights for policy 1, policy_version 82390 (0.0011) +[2023-10-11 18:22:29,825][85175] Updated weights for policy 1, policy_version 82400 (0.0011) +[2023-10-11 18:22:30,923][85176] Updated weights for policy 0, policy_version 81192 (0.0007) +[2023-10-11 18:22:31,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 167510016. Throughput: 0: 1687.9, 1: 1687.6. Samples: 41889508. Policy #0 lag: (min: 14.0, avg: 17.9, max: 46.0) +[2023-10-11 18:22:31,063][84230] Avg episode reward: [(0, '44.890'), (1, '46.520')] +[2023-10-11 18:22:31,299][85176] Updated weights for policy 0, policy_version 81202 (0.0007) +[2023-10-11 18:22:31,676][85176] Updated weights for policy 0, policy_version 81212 (0.0009) +[2023-10-11 18:22:33,877][85175] Updated weights for policy 1, policy_version 82410 (0.0009) +[2023-10-11 18:22:34,245][85175] Updated weights for policy 1, policy_version 82420 (0.0011) +[2023-10-11 18:22:34,614][85175] Updated weights for policy 1, policy_version 82430 (0.0009) +[2023-10-11 18:22:35,703][85176] Updated weights for policy 0, policy_version 81222 (0.0009) +[2023-10-11 18:22:36,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 167575552. Throughput: 0: 1686.4, 1: 1718.0. Samples: 41899996. Policy #0 lag: (min: 14.0, avg: 17.9, max: 46.0) +[2023-10-11 18:22:36,063][84230] Avg episode reward: [(0, '43.050'), (1, '40.950')] +[2023-10-11 18:22:36,075][85176] Updated weights for policy 0, policy_version 81232 (0.0008) +[2023-10-11 18:22:36,452][85176] Updated weights for policy 0, policy_version 81242 (0.0007) +[2023-10-11 18:22:38,732][85175] Updated weights for policy 1, policy_version 82440 (0.0011) +[2023-10-11 18:22:39,103][85175] Updated weights for policy 1, policy_version 82450 (0.0009) +[2023-10-11 18:22:39,473][85175] Updated weights for policy 1, policy_version 82460 (0.0010) +[2023-10-11 18:22:40,578][85176] Updated weights for policy 0, policy_version 81252 (0.0008) +[2023-10-11 18:22:40,973][85176] Updated weights for policy 0, policy_version 81262 (0.0009) +[2023-10-11 18:22:41,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 167641088. Throughput: 0: 1695.2, 1: 1697.1. Samples: 41920016. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) +[2023-10-11 18:22:41,063][84230] Avg episode reward: [(0, '42.880'), (1, '43.230')] +[2023-10-11 18:22:41,340][85176] Updated weights for policy 0, policy_version 81272 (0.0009) +[2023-10-11 18:22:43,288][85175] Updated weights for policy 1, policy_version 82470 (0.0008) +[2023-10-11 18:22:43,659][85175] Updated weights for policy 1, policy_version 82480 (0.0008) +[2023-10-11 18:22:44,028][85175] Updated weights for policy 1, policy_version 82490 (0.0009) +[2023-10-11 18:22:45,532][85176] Updated weights for policy 0, policy_version 81282 (0.0011) +[2023-10-11 18:22:45,906][85176] Updated weights for policy 0, policy_version 81292 (0.0009) +[2023-10-11 18:22:46,063][84230] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 167706624. Throughput: 0: 1680.2, 1: 1712.3. Samples: 41940680. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) +[2023-10-11 18:22:46,064][84230] Avg episode reward: [(0, '42.810'), (1, '40.460')] +[2023-10-11 18:22:46,278][85176] Updated weights for policy 0, policy_version 81302 (0.0008) +[2023-10-11 18:22:46,650][85176] Updated weights for policy 0, policy_version 81312 (0.0009) +[2023-10-11 18:22:47,946][85175] Updated weights for policy 1, policy_version 82500 (0.0007) +[2023-10-11 18:22:48,322][85175] Updated weights for policy 1, policy_version 82510 (0.0009) +[2023-10-11 18:22:48,695][85175] Updated weights for policy 1, policy_version 82520 (0.0007) +[2023-10-11 18:22:50,752][85176] Updated weights for policy 0, policy_version 81322 (0.0009) +[2023-10-11 18:22:51,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 167772160. Throughput: 0: 1688.2, 1: 1712.8. Samples: 41950524. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) +[2023-10-11 18:22:51,063][84230] Avg episode reward: [(0, '44.220'), (1, '47.150')] +[2023-10-11 18:22:51,133][85176] Updated weights for policy 0, policy_version 81332 (0.0008) +[2023-10-11 18:22:51,502][85176] Updated weights for policy 0, policy_version 81342 (0.0009) +[2023-10-11 18:22:52,681][85175] Updated weights for policy 1, policy_version 82530 (0.0007) +[2023-10-11 18:22:53,059][85175] Updated weights for policy 1, policy_version 82540 (0.0008) +[2023-10-11 18:22:53,426][85175] Updated weights for policy 1, policy_version 82550 (0.0010) +[2023-10-11 18:22:53,798][85175] Updated weights for policy 1, policy_version 82560 (0.0010) +[2023-10-11 18:22:55,472][85176] Updated weights for policy 0, policy_version 81352 (0.0008) +[2023-10-11 18:22:55,839][85176] Updated weights for policy 0, policy_version 81362 (0.0008) +[2023-10-11 18:22:56,062][84230] Fps is (10 sec: 13107.6, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 167837696. Throughput: 0: 1690.1, 1: 1694.7. Samples: 41970834. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) +[2023-10-11 18:22:56,063][84230] Avg episode reward: [(0, '43.780'), (1, '41.790')] +[2023-10-11 18:22:56,215][85176] Updated weights for policy 0, policy_version 81372 (0.0009) +[2023-10-11 18:22:57,773][85175] Updated weights for policy 1, policy_version 82570 (0.0008) +[2023-10-11 18:22:58,140][85175] Updated weights for policy 1, policy_version 82580 (0.0007) +[2023-10-11 18:22:58,499][85175] Updated weights for policy 1, policy_version 82590 (0.0009) +[2023-10-11 18:23:00,311][85176] Updated weights for policy 0, policy_version 81382 (0.0008) +[2023-10-11 18:23:00,677][85176] Updated weights for policy 0, policy_version 81392 (0.0009) +[2023-10-11 18:23:01,044][85176] Updated weights for policy 0, policy_version 81402 (0.0010) +[2023-10-11 18:23:01,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 167903232. Throughput: 0: 1671.7, 1: 1722.5. Samples: 41991058. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) +[2023-10-11 18:23:01,063][84230] Avg episode reward: [(0, '46.700'), (1, '48.160')] +[2023-10-11 18:23:02,612][85175] Updated weights for policy 1, policy_version 82600 (0.0009) +[2023-10-11 18:23:02,977][85175] Updated weights for policy 1, policy_version 82610 (0.0008) +[2023-10-11 18:23:03,337][85175] Updated weights for policy 1, policy_version 82620 (0.0009) +[2023-10-11 18:23:05,115][85176] Updated weights for policy 0, policy_version 81412 (0.0009) +[2023-10-11 18:23:05,481][85176] Updated weights for policy 0, policy_version 81422 (0.0007) +[2023-10-11 18:23:05,859][85176] Updated weights for policy 0, policy_version 81432 (0.0007) +[2023-10-11 18:23:06,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 167968768. Throughput: 0: 1682.7, 1: 1693.8. Samples: 42000732. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) +[2023-10-11 18:23:06,064][84230] Avg episode reward: [(0, '44.770'), (1, '43.040')] +[2023-10-11 18:23:07,364][85175] Updated weights for policy 1, policy_version 82630 (0.0008) +[2023-10-11 18:23:07,727][85175] Updated weights for policy 1, policy_version 82640 (0.0007) +[2023-10-11 18:23:08,087][85175] Updated weights for policy 1, policy_version 82650 (0.0008) +[2023-10-11 18:23:09,975][85176] Updated weights for policy 0, policy_version 81442 (0.0007) +[2023-10-11 18:23:10,343][85176] Updated weights for policy 0, policy_version 81452 (0.0009) +[2023-10-11 18:23:10,711][85176] Updated weights for policy 0, policy_version 81462 (0.0009) +[2023-10-11 18:23:11,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 168034304. Throughput: 0: 1682.6, 1: 1707.9. Samples: 42021608. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) +[2023-10-11 18:23:11,063][84230] Avg episode reward: [(0, '43.520'), (1, '48.050')] +[2023-10-11 18:23:11,097][85176] Updated weights for policy 0, policy_version 81472 (0.0009) +[2023-10-11 18:23:12,085][85175] Updated weights for policy 1, policy_version 82660 (0.0008) +[2023-10-11 18:23:12,448][85175] Updated weights for policy 1, policy_version 82670 (0.0008) +[2023-10-11 18:23:12,827][85175] Updated weights for policy 1, policy_version 82680 (0.0008) +[2023-10-11 18:23:15,330][85176] Updated weights for policy 0, policy_version 81482 (0.0010) +[2023-10-11 18:23:15,705][85176] Updated weights for policy 0, policy_version 81492 (0.0008) +[2023-10-11 18:23:16,062][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 168099840. Throughput: 0: 1659.9, 1: 1726.1. Samples: 42041876. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) +[2023-10-11 18:23:16,063][84230] Avg episode reward: [(0, '40.560'), (1, '44.120')] +[2023-10-11 18:23:16,075][85176] Updated weights for policy 0, policy_version 81502 (0.0009) +[2023-10-11 18:23:16,653][85175] Updated weights for policy 1, policy_version 82690 (0.0008) +[2023-10-11 18:23:17,014][85175] Updated weights for policy 1, policy_version 82700 (0.0008) +[2023-10-11 18:23:17,388][85175] Updated weights for policy 1, policy_version 82710 (0.0009) +[2023-10-11 18:23:17,746][85175] Updated weights for policy 1, policy_version 82720 (0.0010) +[2023-10-11 18:23:20,083][85176] Updated weights for policy 0, policy_version 81512 (0.0010) +[2023-10-11 18:23:20,451][85176] Updated weights for policy 0, policy_version 81522 (0.0007) +[2023-10-11 18:23:20,827][85176] Updated weights for policy 0, policy_version 81532 (0.0007) +[2023-10-11 18:23:21,062][84230] Fps is (10 sec: 16384.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 168198144. Throughput: 0: 1674.7, 1: 1698.7. Samples: 42051798. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) +[2023-10-11 18:23:21,063][84230] Avg episode reward: [(0, '41.200'), (1, '48.070')] +[2023-10-11 18:23:21,860][85175] Updated weights for policy 1, policy_version 82730 (0.0010) +[2023-10-11 18:23:22,221][85175] Updated weights for policy 1, policy_version 82740 (0.0008) +[2023-10-11 18:23:22,599][85175] Updated weights for policy 1, policy_version 82750 (0.0008) +[2023-10-11 18:23:24,810][85176] Updated weights for policy 0, policy_version 81542 (0.0007) +[2023-10-11 18:23:25,174][85176] Updated weights for policy 0, policy_version 81552 (0.0008) +[2023-10-11 18:23:25,554][85176] Updated weights for policy 0, policy_version 81562 (0.0008) +[2023-10-11 18:23:26,062][84230] Fps is (10 sec: 16384.1, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 168263680. Throughput: 0: 1668.0, 1: 1717.2. Samples: 42072350. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) +[2023-10-11 18:23:26,063][84230] Avg episode reward: [(0, '42.300'), (1, '43.240')] +[2023-10-11 18:23:26,616][85175] Updated weights for policy 1, policy_version 82760 (0.0010) +[2023-10-11 18:23:26,975][85175] Updated weights for policy 1, policy_version 82770 (0.0008) +[2023-10-11 18:23:27,353][85175] Updated weights for policy 1, policy_version 82780 (0.0008) +[2023-10-11 18:23:29,621][85176] Updated weights for policy 0, policy_version 81572 (0.0007) +[2023-10-11 18:23:29,996][85176] Updated weights for policy 0, policy_version 81582 (0.0009) +[2023-10-11 18:23:30,375][85176] Updated weights for policy 0, policy_version 81592 (0.0007) +[2023-10-11 18:23:31,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 168329216. Throughput: 0: 1659.0, 1: 1708.6. Samples: 42092222. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:23:31,063][84230] Avg episode reward: [(0, '42.450'), (1, '46.030')] +[2023-10-11 18:23:31,390][85175] Updated weights for policy 1, policy_version 82790 (0.0008) +[2023-10-11 18:23:31,762][85175] Updated weights for policy 1, policy_version 82800 (0.0009) +[2023-10-11 18:23:32,135][85175] Updated weights for policy 1, policy_version 82810 (0.0009) +[2023-10-11 18:23:34,450][85176] Updated weights for policy 0, policy_version 81602 (0.0010) +[2023-10-11 18:23:34,828][85176] Updated weights for policy 0, policy_version 81612 (0.0008) +[2023-10-11 18:23:35,204][85176] Updated weights for policy 0, policy_version 81622 (0.0007) +[2023-10-11 18:23:35,570][85176] Updated weights for policy 0, policy_version 81632 (0.0007) +[2023-10-11 18:23:36,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 168394752. Throughput: 0: 1680.0, 1: 1695.5. Samples: 42102424. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:23:36,063][84230] Avg episode reward: [(0, '44.920'), (1, '45.320')] +[2023-10-11 18:23:36,260][85175] Updated weights for policy 1, policy_version 82820 (0.0008) +[2023-10-11 18:23:36,669][85175] Updated weights for policy 1, policy_version 82830 (0.0010) +[2023-10-11 18:23:37,040][85175] Updated weights for policy 1, policy_version 82840 (0.0009) +[2023-10-11 18:23:39,477][85176] Updated weights for policy 0, policy_version 81642 (0.0009) +[2023-10-11 18:23:39,856][85176] Updated weights for policy 0, policy_version 81652 (0.0009) +[2023-10-11 18:23:40,236][85176] Updated weights for policy 0, policy_version 81662 (0.0008) +[2023-10-11 18:23:41,049][85175] Updated weights for policy 1, policy_version 82850 (0.0008) +[2023-10-11 18:23:41,062][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 168460288. Throughput: 0: 1666.8, 1: 1708.9. Samples: 42122742. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:23:41,063][84230] Avg episode reward: [(0, '44.520'), (1, '46.710')] +[2023-10-11 18:23:41,418][85175] Updated weights for policy 1, policy_version 82860 (0.0011) +[2023-10-11 18:23:41,786][85175] Updated weights for policy 1, policy_version 82870 (0.0007) +[2023-10-11 18:23:42,145][85175] Updated weights for policy 1, policy_version 82880 (0.0009) +[2023-10-11 18:23:44,278][85176] Updated weights for policy 0, policy_version 81672 (0.0008) +[2023-10-11 18:23:44,656][85176] Updated weights for policy 0, policy_version 81682 (0.0007) +[2023-10-11 18:23:45,027][85176] Updated weights for policy 0, policy_version 81692 (0.0008) +[2023-10-11 18:23:46,063][84230] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 168525824. Throughput: 0: 1665.2, 1: 1713.1. Samples: 42143080. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:23:46,064][84230] Avg episode reward: [(0, '44.930'), (1, '47.930')] +[2023-10-11 18:23:46,073][85175] Updated weights for policy 1, policy_version 82890 (0.0008) +[2023-10-11 18:23:46,437][85175] Updated weights for policy 1, policy_version 82900 (0.0007) +[2023-10-11 18:23:46,799][85175] Updated weights for policy 1, policy_version 82910 (0.0008) +[2023-10-11 18:23:49,031][85176] Updated weights for policy 0, policy_version 81702 (0.0007) +[2023-10-11 18:23:49,407][85176] Updated weights for policy 0, policy_version 81712 (0.0007) +[2023-10-11 18:23:49,778][85176] Updated weights for policy 0, policy_version 81722 (0.0007) +[2023-10-11 18:23:50,903][85175] Updated weights for policy 1, policy_version 82920 (0.0010) +[2023-10-11 18:23:51,062][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 168591360. Throughput: 0: 1683.8, 1: 1708.4. Samples: 42153378. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:23:51,063][84230] Avg episode reward: [(0, '44.820'), (1, '41.820')] +[2023-10-11 18:23:51,271][85175] Updated weights for policy 1, policy_version 82930 (0.0010) +[2023-10-11 18:23:51,641][85175] Updated weights for policy 1, policy_version 82940 (0.0008) +[2023-10-11 18:23:53,918][85176] Updated weights for policy 0, policy_version 81732 (0.0008) +[2023-10-11 18:23:54,281][85176] Updated weights for policy 0, policy_version 81742 (0.0011) +[2023-10-11 18:23:54,660][85176] Updated weights for policy 0, policy_version 81752 (0.0007) +[2023-10-11 18:23:55,560][85175] Updated weights for policy 1, policy_version 82950 (0.0008) +[2023-10-11 18:23:55,939][85175] Updated weights for policy 1, policy_version 82960 (0.0009) +[2023-10-11 18:23:56,063][84230] Fps is (10 sec: 13107.7, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 168656896. Throughput: 0: 1661.1, 1: 1708.3. Samples: 42173232. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:23:56,063][84230] Avg episode reward: [(0, '43.770'), (1, '45.440')] +[2023-10-11 18:23:56,306][85175] Updated weights for policy 1, policy_version 82970 (0.0008) +[2023-10-11 18:23:58,735][85176] Updated weights for policy 0, policy_version 81762 (0.0007) +[2023-10-11 18:23:59,107][85176] Updated weights for policy 0, policy_version 81772 (0.0009) +[2023-10-11 18:23:59,480][85176] Updated weights for policy 0, policy_version 81782 (0.0008) +[2023-10-11 18:23:59,839][85176] Updated weights for policy 0, policy_version 81792 (0.0009) +[2023-10-11 18:24:00,527][85175] Updated weights for policy 1, policy_version 82980 (0.0007) +[2023-10-11 18:24:00,890][85175] Updated weights for policy 1, policy_version 82990 (0.0009) +[2023-10-11 18:24:01,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 168722432. Throughput: 0: 1669.3, 1: 1700.7. Samples: 42193526. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:24:01,063][84230] Avg episode reward: [(0, '45.000'), (1, '41.410')] +[2023-10-11 18:24:01,258][85175] Updated weights for policy 1, policy_version 83000 (0.0008) +[2023-10-11 18:24:03,964][85176] Updated weights for policy 0, policy_version 81802 (0.0010) +[2023-10-11 18:24:04,339][85176] Updated weights for policy 0, policy_version 81812 (0.0010) +[2023-10-11 18:24:04,717][85176] Updated weights for policy 0, policy_version 81822 (0.0007) +[2023-10-11 18:24:05,278][85175] Updated weights for policy 1, policy_version 83010 (0.0009) +[2023-10-11 18:24:05,647][85175] Updated weights for policy 1, policy_version 83020 (0.0009) +[2023-10-11 18:24:06,026][85175] Updated weights for policy 1, policy_version 83030 (0.0011) +[2023-10-11 18:24:06,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 168787968. Throughput: 0: 1684.1, 1: 1702.6. Samples: 42204200. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:24:06,063][84230] Avg episode reward: [(0, '45.280'), (1, '45.160')] +[2023-10-11 18:24:06,398][85175] Updated weights for policy 1, policy_version 83040 (0.0010) +[2023-10-11 18:24:08,939][85176] Updated weights for policy 0, policy_version 81832 (0.0007) +[2023-10-11 18:24:09,310][85176] Updated weights for policy 0, policy_version 81842 (0.0009) +[2023-10-11 18:24:09,680][85176] Updated weights for policy 0, policy_version 81852 (0.0008) +[2023-10-11 18:24:10,252][85175] Updated weights for policy 1, policy_version 83050 (0.0009) +[2023-10-11 18:24:10,618][85175] Updated weights for policy 1, policy_version 83060 (0.0007) +[2023-10-11 18:24:10,985][85175] Updated weights for policy 1, policy_version 83070 (0.0008) +[2023-10-11 18:24:11,063][84230] Fps is (10 sec: 16383.7, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 168886272. Throughput: 0: 1661.2, 1: 1710.6. Samples: 42224082. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:24:11,064][84230] Avg episode reward: [(0, '45.260'), (1, '42.360')] +[2023-10-11 18:24:13,904][85176] Updated weights for policy 0, policy_version 81862 (0.0008) +[2023-10-11 18:24:14,275][85176] Updated weights for policy 0, policy_version 81872 (0.0008) +[2023-10-11 18:24:14,653][85176] Updated weights for policy 0, policy_version 81882 (0.0007) +[2023-10-11 18:24:15,052][85175] Updated weights for policy 1, policy_version 83080 (0.0007) +[2023-10-11 18:24:15,428][85175] Updated weights for policy 1, policy_version 83090 (0.0009) +[2023-10-11 18:24:15,797][85175] Updated weights for policy 1, policy_version 83100 (0.0009) +[2023-10-11 18:24:16,062][84230] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 168951808. Throughput: 0: 1672.9, 1: 1691.9. Samples: 42243638. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:24:16,063][84230] Avg episode reward: [(0, '46.450'), (1, '43.570')] +[2023-10-11 18:24:16,070][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000081888_83853312.pth... +[2023-10-11 18:24:16,070][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000083104_85098496.pth... +[2023-10-11 18:24:16,100][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000080320_82247680.pth +[2023-10-11 18:24:16,102][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000081504_83460096.pth +[2023-10-11 18:24:18,687][85176] Updated weights for policy 0, policy_version 81892 (0.0011) +[2023-10-11 18:24:19,081][85176] Updated weights for policy 0, policy_version 81902 (0.0009) +[2023-10-11 18:24:19,447][85176] Updated weights for policy 0, policy_version 81912 (0.0009) +[2023-10-11 18:24:19,690][85175] Updated weights for policy 1, policy_version 83110 (0.0008) +[2023-10-11 18:24:20,055][85175] Updated weights for policy 1, policy_version 83120 (0.0008) +[2023-10-11 18:24:20,434][85175] Updated weights for policy 1, policy_version 83130 (0.0009) +[2023-10-11 18:24:21,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 169017344. Throughput: 0: 1670.5, 1: 1710.9. Samples: 42254588. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:24:21,063][84230] Avg episode reward: [(0, '45.460'), (1, '43.130')] +[2023-10-11 18:24:23,559][85176] Updated weights for policy 0, policy_version 81922 (0.0008) +[2023-10-11 18:24:23,931][85176] Updated weights for policy 0, policy_version 81932 (0.0009) +[2023-10-11 18:24:24,306][85176] Updated weights for policy 0, policy_version 81942 (0.0009) +[2023-10-11 18:24:24,676][85176] Updated weights for policy 0, policy_version 81952 (0.0008) +[2023-10-11 18:24:24,692][85175] Updated weights for policy 1, policy_version 83140 (0.0009) +[2023-10-11 18:24:25,094][85175] Updated weights for policy 1, policy_version 83150 (0.0008) +[2023-10-11 18:24:25,465][85175] Updated weights for policy 1, policy_version 83160 (0.0010) +[2023-10-11 18:24:26,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 169082880. Throughput: 0: 1657.2, 1: 1703.3. Samples: 42273968. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-10-11 18:24:26,064][84230] Avg episode reward: [(0, '45.440'), (1, '45.980')] +[2023-10-11 18:24:28,667][85176] Updated weights for policy 0, policy_version 81962 (0.0008) +[2023-10-11 18:24:29,040][85176] Updated weights for policy 0, policy_version 81972 (0.0007) +[2023-10-11 18:24:29,408][85176] Updated weights for policy 0, policy_version 81982 (0.0008) +[2023-10-11 18:24:29,423][85175] Updated weights for policy 1, policy_version 83170 (0.0010) +[2023-10-11 18:24:29,787][85175] Updated weights for policy 1, policy_version 83180 (0.0007) +[2023-10-11 18:24:30,153][85175] Updated weights for policy 1, policy_version 83190 (0.0009) +[2023-10-11 18:24:30,517][85175] Updated weights for policy 1, policy_version 83200 (0.0009) +[2023-10-11 18:24:31,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 169148416. Throughput: 0: 1668.7, 1: 1667.7. Samples: 42293218. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-10-11 18:24:31,063][84230] Avg episode reward: [(0, '43.020'), (1, '44.890')] +[2023-10-11 18:24:33,754][85176] Updated weights for policy 0, policy_version 81992 (0.0010) +[2023-10-11 18:24:34,128][85176] Updated weights for policy 0, policy_version 82002 (0.0008) +[2023-10-11 18:24:34,411][85175] Updated weights for policy 1, policy_version 83210 (0.0008) +[2023-10-11 18:24:34,501][85176] Updated weights for policy 0, policy_version 82012 (0.0008) +[2023-10-11 18:24:34,788][85175] Updated weights for policy 1, policy_version 83220 (0.0008) +[2023-10-11 18:24:35,154][85175] Updated weights for policy 1, policy_version 83230 (0.0008) +[2023-10-11 18:24:36,062][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 169213952. Throughput: 0: 1665.2, 1: 1699.6. Samples: 42304792. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-10-11 18:24:36,063][84230] Avg episode reward: [(0, '45.760'), (1, '47.130')] +[2023-10-11 18:24:38,467][85176] Updated weights for policy 0, policy_version 82022 (0.0008) +[2023-10-11 18:24:38,831][85176] Updated weights for policy 0, policy_version 82032 (0.0010) +[2023-10-11 18:24:39,152][85175] Updated weights for policy 1, policy_version 83240 (0.0008) +[2023-10-11 18:24:39,207][85176] Updated weights for policy 0, policy_version 82042 (0.0010) +[2023-10-11 18:24:39,510][85175] Updated weights for policy 1, policy_version 83250 (0.0009) +[2023-10-11 18:24:39,876][85175] Updated weights for policy 1, policy_version 83260 (0.0007) +[2023-10-11 18:24:41,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 169279488. Throughput: 0: 1661.2, 1: 1683.6. Samples: 42323748. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-10-11 18:24:41,063][84230] Avg episode reward: [(0, '42.370'), (1, '44.500')] +[2023-10-11 18:24:43,286][85176] Updated weights for policy 0, policy_version 82052 (0.0009) +[2023-10-11 18:24:43,650][85176] Updated weights for policy 0, policy_version 82062 (0.0008) +[2023-10-11 18:24:43,874][85175] Updated weights for policy 1, policy_version 83270 (0.0009) +[2023-10-11 18:24:44,022][85176] Updated weights for policy 0, policy_version 82072 (0.0009) +[2023-10-11 18:24:44,238][85175] Updated weights for policy 1, policy_version 83280 (0.0009) +[2023-10-11 18:24:44,616][85175] Updated weights for policy 1, policy_version 83290 (0.0007) +[2023-10-11 18:24:46,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 169345024. Throughput: 0: 1671.3, 1: 1678.7. Samples: 42344278. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-10-11 18:24:46,063][84230] Avg episode reward: [(0, '45.200'), (1, '46.110')] +[2023-10-11 18:24:48,025][85176] Updated weights for policy 0, policy_version 82082 (0.0007) +[2023-10-11 18:24:48,391][85176] Updated weights for policy 0, policy_version 82092 (0.0007) +[2023-10-11 18:24:48,632][85175] Updated weights for policy 1, policy_version 83300 (0.0008) +[2023-10-11 18:24:48,770][85176] Updated weights for policy 0, policy_version 82102 (0.0007) +[2023-10-11 18:24:48,994][85175] Updated weights for policy 1, policy_version 83310 (0.0009) +[2023-10-11 18:24:49,129][85176] Updated weights for policy 0, policy_version 82112 (0.0007) +[2023-10-11 18:24:49,362][85175] Updated weights for policy 1, policy_version 83320 (0.0009) +[2023-10-11 18:24:51,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 169410560. Throughput: 0: 1655.5, 1: 1699.5. Samples: 42355176. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-10-11 18:24:51,063][84230] Avg episode reward: [(0, '41.410'), (1, '44.190')] +[2023-10-11 18:24:53,211][85176] Updated weights for policy 0, policy_version 82122 (0.0009) +[2023-10-11 18:24:53,588][85176] Updated weights for policy 0, policy_version 82132 (0.0008) +[2023-10-11 18:24:53,611][85175] Updated weights for policy 1, policy_version 83330 (0.0010) +[2023-10-11 18:24:53,954][85176] Updated weights for policy 0, policy_version 82142 (0.0007) +[2023-10-11 18:24:53,977][85175] Updated weights for policy 1, policy_version 83340 (0.0008) +[2023-10-11 18:24:54,352][85175] Updated weights for policy 1, policy_version 83350 (0.0009) +[2023-10-11 18:24:54,721][85175] Updated weights for policy 1, policy_version 83360 (0.0010) +[2023-10-11 18:24:56,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 169476096. Throughput: 0: 1664.3, 1: 1670.3. Samples: 42374138. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-10-11 18:24:56,063][84230] Avg episode reward: [(0, '46.710'), (1, '46.140')] +[2023-10-11 18:24:57,976][85176] Updated weights for policy 0, policy_version 82152 (0.0009) +[2023-10-11 18:24:58,342][85176] Updated weights for policy 0, policy_version 82162 (0.0009) +[2023-10-11 18:24:58,671][85175] Updated weights for policy 1, policy_version 83370 (0.0008) +[2023-10-11 18:24:58,719][85176] Updated weights for policy 0, policy_version 82172 (0.0009) +[2023-10-11 18:24:59,028][85175] Updated weights for policy 1, policy_version 83380 (0.0009) +[2023-10-11 18:24:59,396][85175] Updated weights for policy 1, policy_version 83390 (0.0009) +[2023-10-11 18:25:01,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 169541632. Throughput: 0: 1671.4, 1: 1681.7. Samples: 42394528. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-10-11 18:25:01,063][84230] Avg episode reward: [(0, '43.020'), (1, '44.970')] +[2023-10-11 18:25:02,743][85176] Updated weights for policy 0, policy_version 82182 (0.0009) +[2023-10-11 18:25:03,121][85176] Updated weights for policy 0, policy_version 82192 (0.0010) +[2023-10-11 18:25:03,446][85175] Updated weights for policy 1, policy_version 83400 (0.0008) +[2023-10-11 18:25:03,496][85176] Updated weights for policy 0, policy_version 82202 (0.0008) +[2023-10-11 18:25:03,807][85175] Updated weights for policy 1, policy_version 83410 (0.0009) +[2023-10-11 18:25:04,177][85175] Updated weights for policy 1, policy_version 83420 (0.0010) +[2023-10-11 18:25:06,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 169607168. Throughput: 0: 1655.4, 1: 1683.1. Samples: 42404822. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-10-11 18:25:06,064][84230] Avg episode reward: [(0, '45.770'), (1, '46.430')] +[2023-10-11 18:25:07,701][85176] Updated weights for policy 0, policy_version 82212 (0.0008) +[2023-10-11 18:25:08,097][85176] Updated weights for policy 0, policy_version 82222 (0.0007) +[2023-10-11 18:25:08,235][85175] Updated weights for policy 1, policy_version 83430 (0.0009) +[2023-10-11 18:25:08,466][85176] Updated weights for policy 0, policy_version 82232 (0.0009) +[2023-10-11 18:25:08,604][85175] Updated weights for policy 1, policy_version 83440 (0.0009) +[2023-10-11 18:25:08,963][85175] Updated weights for policy 1, policy_version 83450 (0.0009) +[2023-10-11 18:25:11,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 169672704. Throughput: 0: 1672.5, 1: 1665.5. Samples: 42424176. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-10-11 18:25:11,063][84230] Avg episode reward: [(0, '42.450'), (1, '47.030')] +[2023-10-11 18:25:12,577][85176] Updated weights for policy 0, policy_version 82242 (0.0009) +[2023-10-11 18:25:12,953][85176] Updated weights for policy 0, policy_version 82252 (0.0007) +[2023-10-11 18:25:13,120][85175] Updated weights for policy 1, policy_version 83460 (0.0008) +[2023-10-11 18:25:13,325][85176] Updated weights for policy 0, policy_version 82262 (0.0007) +[2023-10-11 18:25:13,517][85175] Updated weights for policy 1, policy_version 83470 (0.0007) +[2023-10-11 18:25:13,703][85176] Updated weights for policy 0, policy_version 82272 (0.0007) +[2023-10-11 18:25:13,895][85175] Updated weights for policy 1, policy_version 83480 (0.0009) +[2023-10-11 18:25:16,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 169738240. Throughput: 0: 1681.3, 1: 1691.5. Samples: 42444994. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-10-11 18:25:16,063][84230] Avg episode reward: [(0, '46.260'), (1, '45.900')] +[2023-10-11 18:25:17,767][85176] Updated weights for policy 0, policy_version 82282 (0.0009) +[2023-10-11 18:25:17,835][85175] Updated weights for policy 1, policy_version 83490 (0.0011) +[2023-10-11 18:25:18,139][85176] Updated weights for policy 0, policy_version 82292 (0.0008) +[2023-10-11 18:25:18,206][85175] Updated weights for policy 1, policy_version 83500 (0.0009) +[2023-10-11 18:25:18,513][85176] Updated weights for policy 0, policy_version 82302 (0.0008) +[2023-10-11 18:25:18,572][85175] Updated weights for policy 1, policy_version 83510 (0.0008) +[2023-10-11 18:25:18,927][85175] Updated weights for policy 1, policy_version 83520 (0.0011) +[2023-10-11 18:25:21,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 169803776. Throughput: 0: 1657.5, 1: 1675.9. Samples: 42454792. Policy #0 lag: (min: 6.0, avg: 13.8, max: 38.0) +[2023-10-11 18:25:21,063][84230] Avg episode reward: [(0, '41.270'), (1, '46.170')] +[2023-10-11 18:25:22,579][85176] Updated weights for policy 0, policy_version 82312 (0.0008) +[2023-10-11 18:25:22,870][85175] Updated weights for policy 1, policy_version 83530 (0.0007) +[2023-10-11 18:25:22,955][85176] Updated weights for policy 0, policy_version 82322 (0.0007) +[2023-10-11 18:25:23,230][85175] Updated weights for policy 1, policy_version 83540 (0.0007) +[2023-10-11 18:25:23,319][85176] Updated weights for policy 0, policy_version 82332 (0.0007) +[2023-10-11 18:25:23,594][85175] Updated weights for policy 1, policy_version 83550 (0.0008) +[2023-10-11 18:25:26,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 169869312. Throughput: 0: 1679.9, 1: 1682.3. Samples: 42475048. Policy #0 lag: (min: 6.0, avg: 13.8, max: 38.0) +[2023-10-11 18:25:26,064][84230] Avg episode reward: [(0, '46.460'), (1, '44.370')] +[2023-10-11 18:25:27,302][85176] Updated weights for policy 0, policy_version 82342 (0.0008) +[2023-10-11 18:25:27,666][85176] Updated weights for policy 0, policy_version 82352 (0.0009) +[2023-10-11 18:25:27,702][85175] Updated weights for policy 1, policy_version 83560 (0.0007) +[2023-10-11 18:25:28,043][85176] Updated weights for policy 0, policy_version 82362 (0.0010) +[2023-10-11 18:25:28,061][85175] Updated weights for policy 1, policy_version 83570 (0.0007) +[2023-10-11 18:25:28,426][85175] Updated weights for policy 1, policy_version 83580 (0.0008) +[2023-10-11 18:25:31,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 169934848. Throughput: 0: 1680.7, 1: 1689.2. Samples: 42495920. Policy #0 lag: (min: 6.0, avg: 13.8, max: 38.0) +[2023-10-11 18:25:31,063][84230] Avg episode reward: [(0, '41.740'), (1, '43.330')] +[2023-10-11 18:25:32,187][85176] Updated weights for policy 0, policy_version 82372 (0.0008) +[2023-10-11 18:25:32,448][85175] Updated weights for policy 1, policy_version 83590 (0.0009) +[2023-10-11 18:25:32,567][85176] Updated weights for policy 0, policy_version 82382 (0.0008) +[2023-10-11 18:25:32,809][85175] Updated weights for policy 1, policy_version 83600 (0.0007) +[2023-10-11 18:25:32,931][85176] Updated weights for policy 0, policy_version 82392 (0.0008) +[2023-10-11 18:25:33,176][85175] Updated weights for policy 1, policy_version 83610 (0.0008) +[2023-10-11 18:25:36,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 170000384. Throughput: 0: 1663.5, 1: 1665.1. Samples: 42504962. Policy #0 lag: (min: 6.0, avg: 13.8, max: 38.0) +[2023-10-11 18:25:36,063][84230] Avg episode reward: [(0, '48.670'), (1, '42.570')] +[2023-10-11 18:25:36,064][84801] Saving new best policy, reward=48.670! +[2023-10-11 18:25:36,996][85176] Updated weights for policy 0, policy_version 82402 (0.0007) +[2023-10-11 18:25:37,221][85175] Updated weights for policy 1, policy_version 83620 (0.0008) +[2023-10-11 18:25:37,371][85176] Updated weights for policy 0, policy_version 82412 (0.0007) +[2023-10-11 18:25:37,587][85175] Updated weights for policy 1, policy_version 83630 (0.0008) +[2023-10-11 18:25:37,739][85176] Updated weights for policy 0, policy_version 82422 (0.0007) +[2023-10-11 18:25:37,957][85175] Updated weights for policy 1, policy_version 83640 (0.0007) +[2023-10-11 18:25:38,111][85176] Updated weights for policy 0, policy_version 82432 (0.0008) +[2023-10-11 18:25:41,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 170065920. Throughput: 0: 1672.9, 1: 1692.7. Samples: 42525592. Policy #0 lag: (min: 6.0, avg: 13.8, max: 38.0) +[2023-10-11 18:25:41,064][84230] Avg episode reward: [(0, '43.420'), (1, '42.640')] +[2023-10-11 18:25:42,020][85175] Updated weights for policy 1, policy_version 83650 (0.0007) +[2023-10-11 18:25:42,095][85176] Updated weights for policy 0, policy_version 82442 (0.0009) +[2023-10-11 18:25:42,390][85175] Updated weights for policy 1, policy_version 83660 (0.0007) +[2023-10-11 18:25:42,478][85176] Updated weights for policy 0, policy_version 82452 (0.0009) +[2023-10-11 18:25:42,750][85175] Updated weights for policy 1, policy_version 83670 (0.0007) +[2023-10-11 18:25:42,851][85176] Updated weights for policy 0, policy_version 82462 (0.0007) +[2023-10-11 18:25:43,117][85175] Updated weights for policy 1, policy_version 83680 (0.0010) +[2023-10-11 18:25:46,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 170131456. Throughput: 0: 1675.8, 1: 1699.4. Samples: 42546412. Policy #0 lag: (min: 6.0, avg: 13.8, max: 38.0) +[2023-10-11 18:25:46,064][84230] Avg episode reward: [(0, '47.820'), (1, '42.550')] +[2023-10-11 18:25:47,000][85176] Updated weights for policy 0, policy_version 82472 (0.0007) +[2023-10-11 18:25:47,120][85175] Updated weights for policy 1, policy_version 83690 (0.0008) +[2023-10-11 18:25:47,372][85176] Updated weights for policy 0, policy_version 82482 (0.0008) +[2023-10-11 18:25:47,490][85175] Updated weights for policy 1, policy_version 83700 (0.0008) +[2023-10-11 18:25:47,745][85176] Updated weights for policy 0, policy_version 82492 (0.0007) +[2023-10-11 18:25:47,848][85175] Updated weights for policy 1, policy_version 83710 (0.0009) +[2023-10-11 18:25:51,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 170196992. Throughput: 0: 1667.5, 1: 1681.3. Samples: 42555514. Policy #0 lag: (min: 6.0, avg: 13.8, max: 38.0) +[2023-10-11 18:25:51,063][84230] Avg episode reward: [(0, '40.860'), (1, '42.010')] +[2023-10-11 18:25:51,894][85176] Updated weights for policy 0, policy_version 82502 (0.0009) +[2023-10-11 18:25:51,899][85175] Updated weights for policy 1, policy_version 83720 (0.0007) +[2023-10-11 18:25:52,263][85176] Updated weights for policy 0, policy_version 82512 (0.0008) +[2023-10-11 18:25:52,271][85175] Updated weights for policy 1, policy_version 83730 (0.0009) +[2023-10-11 18:25:52,644][85175] Updated weights for policy 1, policy_version 83740 (0.0009) +[2023-10-11 18:25:52,645][85176] Updated weights for policy 0, policy_version 82522 (0.0008) +[2023-10-11 18:25:56,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 170262528. Throughput: 0: 1674.2, 1: 1703.1. Samples: 42576156. Policy #0 lag: (min: 6.0, avg: 13.8, max: 38.0) +[2023-10-11 18:25:56,063][84230] Avg episode reward: [(0, '48.270'), (1, '40.780')] +[2023-10-11 18:25:56,653][85175] Updated weights for policy 1, policy_version 83750 (0.0008) +[2023-10-11 18:25:56,805][85176] Updated weights for policy 0, policy_version 82532 (0.0007) +[2023-10-11 18:25:57,020][85175] Updated weights for policy 1, policy_version 83760 (0.0007) +[2023-10-11 18:25:57,197][85176] Updated weights for policy 0, policy_version 82542 (0.0007) +[2023-10-11 18:25:57,384][85175] Updated weights for policy 1, policy_version 83770 (0.0007) +[2023-10-11 18:25:57,567][85176] Updated weights for policy 0, policy_version 82552 (0.0008) +[2023-10-11 18:26:01,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 170328064. Throughput: 0: 1669.3, 1: 1708.3. Samples: 42596986. Policy #0 lag: (min: 6.0, avg: 13.8, max: 38.0) +[2023-10-11 18:26:01,063][84230] Avg episode reward: [(0, '43.310'), (1, '44.150')] +[2023-10-11 18:26:01,468][85175] Updated weights for policy 1, policy_version 83780 (0.0007) +[2023-10-11 18:26:01,596][85176] Updated weights for policy 0, policy_version 82562 (0.0007) +[2023-10-11 18:26:01,871][85175] Updated weights for policy 1, policy_version 83790 (0.0007) +[2023-10-11 18:26:01,975][85176] Updated weights for policy 0, policy_version 82572 (0.0008) +[2023-10-11 18:26:02,245][85175] Updated weights for policy 1, policy_version 83800 (0.0009) +[2023-10-11 18:26:02,339][85176] Updated weights for policy 0, policy_version 82582 (0.0007) +[2023-10-11 18:26:02,712][85176] Updated weights for policy 0, policy_version 82592 (0.0008) +[2023-10-11 18:26:06,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 170393600. Throughput: 0: 1666.4, 1: 1693.6. Samples: 42605994. Policy #0 lag: (min: 6.0, avg: 13.8, max: 38.0) +[2023-10-11 18:26:06,063][84230] Avg episode reward: [(0, '48.270'), (1, '43.520')] +[2023-10-11 18:26:06,155][85175] Updated weights for policy 1, policy_version 83810 (0.0008) +[2023-10-11 18:26:06,526][85175] Updated weights for policy 1, policy_version 83820 (0.0007) +[2023-10-11 18:26:06,791][85176] Updated weights for policy 0, policy_version 82602 (0.0008) +[2023-10-11 18:26:06,880][85175] Updated weights for policy 1, policy_version 83830 (0.0008) +[2023-10-11 18:26:07,171][85176] Updated weights for policy 0, policy_version 82612 (0.0009) +[2023-10-11 18:26:07,250][85175] Updated weights for policy 1, policy_version 83840 (0.0008) +[2023-10-11 18:26:07,550][85176] Updated weights for policy 0, policy_version 82622 (0.0011) +[2023-10-11 18:26:11,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 170459136. Throughput: 0: 1667.2, 1: 1703.2. Samples: 42626716. Policy #0 lag: (min: 6.0, avg: 13.8, max: 38.0) +[2023-10-11 18:26:11,063][84230] Avg episode reward: [(0, '43.640'), (1, '45.890')] +[2023-10-11 18:26:11,342][85175] Updated weights for policy 1, policy_version 83850 (0.0007) +[2023-10-11 18:26:11,713][85175] Updated weights for policy 1, policy_version 83860 (0.0007) +[2023-10-11 18:26:11,766][85176] Updated weights for policy 0, policy_version 82632 (0.0011) +[2023-10-11 18:26:12,083][85175] Updated weights for policy 1, policy_version 83870 (0.0007) +[2023-10-11 18:26:12,140][85176] Updated weights for policy 0, policy_version 82642 (0.0009) +[2023-10-11 18:26:12,515][85176] Updated weights for policy 0, policy_version 82652 (0.0008) +[2023-10-11 18:26:16,062][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 170524672. Throughput: 0: 1665.3, 1: 1704.5. Samples: 42647560. Policy #0 lag: (min: 20.0, avg: 20.0, max: 20.0) +[2023-10-11 18:26:16,063][84230] Avg episode reward: [(0, '46.810'), (1, '44.700')] +[2023-10-11 18:26:16,071][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000082656_84639744.pth... +[2023-10-11 18:26:16,100][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000081120_83066880.pth +[2023-10-11 18:26:16,118][85175] Updated weights for policy 1, policy_version 83880 (0.0007) +[2023-10-11 18:26:16,484][85175] Updated weights for policy 1, policy_version 83890 (0.0009) +[2023-10-11 18:26:16,600][85176] Updated weights for policy 0, policy_version 82662 (0.0008) +[2023-10-11 18:26:16,858][85175] Updated weights for policy 1, policy_version 83900 (0.0007) +[2023-10-11 18:26:16,977][85176] Updated weights for policy 0, policy_version 82672 (0.0008) +[2023-10-11 18:26:17,000][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000083904_85917696.pth... +[2023-10-11 18:26:17,036][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000082304_84279296.pth +[2023-10-11 18:26:17,348][85176] Updated weights for policy 0, policy_version 82682 (0.0007) +[2023-10-11 18:26:20,843][85175] Updated weights for policy 1, policy_version 83910 (0.0008) +[2023-10-11 18:26:21,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 170590208. Throughput: 0: 1668.3, 1: 1705.1. Samples: 42656768. Policy #0 lag: (min: 20.0, avg: 20.0, max: 20.0) +[2023-10-11 18:26:21,064][84230] Avg episode reward: [(0, '44.540'), (1, '45.290')] +[2023-10-11 18:26:21,218][85175] Updated weights for policy 1, policy_version 83920 (0.0008) +[2023-10-11 18:26:21,454][85176] Updated weights for policy 0, policy_version 82692 (0.0008) +[2023-10-11 18:26:21,583][85175] Updated weights for policy 1, policy_version 83930 (0.0008) +[2023-10-11 18:26:21,819][85176] Updated weights for policy 0, policy_version 82702 (0.0008) +[2023-10-11 18:26:22,200][85176] Updated weights for policy 0, policy_version 82712 (0.0009) +[2023-10-11 18:26:25,621][85175] Updated weights for policy 1, policy_version 83940 (0.0011) +[2023-10-11 18:26:25,996][85175] Updated weights for policy 1, policy_version 83950 (0.0009) +[2023-10-11 18:26:26,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.5). Total num frames: 170655744. Throughput: 0: 1675.8, 1: 1703.7. Samples: 42677670. Policy #0 lag: (min: 20.0, avg: 20.0, max: 20.0) +[2023-10-11 18:26:26,063][84230] Avg episode reward: [(0, '48.020'), (1, '45.210')] +[2023-10-11 18:26:26,246][85176] Updated weights for policy 0, policy_version 82722 (0.0009) +[2023-10-11 18:26:26,362][85175] Updated weights for policy 1, policy_version 83960 (0.0008) +[2023-10-11 18:26:26,623][85176] Updated weights for policy 0, policy_version 82732 (0.0008) +[2023-10-11 18:26:26,993][85176] Updated weights for policy 0, policy_version 82742 (0.0007) +[2023-10-11 18:26:27,361][85176] Updated weights for policy 0, policy_version 82752 (0.0008) +[2023-10-11 18:26:30,342][85175] Updated weights for policy 1, policy_version 83970 (0.0010) +[2023-10-11 18:26:30,707][85175] Updated weights for policy 1, policy_version 83980 (0.0007) +[2023-10-11 18:26:31,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 170721280. Throughput: 0: 1676.4, 1: 1699.3. Samples: 42698316. Policy #0 lag: (min: 20.0, avg: 20.0, max: 20.0) +[2023-10-11 18:26:31,063][84230] Avg episode reward: [(0, '44.900'), (1, '45.040')] +[2023-10-11 18:26:31,076][85175] Updated weights for policy 1, policy_version 83990 (0.0007) +[2023-10-11 18:26:31,356][85176] Updated weights for policy 0, policy_version 82762 (0.0007) +[2023-10-11 18:26:31,439][85175] Updated weights for policy 1, policy_version 84000 (0.0009) +[2023-10-11 18:26:31,729][85176] Updated weights for policy 0, policy_version 82772 (0.0009) +[2023-10-11 18:26:32,093][85176] Updated weights for policy 0, policy_version 82782 (0.0007) +[2023-10-11 18:26:35,554][85175] Updated weights for policy 1, policy_version 84010 (0.0007) +[2023-10-11 18:26:35,920][85175] Updated weights for policy 1, policy_version 84020 (0.0007) +[2023-10-11 18:26:36,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 170786816. Throughput: 0: 1678.4, 1: 1702.5. Samples: 42707656. Policy #0 lag: (min: 20.0, avg: 20.0, max: 20.0) +[2023-10-11 18:26:36,064][84230] Avg episode reward: [(0, '45.190'), (1, '46.930')] +[2023-10-11 18:26:36,175][85176] Updated weights for policy 0, policy_version 82792 (0.0008) +[2023-10-11 18:26:36,289][85175] Updated weights for policy 1, policy_version 84030 (0.0008) +[2023-10-11 18:26:36,548][85176] Updated weights for policy 0, policy_version 82802 (0.0009) +[2023-10-11 18:26:36,918][85176] Updated weights for policy 0, policy_version 82812 (0.0010) +[2023-10-11 18:26:40,310][85175] Updated weights for policy 1, policy_version 84040 (0.0009) +[2023-10-11 18:26:40,672][85175] Updated weights for policy 1, policy_version 84050 (0.0009) +[2023-10-11 18:26:40,900][85176] Updated weights for policy 0, policy_version 82822 (0.0009) +[2023-10-11 18:26:41,034][85175] Updated weights for policy 1, policy_version 84060 (0.0009) +[2023-10-11 18:26:41,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 170852352. Throughput: 0: 1674.9, 1: 1703.5. Samples: 42728184. Policy #0 lag: (min: 20.0, avg: 20.0, max: 20.0) +[2023-10-11 18:26:41,063][84230] Avg episode reward: [(0, '45.310'), (1, '42.010')] +[2023-10-11 18:26:41,267][85176] Updated weights for policy 0, policy_version 82832 (0.0008) +[2023-10-11 18:26:41,642][85176] Updated weights for policy 0, policy_version 82842 (0.0007) +[2023-10-11 18:26:45,009][85175] Updated weights for policy 1, policy_version 84070 (0.0010) +[2023-10-11 18:26:45,368][85175] Updated weights for policy 1, policy_version 84080 (0.0009) +[2023-10-11 18:26:45,742][85175] Updated weights for policy 1, policy_version 84090 (0.0008) +[2023-10-11 18:26:45,818][85176] Updated weights for policy 0, policy_version 82852 (0.0009) +[2023-10-11 18:26:46,063][84230] Fps is (10 sec: 16384.1, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 170950656. Throughput: 0: 1673.6, 1: 1686.0. Samples: 42748164. Policy #0 lag: (min: 20.0, avg: 20.0, max: 20.0) +[2023-10-11 18:26:46,063][84230] Avg episode reward: [(0, '44.650'), (1, '48.490')] +[2023-10-11 18:26:46,205][85176] Updated weights for policy 0, policy_version 82862 (0.0009) +[2023-10-11 18:26:46,574][85176] Updated weights for policy 0, policy_version 82872 (0.0009) +[2023-10-11 18:26:49,990][85175] Updated weights for policy 1, policy_version 84100 (0.0009) +[2023-10-11 18:26:50,393][85175] Updated weights for policy 1, policy_version 84110 (0.0008) +[2023-10-11 18:26:50,669][85176] Updated weights for policy 0, policy_version 82882 (0.0008) +[2023-10-11 18:26:50,752][85175] Updated weights for policy 1, policy_version 84120 (0.0008) +[2023-10-11 18:26:51,047][85176] Updated weights for policy 0, policy_version 82892 (0.0007) +[2023-10-11 18:26:51,063][84230] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 171016192. Throughput: 0: 1671.2, 1: 1703.1. Samples: 42757840. Policy #0 lag: (min: 20.0, avg: 20.0, max: 20.0) +[2023-10-11 18:26:51,063][84230] Avg episode reward: [(0, '46.170'), (1, '42.380')] +[2023-10-11 18:26:51,410][85176] Updated weights for policy 0, policy_version 82902 (0.0008) +[2023-10-11 18:26:51,788][85176] Updated weights for policy 0, policy_version 82912 (0.0008) +[2023-10-11 18:26:54,784][85175] Updated weights for policy 1, policy_version 84130 (0.0009) +[2023-10-11 18:26:55,157][85175] Updated weights for policy 1, policy_version 84140 (0.0008) +[2023-10-11 18:26:55,519][85175] Updated weights for policy 1, policy_version 84150 (0.0009) +[2023-10-11 18:26:55,815][85176] Updated weights for policy 0, policy_version 82922 (0.0008) +[2023-10-11 18:26:55,886][85175] Updated weights for policy 1, policy_version 84160 (0.0008) +[2023-10-11 18:26:56,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 171081728. Throughput: 0: 1673.2, 1: 1697.6. Samples: 42778402. Policy #0 lag: (min: 20.0, avg: 20.0, max: 20.0) +[2023-10-11 18:26:56,063][84230] Avg episode reward: [(0, '42.570'), (1, '46.820')] +[2023-10-11 18:26:56,186][85176] Updated weights for policy 0, policy_version 82932 (0.0008) +[2023-10-11 18:26:56,554][85176] Updated weights for policy 0, policy_version 82942 (0.0007) +[2023-10-11 18:26:59,894][85175] Updated weights for policy 1, policy_version 84170 (0.0008) +[2023-10-11 18:27:00,260][85175] Updated weights for policy 1, policy_version 84180 (0.0009) +[2023-10-11 18:27:00,585][85176] Updated weights for policy 0, policy_version 82952 (0.0008) +[2023-10-11 18:27:00,624][85175] Updated weights for policy 1, policy_version 84190 (0.0008) +[2023-10-11 18:27:00,961][85176] Updated weights for policy 0, policy_version 82962 (0.0010) +[2023-10-11 18:27:01,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 171147264. Throughput: 0: 1672.5, 1: 1674.3. Samples: 42798164. Policy #0 lag: (min: 20.0, avg: 20.0, max: 20.0) +[2023-10-11 18:27:01,064][84230] Avg episode reward: [(0, '45.430'), (1, '42.240')] +[2023-10-11 18:27:01,331][85176] Updated weights for policy 0, policy_version 82972 (0.0009) +[2023-10-11 18:27:04,572][85175] Updated weights for policy 1, policy_version 84200 (0.0007) +[2023-10-11 18:27:04,936][85175] Updated weights for policy 1, policy_version 84210 (0.0007) +[2023-10-11 18:27:05,300][85175] Updated weights for policy 1, policy_version 84220 (0.0007) +[2023-10-11 18:27:05,554][85176] Updated weights for policy 0, policy_version 82982 (0.0008) +[2023-10-11 18:27:05,933][85176] Updated weights for policy 0, policy_version 82992 (0.0010) +[2023-10-11 18:27:06,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 171212800. Throughput: 0: 1678.9, 1: 1697.5. Samples: 42808708. Policy #0 lag: (min: 20.0, avg: 20.0, max: 20.0) +[2023-10-11 18:27:06,063][84230] Avg episode reward: [(0, '44.140'), (1, '45.510')] +[2023-10-11 18:27:06,299][85176] Updated weights for policy 0, policy_version 83002 (0.0009) +[2023-10-11 18:27:09,176][85175] Updated weights for policy 1, policy_version 84230 (0.0009) +[2023-10-11 18:27:09,546][85175] Updated weights for policy 1, policy_version 84240 (0.0008) +[2023-10-11 18:27:09,918][85175] Updated weights for policy 1, policy_version 84250 (0.0007) +[2023-10-11 18:27:10,356][85176] Updated weights for policy 0, policy_version 83012 (0.0007) +[2023-10-11 18:27:10,720][85176] Updated weights for policy 0, policy_version 83022 (0.0009) +[2023-10-11 18:27:11,063][84230] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 171278336. Throughput: 0: 1675.9, 1: 1683.1. Samples: 42828824. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) +[2023-10-11 18:27:11,063][84230] Avg episode reward: [(0, '46.470'), (1, '42.840')] +[2023-10-11 18:27:11,086][85176] Updated weights for policy 0, policy_version 83032 (0.0009) +[2023-10-11 18:27:13,883][85175] Updated weights for policy 1, policy_version 84260 (0.0008) +[2023-10-11 18:27:14,241][85175] Updated weights for policy 1, policy_version 84270 (0.0008) +[2023-10-11 18:27:14,608][85175] Updated weights for policy 1, policy_version 84280 (0.0010) +[2023-10-11 18:27:15,157][85176] Updated weights for policy 0, policy_version 83042 (0.0008) +[2023-10-11 18:27:15,533][85176] Updated weights for policy 0, policy_version 83052 (0.0010) +[2023-10-11 18:27:15,904][85176] Updated weights for policy 0, policy_version 83062 (0.0009) +[2023-10-11 18:27:16,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 171343872. Throughput: 0: 1663.5, 1: 1676.7. Samples: 42848624. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) +[2023-10-11 18:27:16,064][84230] Avg episode reward: [(0, '44.090'), (1, '42.940')] +[2023-10-11 18:27:16,277][85176] Updated weights for policy 0, policy_version 83072 (0.0010) +[2023-10-11 18:27:18,542][85175] Updated weights for policy 1, policy_version 84290 (0.0009) +[2023-10-11 18:27:18,908][85175] Updated weights for policy 1, policy_version 84300 (0.0008) +[2023-10-11 18:27:19,280][85175] Updated weights for policy 1, policy_version 84310 (0.0008) +[2023-10-11 18:27:19,649][85175] Updated weights for policy 1, policy_version 84320 (0.0009) +[2023-10-11 18:27:20,351][85176] Updated weights for policy 0, policy_version 83082 (0.0008) +[2023-10-11 18:27:20,727][85176] Updated weights for policy 0, policy_version 83092 (0.0007) +[2023-10-11 18:27:21,062][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 171409408. Throughput: 0: 1672.7, 1: 1704.9. Samples: 42859648. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) +[2023-10-11 18:27:21,063][84230] Avg episode reward: [(0, '45.050'), (1, '46.240')] +[2023-10-11 18:27:21,090][85176] Updated weights for policy 0, policy_version 83102 (0.0007) +[2023-10-11 18:27:23,712][85175] Updated weights for policy 1, policy_version 84330 (0.0008) +[2023-10-11 18:27:24,078][85175] Updated weights for policy 1, policy_version 84340 (0.0009) +[2023-10-11 18:27:24,446][85175] Updated weights for policy 1, policy_version 84350 (0.0010) +[2023-10-11 18:27:25,057][85176] Updated weights for policy 0, policy_version 83112 (0.0008) +[2023-10-11 18:27:25,433][85176] Updated weights for policy 0, policy_version 83122 (0.0010) +[2023-10-11 18:27:25,805][85176] Updated weights for policy 0, policy_version 83132 (0.0009) +[2023-10-11 18:27:26,063][84230] Fps is (10 sec: 16384.1, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 171507712. Throughput: 0: 1683.0, 1: 1679.0. Samples: 42879472. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) +[2023-10-11 18:27:26,064][84230] Avg episode reward: [(0, '42.490'), (1, '45.600')] +[2023-10-11 18:27:28,356][85175] Updated weights for policy 1, policy_version 84360 (0.0009) +[2023-10-11 18:27:28,731][85175] Updated weights for policy 1, policy_version 84370 (0.0009) +[2023-10-11 18:27:29,095][85175] Updated weights for policy 1, policy_version 84380 (0.0010) +[2023-10-11 18:27:29,921][85176] Updated weights for policy 0, policy_version 83142 (0.0008) +[2023-10-11 18:27:30,297][85176] Updated weights for policy 0, policy_version 83152 (0.0007) +[2023-10-11 18:27:30,674][85176] Updated weights for policy 0, policy_version 83162 (0.0007) +[2023-10-11 18:27:31,063][84230] Fps is (10 sec: 16383.7, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 171573248. Throughput: 0: 1663.8, 1: 1698.3. Samples: 42899458. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) +[2023-10-11 18:27:31,063][84230] Avg episode reward: [(0, '43.860'), (1, '47.180')] +[2023-10-11 18:27:33,287][85175] Updated weights for policy 1, policy_version 84390 (0.0011) +[2023-10-11 18:27:33,664][85175] Updated weights for policy 1, policy_version 84400 (0.0011) +[2023-10-11 18:27:34,022][85175] Updated weights for policy 1, policy_version 84410 (0.0010) +[2023-10-11 18:27:34,659][85176] Updated weights for policy 0, policy_version 83172 (0.0009) +[2023-10-11 18:27:35,048][85176] Updated weights for policy 0, policy_version 83182 (0.0008) +[2023-10-11 18:27:35,419][85176] Updated weights for policy 0, policy_version 83192 (0.0007) +[2023-10-11 18:27:36,062][84230] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 171638784. Throughput: 0: 1687.7, 1: 1699.7. Samples: 42910272. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) +[2023-10-11 18:27:36,063][84230] Avg episode reward: [(0, '40.840'), (1, '43.670')] +[2023-10-11 18:27:38,201][85175] Updated weights for policy 1, policy_version 84420 (0.0009) +[2023-10-11 18:27:38,595][85175] Updated weights for policy 1, policy_version 84430 (0.0007) +[2023-10-11 18:27:38,962][85175] Updated weights for policy 1, policy_version 84440 (0.0009) +[2023-10-11 18:27:39,456][85176] Updated weights for policy 0, policy_version 83202 (0.0007) +[2023-10-11 18:27:39,823][85176] Updated weights for policy 0, policy_version 83212 (0.0011) +[2023-10-11 18:27:40,202][85176] Updated weights for policy 0, policy_version 83222 (0.0008) +[2023-10-11 18:27:40,565][85176] Updated weights for policy 0, policy_version 83232 (0.0009) +[2023-10-11 18:27:41,062][84230] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 171704320. Throughput: 0: 1686.9, 1: 1683.8. Samples: 42930084. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) +[2023-10-11 18:27:41,063][84230] Avg episode reward: [(0, '44.620'), (1, '44.470')] +[2023-10-11 18:27:42,941][85175] Updated weights for policy 1, policy_version 84450 (0.0009) +[2023-10-11 18:27:43,308][85175] Updated weights for policy 1, policy_version 84460 (0.0007) +[2023-10-11 18:27:43,670][85175] Updated weights for policy 1, policy_version 84470 (0.0007) +[2023-10-11 18:27:44,039][85175] Updated weights for policy 1, policy_version 84480 (0.0009) +[2023-10-11 18:27:44,726][85176] Updated weights for policy 0, policy_version 83242 (0.0009) +[2023-10-11 18:27:45,109][85176] Updated weights for policy 0, policy_version 83252 (0.0010) +[2023-10-11 18:27:45,483][85176] Updated weights for policy 0, policy_version 83262 (0.0009) +[2023-10-11 18:27:46,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 171769856. Throughput: 0: 1665.1, 1: 1708.5. Samples: 42949974. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) +[2023-10-11 18:27:46,063][84230] Avg episode reward: [(0, '42.040'), (1, '43.090')] +[2023-10-11 18:27:48,002][85175] Updated weights for policy 1, policy_version 84490 (0.0010) +[2023-10-11 18:27:48,365][85175] Updated weights for policy 1, policy_version 84500 (0.0007) +[2023-10-11 18:27:48,736][85175] Updated weights for policy 1, policy_version 84510 (0.0010) +[2023-10-11 18:27:49,549][85176] Updated weights for policy 0, policy_version 83272 (0.0008) +[2023-10-11 18:27:49,907][85176] Updated weights for policy 0, policy_version 83282 (0.0007) +[2023-10-11 18:27:50,284][85176] Updated weights for policy 0, policy_version 83292 (0.0010) +[2023-10-11 18:27:51,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 171835392. Throughput: 0: 1685.6, 1: 1688.1. Samples: 42960524. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) +[2023-10-11 18:27:51,063][84230] Avg episode reward: [(0, '43.670'), (1, '47.310')] +[2023-10-11 18:27:52,787][85175] Updated weights for policy 1, policy_version 84520 (0.0008) +[2023-10-11 18:27:53,158][85175] Updated weights for policy 1, policy_version 84530 (0.0009) +[2023-10-11 18:27:53,521][85175] Updated weights for policy 1, policy_version 84540 (0.0010) +[2023-10-11 18:27:54,334][85176] Updated weights for policy 0, policy_version 83302 (0.0008) +[2023-10-11 18:27:54,715][85176] Updated weights for policy 0, policy_version 83312 (0.0009) +[2023-10-11 18:27:55,095][85176] Updated weights for policy 0, policy_version 83322 (0.0008) +[2023-10-11 18:27:56,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 171900928. Throughput: 0: 1676.8, 1: 1696.8. Samples: 42980638. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) +[2023-10-11 18:27:56,063][84230] Avg episode reward: [(0, '40.650'), (1, '44.040')] +[2023-10-11 18:27:57,480][85175] Updated weights for policy 1, policy_version 84550 (0.0009) +[2023-10-11 18:27:57,838][85175] Updated weights for policy 1, policy_version 84560 (0.0009) +[2023-10-11 18:27:58,203][85175] Updated weights for policy 1, policy_version 84570 (0.0009) +[2023-10-11 18:27:59,017][85176] Updated weights for policy 0, policy_version 83332 (0.0009) +[2023-10-11 18:27:59,393][85176] Updated weights for policy 0, policy_version 83342 (0.0009) +[2023-10-11 18:27:59,763][85176] Updated weights for policy 0, policy_version 83352 (0.0007) +[2023-10-11 18:28:01,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 171966464. Throughput: 0: 1673.2, 1: 1711.3. Samples: 43000928. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:28:01,063][84230] Avg episode reward: [(0, '42.000'), (1, '45.860')] +[2023-10-11 18:28:02,335][85175] Updated weights for policy 1, policy_version 84580 (0.0008) +[2023-10-11 18:28:02,698][85175] Updated weights for policy 1, policy_version 84590 (0.0008) +[2023-10-11 18:28:03,080][85175] Updated weights for policy 1, policy_version 84600 (0.0008) +[2023-10-11 18:28:03,906][85176] Updated weights for policy 0, policy_version 83362 (0.0007) +[2023-10-11 18:28:04,280][85176] Updated weights for policy 0, policy_version 83372 (0.0010) +[2023-10-11 18:28:04,651][85176] Updated weights for policy 0, policy_version 83382 (0.0009) +[2023-10-11 18:28:05,012][85176] Updated weights for policy 0, policy_version 83392 (0.0010) +[2023-10-11 18:28:06,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 172032000. Throughput: 0: 1692.9, 1: 1676.1. Samples: 43011254. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:28:06,064][84230] Avg episode reward: [(0, '42.670'), (1, '41.860')] +[2023-10-11 18:28:07,095][85175] Updated weights for policy 1, policy_version 84610 (0.0010) +[2023-10-11 18:28:07,455][85175] Updated weights for policy 1, policy_version 84620 (0.0008) +[2023-10-11 18:28:07,823][85175] Updated weights for policy 1, policy_version 84630 (0.0008) +[2023-10-11 18:28:08,198][85175] Updated weights for policy 1, policy_version 84640 (0.0007) +[2023-10-11 18:28:08,968][85176] Updated weights for policy 0, policy_version 83402 (0.0007) +[2023-10-11 18:28:09,334][85176] Updated weights for policy 0, policy_version 83412 (0.0008) +[2023-10-11 18:28:09,718][85176] Updated weights for policy 0, policy_version 83422 (0.0010) +[2023-10-11 18:28:11,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 172097536. Throughput: 0: 1665.2, 1: 1705.5. Samples: 43031152. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:28:11,064][84230] Avg episode reward: [(0, '44.850'), (1, '46.570')] +[2023-10-11 18:28:12,065][85175] Updated weights for policy 1, policy_version 84650 (0.0008) +[2023-10-11 18:28:12,435][85175] Updated weights for policy 1, policy_version 84660 (0.0009) +[2023-10-11 18:28:12,808][85175] Updated weights for policy 1, policy_version 84670 (0.0007) +[2023-10-11 18:28:13,796][85176] Updated weights for policy 0, policy_version 83432 (0.0009) +[2023-10-11 18:28:14,157][85176] Updated weights for policy 0, policy_version 83442 (0.0009) +[2023-10-11 18:28:14,529][85176] Updated weights for policy 0, policy_version 83452 (0.0007) +[2023-10-11 18:28:16,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 172163072. Throughput: 0: 1681.1, 1: 1710.4. Samples: 43052076. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:28:16,064][84230] Avg episode reward: [(0, '45.700'), (1, '43.710')] +[2023-10-11 18:28:16,076][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000084672_86704128.pth... +[2023-10-11 18:28:16,076][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000083456_85458944.pth... +[2023-10-11 18:28:16,108][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000081888_83853312.pth +[2023-10-11 18:28:16,115][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000083104_85098496.pth +[2023-10-11 18:28:16,735][85175] Updated weights for policy 1, policy_version 84680 (0.0010) +[2023-10-11 18:28:17,106][85175] Updated weights for policy 1, policy_version 84690 (0.0009) +[2023-10-11 18:28:17,473][85175] Updated weights for policy 1, policy_version 84700 (0.0009) +[2023-10-11 18:28:18,621][85176] Updated weights for policy 0, policy_version 83462 (0.0007) +[2023-10-11 18:28:18,994][85176] Updated weights for policy 0, policy_version 83472 (0.0007) +[2023-10-11 18:28:19,362][85176] Updated weights for policy 0, policy_version 83482 (0.0007) +[2023-10-11 18:28:21,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 172228608. Throughput: 0: 1682.3, 1: 1696.5. Samples: 43062320. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:28:21,064][84230] Avg episode reward: [(0, '43.990'), (1, '47.940')] +[2023-10-11 18:28:21,434][85175] Updated weights for policy 1, policy_version 84710 (0.0009) +[2023-10-11 18:28:21,804][85175] Updated weights for policy 1, policy_version 84720 (0.0011) +[2023-10-11 18:28:22,183][85175] Updated weights for policy 1, policy_version 84730 (0.0008) +[2023-10-11 18:28:23,268][85176] Updated weights for policy 0, policy_version 83492 (0.0008) +[2023-10-11 18:28:23,633][85176] Updated weights for policy 0, policy_version 83502 (0.0007) +[2023-10-11 18:28:24,005][85176] Updated weights for policy 0, policy_version 83512 (0.0008) +[2023-10-11 18:28:26,062][84230] Fps is (10 sec: 13107.7, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 172294144. Throughput: 0: 1664.6, 1: 1717.1. Samples: 43082260. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:28:26,063][84230] Avg episode reward: [(0, '45.830'), (1, '43.950')] +[2023-10-11 18:28:26,269][85175] Updated weights for policy 1, policy_version 84740 (0.0008) +[2023-10-11 18:28:26,674][85175] Updated weights for policy 1, policy_version 84750 (0.0008) +[2023-10-11 18:28:27,039][85175] Updated weights for policy 1, policy_version 84760 (0.0009) +[2023-10-11 18:28:28,029][85176] Updated weights for policy 0, policy_version 83522 (0.0008) +[2023-10-11 18:28:28,411][85176] Updated weights for policy 0, policy_version 83532 (0.0008) +[2023-10-11 18:28:28,790][85176] Updated weights for policy 0, policy_version 83542 (0.0007) +[2023-10-11 18:28:29,160][85176] Updated weights for policy 0, policy_version 83552 (0.0008) +[2023-10-11 18:28:31,043][85175] Updated weights for policy 1, policy_version 84770 (0.0009) +[2023-10-11 18:28:31,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 172359680. Throughput: 0: 1690.9, 1: 1711.8. Samples: 43103094. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:28:31,063][84230] Avg episode reward: [(0, '44.210'), (1, '48.540')] +[2023-10-11 18:28:31,413][85175] Updated weights for policy 1, policy_version 84780 (0.0007) +[2023-10-11 18:28:31,788][85175] Updated weights for policy 1, policy_version 84790 (0.0007) +[2023-10-11 18:28:32,155][85175] Updated weights for policy 1, policy_version 84800 (0.0007) +[2023-10-11 18:28:33,323][85176] Updated weights for policy 0, policy_version 83562 (0.0009) +[2023-10-11 18:28:33,695][85176] Updated weights for policy 0, policy_version 83572 (0.0007) +[2023-10-11 18:28:34,068][85176] Updated weights for policy 0, policy_version 83582 (0.0008) +[2023-10-11 18:28:36,062][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 172425216. Throughput: 0: 1677.9, 1: 1708.8. Samples: 43112924. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:28:36,063][84230] Avg episode reward: [(0, '44.690'), (1, '45.210')] +[2023-10-11 18:28:36,076][85175] Updated weights for policy 1, policy_version 84810 (0.0007) +[2023-10-11 18:28:36,442][85175] Updated weights for policy 1, policy_version 84820 (0.0008) +[2023-10-11 18:28:36,805][85175] Updated weights for policy 1, policy_version 84830 (0.0008) +[2023-10-11 18:28:38,090][85176] Updated weights for policy 0, policy_version 83592 (0.0008) +[2023-10-11 18:28:38,472][85176] Updated weights for policy 0, policy_version 83602 (0.0009) +[2023-10-11 18:28:38,849][85176] Updated weights for policy 0, policy_version 83612 (0.0008) +[2023-10-11 18:28:40,829][85175] Updated weights for policy 1, policy_version 84840 (0.0008) +[2023-10-11 18:28:41,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 172490752. Throughput: 0: 1670.4, 1: 1713.4. Samples: 43132908. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:28:41,063][84230] Avg episode reward: [(0, '42.770'), (1, '43.710')] +[2023-10-11 18:28:41,197][85175] Updated weights for policy 1, policy_version 84850 (0.0008) +[2023-10-11 18:28:41,559][85175] Updated weights for policy 1, policy_version 84860 (0.0008) +[2023-10-11 18:28:43,140][85176] Updated weights for policy 0, policy_version 83622 (0.0008) +[2023-10-11 18:28:43,522][85176] Updated weights for policy 0, policy_version 83632 (0.0009) +[2023-10-11 18:28:43,896][85176] Updated weights for policy 0, policy_version 83642 (0.0009) +[2023-10-11 18:28:45,492][85175] Updated weights for policy 1, policy_version 84870 (0.0008) +[2023-10-11 18:28:45,852][85175] Updated weights for policy 1, policy_version 84880 (0.0007) +[2023-10-11 18:28:46,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 172556288. Throughput: 0: 1684.4, 1: 1705.4. Samples: 43153470. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:28:46,064][84230] Avg episode reward: [(0, '47.400'), (1, '42.820')] +[2023-10-11 18:28:46,225][85175] Updated weights for policy 1, policy_version 84890 (0.0007) +[2023-10-11 18:28:48,006][85176] Updated weights for policy 0, policy_version 83652 (0.0008) +[2023-10-11 18:28:48,391][85176] Updated weights for policy 0, policy_version 83662 (0.0007) +[2023-10-11 18:28:48,754][85176] Updated weights for policy 0, policy_version 83672 (0.0008) +[2023-10-11 18:28:50,313][85175] Updated weights for policy 1, policy_version 84900 (0.0008) +[2023-10-11 18:28:50,670][85175] Updated weights for policy 1, policy_version 84910 (0.0007) +[2023-10-11 18:28:51,043][85175] Updated weights for policy 1, policy_version 84920 (0.0008) +[2023-10-11 18:28:51,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 172621824. Throughput: 0: 1665.7, 1: 1715.6. Samples: 43163410. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:28:51,063][84230] Avg episode reward: [(0, '44.930'), (1, '44.020')] +[2023-10-11 18:28:52,834][85176] Updated weights for policy 0, policy_version 83682 (0.0007) +[2023-10-11 18:28:53,201][85176] Updated weights for policy 0, policy_version 83692 (0.0009) +[2023-10-11 18:28:53,575][85176] Updated weights for policy 0, policy_version 83702 (0.0009) +[2023-10-11 18:28:53,946][85176] Updated weights for policy 0, policy_version 83712 (0.0008) +[2023-10-11 18:28:55,095][85175] Updated weights for policy 1, policy_version 84930 (0.0007) +[2023-10-11 18:28:55,452][85175] Updated weights for policy 1, policy_version 84940 (0.0009) +[2023-10-11 18:28:55,826][85175] Updated weights for policy 1, policy_version 84950 (0.0010) +[2023-10-11 18:28:56,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 172687360. Throughput: 0: 1677.2, 1: 1710.0. Samples: 43183576. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:28:56,064][84230] Avg episode reward: [(0, '47.110'), (1, '44.820')] +[2023-10-11 18:28:56,197][85175] Updated weights for policy 1, policy_version 84960 (0.0009) +[2023-10-11 18:28:58,037][85176] Updated weights for policy 0, policy_version 83722 (0.0008) +[2023-10-11 18:28:58,408][85176] Updated weights for policy 0, policy_version 83732 (0.0008) +[2023-10-11 18:28:58,777][85176] Updated weights for policy 0, policy_version 83742 (0.0007) +[2023-10-11 18:29:00,187][85175] Updated weights for policy 1, policy_version 84970 (0.0009) +[2023-10-11 18:29:00,557][85175] Updated weights for policy 1, policy_version 84980 (0.0009) +[2023-10-11 18:29:00,921][85175] Updated weights for policy 1, policy_version 84990 (0.0008) +[2023-10-11 18:29:01,062][84230] Fps is (10 sec: 16384.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 172785664. Throughput: 0: 1682.0, 1: 1688.4. Samples: 43203742. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:29:01,063][84230] Avg episode reward: [(0, '43.050'), (1, '45.110')] +[2023-10-11 18:29:03,025][85176] Updated weights for policy 0, policy_version 83752 (0.0008) +[2023-10-11 18:29:03,395][85176] Updated weights for policy 0, policy_version 83762 (0.0009) +[2023-10-11 18:29:03,774][85176] Updated weights for policy 0, policy_version 83772 (0.0008) +[2023-10-11 18:29:04,957][85175] Updated weights for policy 1, policy_version 85000 (0.0009) +[2023-10-11 18:29:05,318][85175] Updated weights for policy 1, policy_version 85010 (0.0008) +[2023-10-11 18:29:05,693][85175] Updated weights for policy 1, policy_version 85020 (0.0008) +[2023-10-11 18:29:06,063][84230] Fps is (10 sec: 16384.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 172851200. Throughput: 0: 1666.7, 1: 1704.7. Samples: 43214030. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:29:06,063][84230] Avg episode reward: [(0, '45.890'), (1, '45.690')] +[2023-10-11 18:29:07,786][85176] Updated weights for policy 0, policy_version 83782 (0.0007) +[2023-10-11 18:29:08,163][85176] Updated weights for policy 0, policy_version 83792 (0.0007) +[2023-10-11 18:29:08,537][85176] Updated weights for policy 0, policy_version 83802 (0.0009) +[2023-10-11 18:29:09,691][85175] Updated weights for policy 1, policy_version 85030 (0.0009) +[2023-10-11 18:29:10,062][85175] Updated weights for policy 1, policy_version 85040 (0.0009) +[2023-10-11 18:29:10,429][85175] Updated weights for policy 1, policy_version 85050 (0.0009) +[2023-10-11 18:29:11,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 172916736. Throughput: 0: 1677.0, 1: 1704.4. Samples: 43234422. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:29:11,064][84230] Avg episode reward: [(0, '43.740'), (1, '44.160')] +[2023-10-11 18:29:12,611][85176] Updated weights for policy 0, policy_version 83812 (0.0010) +[2023-10-11 18:29:12,985][85176] Updated weights for policy 0, policy_version 83822 (0.0009) +[2023-10-11 18:29:13,351][85176] Updated weights for policy 0, policy_version 83832 (0.0008) +[2023-10-11 18:29:14,413][85175] Updated weights for policy 1, policy_version 85060 (0.0008) +[2023-10-11 18:29:14,785][85175] Updated weights for policy 1, policy_version 85070 (0.0007) +[2023-10-11 18:29:15,150][85175] Updated weights for policy 1, policy_version 85080 (0.0011) +[2023-10-11 18:29:16,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 172982272. Throughput: 0: 1674.5, 1: 1680.4. Samples: 43254066. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:29:16,063][84230] Avg episode reward: [(0, '46.150'), (1, '45.230')] +[2023-10-11 18:29:17,231][85176] Updated weights for policy 0, policy_version 83842 (0.0008) +[2023-10-11 18:29:17,595][85176] Updated weights for policy 0, policy_version 83852 (0.0010) +[2023-10-11 18:29:17,971][85176] Updated weights for policy 0, policy_version 83862 (0.0007) +[2023-10-11 18:29:18,345][85176] Updated weights for policy 0, policy_version 83872 (0.0008) +[2023-10-11 18:29:19,020][85175] Updated weights for policy 1, policy_version 85090 (0.0007) +[2023-10-11 18:29:19,394][85175] Updated weights for policy 1, policy_version 85100 (0.0010) +[2023-10-11 18:29:19,759][85175] Updated weights for policy 1, policy_version 85110 (0.0007) +[2023-10-11 18:29:20,135][85175] Updated weights for policy 1, policy_version 85120 (0.0007) +[2023-10-11 18:29:21,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 173047808. Throughput: 0: 1656.9, 1: 1712.1. Samples: 43264532. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:29:21,064][84230] Avg episode reward: [(0, '41.860'), (1, '44.270')] +[2023-10-11 18:29:22,509][85176] Updated weights for policy 0, policy_version 83882 (0.0010) +[2023-10-11 18:29:22,890][85176] Updated weights for policy 0, policy_version 83892 (0.0011) +[2023-10-11 18:29:23,274][85176] Updated weights for policy 0, policy_version 83902 (0.0011) +[2023-10-11 18:29:24,291][85175] Updated weights for policy 1, policy_version 85130 (0.0008) +[2023-10-11 18:29:24,652][85175] Updated weights for policy 1, policy_version 85140 (0.0009) +[2023-10-11 18:29:25,021][85175] Updated weights for policy 1, policy_version 85150 (0.0008) +[2023-10-11 18:29:26,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 173113344. Throughput: 0: 1673.4, 1: 1699.4. Samples: 43284686. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:29:26,063][84230] Avg episode reward: [(0, '45.530'), (1, '46.350')] +[2023-10-11 18:29:27,187][85176] Updated weights for policy 0, policy_version 83912 (0.0010) +[2023-10-11 18:29:27,560][85176] Updated weights for policy 0, policy_version 83922 (0.0010) +[2023-10-11 18:29:27,925][85176] Updated weights for policy 0, policy_version 83932 (0.0008) +[2023-10-11 18:29:28,984][85175] Updated weights for policy 1, policy_version 85160 (0.0008) +[2023-10-11 18:29:29,360][85175] Updated weights for policy 1, policy_version 85170 (0.0009) +[2023-10-11 18:29:29,727][85175] Updated weights for policy 1, policy_version 85180 (0.0007) +[2023-10-11 18:29:31,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 173178880. Throughput: 0: 1677.1, 1: 1690.3. Samples: 43305000. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:29:31,064][84230] Avg episode reward: [(0, '43.200'), (1, '44.670')] +[2023-10-11 18:29:31,982][85176] Updated weights for policy 0, policy_version 83942 (0.0007) +[2023-10-11 18:29:32,360][85176] Updated weights for policy 0, policy_version 83952 (0.0007) +[2023-10-11 18:29:32,726][85176] Updated weights for policy 0, policy_version 83962 (0.0009) +[2023-10-11 18:29:33,637][85175] Updated weights for policy 1, policy_version 85190 (0.0007) +[2023-10-11 18:29:34,000][85175] Updated weights for policy 1, policy_version 85200 (0.0008) +[2023-10-11 18:29:34,363][85175] Updated weights for policy 1, policy_version 85210 (0.0011) +[2023-10-11 18:29:36,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 173244416. Throughput: 0: 1664.3, 1: 1710.9. Samples: 43315294. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:29:36,064][84230] Avg episode reward: [(0, '45.670'), (1, '46.640')] +[2023-10-11 18:29:36,695][85176] Updated weights for policy 0, policy_version 83972 (0.0008) +[2023-10-11 18:29:37,061][85176] Updated weights for policy 0, policy_version 83982 (0.0009) +[2023-10-11 18:29:37,432][85176] Updated weights for policy 0, policy_version 83992 (0.0010) +[2023-10-11 18:29:38,392][85175] Updated weights for policy 1, policy_version 85220 (0.0008) +[2023-10-11 18:29:38,760][85175] Updated weights for policy 1, policy_version 85230 (0.0008) +[2023-10-11 18:29:39,135][85175] Updated weights for policy 1, policy_version 85240 (0.0009) +[2023-10-11 18:29:41,063][84230] Fps is (10 sec: 13107.6, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 173309952. Throughput: 0: 1676.5, 1: 1686.0. Samples: 43334892. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:29:41,063][84230] Avg episode reward: [(0, '41.060'), (1, '43.140')] +[2023-10-11 18:29:41,494][85176] Updated weights for policy 0, policy_version 84002 (0.0008) +[2023-10-11 18:29:41,875][85176] Updated weights for policy 0, policy_version 84012 (0.0009) +[2023-10-11 18:29:42,244][85176] Updated weights for policy 0, policy_version 84022 (0.0009) +[2023-10-11 18:29:42,604][85176] Updated weights for policy 0, policy_version 84032 (0.0008) +[2023-10-11 18:29:43,076][85175] Updated weights for policy 1, policy_version 85250 (0.0010) +[2023-10-11 18:29:43,455][85175] Updated weights for policy 1, policy_version 85260 (0.0009) +[2023-10-11 18:29:43,822][85175] Updated weights for policy 1, policy_version 85270 (0.0008) +[2023-10-11 18:29:44,176][85175] Updated weights for policy 1, policy_version 85280 (0.0009) +[2023-10-11 18:29:46,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 173375488. Throughput: 0: 1680.8, 1: 1703.2. Samples: 43356022. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:29:46,064][84230] Avg episode reward: [(0, '44.520'), (1, '46.640')] +[2023-10-11 18:29:46,694][85176] Updated weights for policy 0, policy_version 84042 (0.0008) +[2023-10-11 18:29:47,060][85176] Updated weights for policy 0, policy_version 84052 (0.0007) +[2023-10-11 18:29:47,423][85176] Updated weights for policy 0, policy_version 84062 (0.0007) +[2023-10-11 18:29:48,158][85175] Updated weights for policy 1, policy_version 85290 (0.0011) +[2023-10-11 18:29:48,529][85175] Updated weights for policy 1, policy_version 85300 (0.0009) +[2023-10-11 18:29:48,896][85175] Updated weights for policy 1, policy_version 85310 (0.0007) +[2023-10-11 18:29:51,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 173441024. Throughput: 0: 1673.4, 1: 1698.6. Samples: 43365768. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:29:51,063][84230] Avg episode reward: [(0, '41.450'), (1, '42.390')] +[2023-10-11 18:29:51,527][85176] Updated weights for policy 0, policy_version 84072 (0.0008) +[2023-10-11 18:29:51,897][85176] Updated weights for policy 0, policy_version 84082 (0.0009) +[2023-10-11 18:29:52,268][85176] Updated weights for policy 0, policy_version 84092 (0.0008) +[2023-10-11 18:29:52,953][85175] Updated weights for policy 1, policy_version 85320 (0.0007) +[2023-10-11 18:29:53,321][85175] Updated weights for policy 1, policy_version 85330 (0.0007) +[2023-10-11 18:29:53,680][85175] Updated weights for policy 1, policy_version 85340 (0.0010) +[2023-10-11 18:29:56,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 173506560. Throughput: 0: 1685.6, 1: 1688.3. Samples: 43386244. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:29:56,064][84230] Avg episode reward: [(0, '47.090'), (1, '46.170')] +[2023-10-11 18:29:56,301][85176] Updated weights for policy 0, policy_version 84102 (0.0008) +[2023-10-11 18:29:56,675][85176] Updated weights for policy 0, policy_version 84112 (0.0008) +[2023-10-11 18:29:57,048][85176] Updated weights for policy 0, policy_version 84122 (0.0009) +[2023-10-11 18:29:57,834][85175] Updated weights for policy 1, policy_version 85350 (0.0008) +[2023-10-11 18:29:58,204][85175] Updated weights for policy 1, policy_version 85360 (0.0008) +[2023-10-11 18:29:58,574][85175] Updated weights for policy 1, policy_version 85370 (0.0009) +[2023-10-11 18:30:01,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 173572096. Throughput: 0: 1682.3, 1: 1718.6. Samples: 43407108. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:30:01,063][84230] Avg episode reward: [(0, '43.220'), (1, '42.820')] +[2023-10-11 18:30:01,147][85176] Updated weights for policy 0, policy_version 84132 (0.0009) +[2023-10-11 18:30:01,531][85176] Updated weights for policy 0, policy_version 84142 (0.0009) +[2023-10-11 18:30:01,910][85176] Updated weights for policy 0, policy_version 84152 (0.0008) +[2023-10-11 18:30:02,346][85175] Updated weights for policy 1, policy_version 85380 (0.0008) +[2023-10-11 18:30:02,743][85175] Updated weights for policy 1, policy_version 85390 (0.0008) +[2023-10-11 18:30:03,111][85175] Updated weights for policy 1, policy_version 85400 (0.0007) +[2023-10-11 18:30:05,938][85176] Updated weights for policy 0, policy_version 84162 (0.0010) +[2023-10-11 18:30:06,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 173637632. Throughput: 0: 1685.2, 1: 1688.7. Samples: 43416356. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:30:06,063][84230] Avg episode reward: [(0, '45.180'), (1, '47.100')] +[2023-10-11 18:30:06,307][85176] Updated weights for policy 0, policy_version 84172 (0.0011) +[2023-10-11 18:30:06,680][85176] Updated weights for policy 0, policy_version 84182 (0.0010) +[2023-10-11 18:30:07,059][85176] Updated weights for policy 0, policy_version 84192 (0.0009) +[2023-10-11 18:30:07,313][85175] Updated weights for policy 1, policy_version 85410 (0.0007) +[2023-10-11 18:30:07,678][85175] Updated weights for policy 1, policy_version 85420 (0.0008) +[2023-10-11 18:30:08,046][85175] Updated weights for policy 1, policy_version 85430 (0.0009) +[2023-10-11 18:30:08,409][85175] Updated weights for policy 1, policy_version 85440 (0.0008) +[2023-10-11 18:30:11,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 173703168. Throughput: 0: 1687.0, 1: 1693.6. Samples: 43436814. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:30:11,063][84230] Avg episode reward: [(0, '43.140'), (1, '42.790')] +[2023-10-11 18:30:11,087][85176] Updated weights for policy 0, policy_version 84202 (0.0011) +[2023-10-11 18:30:11,469][85176] Updated weights for policy 0, policy_version 84212 (0.0010) +[2023-10-11 18:30:11,837][85176] Updated weights for policy 0, policy_version 84222 (0.0007) +[2023-10-11 18:30:12,490][85175] Updated weights for policy 1, policy_version 85450 (0.0009) +[2023-10-11 18:30:12,864][85175] Updated weights for policy 1, policy_version 85460 (0.0010) +[2023-10-11 18:30:13,233][85175] Updated weights for policy 1, policy_version 85470 (0.0009) +[2023-10-11 18:30:15,871][85176] Updated weights for policy 0, policy_version 84232 (0.0008) +[2023-10-11 18:30:16,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 173768704. Throughput: 0: 1689.3, 1: 1705.3. Samples: 43457754. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:30:16,063][84230] Avg episode reward: [(0, '43.200'), (1, '45.300')] +[2023-10-11 18:30:16,074][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000085472_87523328.pth... +[2023-10-11 18:30:16,113][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000083904_85917696.pth +[2023-10-11 18:30:16,235][85176] Updated weights for policy 0, policy_version 84242 (0.0008) +[2023-10-11 18:30:16,610][85176] Updated weights for policy 0, policy_version 84252 (0.0008) +[2023-10-11 18:30:16,757][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000084256_86278144.pth... +[2023-10-11 18:30:16,789][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000082656_84639744.pth +[2023-10-11 18:30:17,256][85175] Updated weights for policy 1, policy_version 85480 (0.0009) +[2023-10-11 18:30:17,611][85175] Updated weights for policy 1, policy_version 85490 (0.0007) +[2023-10-11 18:30:17,977][85175] Updated weights for policy 1, policy_version 85500 (0.0009) +[2023-10-11 18:30:20,798][85176] Updated weights for policy 0, policy_version 84262 (0.0010) +[2023-10-11 18:30:21,062][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 173834240. Throughput: 0: 1692.5, 1: 1677.8. Samples: 43466956. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:30:21,063][84230] Avg episode reward: [(0, '44.230'), (1, '41.850')] +[2023-10-11 18:30:21,172][85176] Updated weights for policy 0, policy_version 84272 (0.0008) +[2023-10-11 18:30:21,546][85176] Updated weights for policy 0, policy_version 84282 (0.0009) +[2023-10-11 18:30:22,043][85175] Updated weights for policy 1, policy_version 85510 (0.0008) +[2023-10-11 18:30:22,413][85175] Updated weights for policy 1, policy_version 85520 (0.0008) +[2023-10-11 18:30:22,784][85175] Updated weights for policy 1, policy_version 85530 (0.0007) +[2023-10-11 18:30:25,573][85176] Updated weights for policy 0, policy_version 84292 (0.0009) +[2023-10-11 18:30:25,951][85176] Updated weights for policy 0, policy_version 84302 (0.0007) +[2023-10-11 18:30:26,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 173899776. Throughput: 0: 1690.6, 1: 1710.3. Samples: 43487936. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:30:26,064][84230] Avg episode reward: [(0, '44.730'), (1, '45.620')] +[2023-10-11 18:30:26,327][85176] Updated weights for policy 0, policy_version 84312 (0.0007) +[2023-10-11 18:30:26,747][85175] Updated weights for policy 1, policy_version 85540 (0.0008) +[2023-10-11 18:30:27,112][85175] Updated weights for policy 1, policy_version 85550 (0.0007) +[2023-10-11 18:30:27,484][85175] Updated weights for policy 1, policy_version 85560 (0.0007) +[2023-10-11 18:30:30,371][85176] Updated weights for policy 0, policy_version 84322 (0.0008) +[2023-10-11 18:30:30,736][85176] Updated weights for policy 0, policy_version 84332 (0.0007) +[2023-10-11 18:30:31,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 173965312. Throughput: 0: 1681.9, 1: 1708.7. Samples: 43508598. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:30:31,064][84230] Avg episode reward: [(0, '43.470'), (1, '44.180')] +[2023-10-11 18:30:31,104][85176] Updated weights for policy 0, policy_version 84342 (0.0008) +[2023-10-11 18:30:31,409][85175] Updated weights for policy 1, policy_version 85570 (0.0009) +[2023-10-11 18:30:31,474][85176] Updated weights for policy 0, policy_version 84352 (0.0007) +[2023-10-11 18:30:31,773][85175] Updated weights for policy 1, policy_version 85580 (0.0009) +[2023-10-11 18:30:32,148][85175] Updated weights for policy 1, policy_version 85590 (0.0009) +[2023-10-11 18:30:32,516][85175] Updated weights for policy 1, policy_version 85600 (0.0008) +[2023-10-11 18:30:35,570][85176] Updated weights for policy 0, policy_version 84362 (0.0011) +[2023-10-11 18:30:35,938][85176] Updated weights for policy 0, policy_version 84372 (0.0009) +[2023-10-11 18:30:36,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 174030848. Throughput: 0: 1689.4, 1: 1693.6. Samples: 43518000. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:30:36,063][84230] Avg episode reward: [(0, '44.330'), (1, '42.950')] +[2023-10-11 18:30:36,317][85176] Updated weights for policy 0, policy_version 84382 (0.0009) +[2023-10-11 18:30:36,609][85175] Updated weights for policy 1, policy_version 85610 (0.0008) +[2023-10-11 18:30:36,971][85175] Updated weights for policy 1, policy_version 85620 (0.0009) +[2023-10-11 18:30:37,351][85175] Updated weights for policy 1, policy_version 85630 (0.0008) +[2023-10-11 18:30:40,464][85176] Updated weights for policy 0, policy_version 84392 (0.0007) +[2023-10-11 18:30:40,832][85176] Updated weights for policy 0, policy_version 84402 (0.0007) +[2023-10-11 18:30:41,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 174096384. Throughput: 0: 1683.2, 1: 1704.1. Samples: 43538672. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:30:41,063][84230] Avg episode reward: [(0, '42.480'), (1, '46.750')] +[2023-10-11 18:30:41,206][85176] Updated weights for policy 0, policy_version 84412 (0.0008) +[2023-10-11 18:30:41,441][85175] Updated weights for policy 1, policy_version 85640 (0.0009) +[2023-10-11 18:30:41,804][85175] Updated weights for policy 1, policy_version 85650 (0.0007) +[2023-10-11 18:30:42,183][85175] Updated weights for policy 1, policy_version 85660 (0.0009) +[2023-10-11 18:30:45,197][85176] Updated weights for policy 0, policy_version 84422 (0.0008) +[2023-10-11 18:30:45,569][85176] Updated weights for policy 0, policy_version 84432 (0.0009) +[2023-10-11 18:30:45,945][85176] Updated weights for policy 0, policy_version 84442 (0.0009) +[2023-10-11 18:30:46,062][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 174161920. Throughput: 0: 1674.4, 1: 1696.5. Samples: 43558802. Policy #0 lag: (min: 19.0, avg: 27.3, max: 51.0) +[2023-10-11 18:30:46,063][84230] Avg episode reward: [(0, '45.110'), (1, '43.730')] +[2023-10-11 18:30:46,121][85175] Updated weights for policy 1, policy_version 85670 (0.0008) +[2023-10-11 18:30:46,494][85175] Updated weights for policy 1, policy_version 85680 (0.0007) +[2023-10-11 18:30:46,855][85175] Updated weights for policy 1, policy_version 85690 (0.0008) +[2023-10-11 18:30:50,041][85176] Updated weights for policy 0, policy_version 84452 (0.0008) +[2023-10-11 18:30:50,427][85176] Updated weights for policy 0, policy_version 84462 (0.0010) +[2023-10-11 18:30:50,790][85176] Updated weights for policy 0, policy_version 84472 (0.0009) +[2023-10-11 18:30:50,998][85175] Updated weights for policy 1, policy_version 85700 (0.0008) +[2023-10-11 18:30:51,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 174227456. Throughput: 0: 1688.3, 1: 1693.4. Samples: 43568532. Policy #0 lag: (min: 19.0, avg: 27.3, max: 51.0) +[2023-10-11 18:30:51,064][84230] Avg episode reward: [(0, '44.250'), (1, '46.600')] +[2023-10-11 18:30:51,366][85175] Updated weights for policy 1, policy_version 85710 (0.0008) +[2023-10-11 18:30:51,737][85175] Updated weights for policy 1, policy_version 85720 (0.0007) +[2023-10-11 18:30:54,827][85176] Updated weights for policy 0, policy_version 84482 (0.0009) +[2023-10-11 18:30:55,202][85176] Updated weights for policy 0, policy_version 84492 (0.0009) +[2023-10-11 18:30:55,567][85176] Updated weights for policy 0, policy_version 84502 (0.0009) +[2023-10-11 18:30:55,691][85175] Updated weights for policy 1, policy_version 85730 (0.0009) +[2023-10-11 18:30:55,941][85176] Updated weights for policy 0, policy_version 84512 (0.0009) +[2023-10-11 18:30:56,063][84230] Fps is (10 sec: 16383.7, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 174325760. Throughput: 0: 1681.3, 1: 1700.4. Samples: 43588994. Policy #0 lag: (min: 19.0, avg: 27.3, max: 51.0) +[2023-10-11 18:30:56,064][84230] Avg episode reward: [(0, '44.350'), (1, '44.490')] +[2023-10-11 18:30:56,066][85175] Updated weights for policy 1, policy_version 85740 (0.0009) +[2023-10-11 18:30:56,435][85175] Updated weights for policy 1, policy_version 85750 (0.0008) +[2023-10-11 18:30:56,803][85175] Updated weights for policy 1, policy_version 85760 (0.0008) +[2023-10-11 18:31:00,218][85176] Updated weights for policy 0, policy_version 84522 (0.0008) +[2023-10-11 18:31:00,587][85176] Updated weights for policy 0, policy_version 84532 (0.0008) +[2023-10-11 18:31:00,953][85176] Updated weights for policy 0, policy_version 84542 (0.0009) +[2023-10-11 18:31:00,955][85175] Updated weights for policy 1, policy_version 85770 (0.0009) +[2023-10-11 18:31:01,062][84230] Fps is (10 sec: 16384.3, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 174391296. Throughput: 0: 1652.8, 1: 1701.8. Samples: 43608710. Policy #0 lag: (min: 19.0, avg: 27.3, max: 51.0) +[2023-10-11 18:31:01,063][84230] Avg episode reward: [(0, '48.050'), (1, '47.520')] +[2023-10-11 18:31:01,316][85175] Updated weights for policy 1, policy_version 85780 (0.0010) +[2023-10-11 18:31:01,687][85175] Updated weights for policy 1, policy_version 85790 (0.0011) +[2023-10-11 18:31:05,268][85176] Updated weights for policy 0, policy_version 84552 (0.0008) +[2023-10-11 18:31:05,640][85176] Updated weights for policy 0, policy_version 84562 (0.0009) +[2023-10-11 18:31:05,734][85175] Updated weights for policy 1, policy_version 85800 (0.0008) +[2023-10-11 18:31:06,001][85176] Updated weights for policy 0, policy_version 84572 (0.0010) +[2023-10-11 18:31:06,062][84230] Fps is (10 sec: 9830.6, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 174424064. Throughput: 0: 1665.6, 1: 1702.4. Samples: 43618516. Policy #0 lag: (min: 19.0, avg: 27.3, max: 51.0) +[2023-10-11 18:31:06,063][84230] Avg episode reward: [(0, '44.460'), (1, '42.830')] +[2023-10-11 18:31:06,096][85175] Updated weights for policy 1, policy_version 85810 (0.0008) +[2023-10-11 18:31:06,467][85175] Updated weights for policy 1, policy_version 85820 (0.0007) +[2023-10-11 18:31:10,230][85176] Updated weights for policy 0, policy_version 84582 (0.0008) +[2023-10-11 18:31:10,385][85175] Updated weights for policy 1, policy_version 85830 (0.0009) +[2023-10-11 18:31:10,607][85176] Updated weights for policy 0, policy_version 84592 (0.0009) +[2023-10-11 18:31:10,758][85175] Updated weights for policy 1, policy_version 85840 (0.0008) +[2023-10-11 18:31:10,984][85176] Updated weights for policy 0, policy_version 84602 (0.0007) +[2023-10-11 18:31:11,062][84230] Fps is (10 sec: 9830.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 174489600. Throughput: 0: 1662.7, 1: 1695.7. Samples: 43639062. Policy #0 lag: (min: 19.0, avg: 27.3, max: 51.0) +[2023-10-11 18:31:11,063][84230] Avg episode reward: [(0, '46.250'), (1, '45.540')] +[2023-10-11 18:31:11,135][85175] Updated weights for policy 1, policy_version 85850 (0.0007) +[2023-10-11 18:31:15,182][85175] Updated weights for policy 1, policy_version 85860 (0.0007) +[2023-10-11 18:31:15,249][85176] Updated weights for policy 0, policy_version 84612 (0.0007) +[2023-10-11 18:31:15,547][85175] Updated weights for policy 1, policy_version 85870 (0.0008) +[2023-10-11 18:31:15,623][85176] Updated weights for policy 0, policy_version 84622 (0.0007) +[2023-10-11 18:31:15,921][85175] Updated weights for policy 1, policy_version 85880 (0.0007) +[2023-10-11 18:31:15,996][85176] Updated weights for policy 0, policy_version 84632 (0.0007) +[2023-10-11 18:31:16,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 174555136. Throughput: 0: 1651.3, 1: 1683.0. Samples: 43658644. Policy #0 lag: (min: 19.0, avg: 27.3, max: 51.0) +[2023-10-11 18:31:16,063][84230] Avg episode reward: [(0, '42.360'), (1, '43.430')] +[2023-10-11 18:31:19,908][85175] Updated weights for policy 1, policy_version 85890 (0.0007) +[2023-10-11 18:31:20,036][85176] Updated weights for policy 0, policy_version 84642 (0.0007) +[2023-10-11 18:31:20,279][85175] Updated weights for policy 1, policy_version 85900 (0.0007) +[2023-10-11 18:31:20,403][85176] Updated weights for policy 0, policy_version 84652 (0.0009) +[2023-10-11 18:31:20,661][85175] Updated weights for policy 1, policy_version 85910 (0.0008) +[2023-10-11 18:31:20,773][85176] Updated weights for policy 0, policy_version 84662 (0.0008) +[2023-10-11 18:31:21,015][85175] Updated weights for policy 1, policy_version 85920 (0.0008) +[2023-10-11 18:31:21,063][84230] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 174653440. Throughput: 0: 1654.8, 1: 1695.6. Samples: 43668766. Policy #0 lag: (min: 19.0, avg: 27.3, max: 51.0) +[2023-10-11 18:31:21,063][84230] Avg episode reward: [(0, '46.010'), (1, '47.620')] +[2023-10-11 18:31:21,149][85176] Updated weights for policy 0, policy_version 84672 (0.0009) +[2023-10-11 18:31:25,051][85175] Updated weights for policy 1, policy_version 85930 (0.0008) +[2023-10-11 18:31:25,198][85176] Updated weights for policy 0, policy_version 84682 (0.0007) +[2023-10-11 18:31:25,425][85175] Updated weights for policy 1, policy_version 85940 (0.0008) +[2023-10-11 18:31:25,575][85176] Updated weights for policy 0, policy_version 84692 (0.0008) +[2023-10-11 18:31:25,792][85175] Updated weights for policy 1, policy_version 85950 (0.0009) +[2023-10-11 18:31:25,935][85176] Updated weights for policy 0, policy_version 84702 (0.0007) +[2023-10-11 18:31:26,063][84230] Fps is (10 sec: 19660.8, 60 sec: 14199.5, 300 sec: 13662.6). Total num frames: 174751744. Throughput: 0: 1652.0, 1: 1696.1. Samples: 43689336. Policy #0 lag: (min: 19.0, avg: 27.3, max: 51.0) +[2023-10-11 18:31:26,063][84230] Avg episode reward: [(0, '43.610'), (1, '45.180')] +[2023-10-11 18:31:29,931][85175] Updated weights for policy 1, policy_version 85960 (0.0008) +[2023-10-11 18:31:30,046][85176] Updated weights for policy 0, policy_version 84712 (0.0009) +[2023-10-11 18:31:30,291][85175] Updated weights for policy 1, policy_version 85970 (0.0009) +[2023-10-11 18:31:30,411][85176] Updated weights for policy 0, policy_version 84722 (0.0009) +[2023-10-11 18:31:30,662][85175] Updated weights for policy 1, policy_version 85980 (0.0007) +[2023-10-11 18:31:30,784][85176] Updated weights for policy 0, policy_version 84732 (0.0009) +[2023-10-11 18:31:31,062][84230] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 13662.6). Total num frames: 174817280. Throughput: 0: 1644.1, 1: 1678.0. Samples: 43708294. Policy #0 lag: (min: 19.0, avg: 27.3, max: 51.0) +[2023-10-11 18:31:31,063][84230] Avg episode reward: [(0, '45.850'), (1, '47.310')] +[2023-10-11 18:31:34,851][85175] Updated weights for policy 1, policy_version 85990 (0.0007) +[2023-10-11 18:31:34,946][85176] Updated weights for policy 0, policy_version 84742 (0.0007) +[2023-10-11 18:31:35,216][85175] Updated weights for policy 1, policy_version 86000 (0.0007) +[2023-10-11 18:31:35,325][85176] Updated weights for policy 0, policy_version 84752 (0.0007) +[2023-10-11 18:31:35,586][85175] Updated weights for policy 1, policy_version 86010 (0.0009) +[2023-10-11 18:31:35,696][85176] Updated weights for policy 0, policy_version 84762 (0.0008) +[2023-10-11 18:31:36,062][84230] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13662.6). Total num frames: 174882816. Throughput: 0: 1647.8, 1: 1693.3. Samples: 43718880. Policy #0 lag: (min: 31.0, avg: 31.4, max: 44.0) +[2023-10-11 18:31:36,063][84230] Avg episode reward: [(0, '42.090'), (1, '43.260')] +[2023-10-11 18:31:39,706][85175] Updated weights for policy 1, policy_version 86020 (0.0008) +[2023-10-11 18:31:39,758][85176] Updated weights for policy 0, policy_version 84772 (0.0008) +[2023-10-11 18:31:40,072][85175] Updated weights for policy 1, policy_version 86030 (0.0009) +[2023-10-11 18:31:40,119][85176] Updated weights for policy 0, policy_version 84782 (0.0008) +[2023-10-11 18:31:40,438][85175] Updated weights for policy 1, policy_version 86040 (0.0009) +[2023-10-11 18:31:40,498][85176] Updated weights for policy 0, policy_version 84792 (0.0009) +[2023-10-11 18:31:41,063][84230] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 174948352. Throughput: 0: 1648.3, 1: 1690.9. Samples: 43739256. Policy #0 lag: (min: 31.0, avg: 31.4, max: 44.0) +[2023-10-11 18:31:41,064][84230] Avg episode reward: [(0, '46.970'), (1, '43.560')] +[2023-10-11 18:31:44,454][85175] Updated weights for policy 1, policy_version 86050 (0.0009) +[2023-10-11 18:31:44,697][85176] Updated weights for policy 0, policy_version 84802 (0.0007) +[2023-10-11 18:31:44,827][85175] Updated weights for policy 1, policy_version 86060 (0.0008) +[2023-10-11 18:31:45,079][85176] Updated weights for policy 0, policy_version 84812 (0.0008) +[2023-10-11 18:31:45,186][85175] Updated weights for policy 1, policy_version 86070 (0.0007) +[2023-10-11 18:31:45,439][85176] Updated weights for policy 0, policy_version 84822 (0.0010) +[2023-10-11 18:31:45,555][85175] Updated weights for policy 1, policy_version 86080 (0.0009) +[2023-10-11 18:31:45,809][85176] Updated weights for policy 0, policy_version 84832 (0.0010) +[2023-10-11 18:31:46,063][84230] Fps is (10 sec: 13106.7, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 175013888. Throughput: 0: 1646.6, 1: 1662.5. Samples: 43757620. Policy #0 lag: (min: 31.0, avg: 31.4, max: 44.0) +[2023-10-11 18:31:46,064][84230] Avg episode reward: [(0, '42.960'), (1, '41.930')] +[2023-10-11 18:31:49,447][85175] Updated weights for policy 1, policy_version 86090 (0.0008) +[2023-10-11 18:31:49,817][85175] Updated weights for policy 1, policy_version 86100 (0.0007) +[2023-10-11 18:31:49,820][85176] Updated weights for policy 0, policy_version 84842 (0.0007) +[2023-10-11 18:31:50,176][85175] Updated weights for policy 1, policy_version 86110 (0.0007) +[2023-10-11 18:31:50,198][85176] Updated weights for policy 0, policy_version 84852 (0.0007) +[2023-10-11 18:31:50,566][85176] Updated weights for policy 0, policy_version 84862 (0.0009) +[2023-10-11 18:31:51,062][84230] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 175079424. Throughput: 0: 1651.2, 1: 1691.0. Samples: 43768914. Policy #0 lag: (min: 31.0, avg: 31.4, max: 44.0) +[2023-10-11 18:31:51,063][84230] Avg episode reward: [(0, '48.560'), (1, '43.980')] +[2023-10-11 18:31:54,217][85175] Updated weights for policy 1, policy_version 86120 (0.0009) +[2023-10-11 18:31:54,585][85175] Updated weights for policy 1, policy_version 86130 (0.0007) +[2023-10-11 18:31:54,620][85176] Updated weights for policy 0, policy_version 84872 (0.0009) +[2023-10-11 18:31:54,947][85175] Updated weights for policy 1, policy_version 86140 (0.0007) +[2023-10-11 18:31:55,000][85176] Updated weights for policy 0, policy_version 84882 (0.0007) +[2023-10-11 18:31:55,375][85176] Updated weights for policy 0, policy_version 84892 (0.0010) +[2023-10-11 18:31:56,062][84230] Fps is (10 sec: 13107.8, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 175144960. Throughput: 0: 1649.3, 1: 1680.3. Samples: 43788894. Policy #0 lag: (min: 31.0, avg: 31.4, max: 44.0) +[2023-10-11 18:31:56,063][84230] Avg episode reward: [(0, '42.890'), (1, '44.190')] +[2023-10-11 18:31:59,132][85175] Updated weights for policy 1, policy_version 86150 (0.0008) +[2023-10-11 18:31:59,501][85175] Updated weights for policy 1, policy_version 86160 (0.0009) +[2023-10-11 18:31:59,530][85176] Updated weights for policy 0, policy_version 84902 (0.0009) +[2023-10-11 18:31:59,869][85175] Updated weights for policy 1, policy_version 86170 (0.0007) +[2023-10-11 18:31:59,903][85176] Updated weights for policy 0, policy_version 84912 (0.0008) +[2023-10-11 18:32:00,280][85176] Updated weights for policy 0, policy_version 84922 (0.0008) +[2023-10-11 18:32:01,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 175210496. Throughput: 0: 1640.8, 1: 1671.0. Samples: 43807678. Policy #0 lag: (min: 31.0, avg: 31.4, max: 44.0) +[2023-10-11 18:32:01,064][84230] Avg episode reward: [(0, '48.500'), (1, '46.780')] +[2023-10-11 18:32:03,969][85175] Updated weights for policy 1, policy_version 86180 (0.0009) +[2023-10-11 18:32:04,335][85175] Updated weights for policy 1, policy_version 86190 (0.0010) +[2023-10-11 18:32:04,498][85176] Updated weights for policy 0, policy_version 84932 (0.0008) +[2023-10-11 18:32:04,692][85175] Updated weights for policy 1, policy_version 86200 (0.0007) +[2023-10-11 18:32:04,872][85176] Updated weights for policy 0, policy_version 84942 (0.0007) +[2023-10-11 18:32:05,243][85176] Updated weights for policy 0, policy_version 84952 (0.0007) +[2023-10-11 18:32:06,062][84230] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 175276032. Throughput: 0: 1657.3, 1: 1686.6. Samples: 43819240. Policy #0 lag: (min: 31.0, avg: 31.4, max: 44.0) +[2023-10-11 18:32:06,063][84230] Avg episode reward: [(0, '42.500'), (1, '45.430')] +[2023-10-11 18:32:08,621][85175] Updated weights for policy 1, policy_version 86210 (0.0008) +[2023-10-11 18:32:08,992][85175] Updated weights for policy 1, policy_version 86220 (0.0008) +[2023-10-11 18:32:09,289][85176] Updated weights for policy 0, policy_version 84962 (0.0007) +[2023-10-11 18:32:09,368][85175] Updated weights for policy 1, policy_version 86230 (0.0010) +[2023-10-11 18:32:09,658][85176] Updated weights for policy 0, policy_version 84972 (0.0008) +[2023-10-11 18:32:09,731][85175] Updated weights for policy 1, policy_version 86240 (0.0008) +[2023-10-11 18:32:10,032][85176] Updated weights for policy 0, policy_version 84982 (0.0010) +[2023-10-11 18:32:10,393][85176] Updated weights for policy 0, policy_version 84992 (0.0008) +[2023-10-11 18:32:11,063][84230] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 175341568. Throughput: 0: 1652.0, 1: 1665.0. Samples: 43838602. Policy #0 lag: (min: 31.0, avg: 31.4, max: 44.0) +[2023-10-11 18:32:11,064][84230] Avg episode reward: [(0, '48.010'), (1, '45.840')] +[2023-10-11 18:32:13,620][85175] Updated weights for policy 1, policy_version 86250 (0.0010) +[2023-10-11 18:32:13,992][85175] Updated weights for policy 1, policy_version 86260 (0.0008) +[2023-10-11 18:32:14,351][85175] Updated weights for policy 1, policy_version 86270 (0.0008) +[2023-10-11 18:32:14,688][85176] Updated weights for policy 0, policy_version 85002 (0.0011) +[2023-10-11 18:32:15,053][85176] Updated weights for policy 0, policy_version 85012 (0.0008) +[2023-10-11 18:32:15,432][85176] Updated weights for policy 0, policy_version 85022 (0.0007) +[2023-10-11 18:32:16,063][84230] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 175407104. Throughput: 0: 1649.8, 1: 1682.0. Samples: 43858226. Policy #0 lag: (min: 31.0, avg: 31.4, max: 44.0) +[2023-10-11 18:32:16,064][84230] Avg episode reward: [(0, '41.310'), (1, '47.360')] +[2023-10-11 18:32:16,075][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000086272_88342528.pth... +[2023-10-11 18:32:16,075][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000085024_87064576.pth... +[2023-10-11 18:32:16,111][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000084672_86704128.pth +[2023-10-11 18:32:16,123][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000083456_85458944.pth +[2023-10-11 18:32:18,435][85175] Updated weights for policy 1, policy_version 86280 (0.0010) +[2023-10-11 18:32:18,789][85175] Updated weights for policy 1, policy_version 86290 (0.0011) +[2023-10-11 18:32:19,155][85175] Updated weights for policy 1, policy_version 86300 (0.0010) +[2023-10-11 18:32:19,639][85176] Updated weights for policy 0, policy_version 85032 (0.0007) +[2023-10-11 18:32:20,015][85176] Updated weights for policy 0, policy_version 85042 (0.0007) +[2023-10-11 18:32:20,376][85176] Updated weights for policy 0, policy_version 85052 (0.0009) +[2023-10-11 18:32:21,062][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 175472640. Throughput: 0: 1661.3, 1: 1684.3. Samples: 43869432. Policy #0 lag: (min: 31.0, avg: 31.4, max: 44.0) +[2023-10-11 18:32:21,063][84230] Avg episode reward: [(0, '48.600'), (1, '45.700')] +[2023-10-11 18:32:23,148][85175] Updated weights for policy 1, policy_version 86310 (0.0008) +[2023-10-11 18:32:23,508][85175] Updated weights for policy 1, policy_version 86320 (0.0007) +[2023-10-11 18:32:23,877][85175] Updated weights for policy 1, policy_version 86330 (0.0009) +[2023-10-11 18:32:24,192][85176] Updated weights for policy 0, policy_version 85062 (0.0008) +[2023-10-11 18:32:24,552][85176] Updated weights for policy 0, policy_version 85072 (0.0007) +[2023-10-11 18:32:24,928][85176] Updated weights for policy 0, policy_version 85082 (0.0008) +[2023-10-11 18:32:26,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 175538176. Throughput: 0: 1659.0, 1: 1670.2. Samples: 43889072. Policy #0 lag: (min: 31.0, avg: 31.4, max: 44.0) +[2023-10-11 18:32:26,063][84230] Avg episode reward: [(0, '42.990'), (1, '45.700')] +[2023-10-11 18:32:27,919][85175] Updated weights for policy 1, policy_version 86340 (0.0010) +[2023-10-11 18:32:28,316][85175] Updated weights for policy 1, policy_version 86350 (0.0007) +[2023-10-11 18:32:28,683][85175] Updated weights for policy 1, policy_version 86360 (0.0007) +[2023-10-11 18:32:28,873][85176] Updated weights for policy 0, policy_version 85092 (0.0008) +[2023-10-11 18:32:29,261][85176] Updated weights for policy 0, policy_version 85102 (0.0008) +[2023-10-11 18:32:29,633][85176] Updated weights for policy 0, policy_version 85112 (0.0009) +[2023-10-11 18:32:31,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 175603712. Throughput: 0: 1669.6, 1: 1700.8. Samples: 43909286. Policy #0 lag: (min: 8.0, avg: 34.7, max: 40.0) +[2023-10-11 18:32:31,064][84230] Avg episode reward: [(0, '47.130'), (1, '44.540')] +[2023-10-11 18:32:32,788][85175] Updated weights for policy 1, policy_version 86370 (0.0008) +[2023-10-11 18:32:33,163][85175] Updated weights for policy 1, policy_version 86380 (0.0007) +[2023-10-11 18:32:33,537][85175] Updated weights for policy 1, policy_version 86390 (0.0009) +[2023-10-11 18:32:33,765][85176] Updated weights for policy 0, policy_version 85122 (0.0009) +[2023-10-11 18:32:33,903][85175] Updated weights for policy 1, policy_version 86400 (0.0009) +[2023-10-11 18:32:34,132][85176] Updated weights for policy 0, policy_version 85132 (0.0010) +[2023-10-11 18:32:34,502][85176] Updated weights for policy 0, policy_version 85142 (0.0008) +[2023-10-11 18:32:34,873][85176] Updated weights for policy 0, policy_version 85152 (0.0011) +[2023-10-11 18:32:36,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 175669248. Throughput: 0: 1679.9, 1: 1679.4. Samples: 43920082. Policy #0 lag: (min: 8.0, avg: 34.7, max: 40.0) +[2023-10-11 18:32:36,063][84230] Avg episode reward: [(0, '41.710'), (1, '45.810')] +[2023-10-11 18:32:38,080][85175] Updated weights for policy 1, policy_version 86410 (0.0007) +[2023-10-11 18:32:38,456][85175] Updated weights for policy 1, policy_version 86420 (0.0007) +[2023-10-11 18:32:38,822][85176] Updated weights for policy 0, policy_version 85162 (0.0008) +[2023-10-11 18:32:38,832][85175] Updated weights for policy 1, policy_version 86430 (0.0008) +[2023-10-11 18:32:39,208][85176] Updated weights for policy 0, policy_version 85172 (0.0007) +[2023-10-11 18:32:39,575][85176] Updated weights for policy 0, policy_version 85182 (0.0010) +[2023-10-11 18:32:41,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 175734784. Throughput: 0: 1662.2, 1: 1679.0. Samples: 43939250. Policy #0 lag: (min: 8.0, avg: 34.7, max: 40.0) +[2023-10-11 18:32:41,064][84230] Avg episode reward: [(0, '46.810'), (1, '44.540')] +[2023-10-11 18:32:42,635][85175] Updated weights for policy 1, policy_version 86440 (0.0010) +[2023-10-11 18:32:43,007][85175] Updated weights for policy 1, policy_version 86450 (0.0009) +[2023-10-11 18:32:43,376][85175] Updated weights for policy 1, policy_version 86460 (0.0008) +[2023-10-11 18:32:43,556][85176] Updated weights for policy 0, policy_version 85192 (0.0008) +[2023-10-11 18:32:43,925][85176] Updated weights for policy 0, policy_version 85202 (0.0009) +[2023-10-11 18:32:44,301][85176] Updated weights for policy 0, policy_version 85212 (0.0008) +[2023-10-11 18:32:46,063][84230] Fps is (10 sec: 13106.8, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 175800320. Throughput: 0: 1680.2, 1: 1703.4. Samples: 43959942. Policy #0 lag: (min: 8.0, avg: 34.7, max: 40.0) +[2023-10-11 18:32:46,064][84230] Avg episode reward: [(0, '41.200'), (1, '46.840')] +[2023-10-11 18:32:47,400][85175] Updated weights for policy 1, policy_version 86470 (0.0007) +[2023-10-11 18:32:47,768][85175] Updated weights for policy 1, policy_version 86480 (0.0008) +[2023-10-11 18:32:48,140][85175] Updated weights for policy 1, policy_version 86490 (0.0009) +[2023-10-11 18:32:48,459][85176] Updated weights for policy 0, policy_version 85222 (0.0008) +[2023-10-11 18:32:48,828][85176] Updated weights for policy 0, policy_version 85232 (0.0009) +[2023-10-11 18:32:49,208][85176] Updated weights for policy 0, policy_version 85242 (0.0010) +[2023-10-11 18:32:51,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 175865856. Throughput: 0: 1674.6, 1: 1676.6. Samples: 43970046. Policy #0 lag: (min: 8.0, avg: 34.7, max: 40.0) +[2023-10-11 18:32:51,064][84230] Avg episode reward: [(0, '46.090'), (1, '44.220')] +[2023-10-11 18:32:52,296][85175] Updated weights for policy 1, policy_version 86500 (0.0008) +[2023-10-11 18:32:52,666][85175] Updated weights for policy 1, policy_version 86510 (0.0007) +[2023-10-11 18:32:53,023][85175] Updated weights for policy 1, policy_version 86520 (0.0007) +[2023-10-11 18:32:53,261][85176] Updated weights for policy 0, policy_version 85252 (0.0009) +[2023-10-11 18:32:53,633][85176] Updated weights for policy 0, policy_version 85262 (0.0007) +[2023-10-11 18:32:54,012][85176] Updated weights for policy 0, policy_version 85272 (0.0008) +[2023-10-11 18:32:56,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 175931392. Throughput: 0: 1659.2, 1: 1698.9. Samples: 43989716. Policy #0 lag: (min: 8.0, avg: 34.7, max: 40.0) +[2023-10-11 18:32:56,064][84230] Avg episode reward: [(0, '41.320'), (1, '44.190')] +[2023-10-11 18:32:56,937][85175] Updated weights for policy 1, policy_version 86530 (0.0007) +[2023-10-11 18:32:57,289][85175] Updated weights for policy 1, policy_version 86540 (0.0008) +[2023-10-11 18:32:57,658][85175] Updated weights for policy 1, policy_version 86550 (0.0007) +[2023-10-11 18:32:58,029][85175] Updated weights for policy 1, policy_version 86560 (0.0007) +[2023-10-11 18:32:58,080][85176] Updated weights for policy 0, policy_version 85282 (0.0008) +[2023-10-11 18:32:58,456][85176] Updated weights for policy 0, policy_version 85292 (0.0007) +[2023-10-11 18:32:58,835][85176] Updated weights for policy 0, policy_version 85302 (0.0009) +[2023-10-11 18:32:59,202][85176] Updated weights for policy 0, policy_version 85312 (0.0011) +[2023-10-11 18:33:01,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 175996928. Throughput: 0: 1680.3, 1: 1709.1. Samples: 44010748. Policy #0 lag: (min: 8.0, avg: 34.7, max: 40.0) +[2023-10-11 18:33:01,063][84230] Avg episode reward: [(0, '45.340'), (1, '43.580')] +[2023-10-11 18:33:02,142][85175] Updated weights for policy 1, policy_version 86570 (0.0008) +[2023-10-11 18:33:02,503][85175] Updated weights for policy 1, policy_version 86580 (0.0008) +[2023-10-11 18:33:02,877][85175] Updated weights for policy 1, policy_version 86590 (0.0008) +[2023-10-11 18:33:03,395][85176] Updated weights for policy 0, policy_version 85322 (0.0011) +[2023-10-11 18:33:03,762][85176] Updated weights for policy 0, policy_version 85332 (0.0009) +[2023-10-11 18:33:04,142][85176] Updated weights for policy 0, policy_version 85342 (0.0010) +[2023-10-11 18:33:06,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 176062464. Throughput: 0: 1671.4, 1: 1690.6. Samples: 44020720. Policy #0 lag: (min: 8.0, avg: 34.7, max: 40.0) +[2023-10-11 18:33:06,064][84230] Avg episode reward: [(0, '43.130'), (1, '45.280')] +[2023-10-11 18:33:06,851][85175] Updated weights for policy 1, policy_version 86600 (0.0007) +[2023-10-11 18:33:07,222][85175] Updated weights for policy 1, policy_version 86610 (0.0007) +[2023-10-11 18:33:07,583][85175] Updated weights for policy 1, policy_version 86620 (0.0009) +[2023-10-11 18:33:08,181][85176] Updated weights for policy 0, policy_version 85352 (0.0009) +[2023-10-11 18:33:08,548][85176] Updated weights for policy 0, policy_version 85362 (0.0007) +[2023-10-11 18:33:08,925][85176] Updated weights for policy 0, policy_version 85372 (0.0011) +[2023-10-11 18:33:11,062][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.3, 300 sec: 13440.5). Total num frames: 176128000. Throughput: 0: 1658.0, 1: 1709.7. Samples: 44040620. Policy #0 lag: (min: 8.0, avg: 34.7, max: 40.0) +[2023-10-11 18:33:11,063][84230] Avg episode reward: [(0, '45.440'), (1, '41.910')] +[2023-10-11 18:33:11,649][85175] Updated weights for policy 1, policy_version 86630 (0.0009) +[2023-10-11 18:33:12,010][85175] Updated weights for policy 1, policy_version 86640 (0.0011) +[2023-10-11 18:33:12,378][85175] Updated weights for policy 1, policy_version 86650 (0.0009) +[2023-10-11 18:33:12,967][85176] Updated weights for policy 0, policy_version 85382 (0.0010) +[2023-10-11 18:33:13,341][85176] Updated weights for policy 0, policy_version 85392 (0.0007) +[2023-10-11 18:33:13,706][85176] Updated weights for policy 0, policy_version 85402 (0.0007) +[2023-10-11 18:33:16,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 176193536. Throughput: 0: 1679.4, 1: 1700.3. Samples: 44061374. Policy #0 lag: (min: 8.0, avg: 34.7, max: 40.0) +[2023-10-11 18:33:16,064][84230] Avg episode reward: [(0, '45.520'), (1, '45.120')] +[2023-10-11 18:33:16,447][85175] Updated weights for policy 1, policy_version 86660 (0.0009) +[2023-10-11 18:33:16,818][85175] Updated weights for policy 1, policy_version 86670 (0.0009) +[2023-10-11 18:33:17,184][85175] Updated weights for policy 1, policy_version 86680 (0.0007) +[2023-10-11 18:33:17,788][85176] Updated weights for policy 0, policy_version 85412 (0.0009) +[2023-10-11 18:33:18,159][85176] Updated weights for policy 0, policy_version 85422 (0.0011) +[2023-10-11 18:33:18,541][85176] Updated weights for policy 0, policy_version 85432 (0.0010) +[2023-10-11 18:33:21,063][84230] Fps is (10 sec: 13106.8, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 176259072. Throughput: 0: 1655.4, 1: 1692.3. Samples: 44070728. Policy #0 lag: (min: 8.0, avg: 34.7, max: 40.0) +[2023-10-11 18:33:21,064][84230] Avg episode reward: [(0, '45.360'), (1, '42.580')] +[2023-10-11 18:33:21,095][85175] Updated weights for policy 1, policy_version 86690 (0.0007) +[2023-10-11 18:33:21,454][85175] Updated weights for policy 1, policy_version 86700 (0.0007) +[2023-10-11 18:33:21,831][85175] Updated weights for policy 1, policy_version 86710 (0.0007) +[2023-10-11 18:33:22,202][85175] Updated weights for policy 1, policy_version 86720 (0.0007) +[2023-10-11 18:33:22,798][85176] Updated weights for policy 0, policy_version 85442 (0.0010) +[2023-10-11 18:33:23,184][85176] Updated weights for policy 0, policy_version 85452 (0.0007) +[2023-10-11 18:33:23,554][85176] Updated weights for policy 0, policy_version 85462 (0.0007) +[2023-10-11 18:33:23,941][85176] Updated weights for policy 0, policy_version 85472 (0.0010) +[2023-10-11 18:33:26,059][85175] Updated weights for policy 1, policy_version 86730 (0.0007) +[2023-10-11 18:33:26,063][84230] Fps is (10 sec: 13107.7, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 176324608. Throughput: 0: 1668.7, 1: 1706.9. Samples: 44091154. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) +[2023-10-11 18:33:26,063][84230] Avg episode reward: [(0, '44.370'), (1, '47.080')] +[2023-10-11 18:33:26,427][85175] Updated weights for policy 1, policy_version 86740 (0.0007) +[2023-10-11 18:33:26,794][85175] Updated weights for policy 1, policy_version 86750 (0.0007) +[2023-10-11 18:33:27,958][85176] Updated weights for policy 0, policy_version 85482 (0.0009) +[2023-10-11 18:33:28,331][85176] Updated weights for policy 0, policy_version 85492 (0.0009) +[2023-10-11 18:33:28,702][85176] Updated weights for policy 0, policy_version 85502 (0.0007) +[2023-10-11 18:33:30,862][85175] Updated weights for policy 1, policy_version 86760 (0.0009) +[2023-10-11 18:33:31,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 176390144. Throughput: 0: 1675.1, 1: 1704.5. Samples: 44112024. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) +[2023-10-11 18:33:31,064][84230] Avg episode reward: [(0, '43.980'), (1, '44.700')] +[2023-10-11 18:33:31,239][85175] Updated weights for policy 1, policy_version 86770 (0.0010) +[2023-10-11 18:33:31,602][85175] Updated weights for policy 1, policy_version 86780 (0.0009) +[2023-10-11 18:33:32,788][85176] Updated weights for policy 0, policy_version 85512 (0.0008) +[2023-10-11 18:33:33,151][85176] Updated weights for policy 0, policy_version 85522 (0.0009) +[2023-10-11 18:33:33,535][85176] Updated weights for policy 0, policy_version 85532 (0.0009) +[2023-10-11 18:33:35,533][85175] Updated weights for policy 1, policy_version 86790 (0.0010) +[2023-10-11 18:33:35,898][85175] Updated weights for policy 1, policy_version 86800 (0.0010) +[2023-10-11 18:33:36,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 176455680. Throughput: 0: 1659.5, 1: 1704.0. Samples: 44121404. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) +[2023-10-11 18:33:36,063][84230] Avg episode reward: [(0, '45.380'), (1, '45.400')] +[2023-10-11 18:33:36,266][85175] Updated weights for policy 1, policy_version 86810 (0.0009) +[2023-10-11 18:33:37,580][85176] Updated weights for policy 0, policy_version 85542 (0.0009) +[2023-10-11 18:33:37,957][85176] Updated weights for policy 0, policy_version 85552 (0.0007) +[2023-10-11 18:33:38,329][85176] Updated weights for policy 0, policy_version 85562 (0.0010) +[2023-10-11 18:33:40,404][85175] Updated weights for policy 1, policy_version 86820 (0.0010) +[2023-10-11 18:33:40,769][85175] Updated weights for policy 1, policy_version 86830 (0.0008) +[2023-10-11 18:33:41,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 176521216. Throughput: 0: 1677.4, 1: 1702.5. Samples: 44141810. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) +[2023-10-11 18:33:41,063][84230] Avg episode reward: [(0, '44.500'), (1, '45.020')] +[2023-10-11 18:33:41,139][85175] Updated weights for policy 1, policy_version 86840 (0.0008) +[2023-10-11 18:33:42,260][85176] Updated weights for policy 0, policy_version 85572 (0.0009) +[2023-10-11 18:33:42,620][85176] Updated weights for policy 0, policy_version 85582 (0.0007) +[2023-10-11 18:33:42,995][85176] Updated weights for policy 0, policy_version 85592 (0.0007) +[2023-10-11 18:33:45,415][85175] Updated weights for policy 1, policy_version 86850 (0.0008) +[2023-10-11 18:33:45,795][85175] Updated weights for policy 1, policy_version 86860 (0.0008) +[2023-10-11 18:33:46,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 176586752. Throughput: 0: 1678.0, 1: 1689.1. Samples: 44162266. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) +[2023-10-11 18:33:46,063][84230] Avg episode reward: [(0, '45.740'), (1, '47.280')] +[2023-10-11 18:33:46,153][85175] Updated weights for policy 1, policy_version 86870 (0.0007) +[2023-10-11 18:33:46,524][85175] Updated weights for policy 1, policy_version 86880 (0.0007) +[2023-10-11 18:33:47,228][85176] Updated weights for policy 0, policy_version 85602 (0.0008) +[2023-10-11 18:33:47,603][85176] Updated weights for policy 0, policy_version 85612 (0.0009) +[2023-10-11 18:33:47,977][85176] Updated weights for policy 0, policy_version 85622 (0.0007) +[2023-10-11 18:33:48,343][85176] Updated weights for policy 0, policy_version 85632 (0.0007) +[2023-10-11 18:33:50,603][85175] Updated weights for policy 1, policy_version 86890 (0.0009) +[2023-10-11 18:33:50,964][85175] Updated weights for policy 1, policy_version 86900 (0.0009) +[2023-10-11 18:33:51,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 176652288. Throughput: 0: 1659.2, 1: 1695.2. Samples: 44171672. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) +[2023-10-11 18:33:51,064][84230] Avg episode reward: [(0, '43.220'), (1, '44.960')] +[2023-10-11 18:33:51,338][85175] Updated weights for policy 1, policy_version 86910 (0.0007) +[2023-10-11 18:33:52,349][85176] Updated weights for policy 0, policy_version 85642 (0.0007) +[2023-10-11 18:33:52,724][85176] Updated weights for policy 0, policy_version 85652 (0.0008) +[2023-10-11 18:33:53,099][85176] Updated weights for policy 0, policy_version 85662 (0.0010) +[2023-10-11 18:33:55,221][85175] Updated weights for policy 1, policy_version 86920 (0.0008) +[2023-10-11 18:33:55,596][85175] Updated weights for policy 1, policy_version 86930 (0.0008) +[2023-10-11 18:33:55,959][85175] Updated weights for policy 1, policy_version 86940 (0.0008) +[2023-10-11 18:33:56,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.3). Total num frames: 176717824. Throughput: 0: 1683.5, 1: 1694.4. Samples: 44192626. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) +[2023-10-11 18:33:56,064][84230] Avg episode reward: [(0, '44.920'), (1, '46.180')] +[2023-10-11 18:33:57,309][85176] Updated weights for policy 0, policy_version 85672 (0.0008) +[2023-10-11 18:33:57,687][85176] Updated weights for policy 0, policy_version 85682 (0.0008) +[2023-10-11 18:33:58,059][85176] Updated weights for policy 0, policy_version 85692 (0.0008) +[2023-10-11 18:33:59,952][85175] Updated weights for policy 1, policy_version 86950 (0.0009) +[2023-10-11 18:34:00,318][85175] Updated weights for policy 1, policy_version 86960 (0.0010) +[2023-10-11 18:34:00,690][85175] Updated weights for policy 1, policy_version 86970 (0.0008) +[2023-10-11 18:34:01,063][84230] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 176816128. Throughput: 0: 1679.2, 1: 1684.8. Samples: 44212752. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) +[2023-10-11 18:34:01,063][84230] Avg episode reward: [(0, '43.640'), (1, '46.540')] +[2023-10-11 18:34:01,884][85176] Updated weights for policy 0, policy_version 85702 (0.0010) +[2023-10-11 18:34:02,250][85176] Updated weights for policy 0, policy_version 85712 (0.0010) +[2023-10-11 18:34:02,620][85176] Updated weights for policy 0, policy_version 85722 (0.0010) +[2023-10-11 18:34:04,769][85175] Updated weights for policy 1, policy_version 86980 (0.0008) +[2023-10-11 18:34:05,164][85175] Updated weights for policy 1, policy_version 86990 (0.0007) +[2023-10-11 18:34:05,544][85175] Updated weights for policy 1, policy_version 87000 (0.0010) +[2023-10-11 18:34:06,063][84230] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 176881664. Throughput: 0: 1670.0, 1: 1705.6. Samples: 44222634. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) +[2023-10-11 18:34:06,064][84230] Avg episode reward: [(0, '46.490'), (1, '45.220')] +[2023-10-11 18:34:06,764][85176] Updated weights for policy 0, policy_version 85732 (0.0008) +[2023-10-11 18:34:07,141][85176] Updated weights for policy 0, policy_version 85742 (0.0008) +[2023-10-11 18:34:07,513][85176] Updated weights for policy 0, policy_version 85752 (0.0008) +[2023-10-11 18:34:09,474][85175] Updated weights for policy 1, policy_version 87010 (0.0011) +[2023-10-11 18:34:09,844][85175] Updated weights for policy 1, policy_version 87020 (0.0008) +[2023-10-11 18:34:10,221][85175] Updated weights for policy 1, policy_version 87030 (0.0009) +[2023-10-11 18:34:10,583][85175] Updated weights for policy 1, policy_version 87040 (0.0009) +[2023-10-11 18:34:11,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13653.2, 300 sec: 13440.4). Total num frames: 176947200. Throughput: 0: 1677.5, 1: 1694.9. Samples: 44242916. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) +[2023-10-11 18:34:11,064][84230] Avg episode reward: [(0, '42.650'), (1, '46.260')] +[2023-10-11 18:34:11,557][85176] Updated weights for policy 0, policy_version 85762 (0.0008) +[2023-10-11 18:34:11,934][85176] Updated weights for policy 0, policy_version 85772 (0.0007) +[2023-10-11 18:34:12,304][85176] Updated weights for policy 0, policy_version 85782 (0.0009) +[2023-10-11 18:34:12,687][85176] Updated weights for policy 0, policy_version 85792 (0.0007) +[2023-10-11 18:34:14,684][85175] Updated weights for policy 1, policy_version 87050 (0.0007) +[2023-10-11 18:34:15,048][85175] Updated weights for policy 1, policy_version 87060 (0.0007) +[2023-10-11 18:34:15,420][85175] Updated weights for policy 1, policy_version 87070 (0.0007) +[2023-10-11 18:34:16,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 177012736. Throughput: 0: 1675.8, 1: 1668.1. Samples: 44262500. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) +[2023-10-11 18:34:16,064][84230] Avg episode reward: [(0, '45.950'), (1, '45.540')] +[2023-10-11 18:34:16,075][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000087072_89161728.pth... +[2023-10-11 18:34:16,075][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000085792_87851008.pth... +[2023-10-11 18:34:16,107][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000085472_87523328.pth +[2023-10-11 18:34:16,111][85000] Saving a milestone ./train_atari/atari_frostbite_APPO/checkpoint_p1/milestones/checkpoint_000087072_89161728.pth +[2023-10-11 18:34:16,111][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000084256_86278144.pth +[2023-10-11 18:34:16,115][84801] Saving a milestone ./train_atari/atari_frostbite_APPO/checkpoint_p0/milestones/checkpoint_000085792_87851008.pth +[2023-10-11 18:34:16,942][85176] Updated weights for policy 0, policy_version 85802 (0.0010) +[2023-10-11 18:34:17,317][85176] Updated weights for policy 0, policy_version 85812 (0.0007) +[2023-10-11 18:34:17,686][85176] Updated weights for policy 0, policy_version 85822 (0.0009) +[2023-10-11 18:34:19,590][85175] Updated weights for policy 1, policy_version 87080 (0.0007) +[2023-10-11 18:34:19,953][85175] Updated weights for policy 1, policy_version 87090 (0.0010) +[2023-10-11 18:34:20,324][85175] Updated weights for policy 1, policy_version 87100 (0.0009) +[2023-10-11 18:34:21,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 177078272. Throughput: 0: 1672.2, 1: 1692.1. Samples: 44272798. Policy #0 lag: (min: 25.0, avg: 31.5, max: 57.0) +[2023-10-11 18:34:21,064][84230] Avg episode reward: [(0, '43.890'), (1, '48.380')] +[2023-10-11 18:34:21,635][85176] Updated weights for policy 0, policy_version 85832 (0.0007) +[2023-10-11 18:34:22,000][85176] Updated weights for policy 0, policy_version 85842 (0.0008) +[2023-10-11 18:34:22,377][85176] Updated weights for policy 0, policy_version 85852 (0.0008) +[2023-10-11 18:34:24,237][85175] Updated weights for policy 1, policy_version 87110 (0.0009) +[2023-10-11 18:34:24,610][85175] Updated weights for policy 1, policy_version 87120 (0.0008) +[2023-10-11 18:34:24,975][85175] Updated weights for policy 1, policy_version 87130 (0.0008) +[2023-10-11 18:34:26,062][84230] Fps is (10 sec: 13107.6, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 177143808. Throughput: 0: 1681.4, 1: 1683.5. Samples: 44293230. Policy #0 lag: (min: 25.0, avg: 31.5, max: 57.0) +[2023-10-11 18:34:26,063][84230] Avg episode reward: [(0, '44.140'), (1, '46.680')] +[2023-10-11 18:34:26,472][85176] Updated weights for policy 0, policy_version 85862 (0.0009) +[2023-10-11 18:34:26,841][85176] Updated weights for policy 0, policy_version 85872 (0.0011) +[2023-10-11 18:34:27,207][85176] Updated weights for policy 0, policy_version 85882 (0.0009) +[2023-10-11 18:34:28,840][85175] Updated weights for policy 1, policy_version 87140 (0.0009) +[2023-10-11 18:34:29,210][85175] Updated weights for policy 1, policy_version 87150 (0.0008) +[2023-10-11 18:34:29,583][85175] Updated weights for policy 1, policy_version 87160 (0.0010) +[2023-10-11 18:34:31,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 177209344. Throughput: 0: 1679.2, 1: 1681.2. Samples: 44313482. Policy #0 lag: (min: 25.0, avg: 31.5, max: 57.0) +[2023-10-11 18:34:31,064][84230] Avg episode reward: [(0, '41.450'), (1, '46.870')] +[2023-10-11 18:34:31,436][85176] Updated weights for policy 0, policy_version 85892 (0.0011) +[2023-10-11 18:34:31,809][85176] Updated weights for policy 0, policy_version 85902 (0.0008) +[2023-10-11 18:34:32,193][85176] Updated weights for policy 0, policy_version 85912 (0.0009) +[2023-10-11 18:34:33,515][85175] Updated weights for policy 1, policy_version 87170 (0.0009) +[2023-10-11 18:34:33,887][85175] Updated weights for policy 1, policy_version 87180 (0.0009) +[2023-10-11 18:34:34,252][85175] Updated weights for policy 1, policy_version 87190 (0.0011) +[2023-10-11 18:34:34,611][85175] Updated weights for policy 1, policy_version 87200 (0.0008) +[2023-10-11 18:34:36,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 177274880. Throughput: 0: 1679.9, 1: 1704.7. Samples: 44323976. Policy #0 lag: (min: 25.0, avg: 31.5, max: 57.0) +[2023-10-11 18:34:36,063][84230] Avg episode reward: [(0, '44.030'), (1, '45.760')] +[2023-10-11 18:34:36,142][85176] Updated weights for policy 0, policy_version 85922 (0.0008) +[2023-10-11 18:34:36,505][85176] Updated weights for policy 0, policy_version 85932 (0.0011) +[2023-10-11 18:34:36,879][85176] Updated weights for policy 0, policy_version 85942 (0.0010) +[2023-10-11 18:34:37,253][85176] Updated weights for policy 0, policy_version 85952 (0.0012) +[2023-10-11 18:34:38,666][85175] Updated weights for policy 1, policy_version 87210 (0.0010) +[2023-10-11 18:34:39,041][85175] Updated weights for policy 1, policy_version 87220 (0.0007) +[2023-10-11 18:34:39,409][85175] Updated weights for policy 1, policy_version 87230 (0.0007) +[2023-10-11 18:34:41,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 177340416. Throughput: 0: 1677.4, 1: 1680.2. Samples: 44343718. Policy #0 lag: (min: 25.0, avg: 31.5, max: 57.0) +[2023-10-11 18:34:41,064][84230] Avg episode reward: [(0, '43.300'), (1, '47.000')] +[2023-10-11 18:34:41,510][85176] Updated weights for policy 0, policy_version 85962 (0.0008) +[2023-10-11 18:34:41,888][85176] Updated weights for policy 0, policy_version 85972 (0.0008) +[2023-10-11 18:34:42,265][85176] Updated weights for policy 0, policy_version 85982 (0.0008) +[2023-10-11 18:34:43,218][85175] Updated weights for policy 1, policy_version 87240 (0.0008) +[2023-10-11 18:34:43,583][85175] Updated weights for policy 1, policy_version 87250 (0.0007) +[2023-10-11 18:34:43,952][85175] Updated weights for policy 1, policy_version 87260 (0.0009) +[2023-10-11 18:34:46,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 177405952. Throughput: 0: 1673.9, 1: 1701.0. Samples: 44364624. Policy #0 lag: (min: 25.0, avg: 31.5, max: 57.0) +[2023-10-11 18:34:46,064][84230] Avg episode reward: [(0, '45.600'), (1, '45.200')] +[2023-10-11 18:34:46,377][85176] Updated weights for policy 0, policy_version 85992 (0.0008) +[2023-10-11 18:34:46,760][85176] Updated weights for policy 0, policy_version 86002 (0.0008) +[2023-10-11 18:34:47,136][85176] Updated weights for policy 0, policy_version 86012 (0.0009) +[2023-10-11 18:34:47,957][85175] Updated weights for policy 1, policy_version 87270 (0.0010) +[2023-10-11 18:34:48,326][85175] Updated weights for policy 1, policy_version 87280 (0.0011) +[2023-10-11 18:34:48,692][85175] Updated weights for policy 1, policy_version 87290 (0.0008) +[2023-10-11 18:34:51,063][84230] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 177471488. Throughput: 0: 1672.9, 1: 1694.9. Samples: 44374186. Policy #0 lag: (min: 25.0, avg: 31.5, max: 57.0) +[2023-10-11 18:34:51,063][84230] Avg episode reward: [(0, '44.760'), (1, '47.320')] +[2023-10-11 18:34:51,250][85176] Updated weights for policy 0, policy_version 86022 (0.0008) +[2023-10-11 18:34:51,627][85176] Updated weights for policy 0, policy_version 86032 (0.0010) +[2023-10-11 18:34:52,000][85176] Updated weights for policy 0, policy_version 86042 (0.0008) +[2023-10-11 18:34:52,612][85175] Updated weights for policy 1, policy_version 87300 (0.0008) +[2023-10-11 18:34:52,979][85175] Updated weights for policy 1, policy_version 87310 (0.0008) +[2023-10-11 18:34:53,349][85175] Updated weights for policy 1, policy_version 87320 (0.0007) +[2023-10-11 18:34:56,044][85176] Updated weights for policy 0, policy_version 86052 (0.0011) +[2023-10-11 18:34:56,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 177537024. Throughput: 0: 1668.8, 1: 1696.0. Samples: 44394328. Policy #0 lag: (min: 25.0, avg: 31.5, max: 57.0) +[2023-10-11 18:34:56,063][84230] Avg episode reward: [(0, '45.280'), (1, '44.140')] +[2023-10-11 18:34:56,422][85176] Updated weights for policy 0, policy_version 86062 (0.0011) +[2023-10-11 18:34:56,793][85176] Updated weights for policy 0, policy_version 86072 (0.0011) +[2023-10-11 18:34:57,456][85175] Updated weights for policy 1, policy_version 87330 (0.0007) +[2023-10-11 18:34:57,873][85175] Updated weights for policy 1, policy_version 87340 (0.0007) +[2023-10-11 18:34:58,240][85175] Updated weights for policy 1, policy_version 87350 (0.0007) +[2023-10-11 18:34:58,607][85175] Updated weights for policy 1, policy_version 87360 (0.0008) +[2023-10-11 18:35:00,848][85176] Updated weights for policy 0, policy_version 86082 (0.0008) +[2023-10-11 18:35:01,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 177602560. Throughput: 0: 1671.3, 1: 1722.3. Samples: 44415212. Policy #0 lag: (min: 25.0, avg: 31.5, max: 57.0) +[2023-10-11 18:35:01,063][84230] Avg episode reward: [(0, '46.930'), (1, '45.610')] +[2023-10-11 18:35:01,228][85176] Updated weights for policy 0, policy_version 86092 (0.0009) +[2023-10-11 18:35:01,610][85176] Updated weights for policy 0, policy_version 86102 (0.0009) +[2023-10-11 18:35:01,985][85176] Updated weights for policy 0, policy_version 86112 (0.0009) +[2023-10-11 18:35:02,655][85175] Updated weights for policy 1, policy_version 87370 (0.0008) +[2023-10-11 18:35:03,024][85175] Updated weights for policy 1, policy_version 87380 (0.0007) +[2023-10-11 18:35:03,402][85175] Updated weights for policy 1, policy_version 87390 (0.0007) +[2023-10-11 18:35:05,939][85176] Updated weights for policy 0, policy_version 86122 (0.0007) +[2023-10-11 18:35:06,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 177668096. Throughput: 0: 1669.2, 1: 1698.6. Samples: 44424348. Policy #0 lag: (min: 25.0, avg: 31.5, max: 57.0) +[2023-10-11 18:35:06,063][84230] Avg episode reward: [(0, '43.410'), (1, '43.440')] +[2023-10-11 18:35:06,305][85176] Updated weights for policy 0, policy_version 86132 (0.0008) +[2023-10-11 18:35:06,678][85176] Updated weights for policy 0, policy_version 86142 (0.0008) +[2023-10-11 18:35:07,353][85175] Updated weights for policy 1, policy_version 87400 (0.0009) +[2023-10-11 18:35:07,728][85175] Updated weights for policy 1, policy_version 87410 (0.0011) +[2023-10-11 18:35:08,087][85175] Updated weights for policy 1, policy_version 87420 (0.0010) +[2023-10-11 18:35:10,773][85176] Updated weights for policy 0, policy_version 86152 (0.0008) +[2023-10-11 18:35:11,062][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 177733632. Throughput: 0: 1669.0, 1: 1712.8. Samples: 44445410. Policy #0 lag: (min: 25.0, avg: 31.5, max: 57.0) +[2023-10-11 18:35:11,063][84230] Avg episode reward: [(0, '46.060'), (1, '46.350')] +[2023-10-11 18:35:11,146][85176] Updated weights for policy 0, policy_version 86162 (0.0007) +[2023-10-11 18:35:11,520][85176] Updated weights for policy 0, policy_version 86172 (0.0008) +[2023-10-11 18:35:12,123][85175] Updated weights for policy 1, policy_version 87430 (0.0009) +[2023-10-11 18:35:12,482][85175] Updated weights for policy 1, policy_version 87440 (0.0010) +[2023-10-11 18:35:12,860][85175] Updated weights for policy 1, policy_version 87450 (0.0009) +[2023-10-11 18:35:15,631][85176] Updated weights for policy 0, policy_version 86182 (0.0009) +[2023-10-11 18:35:16,011][85176] Updated weights for policy 0, policy_version 86192 (0.0009) +[2023-10-11 18:35:16,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 177799168. Throughput: 0: 1668.0, 1: 1726.5. Samples: 44466236. Policy #0 lag: (min: 26.0, avg: 26.0, max: 26.0) +[2023-10-11 18:35:16,063][84230] Avg episode reward: [(0, '43.180'), (1, '43.620')] +[2023-10-11 18:35:16,387][85176] Updated weights for policy 0, policy_version 86202 (0.0009) +[2023-10-11 18:35:16,737][85175] Updated weights for policy 1, policy_version 87460 (0.0008) +[2023-10-11 18:35:17,103][85175] Updated weights for policy 1, policy_version 87470 (0.0009) +[2023-10-11 18:35:17,457][85175] Updated weights for policy 1, policy_version 87480 (0.0009) +[2023-10-11 18:35:20,313][85176] Updated weights for policy 0, policy_version 86212 (0.0008) +[2023-10-11 18:35:20,689][85176] Updated weights for policy 0, policy_version 86222 (0.0009) +[2023-10-11 18:35:21,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 177864704. Throughput: 0: 1669.0, 1: 1699.1. Samples: 44475538. Policy #0 lag: (min: 26.0, avg: 26.0, max: 26.0) +[2023-10-11 18:35:21,063][84230] Avg episode reward: [(0, '46.400'), (1, '46.020')] +[2023-10-11 18:35:21,065][85176] Updated weights for policy 0, policy_version 86232 (0.0010) +[2023-10-11 18:35:21,528][85175] Updated weights for policy 1, policy_version 87490 (0.0009) +[2023-10-11 18:35:21,904][85175] Updated weights for policy 1, policy_version 87500 (0.0009) +[2023-10-11 18:35:22,276][85175] Updated weights for policy 1, policy_version 87510 (0.0008) +[2023-10-11 18:35:22,644][85175] Updated weights for policy 1, policy_version 87520 (0.0010) +[2023-10-11 18:35:25,065][85176] Updated weights for policy 0, policy_version 86242 (0.0008) +[2023-10-11 18:35:25,436][85176] Updated weights for policy 0, policy_version 86252 (0.0009) +[2023-10-11 18:35:25,818][85176] Updated weights for policy 0, policy_version 86262 (0.0009) +[2023-10-11 18:35:26,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 177930240. Throughput: 0: 1673.4, 1: 1724.0. Samples: 44496598. Policy #0 lag: (min: 26.0, avg: 26.0, max: 26.0) +[2023-10-11 18:35:26,063][84230] Avg episode reward: [(0, '42.810'), (1, '43.330')] +[2023-10-11 18:35:26,188][85176] Updated weights for policy 0, policy_version 86272 (0.0008) +[2023-10-11 18:35:26,742][85175] Updated weights for policy 1, policy_version 87530 (0.0008) +[2023-10-11 18:35:27,115][85175] Updated weights for policy 1, policy_version 87540 (0.0009) +[2023-10-11 18:35:27,481][85175] Updated weights for policy 1, policy_version 87550 (0.0009) +[2023-10-11 18:35:30,111][85176] Updated weights for policy 0, policy_version 86282 (0.0007) +[2023-10-11 18:35:30,483][85176] Updated weights for policy 0, policy_version 86292 (0.0010) +[2023-10-11 18:35:30,856][85176] Updated weights for policy 0, policy_version 86302 (0.0011) +[2023-10-11 18:35:31,063][84230] Fps is (10 sec: 16384.0, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 178028544. Throughput: 0: 1660.8, 1: 1723.8. Samples: 44516930. Policy #0 lag: (min: 26.0, avg: 26.0, max: 26.0) +[2023-10-11 18:35:31,063][84230] Avg episode reward: [(0, '46.130'), (1, '46.670')] +[2023-10-11 18:35:31,445][85175] Updated weights for policy 1, policy_version 87560 (0.0008) +[2023-10-11 18:35:31,812][85175] Updated weights for policy 1, policy_version 87570 (0.0010) +[2023-10-11 18:35:32,186][85175] Updated weights for policy 1, policy_version 87580 (0.0009) +[2023-10-11 18:35:35,096][85176] Updated weights for policy 0, policy_version 86312 (0.0008) +[2023-10-11 18:35:35,471][85176] Updated weights for policy 0, policy_version 86322 (0.0009) +[2023-10-11 18:35:35,849][85176] Updated weights for policy 0, policy_version 86332 (0.0007) +[2023-10-11 18:35:36,062][84230] Fps is (10 sec: 16384.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 178094080. Throughput: 0: 1686.3, 1: 1709.6. Samples: 44527002. Policy #0 lag: (min: 26.0, avg: 26.0, max: 26.0) +[2023-10-11 18:35:36,063][84230] Avg episode reward: [(0, '41.200'), (1, '43.270')] +[2023-10-11 18:35:36,186][85175] Updated weights for policy 1, policy_version 87590 (0.0007) +[2023-10-11 18:35:36,549][85175] Updated weights for policy 1, policy_version 87600 (0.0008) +[2023-10-11 18:35:36,924][85175] Updated weights for policy 1, policy_version 87610 (0.0008) +[2023-10-11 18:35:39,839][85176] Updated weights for policy 0, policy_version 86342 (0.0007) +[2023-10-11 18:35:40,221][85176] Updated weights for policy 0, policy_version 86352 (0.0009) +[2023-10-11 18:35:40,578][85176] Updated weights for policy 0, policy_version 86362 (0.0009) +[2023-10-11 18:35:40,955][85175] Updated weights for policy 1, policy_version 87620 (0.0008) +[2023-10-11 18:35:41,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 178159616. Throughput: 0: 1690.0, 1: 1720.0. Samples: 44547780. Policy #0 lag: (min: 26.0, avg: 26.0, max: 26.0) +[2023-10-11 18:35:41,064][84230] Avg episode reward: [(0, '44.520'), (1, '45.500')] +[2023-10-11 18:35:41,316][85175] Updated weights for policy 1, policy_version 87630 (0.0008) +[2023-10-11 18:35:41,681][85175] Updated weights for policy 1, policy_version 87640 (0.0007) +[2023-10-11 18:35:44,940][85176] Updated weights for policy 0, policy_version 86372 (0.0009) +[2023-10-11 18:35:45,313][85176] Updated weights for policy 0, policy_version 86382 (0.0009) +[2023-10-11 18:35:45,682][85176] Updated weights for policy 0, policy_version 86392 (0.0008) +[2023-10-11 18:35:45,778][85175] Updated weights for policy 1, policy_version 87650 (0.0008) +[2023-10-11 18:35:46,062][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 178225152. Throughput: 0: 1669.8, 1: 1717.7. Samples: 44567648. Policy #0 lag: (min: 26.0, avg: 26.0, max: 26.0) +[2023-10-11 18:35:46,063][84230] Avg episode reward: [(0, '41.190'), (1, '45.000')] +[2023-10-11 18:35:46,198][85175] Updated weights for policy 1, policy_version 87660 (0.0007) +[2023-10-11 18:35:46,577][85175] Updated weights for policy 1, policy_version 87670 (0.0009) +[2023-10-11 18:35:46,944][85175] Updated weights for policy 1, policy_version 87680 (0.0007) +[2023-10-11 18:35:49,755][85176] Updated weights for policy 0, policy_version 86402 (0.0008) +[2023-10-11 18:35:50,133][85176] Updated weights for policy 0, policy_version 86412 (0.0008) +[2023-10-11 18:35:50,500][85176] Updated weights for policy 0, policy_version 86422 (0.0007) +[2023-10-11 18:35:50,872][85176] Updated weights for policy 0, policy_version 86432 (0.0007) +[2023-10-11 18:35:50,895][85175] Updated weights for policy 1, policy_version 87690 (0.0008) +[2023-10-11 18:35:51,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 178290688. Throughput: 0: 1690.4, 1: 1715.3. Samples: 44577604. Policy #0 lag: (min: 26.0, avg: 26.0, max: 26.0) +[2023-10-11 18:35:51,063][84230] Avg episode reward: [(0, '44.510'), (1, '45.880')] +[2023-10-11 18:35:51,263][85175] Updated weights for policy 1, policy_version 87700 (0.0010) +[2023-10-11 18:35:51,633][85175] Updated weights for policy 1, policy_version 87710 (0.0009) +[2023-10-11 18:35:54,901][85176] Updated weights for policy 0, policy_version 86442 (0.0009) +[2023-10-11 18:35:55,267][85176] Updated weights for policy 0, policy_version 86452 (0.0007) +[2023-10-11 18:35:55,645][85176] Updated weights for policy 0, policy_version 86462 (0.0008) +[2023-10-11 18:35:55,673][85175] Updated weights for policy 1, policy_version 87720 (0.0007) +[2023-10-11 18:35:56,039][85175] Updated weights for policy 1, policy_version 87730 (0.0010) +[2023-10-11 18:35:56,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 178356224. Throughput: 0: 1688.8, 1: 1706.9. Samples: 44598218. Policy #0 lag: (min: 26.0, avg: 26.0, max: 26.0) +[2023-10-11 18:35:56,063][84230] Avg episode reward: [(0, '42.520'), (1, '45.410')] +[2023-10-11 18:35:56,413][85175] Updated weights for policy 1, policy_version 87740 (0.0007) +[2023-10-11 18:35:59,723][85176] Updated weights for policy 0, policy_version 86472 (0.0010) +[2023-10-11 18:36:00,093][85176] Updated weights for policy 0, policy_version 86482 (0.0010) +[2023-10-11 18:36:00,344][85175] Updated weights for policy 1, policy_version 87750 (0.0007) +[2023-10-11 18:36:00,466][85176] Updated weights for policy 0, policy_version 86492 (0.0009) +[2023-10-11 18:36:00,715][85175] Updated weights for policy 1, policy_version 87760 (0.0009) +[2023-10-11 18:36:01,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 178421760. Throughput: 0: 1664.1, 1: 1697.9. Samples: 44617526. Policy #0 lag: (min: 26.0, avg: 26.0, max: 26.0) +[2023-10-11 18:36:01,064][84230] Avg episode reward: [(0, '45.410'), (1, '43.920')] +[2023-10-11 18:36:01,077][85175] Updated weights for policy 1, policy_version 87770 (0.0008) +[2023-10-11 18:36:04,477][85176] Updated weights for policy 0, policy_version 86502 (0.0010) +[2023-10-11 18:36:04,855][85176] Updated weights for policy 0, policy_version 86512 (0.0010) +[2023-10-11 18:36:05,146][85175] Updated weights for policy 1, policy_version 87780 (0.0008) +[2023-10-11 18:36:05,233][85176] Updated weights for policy 0, policy_version 86522 (0.0008) +[2023-10-11 18:36:05,518][85175] Updated weights for policy 1, policy_version 87790 (0.0008) +[2023-10-11 18:36:05,885][85175] Updated weights for policy 1, policy_version 87800 (0.0009) +[2023-10-11 18:36:06,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 178487296. Throughput: 0: 1688.0, 1: 1703.3. Samples: 44628146. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2023-10-11 18:36:06,063][84230] Avg episode reward: [(0, '43.740'), (1, '44.900')] +[2023-10-11 18:36:09,245][85176] Updated weights for policy 0, policy_version 86532 (0.0007) +[2023-10-11 18:36:09,630][85176] Updated weights for policy 0, policy_version 86542 (0.0007) +[2023-10-11 18:36:09,995][85176] Updated weights for policy 0, policy_version 86552 (0.0008) +[2023-10-11 18:36:10,033][85175] Updated weights for policy 1, policy_version 87810 (0.0011) +[2023-10-11 18:36:10,404][85175] Updated weights for policy 1, policy_version 87820 (0.0010) +[2023-10-11 18:36:10,779][85175] Updated weights for policy 1, policy_version 87830 (0.0009) +[2023-10-11 18:36:11,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 178552832. Throughput: 0: 1673.2, 1: 1701.4. Samples: 44648454. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2023-10-11 18:36:11,064][84230] Avg episode reward: [(0, '45.470'), (1, '41.290')] +[2023-10-11 18:36:11,144][85175] Updated weights for policy 1, policy_version 87840 (0.0008) +[2023-10-11 18:36:13,969][85176] Updated weights for policy 0, policy_version 86562 (0.0009) +[2023-10-11 18:36:14,338][85176] Updated weights for policy 0, policy_version 86572 (0.0008) +[2023-10-11 18:36:14,722][85176] Updated weights for policy 0, policy_version 86582 (0.0008) +[2023-10-11 18:36:15,040][85175] Updated weights for policy 1, policy_version 87850 (0.0007) +[2023-10-11 18:36:15,089][85176] Updated weights for policy 0, policy_version 86592 (0.0007) +[2023-10-11 18:36:15,406][85175] Updated weights for policy 1, policy_version 87860 (0.0008) +[2023-10-11 18:36:15,780][85175] Updated weights for policy 1, policy_version 87870 (0.0009) +[2023-10-11 18:36:16,063][84230] Fps is (10 sec: 16383.8, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 178651136. Throughput: 0: 1673.1, 1: 1679.4. Samples: 44667790. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2023-10-11 18:36:16,063][84230] Avg episode reward: [(0, '44.750'), (1, '45.510')] +[2023-10-11 18:36:16,071][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000086592_88670208.pth... +[2023-10-11 18:36:16,072][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000087872_89980928.pth... +[2023-10-11 18:36:16,109][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000086272_88342528.pth +[2023-10-11 18:36:16,118][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000085024_87064576.pth +[2023-10-11 18:36:19,127][85176] Updated weights for policy 0, policy_version 86602 (0.0009) +[2023-10-11 18:36:19,493][85176] Updated weights for policy 0, policy_version 86612 (0.0011) +[2023-10-11 18:36:19,847][85175] Updated weights for policy 1, policy_version 87880 (0.0010) +[2023-10-11 18:36:19,868][85176] Updated weights for policy 0, policy_version 86622 (0.0009) +[2023-10-11 18:36:20,221][85175] Updated weights for policy 1, policy_version 87890 (0.0009) +[2023-10-11 18:36:20,582][85175] Updated weights for policy 1, policy_version 87900 (0.0009) +[2023-10-11 18:36:21,062][84230] Fps is (10 sec: 16384.3, 60 sec: 14199.5, 300 sec: 13440.4). Total num frames: 178716672. Throughput: 0: 1682.2, 1: 1698.1. Samples: 44679118. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2023-10-11 18:36:21,063][84230] Avg episode reward: [(0, '45.870'), (1, '41.770')] +[2023-10-11 18:36:24,107][85176] Updated weights for policy 0, policy_version 86632 (0.0009) +[2023-10-11 18:36:24,488][85175] Updated weights for policy 1, policy_version 87910 (0.0008) +[2023-10-11 18:36:24,493][85176] Updated weights for policy 0, policy_version 86642 (0.0010) +[2023-10-11 18:36:24,857][85175] Updated weights for policy 1, policy_version 87920 (0.0009) +[2023-10-11 18:36:24,859][85176] Updated weights for policy 0, policy_version 86652 (0.0007) +[2023-10-11 18:36:25,233][85175] Updated weights for policy 1, policy_version 87930 (0.0007) +[2023-10-11 18:36:26,063][84230] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 13440.4). Total num frames: 178782208. Throughput: 0: 1663.9, 1: 1694.0. Samples: 44698886. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2023-10-11 18:36:26,064][84230] Avg episode reward: [(0, '43.330'), (1, '48.530')] +[2023-10-11 18:36:28,715][85176] Updated weights for policy 0, policy_version 86662 (0.0008) +[2023-10-11 18:36:29,088][85176] Updated weights for policy 0, policy_version 86672 (0.0009) +[2023-10-11 18:36:29,274][85175] Updated weights for policy 1, policy_version 87940 (0.0009) +[2023-10-11 18:36:29,455][85176] Updated weights for policy 0, policy_version 86682 (0.0008) +[2023-10-11 18:36:29,649][85175] Updated weights for policy 1, policy_version 87950 (0.0009) +[2023-10-11 18:36:30,019][85175] Updated weights for policy 1, policy_version 87960 (0.0011) +[2023-10-11 18:36:31,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 178847744. Throughput: 0: 1673.6, 1: 1672.3. Samples: 44718212. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2023-10-11 18:36:31,063][84230] Avg episode reward: [(0, '43.770'), (1, '41.230')] +[2023-10-11 18:36:33,433][85176] Updated weights for policy 0, policy_version 86692 (0.0007) +[2023-10-11 18:36:33,815][85176] Updated weights for policy 0, policy_version 86702 (0.0009) +[2023-10-11 18:36:34,137][85175] Updated weights for policy 1, policy_version 87970 (0.0008) +[2023-10-11 18:36:34,185][85176] Updated weights for policy 0, policy_version 86712 (0.0008) +[2023-10-11 18:36:34,559][85175] Updated weights for policy 1, policy_version 87980 (0.0008) +[2023-10-11 18:36:34,944][85175] Updated weights for policy 1, policy_version 87990 (0.0009) +[2023-10-11 18:36:35,316][85175] Updated weights for policy 1, policy_version 88000 (0.0012) +[2023-10-11 18:36:36,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 178913280. Throughput: 0: 1675.0, 1: 1700.9. Samples: 44729518. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2023-10-11 18:36:36,064][84230] Avg episode reward: [(0, '43.760'), (1, '47.840')] +[2023-10-11 18:36:38,569][85176] Updated weights for policy 0, policy_version 86722 (0.0009) +[2023-10-11 18:36:38,944][85176] Updated weights for policy 0, policy_version 86732 (0.0009) +[2023-10-11 18:36:39,252][85175] Updated weights for policy 1, policy_version 88010 (0.0009) +[2023-10-11 18:36:39,316][85176] Updated weights for policy 0, policy_version 86742 (0.0008) +[2023-10-11 18:36:39,612][85175] Updated weights for policy 1, policy_version 88020 (0.0009) +[2023-10-11 18:36:39,681][85176] Updated weights for policy 0, policy_version 86752 (0.0007) +[2023-10-11 18:36:39,989][85175] Updated weights for policy 1, policy_version 88030 (0.0009) +[2023-10-11 18:36:41,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 178978816. Throughput: 0: 1658.0, 1: 1682.8. Samples: 44748556. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2023-10-11 18:36:41,063][84230] Avg episode reward: [(0, '41.860'), (1, '42.110')] +[2023-10-11 18:36:43,632][85176] Updated weights for policy 0, policy_version 86762 (0.0009) +[2023-10-11 18:36:43,772][85175] Updated weights for policy 1, policy_version 88040 (0.0008) +[2023-10-11 18:36:44,001][85176] Updated weights for policy 0, policy_version 86772 (0.0008) +[2023-10-11 18:36:44,135][85175] Updated weights for policy 1, policy_version 88050 (0.0009) +[2023-10-11 18:36:44,373][85176] Updated weights for policy 0, policy_version 86782 (0.0009) +[2023-10-11 18:36:44,508][85175] Updated weights for policy 1, policy_version 88060 (0.0008) +[2023-10-11 18:36:46,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 179044352. Throughput: 0: 1683.3, 1: 1680.7. Samples: 44768902. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2023-10-11 18:36:46,063][84230] Avg episode reward: [(0, '44.410'), (1, '47.960')] +[2023-10-11 18:36:48,450][85176] Updated weights for policy 0, policy_version 86792 (0.0008) +[2023-10-11 18:36:48,663][85175] Updated weights for policy 1, policy_version 88070 (0.0010) +[2023-10-11 18:36:48,820][85176] Updated weights for policy 0, policy_version 86802 (0.0007) +[2023-10-11 18:36:49,028][85175] Updated weights for policy 1, policy_version 88080 (0.0011) +[2023-10-11 18:36:49,190][85176] Updated weights for policy 0, policy_version 86812 (0.0009) +[2023-10-11 18:36:49,396][85175] Updated weights for policy 1, policy_version 88090 (0.0008) +[2023-10-11 18:36:51,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 179109888. Throughput: 0: 1674.2, 1: 1694.0. Samples: 44779714. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2023-10-11 18:36:51,063][84230] Avg episode reward: [(0, '42.320'), (1, '43.490')] +[2023-10-11 18:36:53,192][85175] Updated weights for policy 1, policy_version 88100 (0.0007) +[2023-10-11 18:36:53,356][85176] Updated weights for policy 0, policy_version 86822 (0.0008) +[2023-10-11 18:36:53,560][85175] Updated weights for policy 1, policy_version 88110 (0.0010) +[2023-10-11 18:36:53,722][85176] Updated weights for policy 0, policy_version 86832 (0.0008) +[2023-10-11 18:36:53,923][85175] Updated weights for policy 1, policy_version 88120 (0.0009) +[2023-10-11 18:36:54,095][85176] Updated weights for policy 0, policy_version 86842 (0.0008) +[2023-10-11 18:36:56,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 179175424. Throughput: 0: 1664.1, 1: 1674.5. Samples: 44798690. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) +[2023-10-11 18:36:56,063][84230] Avg episode reward: [(0, '47.930'), (1, '47.490')] +[2023-10-11 18:36:57,954][85175] Updated weights for policy 1, policy_version 88130 (0.0008) +[2023-10-11 18:36:58,160][85176] Updated weights for policy 0, policy_version 86852 (0.0008) +[2023-10-11 18:36:58,321][85175] Updated weights for policy 1, policy_version 88140 (0.0008) +[2023-10-11 18:36:58,532][85176] Updated weights for policy 0, policy_version 86862 (0.0008) +[2023-10-11 18:36:58,687][85175] Updated weights for policy 1, policy_version 88150 (0.0009) +[2023-10-11 18:36:58,907][85176] Updated weights for policy 0, policy_version 86872 (0.0008) +[2023-10-11 18:36:59,054][85175] Updated weights for policy 1, policy_version 88160 (0.0008) +[2023-10-11 18:37:01,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 179240960. Throughput: 0: 1679.4, 1: 1693.0. Samples: 44819546. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) +[2023-10-11 18:37:01,063][84230] Avg episode reward: [(0, '43.770'), (1, '43.090')] +[2023-10-11 18:37:03,041][85176] Updated weights for policy 0, policy_version 86882 (0.0009) +[2023-10-11 18:37:03,225][85175] Updated weights for policy 1, policy_version 88170 (0.0007) +[2023-10-11 18:37:03,415][85176] Updated weights for policy 0, policy_version 86892 (0.0009) +[2023-10-11 18:37:03,591][85175] Updated weights for policy 1, policy_version 88180 (0.0010) +[2023-10-11 18:37:03,783][85176] Updated weights for policy 0, policy_version 86902 (0.0010) +[2023-10-11 18:37:03,956][85175] Updated weights for policy 1, policy_version 88190 (0.0008) +[2023-10-11 18:37:04,155][85176] Updated weights for policy 0, policy_version 86912 (0.0008) +[2023-10-11 18:37:06,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 179306496. Throughput: 0: 1661.4, 1: 1686.7. Samples: 44829784. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) +[2023-10-11 18:37:06,063][84230] Avg episode reward: [(0, '46.010'), (1, '45.310')] +[2023-10-11 18:37:08,040][85176] Updated weights for policy 0, policy_version 86922 (0.0008) +[2023-10-11 18:37:08,113][85175] Updated weights for policy 1, policy_version 88200 (0.0007) +[2023-10-11 18:37:08,402][85176] Updated weights for policy 0, policy_version 86932 (0.0008) +[2023-10-11 18:37:08,473][85175] Updated weights for policy 1, policy_version 88210 (0.0008) +[2023-10-11 18:37:08,774][85176] Updated weights for policy 0, policy_version 86942 (0.0009) +[2023-10-11 18:37:08,839][85175] Updated weights for policy 1, policy_version 88220 (0.0007) +[2023-10-11 18:37:11,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 179372032. Throughput: 0: 1670.5, 1: 1670.0. Samples: 44849208. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) +[2023-10-11 18:37:11,064][84230] Avg episode reward: [(0, '42.920'), (1, '43.910')] +[2023-10-11 18:37:12,809][85175] Updated weights for policy 1, policy_version 88230 (0.0010) +[2023-10-11 18:37:13,002][85176] Updated weights for policy 0, policy_version 86952 (0.0007) +[2023-10-11 18:37:13,181][85175] Updated weights for policy 1, policy_version 88240 (0.0009) +[2023-10-11 18:37:13,382][85176] Updated weights for policy 0, policy_version 86962 (0.0009) +[2023-10-11 18:37:13,556][85175] Updated weights for policy 1, policy_version 88250 (0.0007) +[2023-10-11 18:37:13,753][85176] Updated weights for policy 0, policy_version 86972 (0.0009) +[2023-10-11 18:37:16,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 179437568. Throughput: 0: 1673.4, 1: 1695.7. Samples: 44869822. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) +[2023-10-11 18:37:16,063][84230] Avg episode reward: [(0, '46.000'), (1, '44.910')] +[2023-10-11 18:37:17,446][85175] Updated weights for policy 1, policy_version 88260 (0.0008) +[2023-10-11 18:37:17,817][85175] Updated weights for policy 1, policy_version 88270 (0.0009) +[2023-10-11 18:37:17,911][85176] Updated weights for policy 0, policy_version 86982 (0.0010) +[2023-10-11 18:37:18,180][85175] Updated weights for policy 1, policy_version 88280 (0.0009) +[2023-10-11 18:37:18,277][85176] Updated weights for policy 0, policy_version 86992 (0.0009) +[2023-10-11 18:37:18,657][85176] Updated weights for policy 0, policy_version 87002 (0.0007) +[2023-10-11 18:37:21,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 179503104. Throughput: 0: 1655.7, 1: 1670.1. Samples: 44879174. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) +[2023-10-11 18:37:21,063][84230] Avg episode reward: [(0, '43.750'), (1, '45.350')] +[2023-10-11 18:37:22,314][85175] Updated weights for policy 1, policy_version 88290 (0.0009) +[2023-10-11 18:37:22,679][85175] Updated weights for policy 1, policy_version 88300 (0.0008) +[2023-10-11 18:37:22,797][85176] Updated weights for policy 0, policy_version 87012 (0.0008) +[2023-10-11 18:37:23,051][85175] Updated weights for policy 1, policy_version 88310 (0.0008) +[2023-10-11 18:37:23,169][85176] Updated weights for policy 0, policy_version 87022 (0.0007) +[2023-10-11 18:37:23,417][85175] Updated weights for policy 1, policy_version 88320 (0.0007) +[2023-10-11 18:37:23,537][85176] Updated weights for policy 0, policy_version 87032 (0.0009) +[2023-10-11 18:37:26,062][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 179568640. Throughput: 0: 1663.6, 1: 1689.9. Samples: 44899460. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) +[2023-10-11 18:37:26,063][84230] Avg episode reward: [(0, '46.750'), (1, '43.460')] +[2023-10-11 18:37:27,370][85175] Updated weights for policy 1, policy_version 88330 (0.0011) +[2023-10-11 18:37:27,614][85176] Updated weights for policy 0, policy_version 87042 (0.0011) +[2023-10-11 18:37:27,753][85175] Updated weights for policy 1, policy_version 88340 (0.0009) +[2023-10-11 18:37:27,971][85176] Updated weights for policy 0, policy_version 87052 (0.0010) +[2023-10-11 18:37:28,117][85175] Updated weights for policy 1, policy_version 88350 (0.0009) +[2023-10-11 18:37:28,345][85176] Updated weights for policy 0, policy_version 87062 (0.0009) +[2023-10-11 18:37:28,717][85176] Updated weights for policy 0, policy_version 87072 (0.0009) +[2023-10-11 18:37:31,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 179634176. Throughput: 0: 1667.9, 1: 1698.1. Samples: 44920370. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) +[2023-10-11 18:37:31,064][84230] Avg episode reward: [(0, '42.490'), (1, '45.260')] +[2023-10-11 18:37:32,174][85175] Updated weights for policy 1, policy_version 88360 (0.0008) +[2023-10-11 18:37:32,547][85175] Updated weights for policy 1, policy_version 88370 (0.0009) +[2023-10-11 18:37:32,909][85175] Updated weights for policy 1, policy_version 88380 (0.0008) +[2023-10-11 18:37:32,923][85176] Updated weights for policy 0, policy_version 87082 (0.0008) +[2023-10-11 18:37:33,294][85176] Updated weights for policy 0, policy_version 87092 (0.0007) +[2023-10-11 18:37:33,665][85176] Updated weights for policy 0, policy_version 87102 (0.0007) +[2023-10-11 18:37:36,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 179699712. Throughput: 0: 1658.6, 1: 1677.0. Samples: 44929816. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) +[2023-10-11 18:37:36,063][84230] Avg episode reward: [(0, '45.730'), (1, '41.580')] +[2023-10-11 18:37:36,965][85175] Updated weights for policy 1, policy_version 88390 (0.0007) +[2023-10-11 18:37:37,325][85175] Updated weights for policy 1, policy_version 88400 (0.0008) +[2023-10-11 18:37:37,540][85176] Updated weights for policy 0, policy_version 87112 (0.0009) +[2023-10-11 18:37:37,698][85175] Updated weights for policy 1, policy_version 88410 (0.0009) +[2023-10-11 18:37:37,906][85176] Updated weights for policy 0, policy_version 87122 (0.0008) +[2023-10-11 18:37:38,291][85176] Updated weights for policy 0, policy_version 87132 (0.0010) +[2023-10-11 18:37:41,062][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 179765248. Throughput: 0: 1672.7, 1: 1705.6. Samples: 44950712. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) +[2023-10-11 18:37:41,063][84230] Avg episode reward: [(0, '42.620'), (1, '45.970')] +[2023-10-11 18:37:41,592][85175] Updated weights for policy 1, policy_version 88420 (0.0009) +[2023-10-11 18:37:41,953][85175] Updated weights for policy 1, policy_version 88430 (0.0008) +[2023-10-11 18:37:42,323][85175] Updated weights for policy 1, policy_version 88440 (0.0007) +[2023-10-11 18:37:42,391][85176] Updated weights for policy 0, policy_version 87142 (0.0008) +[2023-10-11 18:37:42,765][85176] Updated weights for policy 0, policy_version 87152 (0.0009) +[2023-10-11 18:37:43,144][85176] Updated weights for policy 0, policy_version 87162 (0.0007) +[2023-10-11 18:37:46,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 179830784. Throughput: 0: 1677.5, 1: 1706.7. Samples: 44971836. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) +[2023-10-11 18:37:46,064][84230] Avg episode reward: [(0, '45.170'), (1, '41.570')] +[2023-10-11 18:37:46,385][85175] Updated weights for policy 1, policy_version 88450 (0.0008) +[2023-10-11 18:37:46,756][85175] Updated weights for policy 1, policy_version 88460 (0.0010) +[2023-10-11 18:37:47,050][85176] Updated weights for policy 0, policy_version 87172 (0.0007) +[2023-10-11 18:37:47,120][85175] Updated weights for policy 1, policy_version 88470 (0.0008) +[2023-10-11 18:37:47,423][85176] Updated weights for policy 0, policy_version 87182 (0.0009) +[2023-10-11 18:37:47,482][85175] Updated weights for policy 1, policy_version 88480 (0.0007) +[2023-10-11 18:37:47,797][85176] Updated weights for policy 0, policy_version 87192 (0.0008) +[2023-10-11 18:37:51,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 179896320. Throughput: 0: 1665.9, 1: 1694.8. Samples: 44981014. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) +[2023-10-11 18:37:51,063][84230] Avg episode reward: [(0, '42.900'), (1, '48.270')] +[2023-10-11 18:37:51,551][85175] Updated weights for policy 1, policy_version 88490 (0.0008) +[2023-10-11 18:37:51,892][85176] Updated weights for policy 0, policy_version 87202 (0.0009) +[2023-10-11 18:37:51,921][85175] Updated weights for policy 1, policy_version 88500 (0.0008) +[2023-10-11 18:37:52,271][85176] Updated weights for policy 0, policy_version 87212 (0.0008) +[2023-10-11 18:37:52,290][85175] Updated weights for policy 1, policy_version 88510 (0.0008) +[2023-10-11 18:37:52,652][85176] Updated weights for policy 0, policy_version 87222 (0.0009) +[2023-10-11 18:37:53,026][85176] Updated weights for policy 0, policy_version 87232 (0.0007) +[2023-10-11 18:37:56,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 179961856. Throughput: 0: 1678.5, 1: 1717.7. Samples: 45002038. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:37:56,063][84230] Avg episode reward: [(0, '46.370'), (1, '42.980')] +[2023-10-11 18:37:56,245][85175] Updated weights for policy 1, policy_version 88520 (0.0008) +[2023-10-11 18:37:56,603][85175] Updated weights for policy 1, policy_version 88530 (0.0009) +[2023-10-11 18:37:56,973][85175] Updated weights for policy 1, policy_version 88540 (0.0009) +[2023-10-11 18:37:57,084][85176] Updated weights for policy 0, policy_version 87242 (0.0008) +[2023-10-11 18:37:57,463][85176] Updated weights for policy 0, policy_version 87252 (0.0007) +[2023-10-11 18:37:57,842][85176] Updated weights for policy 0, policy_version 87262 (0.0010) +[2023-10-11 18:38:00,920][85175] Updated weights for policy 1, policy_version 88550 (0.0009) +[2023-10-11 18:38:01,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 180027392. Throughput: 0: 1683.4, 1: 1717.6. Samples: 45022866. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:38:01,064][84230] Avg episode reward: [(0, '43.780'), (1, '48.300')] +[2023-10-11 18:38:01,298][85175] Updated weights for policy 1, policy_version 88560 (0.0011) +[2023-10-11 18:38:01,668][85175] Updated weights for policy 1, policy_version 88570 (0.0010) +[2023-10-11 18:38:02,091][85176] Updated weights for policy 0, policy_version 87272 (0.0008) +[2023-10-11 18:38:02,457][85176] Updated weights for policy 0, policy_version 87282 (0.0010) +[2023-10-11 18:38:02,833][85176] Updated weights for policy 0, policy_version 87292 (0.0007) +[2023-10-11 18:38:05,777][85175] Updated weights for policy 1, policy_version 88580 (0.0008) +[2023-10-11 18:38:06,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 180092928. Throughput: 0: 1678.8, 1: 1715.5. Samples: 45031920. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:38:06,063][84230] Avg episode reward: [(0, '46.030'), (1, '42.740')] +[2023-10-11 18:38:06,148][85175] Updated weights for policy 1, policy_version 88590 (0.0007) +[2023-10-11 18:38:06,526][85175] Updated weights for policy 1, policy_version 88600 (0.0011) +[2023-10-11 18:38:06,868][85176] Updated weights for policy 0, policy_version 87302 (0.0008) +[2023-10-11 18:38:07,251][85176] Updated weights for policy 0, policy_version 87312 (0.0008) +[2023-10-11 18:38:07,623][85176] Updated weights for policy 0, policy_version 87322 (0.0010) +[2023-10-11 18:38:10,536][85175] Updated weights for policy 1, policy_version 88610 (0.0007) +[2023-10-11 18:38:10,902][85175] Updated weights for policy 1, policy_version 88620 (0.0008) +[2023-10-11 18:38:11,062][84230] Fps is (10 sec: 13107.6, 60 sec: 13107.3, 300 sec: 13440.5). Total num frames: 180158464. Throughput: 0: 1686.4, 1: 1717.6. Samples: 45052640. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:38:11,063][84230] Avg episode reward: [(0, '44.880'), (1, '45.010')] +[2023-10-11 18:38:11,259][85175] Updated weights for policy 1, policy_version 88630 (0.0009) +[2023-10-11 18:38:11,577][85176] Updated weights for policy 0, policy_version 87332 (0.0009) +[2023-10-11 18:38:11,635][85175] Updated weights for policy 1, policy_version 88640 (0.0008) +[2023-10-11 18:38:11,933][85176] Updated weights for policy 0, policy_version 87342 (0.0009) +[2023-10-11 18:38:12,307][85176] Updated weights for policy 0, policy_version 87352 (0.0009) +[2023-10-11 18:38:15,861][85175] Updated weights for policy 1, policy_version 88650 (0.0009) +[2023-10-11 18:38:16,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 180224000. Throughput: 0: 1687.8, 1: 1710.2. Samples: 45073278. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:38:16,064][84230] Avg episode reward: [(0, '44.730'), (1, '40.230')] +[2023-10-11 18:38:16,076][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000087360_89456640.pth... +[2023-10-11 18:38:16,112][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000085792_87851008.pth +[2023-10-11 18:38:16,246][85175] Updated weights for policy 1, policy_version 88660 (0.0009) +[2023-10-11 18:38:16,441][85176] Updated weights for policy 0, policy_version 87362 (0.0007) +[2023-10-11 18:38:16,608][85175] Updated weights for policy 1, policy_version 88670 (0.0007) +[2023-10-11 18:38:16,680][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000088672_90800128.pth... +[2023-10-11 18:38:16,724][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000087072_89161728.pth +[2023-10-11 18:38:16,811][85176] Updated weights for policy 0, policy_version 87372 (0.0008) +[2023-10-11 18:38:17,182][85176] Updated weights for policy 0, policy_version 87382 (0.0010) +[2023-10-11 18:38:17,541][85176] Updated weights for policy 0, policy_version 87392 (0.0009) +[2023-10-11 18:38:20,547][85175] Updated weights for policy 1, policy_version 88680 (0.0008) +[2023-10-11 18:38:20,914][85175] Updated weights for policy 1, policy_version 88690 (0.0010) +[2023-10-11 18:38:21,063][84230] Fps is (10 sec: 13106.7, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 180289536. Throughput: 0: 1678.3, 1: 1711.6. Samples: 45082364. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:38:21,064][84230] Avg episode reward: [(0, '45.230'), (1, '46.050')] +[2023-10-11 18:38:21,280][85175] Updated weights for policy 1, policy_version 88700 (0.0009) +[2023-10-11 18:38:21,691][85176] Updated weights for policy 0, policy_version 87402 (0.0009) +[2023-10-11 18:38:22,067][85176] Updated weights for policy 0, policy_version 87412 (0.0008) +[2023-10-11 18:38:22,434][85176] Updated weights for policy 0, policy_version 87422 (0.0009) +[2023-10-11 18:38:25,318][85175] Updated weights for policy 1, policy_version 88710 (0.0009) +[2023-10-11 18:38:25,689][85175] Updated weights for policy 1, policy_version 88720 (0.0011) +[2023-10-11 18:38:26,059][85175] Updated weights for policy 1, policy_version 88730 (0.0010) +[2023-10-11 18:38:26,062][84230] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 180355072. Throughput: 0: 1678.9, 1: 1702.0. Samples: 45102850. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:38:26,063][84230] Avg episode reward: [(0, '44.270'), (1, '44.050')] +[2023-10-11 18:38:26,514][85176] Updated weights for policy 0, policy_version 87432 (0.0009) +[2023-10-11 18:38:26,893][85176] Updated weights for policy 0, policy_version 87442 (0.0007) +[2023-10-11 18:38:27,260][85176] Updated weights for policy 0, policy_version 87452 (0.0007) +[2023-10-11 18:38:30,014][85175] Updated weights for policy 1, policy_version 88740 (0.0008) +[2023-10-11 18:38:30,390][85175] Updated weights for policy 1, policy_version 88750 (0.0009) +[2023-10-11 18:38:30,749][85175] Updated weights for policy 1, policy_version 88760 (0.0007) +[2023-10-11 18:38:31,063][84230] Fps is (10 sec: 16384.5, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 180453376. Throughput: 0: 1680.6, 1: 1686.5. Samples: 45123354. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:38:31,063][84230] Avg episode reward: [(0, '47.270'), (1, '45.210')] +[2023-10-11 18:38:31,291][85176] Updated weights for policy 0, policy_version 87462 (0.0008) +[2023-10-11 18:38:31,663][85176] Updated weights for policy 0, policy_version 87472 (0.0009) +[2023-10-11 18:38:32,038][85176] Updated weights for policy 0, policy_version 87482 (0.0009) +[2023-10-11 18:38:34,792][85175] Updated weights for policy 1, policy_version 88770 (0.0008) +[2023-10-11 18:38:35,162][85175] Updated weights for policy 1, policy_version 88780 (0.0008) +[2023-10-11 18:38:35,536][85175] Updated weights for policy 1, policy_version 88790 (0.0008) +[2023-10-11 18:38:35,900][85175] Updated weights for policy 1, policy_version 88800 (0.0010) +[2023-10-11 18:38:36,062][84230] Fps is (10 sec: 16384.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 180518912. Throughput: 0: 1684.8, 1: 1702.0. Samples: 45133420. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:38:36,063][84230] Avg episode reward: [(0, '44.410'), (1, '41.990')] +[2023-10-11 18:38:36,095][85176] Updated weights for policy 0, policy_version 87492 (0.0009) +[2023-10-11 18:38:36,470][85176] Updated weights for policy 0, policy_version 87502 (0.0007) +[2023-10-11 18:38:36,834][85176] Updated weights for policy 0, policy_version 87512 (0.0009) +[2023-10-11 18:38:39,948][85175] Updated weights for policy 1, policy_version 88810 (0.0009) +[2023-10-11 18:38:40,314][85175] Updated weights for policy 1, policy_version 88820 (0.0008) +[2023-10-11 18:38:40,689][85175] Updated weights for policy 1, policy_version 88830 (0.0010) +[2023-10-11 18:38:40,903][85176] Updated weights for policy 0, policy_version 87522 (0.0009) +[2023-10-11 18:38:41,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 180584448. Throughput: 0: 1682.3, 1: 1695.9. Samples: 45154056. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:38:41,063][84230] Avg episode reward: [(0, '46.060'), (1, '41.960')] +[2023-10-11 18:38:41,279][85176] Updated weights for policy 0, policy_version 87532 (0.0008) +[2023-10-11 18:38:41,645][85176] Updated weights for policy 0, policy_version 87542 (0.0008) +[2023-10-11 18:38:42,019][85176] Updated weights for policy 0, policy_version 87552 (0.0007) +[2023-10-11 18:38:44,655][85175] Updated weights for policy 1, policy_version 88840 (0.0008) +[2023-10-11 18:38:45,022][85175] Updated weights for policy 1, policy_version 88850 (0.0009) +[2023-10-11 18:38:45,384][85175] Updated weights for policy 1, policy_version 88860 (0.0009) +[2023-10-11 18:38:45,887][85176] Updated weights for policy 0, policy_version 87562 (0.0008) +[2023-10-11 18:38:46,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 180649984. Throughput: 0: 1688.1, 1: 1668.2. Samples: 45173898. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:38:46,063][84230] Avg episode reward: [(0, '44.860'), (1, '45.900')] +[2023-10-11 18:38:46,262][85176] Updated weights for policy 0, policy_version 87572 (0.0009) +[2023-10-11 18:38:46,636][85176] Updated weights for policy 0, policy_version 87582 (0.0009) +[2023-10-11 18:38:49,484][85175] Updated weights for policy 1, policy_version 88870 (0.0008) +[2023-10-11 18:38:49,851][85175] Updated weights for policy 1, policy_version 88880 (0.0009) +[2023-10-11 18:38:50,230][85175] Updated weights for policy 1, policy_version 88890 (0.0010) +[2023-10-11 18:38:50,811][85176] Updated weights for policy 0, policy_version 87592 (0.0010) +[2023-10-11 18:38:51,062][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 180715520. Throughput: 0: 1687.9, 1: 1695.4. Samples: 45184170. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:38:51,063][84230] Avg episode reward: [(0, '43.800'), (1, '44.050')] +[2023-10-11 18:38:51,179][85176] Updated weights for policy 0, policy_version 87602 (0.0008) +[2023-10-11 18:38:51,543][85176] Updated weights for policy 0, policy_version 87612 (0.0008) +[2023-10-11 18:38:54,365][85175] Updated weights for policy 1, policy_version 88900 (0.0009) +[2023-10-11 18:38:54,740][85175] Updated weights for policy 1, policy_version 88910 (0.0009) +[2023-10-11 18:38:55,103][85175] Updated weights for policy 1, policy_version 88920 (0.0007) +[2023-10-11 18:38:55,398][85176] Updated weights for policy 0, policy_version 87622 (0.0009) +[2023-10-11 18:38:55,780][85176] Updated weights for policy 0, policy_version 87632 (0.0008) +[2023-10-11 18:38:56,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 180781056. Throughput: 0: 1690.9, 1: 1683.2. Samples: 45204478. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:38:56,064][84230] Avg episode reward: [(0, '46.620'), (1, '45.680')] +[2023-10-11 18:38:56,154][85176] Updated weights for policy 0, policy_version 87642 (0.0009) +[2023-10-11 18:38:59,153][85175] Updated weights for policy 1, policy_version 88930 (0.0009) +[2023-10-11 18:38:59,522][85175] Updated weights for policy 1, policy_version 88940 (0.0008) +[2023-10-11 18:38:59,889][85175] Updated weights for policy 1, policy_version 88950 (0.0008) +[2023-10-11 18:39:00,138][85176] Updated weights for policy 0, policy_version 87652 (0.0007) +[2023-10-11 18:39:00,248][85175] Updated weights for policy 1, policy_version 88960 (0.0007) +[2023-10-11 18:39:00,508][85176] Updated weights for policy 0, policy_version 87662 (0.0010) +[2023-10-11 18:39:00,897][85176] Updated weights for policy 0, policy_version 87672 (0.0011) +[2023-10-11 18:39:01,063][84230] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 180846592. Throughput: 0: 1680.2, 1: 1666.7. Samples: 45223890. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:39:01,064][84230] Avg episode reward: [(0, '42.420'), (1, '42.550')] +[2023-10-11 18:39:04,193][85175] Updated weights for policy 1, policy_version 88970 (0.0008) +[2023-10-11 18:39:04,577][85175] Updated weights for policy 1, policy_version 88980 (0.0007) +[2023-10-11 18:39:04,946][85175] Updated weights for policy 1, policy_version 88990 (0.0008) +[2023-10-11 18:39:05,032][85176] Updated weights for policy 0, policy_version 87682 (0.0009) +[2023-10-11 18:39:05,410][85176] Updated weights for policy 0, policy_version 87692 (0.0009) +[2023-10-11 18:39:05,791][85176] Updated weights for policy 0, policy_version 87702 (0.0008) +[2023-10-11 18:39:06,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 180912128. Throughput: 0: 1694.0, 1: 1700.1. Samples: 45235100. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:39:06,064][84230] Avg episode reward: [(0, '44.530'), (1, '45.290')] +[2023-10-11 18:39:06,174][85176] Updated weights for policy 0, policy_version 87712 (0.0009) +[2023-10-11 18:39:08,978][85175] Updated weights for policy 1, policy_version 89000 (0.0009) +[2023-10-11 18:39:09,353][85175] Updated weights for policy 1, policy_version 89010 (0.0008) +[2023-10-11 18:39:09,715][85175] Updated weights for policy 1, policy_version 89020 (0.0009) +[2023-10-11 18:39:10,153][85176] Updated weights for policy 0, policy_version 87722 (0.0010) +[2023-10-11 18:39:10,511][85176] Updated weights for policy 0, policy_version 87732 (0.0010) +[2023-10-11 18:39:10,885][85176] Updated weights for policy 0, policy_version 87742 (0.0007) +[2023-10-11 18:39:11,063][84230] Fps is (10 sec: 16384.0, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 181010432. Throughput: 0: 1702.1, 1: 1676.1. Samples: 45254872. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:39:11,064][84230] Avg episode reward: [(0, '43.760'), (1, '43.140')] +[2023-10-11 18:39:13,802][85175] Updated weights for policy 1, policy_version 89030 (0.0007) +[2023-10-11 18:39:14,164][85175] Updated weights for policy 1, policy_version 89040 (0.0007) +[2023-10-11 18:39:14,536][85175] Updated weights for policy 1, policy_version 89050 (0.0007) +[2023-10-11 18:39:14,666][85176] Updated weights for policy 0, policy_version 87752 (0.0007) +[2023-10-11 18:39:15,035][85176] Updated weights for policy 0, policy_version 87762 (0.0007) +[2023-10-11 18:39:15,414][85176] Updated weights for policy 0, policy_version 87772 (0.0008) +[2023-10-11 18:39:16,063][84230] Fps is (10 sec: 16383.6, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 181075968. Throughput: 0: 1673.5, 1: 1678.1. Samples: 45274178. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:39:16,064][84230] Avg episode reward: [(0, '46.440'), (1, '47.430')] +[2023-10-11 18:39:18,582][85175] Updated weights for policy 1, policy_version 89060 (0.0009) +[2023-10-11 18:39:18,936][85175] Updated weights for policy 1, policy_version 89070 (0.0008) +[2023-10-11 18:39:19,299][85175] Updated weights for policy 1, policy_version 89080 (0.0007) +[2023-10-11 18:39:19,448][85176] Updated weights for policy 0, policy_version 87782 (0.0007) +[2023-10-11 18:39:19,820][85176] Updated weights for policy 0, policy_version 87792 (0.0008) +[2023-10-11 18:39:20,186][85176] Updated weights for policy 0, policy_version 87802 (0.0008) +[2023-10-11 18:39:21,062][84230] Fps is (10 sec: 13107.6, 60 sec: 14199.6, 300 sec: 13551.5). Total num frames: 181141504. Throughput: 0: 1695.0, 1: 1687.1. Samples: 45285614. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:39:21,063][84230] Avg episode reward: [(0, '44.760'), (1, '43.020')] +[2023-10-11 18:39:23,164][85175] Updated weights for policy 1, policy_version 89090 (0.0008) +[2023-10-11 18:39:23,530][85175] Updated weights for policy 1, policy_version 89100 (0.0009) +[2023-10-11 18:39:23,907][85175] Updated weights for policy 1, policy_version 89110 (0.0010) +[2023-10-11 18:39:24,255][85176] Updated weights for policy 0, policy_version 87812 (0.0008) +[2023-10-11 18:39:24,264][85175] Updated weights for policy 1, policy_version 89120 (0.0010) +[2023-10-11 18:39:24,625][85176] Updated weights for policy 0, policy_version 87822 (0.0009) +[2023-10-11 18:39:25,001][85176] Updated weights for policy 0, policy_version 87832 (0.0008) +[2023-10-11 18:39:26,062][84230] Fps is (10 sec: 13107.7, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 181207040. Throughput: 0: 1685.6, 1: 1665.1. Samples: 45304836. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:39:26,063][84230] Avg episode reward: [(0, '46.900'), (1, '45.920')] +[2023-10-11 18:39:28,103][85175] Updated weights for policy 1, policy_version 89130 (0.0008) +[2023-10-11 18:39:28,471][85175] Updated weights for policy 1, policy_version 89140 (0.0007) +[2023-10-11 18:39:28,843][85175] Updated weights for policy 1, policy_version 89150 (0.0008) +[2023-10-11 18:39:29,216][85176] Updated weights for policy 0, policy_version 87842 (0.0009) +[2023-10-11 18:39:29,582][85176] Updated weights for policy 0, policy_version 87852 (0.0008) +[2023-10-11 18:39:29,957][85176] Updated weights for policy 0, policy_version 87862 (0.0009) +[2023-10-11 18:39:30,332][85176] Updated weights for policy 0, policy_version 87872 (0.0007) +[2023-10-11 18:39:31,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 181272576. Throughput: 0: 1661.5, 1: 1696.1. Samples: 45324990. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:39:31,063][84230] Avg episode reward: [(0, '43.830'), (1, '38.870')] +[2023-10-11 18:39:32,967][85175] Updated weights for policy 1, policy_version 89160 (0.0009) +[2023-10-11 18:39:33,342][85175] Updated weights for policy 1, policy_version 89170 (0.0008) +[2023-10-11 18:39:33,706][85175] Updated weights for policy 1, policy_version 89180 (0.0009) +[2023-10-11 18:39:34,449][85176] Updated weights for policy 0, policy_version 87882 (0.0007) +[2023-10-11 18:39:34,819][85176] Updated weights for policy 0, policy_version 87892 (0.0009) +[2023-10-11 18:39:35,188][85176] Updated weights for policy 0, policy_version 87902 (0.0008) +[2023-10-11 18:39:36,062][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 181338112. Throughput: 0: 1693.8, 1: 1679.2. Samples: 45335952. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:39:36,063][84230] Avg episode reward: [(0, '45.910'), (1, '47.100')] +[2023-10-11 18:39:37,740][85175] Updated weights for policy 1, policy_version 89190 (0.0011) +[2023-10-11 18:39:38,110][85175] Updated weights for policy 1, policy_version 89200 (0.0008) +[2023-10-11 18:39:38,489][85175] Updated weights for policy 1, policy_version 89210 (0.0007) +[2023-10-11 18:39:39,368][85176] Updated weights for policy 0, policy_version 87912 (0.0008) +[2023-10-11 18:39:39,742][85176] Updated weights for policy 0, policy_version 87922 (0.0007) +[2023-10-11 18:39:40,111][85176] Updated weights for policy 0, policy_version 87932 (0.0007) +[2023-10-11 18:39:41,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 181403648. Throughput: 0: 1679.7, 1: 1684.5. Samples: 45355866. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:39:41,064][84230] Avg episode reward: [(0, '43.030'), (1, '43.170')] +[2023-10-11 18:39:42,580][85175] Updated weights for policy 1, policy_version 89220 (0.0008) +[2023-10-11 18:39:42,950][85175] Updated weights for policy 1, policy_version 89230 (0.0010) +[2023-10-11 18:39:43,320][85175] Updated weights for policy 1, policy_version 89240 (0.0010) +[2023-10-11 18:39:44,015][85176] Updated weights for policy 0, policy_version 87942 (0.0009) +[2023-10-11 18:39:44,382][85176] Updated weights for policy 0, policy_version 87952 (0.0008) +[2023-10-11 18:39:44,759][85176] Updated weights for policy 0, policy_version 87962 (0.0007) +[2023-10-11 18:39:46,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 181469184. Throughput: 0: 1675.4, 1: 1705.1. Samples: 45376014. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:39:46,063][84230] Avg episode reward: [(0, '45.110'), (1, '49.070')] +[2023-10-11 18:39:47,444][85175] Updated weights for policy 1, policy_version 89250 (0.0011) +[2023-10-11 18:39:47,806][85175] Updated weights for policy 1, policy_version 89260 (0.0009) +[2023-10-11 18:39:48,173][85175] Updated weights for policy 1, policy_version 89270 (0.0008) +[2023-10-11 18:39:48,535][85175] Updated weights for policy 1, policy_version 89280 (0.0009) +[2023-10-11 18:39:48,987][85176] Updated weights for policy 0, policy_version 87972 (0.0008) +[2023-10-11 18:39:49,362][85176] Updated weights for policy 0, policy_version 87982 (0.0009) +[2023-10-11 18:39:49,745][85176] Updated weights for policy 0, policy_version 87992 (0.0007) +[2023-10-11 18:39:51,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 181534720. Throughput: 0: 1690.0, 1: 1673.7. Samples: 45386468. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:39:51,063][84230] Avg episode reward: [(0, '42.600'), (1, '42.390')] +[2023-10-11 18:39:52,566][85175] Updated weights for policy 1, policy_version 89290 (0.0007) +[2023-10-11 18:39:52,932][85175] Updated weights for policy 1, policy_version 89300 (0.0008) +[2023-10-11 18:39:53,298][85175] Updated weights for policy 1, policy_version 89310 (0.0010) +[2023-10-11 18:39:53,784][85176] Updated weights for policy 0, policy_version 88002 (0.0008) +[2023-10-11 18:39:54,156][85176] Updated weights for policy 0, policy_version 88012 (0.0010) +[2023-10-11 18:39:54,527][85176] Updated weights for policy 0, policy_version 88022 (0.0009) +[2023-10-11 18:39:54,902][85176] Updated weights for policy 0, policy_version 88032 (0.0009) +[2023-10-11 18:39:56,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 181600256. Throughput: 0: 1665.5, 1: 1700.2. Samples: 45406328. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:39:56,063][84230] Avg episode reward: [(0, '44.180'), (1, '49.010')] +[2023-10-11 18:39:57,254][85175] Updated weights for policy 1, policy_version 89320 (0.0009) +[2023-10-11 18:39:57,629][85175] Updated weights for policy 1, policy_version 89330 (0.0008) +[2023-10-11 18:39:58,006][85175] Updated weights for policy 1, policy_version 89340 (0.0009) +[2023-10-11 18:39:58,900][85176] Updated weights for policy 0, policy_version 88042 (0.0010) +[2023-10-11 18:39:59,286][85176] Updated weights for policy 0, policy_version 88052 (0.0010) +[2023-10-11 18:39:59,654][85176] Updated weights for policy 0, policy_version 88062 (0.0009) +[2023-10-11 18:40:01,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 181665792. Throughput: 0: 1682.5, 1: 1717.5. Samples: 45427178. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:40:01,063][84230] Avg episode reward: [(0, '43.500'), (1, '42.630')] +[2023-10-11 18:40:01,992][85175] Updated weights for policy 1, policy_version 89350 (0.0009) +[2023-10-11 18:40:02,360][85175] Updated weights for policy 1, policy_version 89360 (0.0007) +[2023-10-11 18:40:02,725][85175] Updated weights for policy 1, policy_version 89370 (0.0009) +[2023-10-11 18:40:03,591][85176] Updated weights for policy 0, policy_version 88072 (0.0011) +[2023-10-11 18:40:03,967][85176] Updated weights for policy 0, policy_version 88082 (0.0010) +[2023-10-11 18:40:04,340][85176] Updated weights for policy 0, policy_version 88092 (0.0010) +[2023-10-11 18:40:06,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 181731328. Throughput: 0: 1679.1, 1: 1694.7. Samples: 45437434. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:40:06,063][84230] Avg episode reward: [(0, '45.500'), (1, '47.570')] +[2023-10-11 18:40:06,709][85175] Updated weights for policy 1, policy_version 89380 (0.0007) +[2023-10-11 18:40:07,081][85175] Updated weights for policy 1, policy_version 89390 (0.0010) +[2023-10-11 18:40:07,441][85175] Updated weights for policy 1, policy_version 89400 (0.0008) +[2023-10-11 18:40:08,378][85176] Updated weights for policy 0, policy_version 88102 (0.0009) +[2023-10-11 18:40:08,764][85176] Updated weights for policy 0, policy_version 88112 (0.0008) +[2023-10-11 18:40:09,140][85176] Updated weights for policy 0, policy_version 88122 (0.0008) +[2023-10-11 18:40:11,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13551.5). Total num frames: 181796864. Throughput: 0: 1666.7, 1: 1720.8. Samples: 45457274. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:40:11,064][84230] Avg episode reward: [(0, '45.510'), (1, '42.240')] +[2023-10-11 18:40:11,172][85175] Updated weights for policy 1, policy_version 89410 (0.0007) +[2023-10-11 18:40:11,541][85175] Updated weights for policy 1, policy_version 89420 (0.0009) +[2023-10-11 18:40:11,911][85175] Updated weights for policy 1, policy_version 89430 (0.0008) +[2023-10-11 18:40:12,284][85175] Updated weights for policy 1, policy_version 89440 (0.0009) +[2023-10-11 18:40:13,244][85176] Updated weights for policy 0, policy_version 88132 (0.0009) +[2023-10-11 18:40:13,621][85176] Updated weights for policy 0, policy_version 88142 (0.0007) +[2023-10-11 18:40:13,992][85176] Updated weights for policy 0, policy_version 88152 (0.0009) +[2023-10-11 18:40:16,063][84230] Fps is (10 sec: 13106.7, 60 sec: 13107.2, 300 sec: 13551.5). Total num frames: 181862400. Throughput: 0: 1689.2, 1: 1715.9. Samples: 45478222. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:40:16,064][84230] Avg episode reward: [(0, '44.610'), (1, '45.440')] +[2023-10-11 18:40:16,074][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000088160_90275840.pth... +[2023-10-11 18:40:16,109][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000086592_88670208.pth +[2023-10-11 18:40:16,450][85175] Updated weights for policy 1, policy_version 89450 (0.0008) +[2023-10-11 18:40:16,817][85175] Updated weights for policy 1, policy_version 89460 (0.0008) +[2023-10-11 18:40:17,189][85175] Updated weights for policy 1, policy_version 89470 (0.0008) +[2023-10-11 18:40:17,256][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000089472_91619328.pth... +[2023-10-11 18:40:17,291][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000087872_89980928.pth +[2023-10-11 18:40:18,068][85176] Updated weights for policy 0, policy_version 88162 (0.0008) +[2023-10-11 18:40:18,437][85176] Updated weights for policy 0, policy_version 88172 (0.0008) +[2023-10-11 18:40:18,804][85176] Updated weights for policy 0, policy_version 88182 (0.0008) +[2023-10-11 18:40:19,178][85176] Updated weights for policy 0, policy_version 88192 (0.0008) +[2023-10-11 18:40:21,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.1, 300 sec: 13551.5). Total num frames: 181927936. Throughput: 0: 1674.5, 1: 1705.8. Samples: 45488066. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:40:21,064][84230] Avg episode reward: [(0, '44.510'), (1, '42.590')] +[2023-10-11 18:40:21,183][85175] Updated weights for policy 1, policy_version 89480 (0.0009) +[2023-10-11 18:40:21,544][85175] Updated weights for policy 1, policy_version 89490 (0.0008) +[2023-10-11 18:40:21,915][85175] Updated weights for policy 1, policy_version 89500 (0.0009) +[2023-10-11 18:40:23,282][85176] Updated weights for policy 0, policy_version 88202 (0.0007) +[2023-10-11 18:40:23,653][85176] Updated weights for policy 0, policy_version 88212 (0.0009) +[2023-10-11 18:40:24,030][85176] Updated weights for policy 0, policy_version 88222 (0.0009) +[2023-10-11 18:40:26,062][84230] Fps is (10 sec: 13107.7, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 181993472. Throughput: 0: 1671.3, 1: 1710.5. Samples: 45508046. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:40:26,063][84230] Avg episode reward: [(0, '44.040'), (1, '43.670')] +[2023-10-11 18:40:26,133][85175] Updated weights for policy 1, policy_version 89510 (0.0008) +[2023-10-11 18:40:26,493][85175] Updated weights for policy 1, policy_version 89520 (0.0009) +[2023-10-11 18:40:26,855][85175] Updated weights for policy 1, policy_version 89530 (0.0009) +[2023-10-11 18:40:28,206][85176] Updated weights for policy 0, policy_version 88232 (0.0009) +[2023-10-11 18:40:28,582][85176] Updated weights for policy 0, policy_version 88242 (0.0009) +[2023-10-11 18:40:28,948][85176] Updated weights for policy 0, policy_version 88252 (0.0008) +[2023-10-11 18:40:30,678][85175] Updated weights for policy 1, policy_version 89540 (0.0007) +[2023-10-11 18:40:31,037][85175] Updated weights for policy 1, policy_version 89550 (0.0007) +[2023-10-11 18:40:31,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 182059008. Throughput: 0: 1679.1, 1: 1714.7. Samples: 45528736. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:40:31,064][84230] Avg episode reward: [(0, '47.710'), (1, '45.200')] +[2023-10-11 18:40:31,397][85175] Updated weights for policy 1, policy_version 89560 (0.0008) +[2023-10-11 18:40:33,091][85176] Updated weights for policy 0, policy_version 88262 (0.0008) +[2023-10-11 18:40:33,463][85176] Updated weights for policy 0, policy_version 88272 (0.0008) +[2023-10-11 18:40:33,829][85176] Updated weights for policy 0, policy_version 88282 (0.0008) +[2023-10-11 18:40:35,292][85175] Updated weights for policy 1, policy_version 89570 (0.0008) +[2023-10-11 18:40:35,656][85175] Updated weights for policy 1, policy_version 89580 (0.0010) +[2023-10-11 18:40:36,026][85175] Updated weights for policy 1, policy_version 89590 (0.0008) +[2023-10-11 18:40:36,062][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 182124544. Throughput: 0: 1665.9, 1: 1713.9. Samples: 45538558. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:40:36,063][84230] Avg episode reward: [(0, '45.580'), (1, '42.410')] +[2023-10-11 18:40:36,393][85175] Updated weights for policy 1, policy_version 89600 (0.0010) +[2023-10-11 18:40:37,998][85176] Updated weights for policy 0, policy_version 88292 (0.0008) +[2023-10-11 18:40:38,371][85176] Updated weights for policy 0, policy_version 88302 (0.0008) +[2023-10-11 18:40:38,748][85176] Updated weights for policy 0, policy_version 88312 (0.0009) +[2023-10-11 18:40:40,415][85175] Updated weights for policy 1, policy_version 89610 (0.0008) +[2023-10-11 18:40:40,790][85175] Updated weights for policy 1, policy_version 89620 (0.0010) +[2023-10-11 18:40:41,062][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 182190080. Throughput: 0: 1672.4, 1: 1715.4. Samples: 45558778. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:40:41,063][84230] Avg episode reward: [(0, '45.120'), (1, '44.950')] +[2023-10-11 18:40:41,165][85175] Updated weights for policy 1, policy_version 89630 (0.0010) +[2023-10-11 18:40:42,700][85176] Updated weights for policy 0, policy_version 88322 (0.0008) +[2023-10-11 18:40:43,084][85176] Updated weights for policy 0, policy_version 88332 (0.0010) +[2023-10-11 18:40:43,458][85176] Updated weights for policy 0, policy_version 88342 (0.0010) +[2023-10-11 18:40:43,824][85176] Updated weights for policy 0, policy_version 88352 (0.0007) +[2023-10-11 18:40:45,149][85175] Updated weights for policy 1, policy_version 89640 (0.0009) +[2023-10-11 18:40:45,512][85175] Updated weights for policy 1, policy_version 89650 (0.0009) +[2023-10-11 18:40:45,887][85175] Updated weights for policy 1, policy_version 89660 (0.0007) +[2023-10-11 18:40:46,063][84230] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 182288384. Throughput: 0: 1676.8, 1: 1694.2. Samples: 45578874. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:40:46,063][84230] Avg episode reward: [(0, '40.890'), (1, '42.300')] +[2023-10-11 18:40:47,699][85176] Updated weights for policy 0, policy_version 88362 (0.0008) +[2023-10-11 18:40:48,080][85176] Updated weights for policy 0, policy_version 88372 (0.0008) +[2023-10-11 18:40:48,453][85176] Updated weights for policy 0, policy_version 88382 (0.0008) +[2023-10-11 18:40:50,053][85175] Updated weights for policy 1, policy_version 89670 (0.0010) +[2023-10-11 18:40:50,418][85175] Updated weights for policy 1, policy_version 89680 (0.0011) +[2023-10-11 18:40:50,791][85175] Updated weights for policy 1, policy_version 89690 (0.0008) +[2023-10-11 18:40:51,063][84230] Fps is (10 sec: 16383.6, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 182353920. Throughput: 0: 1654.6, 1: 1708.0. Samples: 45588752. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:40:51,064][84230] Avg episode reward: [(0, '42.510'), (1, '47.030')] +[2023-10-11 18:40:52,600][85176] Updated weights for policy 0, policy_version 88392 (0.0009) +[2023-10-11 18:40:52,973][85176] Updated weights for policy 0, policy_version 88402 (0.0008) +[2023-10-11 18:40:53,338][85176] Updated weights for policy 0, policy_version 88412 (0.0011) +[2023-10-11 18:40:54,741][85175] Updated weights for policy 1, policy_version 89700 (0.0008) +[2023-10-11 18:40:55,120][85175] Updated weights for policy 1, policy_version 89710 (0.0009) +[2023-10-11 18:40:55,488][85175] Updated weights for policy 1, policy_version 89720 (0.0009) +[2023-10-11 18:40:56,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 182419456. Throughput: 0: 1673.7, 1: 1707.7. Samples: 45609436. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:40:56,063][84230] Avg episode reward: [(0, '42.360'), (1, '41.720')] +[2023-10-11 18:40:57,301][85176] Updated weights for policy 0, policy_version 88422 (0.0009) +[2023-10-11 18:40:57,671][85176] Updated weights for policy 0, policy_version 88432 (0.0008) +[2023-10-11 18:40:58,043][85176] Updated weights for policy 0, policy_version 88442 (0.0009) +[2023-10-11 18:40:59,412][85175] Updated weights for policy 1, policy_version 89730 (0.0008) +[2023-10-11 18:40:59,784][85175] Updated weights for policy 1, policy_version 89740 (0.0010) +[2023-10-11 18:41:00,156][85175] Updated weights for policy 1, policy_version 89750 (0.0009) +[2023-10-11 18:41:00,518][85175] Updated weights for policy 1, policy_version 89760 (0.0011) +[2023-10-11 18:41:01,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 182484992. Throughput: 0: 1677.6, 1: 1678.9. Samples: 45629266. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:41:01,063][84230] Avg episode reward: [(0, '45.580'), (1, '45.640')] +[2023-10-11 18:41:02,131][85176] Updated weights for policy 0, policy_version 88452 (0.0008) +[2023-10-11 18:41:02,499][85176] Updated weights for policy 0, policy_version 88462 (0.0009) +[2023-10-11 18:41:02,881][85176] Updated weights for policy 0, policy_version 88472 (0.0009) +[2023-10-11 18:41:04,553][85175] Updated weights for policy 1, policy_version 89770 (0.0008) +[2023-10-11 18:41:04,927][85175] Updated weights for policy 1, policy_version 89780 (0.0008) +[2023-10-11 18:41:05,288][85175] Updated weights for policy 1, policy_version 89790 (0.0007) +[2023-10-11 18:41:06,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 182550528. Throughput: 0: 1660.5, 1: 1710.8. Samples: 45639772. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:41:06,063][84230] Avg episode reward: [(0, '43.180'), (1, '40.190')] +[2023-10-11 18:41:06,932][85176] Updated weights for policy 0, policy_version 88482 (0.0009) +[2023-10-11 18:41:07,305][85176] Updated weights for policy 0, policy_version 88492 (0.0007) +[2023-10-11 18:41:07,676][85176] Updated weights for policy 0, policy_version 88502 (0.0008) +[2023-10-11 18:41:08,044][85176] Updated weights for policy 0, policy_version 88512 (0.0007) +[2023-10-11 18:41:09,248][85175] Updated weights for policy 1, policy_version 89800 (0.0009) +[2023-10-11 18:41:09,613][85175] Updated weights for policy 1, policy_version 89810 (0.0011) +[2023-10-11 18:41:09,982][85175] Updated weights for policy 1, policy_version 89820 (0.0008) +[2023-10-11 18:41:11,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 182616064. Throughput: 0: 1676.9, 1: 1703.6. Samples: 45660172. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:41:11,063][84230] Avg episode reward: [(0, '45.410'), (1, '47.750')] +[2023-10-11 18:41:11,986][85176] Updated weights for policy 0, policy_version 88522 (0.0008) +[2023-10-11 18:41:12,355][85176] Updated weights for policy 0, policy_version 88532 (0.0008) +[2023-10-11 18:41:12,726][85176] Updated weights for policy 0, policy_version 88542 (0.0009) +[2023-10-11 18:41:14,049][85175] Updated weights for policy 1, policy_version 89830 (0.0008) +[2023-10-11 18:41:14,408][85175] Updated weights for policy 1, policy_version 89840 (0.0009) +[2023-10-11 18:41:14,776][85175] Updated weights for policy 1, policy_version 89850 (0.0010) +[2023-10-11 18:41:16,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 182681600. Throughput: 0: 1685.6, 1: 1684.0. Samples: 45680366. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:41:16,063][84230] Avg episode reward: [(0, '42.400'), (1, '42.810')] +[2023-10-11 18:41:16,970][85176] Updated weights for policy 0, policy_version 88552 (0.0008) +[2023-10-11 18:41:17,349][85176] Updated weights for policy 0, policy_version 88562 (0.0008) +[2023-10-11 18:41:17,719][85176] Updated weights for policy 0, policy_version 88572 (0.0007) +[2023-10-11 18:41:18,779][85175] Updated weights for policy 1, policy_version 89860 (0.0010) +[2023-10-11 18:41:19,156][85175] Updated weights for policy 1, policy_version 89870 (0.0011) +[2023-10-11 18:41:19,523][85175] Updated weights for policy 1, policy_version 89880 (0.0009) +[2023-10-11 18:41:21,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 182747136. Throughput: 0: 1668.0, 1: 1715.0. Samples: 45690796. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:41:21,064][84230] Avg episode reward: [(0, '46.080'), (1, '48.760')] +[2023-10-11 18:41:21,801][85176] Updated weights for policy 0, policy_version 88582 (0.0007) +[2023-10-11 18:41:22,169][85176] Updated weights for policy 0, policy_version 88592 (0.0008) +[2023-10-11 18:41:22,538][85176] Updated weights for policy 0, policy_version 88602 (0.0009) +[2023-10-11 18:41:23,504][85175] Updated weights for policy 1, policy_version 89890 (0.0009) +[2023-10-11 18:41:23,872][85175] Updated weights for policy 1, policy_version 89900 (0.0011) +[2023-10-11 18:41:24,246][85175] Updated weights for policy 1, policy_version 89910 (0.0010) +[2023-10-11 18:41:24,601][85175] Updated weights for policy 1, policy_version 89920 (0.0008) +[2023-10-11 18:41:26,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 182812672. Throughput: 0: 1684.8, 1: 1689.3. Samples: 45710614. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:41:26,064][84230] Avg episode reward: [(0, '43.580'), (1, '41.070')] +[2023-10-11 18:41:26,688][85176] Updated weights for policy 0, policy_version 88612 (0.0009) +[2023-10-11 18:41:27,058][85176] Updated weights for policy 0, policy_version 88622 (0.0010) +[2023-10-11 18:41:27,435][85176] Updated weights for policy 0, policy_version 88632 (0.0009) +[2023-10-11 18:41:28,661][85175] Updated weights for policy 1, policy_version 89930 (0.0009) +[2023-10-11 18:41:29,030][85175] Updated weights for policy 1, policy_version 89940 (0.0009) +[2023-10-11 18:41:29,401][85175] Updated weights for policy 1, policy_version 89950 (0.0008) +[2023-10-11 18:41:31,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 182878208. Throughput: 0: 1691.8, 1: 1702.8. Samples: 45731634. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:41:31,064][84230] Avg episode reward: [(0, '46.810'), (1, '45.750')] +[2023-10-11 18:41:31,358][85176] Updated weights for policy 0, policy_version 88642 (0.0009) +[2023-10-11 18:41:31,735][85176] Updated weights for policy 0, policy_version 88652 (0.0009) +[2023-10-11 18:41:32,099][85176] Updated weights for policy 0, policy_version 88662 (0.0009) +[2023-10-11 18:41:32,474][85176] Updated weights for policy 0, policy_version 88672 (0.0010) +[2023-10-11 18:41:33,283][85175] Updated weights for policy 1, policy_version 89960 (0.0008) +[2023-10-11 18:41:33,643][85175] Updated weights for policy 1, policy_version 89970 (0.0007) +[2023-10-11 18:41:34,008][85175] Updated weights for policy 1, policy_version 89980 (0.0008) +[2023-10-11 18:41:36,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 182943744. Throughput: 0: 1690.7, 1: 1703.2. Samples: 45741476. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:41:36,063][84230] Avg episode reward: [(0, '42.230'), (1, '39.960')] +[2023-10-11 18:41:36,490][85176] Updated weights for policy 0, policy_version 88682 (0.0008) +[2023-10-11 18:41:36,865][85176] Updated weights for policy 0, policy_version 88692 (0.0008) +[2023-10-11 18:41:37,244][85176] Updated weights for policy 0, policy_version 88702 (0.0009) +[2023-10-11 18:41:37,988][85175] Updated weights for policy 1, policy_version 89990 (0.0010) +[2023-10-11 18:41:38,345][85175] Updated weights for policy 1, policy_version 90000 (0.0010) +[2023-10-11 18:41:38,713][85175] Updated weights for policy 1, policy_version 90010 (0.0008) +[2023-10-11 18:41:41,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 183009280. Throughput: 0: 1699.3, 1: 1690.6. Samples: 45761984. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:41:41,064][84230] Avg episode reward: [(0, '48.490'), (1, '46.960')] +[2023-10-11 18:41:41,454][85176] Updated weights for policy 0, policy_version 88712 (0.0009) +[2023-10-11 18:41:41,825][85176] Updated weights for policy 0, policy_version 88722 (0.0007) +[2023-10-11 18:41:42,192][85176] Updated weights for policy 0, policy_version 88732 (0.0007) +[2023-10-11 18:41:42,740][85175] Updated weights for policy 1, policy_version 90020 (0.0007) +[2023-10-11 18:41:43,112][85175] Updated weights for policy 1, policy_version 90030 (0.0008) +[2023-10-11 18:41:43,473][85175] Updated weights for policy 1, policy_version 90040 (0.0008) +[2023-10-11 18:41:46,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 183074816. Throughput: 0: 1692.2, 1: 1718.0. Samples: 45782724. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:41:46,063][84230] Avg episode reward: [(0, '41.600'), (1, '40.130')] +[2023-10-11 18:41:46,268][85176] Updated weights for policy 0, policy_version 88742 (0.0009) +[2023-10-11 18:41:46,636][85176] Updated weights for policy 0, policy_version 88752 (0.0009) +[2023-10-11 18:41:47,000][85176] Updated weights for policy 0, policy_version 88762 (0.0008) +[2023-10-11 18:41:47,649][85175] Updated weights for policy 1, policy_version 90050 (0.0008) +[2023-10-11 18:41:48,004][85175] Updated weights for policy 1, policy_version 90060 (0.0010) +[2023-10-11 18:41:48,379][85175] Updated weights for policy 1, policy_version 90070 (0.0007) +[2023-10-11 18:41:48,741][85175] Updated weights for policy 1, policy_version 90080 (0.0009) +[2023-10-11 18:41:50,973][85176] Updated weights for policy 0, policy_version 88772 (0.0009) +[2023-10-11 18:41:51,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 183140352. Throughput: 0: 1692.9, 1: 1693.7. Samples: 45792170. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:41:51,064][84230] Avg episode reward: [(0, '47.800'), (1, '46.640')] +[2023-10-11 18:41:51,335][85176] Updated weights for policy 0, policy_version 88782 (0.0009) +[2023-10-11 18:41:51,709][85176] Updated weights for policy 0, policy_version 88792 (0.0008) +[2023-10-11 18:41:52,789][85175] Updated weights for policy 1, policy_version 90090 (0.0008) +[2023-10-11 18:41:53,147][85175] Updated weights for policy 1, policy_version 90100 (0.0009) +[2023-10-11 18:41:53,519][85175] Updated weights for policy 1, policy_version 90110 (0.0008) +[2023-10-11 18:41:55,631][85176] Updated weights for policy 0, policy_version 88802 (0.0007) +[2023-10-11 18:41:56,013][85176] Updated weights for policy 0, policy_version 88812 (0.0007) +[2023-10-11 18:41:56,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 183205888. Throughput: 0: 1689.0, 1: 1695.0. Samples: 45812450. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:41:56,063][84230] Avg episode reward: [(0, '43.600'), (1, '43.390')] +[2023-10-11 18:41:56,386][85176] Updated weights for policy 0, policy_version 88822 (0.0008) +[2023-10-11 18:41:56,762][85176] Updated weights for policy 0, policy_version 88832 (0.0008) +[2023-10-11 18:41:57,533][85175] Updated weights for policy 1, policy_version 90120 (0.0009) +[2023-10-11 18:41:57,897][85175] Updated weights for policy 1, policy_version 90130 (0.0008) +[2023-10-11 18:41:58,264][85175] Updated weights for policy 1, policy_version 90140 (0.0007) +[2023-10-11 18:42:00,867][85176] Updated weights for policy 0, policy_version 88842 (0.0008) +[2023-10-11 18:42:01,062][84230] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 183271424. Throughput: 0: 1686.2, 1: 1712.3. Samples: 45833300. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:42:01,063][84230] Avg episode reward: [(0, '47.370'), (1, '45.030')] +[2023-10-11 18:42:01,249][85176] Updated weights for policy 0, policy_version 88852 (0.0008) +[2023-10-11 18:42:01,628][85176] Updated weights for policy 0, policy_version 88862 (0.0009) +[2023-10-11 18:42:02,330][85175] Updated weights for policy 1, policy_version 90150 (0.0008) +[2023-10-11 18:42:02,709][85175] Updated weights for policy 1, policy_version 90160 (0.0009) +[2023-10-11 18:42:03,074][85175] Updated weights for policy 1, policy_version 90170 (0.0008) +[2023-10-11 18:42:05,659][85176] Updated weights for policy 0, policy_version 88872 (0.0007) +[2023-10-11 18:42:06,032][85176] Updated weights for policy 0, policy_version 88882 (0.0008) +[2023-10-11 18:42:06,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 183336960. Throughput: 0: 1695.0, 1: 1682.4. Samples: 45842780. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:42:06,063][84230] Avg episode reward: [(0, '43.820'), (1, '44.980')] +[2023-10-11 18:42:06,396][85176] Updated weights for policy 0, policy_version 88892 (0.0009) +[2023-10-11 18:42:06,986][85175] Updated weights for policy 1, policy_version 90180 (0.0010) +[2023-10-11 18:42:07,364][85175] Updated weights for policy 1, policy_version 90190 (0.0008) +[2023-10-11 18:42:07,727][85175] Updated weights for policy 1, policy_version 90200 (0.0009) +[2023-10-11 18:42:10,524][85176] Updated weights for policy 0, policy_version 88902 (0.0009) +[2023-10-11 18:42:10,896][85176] Updated weights for policy 0, policy_version 88912 (0.0007) +[2023-10-11 18:42:11,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 183402496. Throughput: 0: 1684.0, 1: 1711.7. Samples: 45863418. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:42:11,063][84230] Avg episode reward: [(0, '43.540'), (1, '43.430')] +[2023-10-11 18:42:11,268][85176] Updated weights for policy 0, policy_version 88922 (0.0007) +[2023-10-11 18:42:11,639][85175] Updated weights for policy 1, policy_version 90210 (0.0008) +[2023-10-11 18:42:12,010][85175] Updated weights for policy 1, policy_version 90220 (0.0007) +[2023-10-11 18:42:12,377][85175] Updated weights for policy 1, policy_version 90230 (0.0008) +[2023-10-11 18:42:12,748][85175] Updated weights for policy 1, policy_version 90240 (0.0008) +[2023-10-11 18:42:15,318][85176] Updated weights for policy 0, policy_version 88932 (0.0008) +[2023-10-11 18:42:15,690][85176] Updated weights for policy 0, policy_version 88942 (0.0009) +[2023-10-11 18:42:16,054][85176] Updated weights for policy 0, policy_version 88952 (0.0008) +[2023-10-11 18:42:16,063][84230] Fps is (10 sec: 13106.5, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 183468032. Throughput: 0: 1667.1, 1: 1714.0. Samples: 45883782. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:42:16,064][84230] Avg episode reward: [(0, '40.680'), (1, '46.620')] +[2023-10-11 18:42:16,075][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000090240_92405760.pth... +[2023-10-11 18:42:16,116][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000088672_90800128.pth +[2023-10-11 18:42:16,358][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000088960_91095040.pth... +[2023-10-11 18:42:16,391][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000087360_89456640.pth +[2023-10-11 18:42:16,729][85175] Updated weights for policy 1, policy_version 90250 (0.0010) +[2023-10-11 18:42:17,098][85175] Updated weights for policy 1, policy_version 90260 (0.0010) +[2023-10-11 18:42:17,464][85175] Updated weights for policy 1, policy_version 90270 (0.0008) +[2023-10-11 18:42:20,197][85176] Updated weights for policy 0, policy_version 88962 (0.0009) +[2023-10-11 18:42:20,559][85176] Updated weights for policy 0, policy_version 88972 (0.0009) +[2023-10-11 18:42:20,936][85176] Updated weights for policy 0, policy_version 88982 (0.0007) +[2023-10-11 18:42:21,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 183533568. Throughput: 0: 1677.6, 1: 1694.2. Samples: 45893210. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:42:21,064][84230] Avg episode reward: [(0, '42.460'), (1, '42.400')] +[2023-10-11 18:42:21,311][85176] Updated weights for policy 0, policy_version 88992 (0.0007) +[2023-10-11 18:42:21,405][85175] Updated weights for policy 1, policy_version 90280 (0.0008) +[2023-10-11 18:42:21,790][85175] Updated weights for policy 1, policy_version 90290 (0.0011) +[2023-10-11 18:42:22,159][85175] Updated weights for policy 1, policy_version 90300 (0.0010) +[2023-10-11 18:42:25,427][85176] Updated weights for policy 0, policy_version 89002 (0.0007) +[2023-10-11 18:42:25,797][85176] Updated weights for policy 0, policy_version 89012 (0.0007) +[2023-10-11 18:42:26,063][84230] Fps is (10 sec: 13107.7, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 183599104. Throughput: 0: 1674.4, 1: 1707.0. Samples: 45914150. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:42:26,063][84230] Avg episode reward: [(0, '43.840'), (1, '45.880')] +[2023-10-11 18:42:26,179][85176] Updated weights for policy 0, policy_version 89022 (0.0008) +[2023-10-11 18:42:26,207][85175] Updated weights for policy 1, policy_version 90310 (0.0009) +[2023-10-11 18:42:26,580][85175] Updated weights for policy 1, policy_version 90320 (0.0008) +[2023-10-11 18:42:26,950][85175] Updated weights for policy 1, policy_version 90330 (0.0010) +[2023-10-11 18:42:30,303][85176] Updated weights for policy 0, policy_version 89032 (0.0010) +[2023-10-11 18:42:30,684][85176] Updated weights for policy 0, policy_version 89042 (0.0007) +[2023-10-11 18:42:31,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 183664640. Throughput: 0: 1659.3, 1: 1705.4. Samples: 45934138. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:42:31,063][84230] Avg episode reward: [(0, '44.760'), (1, '42.690')] +[2023-10-11 18:42:31,065][85176] Updated weights for policy 0, policy_version 89052 (0.0009) +[2023-10-11 18:42:31,096][85175] Updated weights for policy 1, policy_version 90340 (0.0008) +[2023-10-11 18:42:31,465][85175] Updated weights for policy 1, policy_version 90350 (0.0008) +[2023-10-11 18:42:31,835][85175] Updated weights for policy 1, policy_version 90360 (0.0009) +[2023-10-11 18:42:35,175][85176] Updated weights for policy 0, policy_version 89062 (0.0008) +[2023-10-11 18:42:35,544][85176] Updated weights for policy 0, policy_version 89072 (0.0007) +[2023-10-11 18:42:35,916][85176] Updated weights for policy 0, policy_version 89082 (0.0008) +[2023-10-11 18:42:36,031][85175] Updated weights for policy 1, policy_version 90370 (0.0007) +[2023-10-11 18:42:36,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 183730176. Throughput: 0: 1672.2, 1: 1696.2. Samples: 45943748. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:42:36,063][84230] Avg episode reward: [(0, '44.240'), (1, '48.980')] +[2023-10-11 18:42:36,400][85175] Updated weights for policy 1, policy_version 90380 (0.0010) +[2023-10-11 18:42:36,768][85175] Updated weights for policy 1, policy_version 90390 (0.0008) +[2023-10-11 18:42:37,139][85175] Updated weights for policy 1, policy_version 90400 (0.0009) +[2023-10-11 18:42:40,124][85176] Updated weights for policy 0, policy_version 89092 (0.0009) +[2023-10-11 18:42:40,499][85176] Updated weights for policy 0, policy_version 89102 (0.0008) +[2023-10-11 18:42:40,881][85176] Updated weights for policy 0, policy_version 89112 (0.0010) +[2023-10-11 18:42:41,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 183795712. Throughput: 0: 1672.9, 1: 1702.6. Samples: 45964348. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:42:41,063][84230] Avg episode reward: [(0, '45.600'), (1, '44.230')] +[2023-10-11 18:42:41,265][85175] Updated weights for policy 1, policy_version 90410 (0.0008) +[2023-10-11 18:42:41,624][85175] Updated weights for policy 1, policy_version 90420 (0.0008) +[2023-10-11 18:42:41,976][85175] Updated weights for policy 1, policy_version 90430 (0.0008) +[2023-10-11 18:42:44,958][85176] Updated weights for policy 0, policy_version 89122 (0.0008) +[2023-10-11 18:42:45,332][85176] Updated weights for policy 0, policy_version 89132 (0.0011) +[2023-10-11 18:42:45,704][85176] Updated weights for policy 0, policy_version 89142 (0.0008) +[2023-10-11 18:42:45,922][85175] Updated weights for policy 1, policy_version 90440 (0.0008) +[2023-10-11 18:42:46,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 183861248. Throughput: 0: 1653.2, 1: 1703.9. Samples: 45984372. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:42:46,064][84230] Avg episode reward: [(0, '47.860'), (1, '49.340')] +[2023-10-11 18:42:46,079][85176] Updated weights for policy 0, policy_version 89152 (0.0009) +[2023-10-11 18:42:46,292][85175] Updated weights for policy 1, policy_version 90450 (0.0011) +[2023-10-11 18:42:46,653][85175] Updated weights for policy 1, policy_version 90460 (0.0008) +[2023-10-11 18:42:50,160][85176] Updated weights for policy 0, policy_version 89162 (0.0010) +[2023-10-11 18:42:50,537][85176] Updated weights for policy 0, policy_version 89172 (0.0009) +[2023-10-11 18:42:50,609][85175] Updated weights for policy 1, policy_version 90470 (0.0009) +[2023-10-11 18:42:50,906][85176] Updated weights for policy 0, policy_version 89182 (0.0008) +[2023-10-11 18:42:50,984][85175] Updated weights for policy 1, policy_version 90480 (0.0007) +[2023-10-11 18:42:51,062][84230] Fps is (10 sec: 16384.1, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 183959552. Throughput: 0: 1664.6, 1: 1698.0. Samples: 45994100. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:42:51,063][84230] Avg episode reward: [(0, '47.290'), (1, '43.120')] +[2023-10-11 18:42:51,350][85175] Updated weights for policy 1, policy_version 90490 (0.0007) +[2023-10-11 18:42:55,157][85176] Updated weights for policy 0, policy_version 89192 (0.0010) +[2023-10-11 18:42:55,390][85175] Updated weights for policy 1, policy_version 90500 (0.0008) +[2023-10-11 18:42:55,533][85176] Updated weights for policy 0, policy_version 89202 (0.0009) +[2023-10-11 18:42:55,745][85175] Updated weights for policy 1, policy_version 90510 (0.0008) +[2023-10-11 18:42:55,902][85176] Updated weights for policy 0, policy_version 89212 (0.0008) +[2023-10-11 18:42:56,062][84230] Fps is (10 sec: 16384.6, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 184025088. Throughput: 0: 1667.0, 1: 1697.4. Samples: 46014818. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:42:56,063][84230] Avg episode reward: [(0, '45.910'), (1, '48.660')] +[2023-10-11 18:42:56,104][85175] Updated weights for policy 1, policy_version 90520 (0.0010) +[2023-10-11 18:42:59,929][85175] Updated weights for policy 1, policy_version 90530 (0.0010) +[2023-10-11 18:42:59,955][85176] Updated weights for policy 0, policy_version 89222 (0.0008) +[2023-10-11 18:43:00,295][85175] Updated weights for policy 1, policy_version 90540 (0.0009) +[2023-10-11 18:43:00,318][85176] Updated weights for policy 0, policy_version 89232 (0.0007) +[2023-10-11 18:43:00,656][85175] Updated weights for policy 1, policy_version 90550 (0.0008) +[2023-10-11 18:43:00,700][85176] Updated weights for policy 0, policy_version 89242 (0.0009) +[2023-10-11 18:43:01,026][85175] Updated weights for policy 1, policy_version 90560 (0.0009) +[2023-10-11 18:43:01,063][84230] Fps is (10 sec: 16383.8, 60 sec: 14199.5, 300 sec: 13662.6). Total num frames: 184123392. Throughput: 0: 1654.6, 1: 1681.0. Samples: 46033880. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:43:01,063][84230] Avg episode reward: [(0, '46.060'), (1, '42.570')] +[2023-10-11 18:43:04,765][85176] Updated weights for policy 0, policy_version 89252 (0.0009) +[2023-10-11 18:43:05,139][85176] Updated weights for policy 0, policy_version 89262 (0.0007) +[2023-10-11 18:43:05,226][85175] Updated weights for policy 1, policy_version 90570 (0.0007) +[2023-10-11 18:43:05,519][85176] Updated weights for policy 0, policy_version 89272 (0.0007) +[2023-10-11 18:43:05,597][85175] Updated weights for policy 1, policy_version 90580 (0.0008) +[2023-10-11 18:43:05,960][85175] Updated weights for policy 1, policy_version 90590 (0.0009) +[2023-10-11 18:43:06,063][84230] Fps is (10 sec: 16383.3, 60 sec: 14199.4, 300 sec: 13662.6). Total num frames: 184188928. Throughput: 0: 1662.9, 1: 1700.3. Samples: 46044556. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:43:06,064][84230] Avg episode reward: [(0, '45.150'), (1, '44.970')] +[2023-10-11 18:43:09,644][85176] Updated weights for policy 0, policy_version 89282 (0.0008) +[2023-10-11 18:43:10,014][85176] Updated weights for policy 0, policy_version 89292 (0.0008) +[2023-10-11 18:43:10,087][85175] Updated weights for policy 1, policy_version 90600 (0.0010) +[2023-10-11 18:43:10,391][85176] Updated weights for policy 0, policy_version 89302 (0.0009) +[2023-10-11 18:43:10,457][85175] Updated weights for policy 1, policy_version 90610 (0.0009) +[2023-10-11 18:43:10,774][85176] Updated weights for policy 0, policy_version 89312 (0.0009) +[2023-10-11 18:43:10,817][85175] Updated weights for policy 1, policy_version 90620 (0.0007) +[2023-10-11 18:43:11,063][84230] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13662.6). Total num frames: 184254464. Throughput: 0: 1658.7, 1: 1691.4. Samples: 46064906. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:43:11,063][84230] Avg episode reward: [(0, '43.030'), (1, '42.590')] +[2023-10-11 18:43:14,917][85176] Updated weights for policy 0, policy_version 89322 (0.0008) +[2023-10-11 18:43:14,933][85175] Updated weights for policy 1, policy_version 90630 (0.0008) +[2023-10-11 18:43:15,282][85176] Updated weights for policy 0, policy_version 89332 (0.0008) +[2023-10-11 18:43:15,301][85175] Updated weights for policy 1, policy_version 90640 (0.0009) +[2023-10-11 18:43:15,656][85176] Updated weights for policy 0, policy_version 89342 (0.0008) +[2023-10-11 18:43:15,667][85175] Updated weights for policy 1, policy_version 90650 (0.0010) +[2023-10-11 18:43:16,063][84230] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13662.6). Total num frames: 184320000. Throughput: 0: 1649.7, 1: 1671.7. Samples: 46083602. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:43:16,064][84230] Avg episode reward: [(0, '46.730'), (1, '46.120')] +[2023-10-11 18:43:19,749][85176] Updated weights for policy 0, policy_version 89352 (0.0009) +[2023-10-11 18:43:19,757][85175] Updated weights for policy 1, policy_version 90660 (0.0008) +[2023-10-11 18:43:20,118][85176] Updated weights for policy 0, policy_version 89362 (0.0007) +[2023-10-11 18:43:20,123][85175] Updated weights for policy 1, policy_version 90670 (0.0009) +[2023-10-11 18:43:20,488][85175] Updated weights for policy 1, policy_version 90680 (0.0010) +[2023-10-11 18:43:20,489][85176] Updated weights for policy 0, policy_version 89372 (0.0008) +[2023-10-11 18:43:21,063][84230] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 13662.6). Total num frames: 184385536. Throughput: 0: 1658.6, 1: 1693.6. Samples: 46094598. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:43:21,064][84230] Avg episode reward: [(0, '42.320'), (1, '45.670')] +[2023-10-11 18:43:24,579][85175] Updated weights for policy 1, policy_version 90690 (0.0009) +[2023-10-11 18:43:24,673][85176] Updated weights for policy 0, policy_version 89382 (0.0008) +[2023-10-11 18:43:24,954][85175] Updated weights for policy 1, policy_version 90700 (0.0008) +[2023-10-11 18:43:25,039][85176] Updated weights for policy 0, policy_version 89392 (0.0007) +[2023-10-11 18:43:25,319][85175] Updated weights for policy 1, policy_version 90710 (0.0009) +[2023-10-11 18:43:25,404][85176] Updated weights for policy 0, policy_version 89402 (0.0007) +[2023-10-11 18:43:25,685][85175] Updated weights for policy 1, policy_version 90720 (0.0009) +[2023-10-11 18:43:26,063][84230] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 184451072. Throughput: 0: 1655.2, 1: 1693.0. Samples: 46115016. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:43:26,064][84230] Avg episode reward: [(0, '46.800'), (1, '45.550')] +[2023-10-11 18:43:29,524][85176] Updated weights for policy 0, policy_version 89412 (0.0007) +[2023-10-11 18:43:29,748][85175] Updated weights for policy 1, policy_version 90730 (0.0007) +[2023-10-11 18:43:29,894][85176] Updated weights for policy 0, policy_version 89422 (0.0008) +[2023-10-11 18:43:30,115][85175] Updated weights for policy 1, policy_version 90740 (0.0008) +[2023-10-11 18:43:30,258][85176] Updated weights for policy 0, policy_version 89432 (0.0009) +[2023-10-11 18:43:30,482][85175] Updated weights for policy 1, policy_version 90750 (0.0008) +[2023-10-11 18:43:31,063][84230] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 184516608. Throughput: 0: 1646.7, 1: 1663.4. Samples: 46133328. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:43:31,064][84230] Avg episode reward: [(0, '43.490'), (1, '43.240')] +[2023-10-11 18:43:34,254][85176] Updated weights for policy 0, policy_version 89442 (0.0008) +[2023-10-11 18:43:34,497][85175] Updated weights for policy 1, policy_version 90760 (0.0008) +[2023-10-11 18:43:34,630][85176] Updated weights for policy 0, policy_version 89452 (0.0008) +[2023-10-11 18:43:34,867][85175] Updated weights for policy 1, policy_version 90770 (0.0008) +[2023-10-11 18:43:34,996][85176] Updated weights for policy 0, policy_version 89462 (0.0007) +[2023-10-11 18:43:35,228][85175] Updated weights for policy 1, policy_version 90780 (0.0008) +[2023-10-11 18:43:35,355][85176] Updated weights for policy 0, policy_version 89472 (0.0008) +[2023-10-11 18:43:36,062][84230] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 184582144. Throughput: 0: 1661.5, 1: 1693.9. Samples: 46145092. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:43:36,063][84230] Avg episode reward: [(0, '47.670'), (1, '43.700')] +[2023-10-11 18:43:39,393][85175] Updated weights for policy 1, policy_version 90790 (0.0007) +[2023-10-11 18:43:39,591][85176] Updated weights for policy 0, policy_version 89482 (0.0009) +[2023-10-11 18:43:39,757][85175] Updated weights for policy 1, policy_version 90800 (0.0008) +[2023-10-11 18:43:39,962][85176] Updated weights for policy 0, policy_version 89492 (0.0007) +[2023-10-11 18:43:40,121][85175] Updated weights for policy 1, policy_version 90810 (0.0009) +[2023-10-11 18:43:40,339][85176] Updated weights for policy 0, policy_version 89502 (0.0007) +[2023-10-11 18:43:41,062][84230] Fps is (10 sec: 13107.6, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 184647680. Throughput: 0: 1654.6, 1: 1674.1. Samples: 46164608. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:43:41,063][84230] Avg episode reward: [(0, '42.100'), (1, '41.900')] +[2023-10-11 18:43:44,158][85175] Updated weights for policy 1, policy_version 90820 (0.0009) +[2023-10-11 18:43:44,456][85176] Updated weights for policy 0, policy_version 89512 (0.0007) +[2023-10-11 18:43:44,524][85175] Updated weights for policy 1, policy_version 90830 (0.0010) +[2023-10-11 18:43:44,830][85176] Updated weights for policy 0, policy_version 89522 (0.0007) +[2023-10-11 18:43:44,897][85175] Updated weights for policy 1, policy_version 90840 (0.0009) +[2023-10-11 18:43:45,203][85176] Updated weights for policy 0, policy_version 89532 (0.0009) +[2023-10-11 18:43:46,062][84230] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 184713216. Throughput: 0: 1654.7, 1: 1673.2. Samples: 46183634. Policy #0 lag: (min: 30.0, avg: 35.0, max: 62.0) +[2023-10-11 18:43:46,063][84230] Avg episode reward: [(0, '45.970'), (1, '42.080')] +[2023-10-11 18:43:48,777][85175] Updated weights for policy 1, policy_version 90850 (0.0007) +[2023-10-11 18:43:49,150][85175] Updated weights for policy 1, policy_version 90860 (0.0007) +[2023-10-11 18:43:49,375][85176] Updated weights for policy 0, policy_version 89542 (0.0008) +[2023-10-11 18:43:49,522][85175] Updated weights for policy 1, policy_version 90870 (0.0007) +[2023-10-11 18:43:49,738][85176] Updated weights for policy 0, policy_version 89552 (0.0007) +[2023-10-11 18:43:49,890][85175] Updated weights for policy 1, policy_version 90880 (0.0008) +[2023-10-11 18:43:50,108][85176] Updated weights for policy 0, policy_version 89562 (0.0008) +[2023-10-11 18:43:51,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 184778752. Throughput: 0: 1662.5, 1: 1688.0. Samples: 46195326. Policy #0 lag: (min: 30.0, avg: 35.0, max: 62.0) +[2023-10-11 18:43:51,064][84230] Avg episode reward: [(0, '42.240'), (1, '43.010')] +[2023-10-11 18:43:54,012][85175] Updated weights for policy 1, policy_version 90890 (0.0009) +[2023-10-11 18:43:54,254][85176] Updated weights for policy 0, policy_version 89572 (0.0008) +[2023-10-11 18:43:54,381][85175] Updated weights for policy 1, policy_version 90900 (0.0007) +[2023-10-11 18:43:54,627][85176] Updated weights for policy 0, policy_version 89582 (0.0007) +[2023-10-11 18:43:54,740][85175] Updated weights for policy 1, policy_version 90910 (0.0008) +[2023-10-11 18:43:55,012][85176] Updated weights for policy 0, policy_version 89592 (0.0007) +[2023-10-11 18:43:56,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 184844288. Throughput: 0: 1653.3, 1: 1672.5. Samples: 46214568. Policy #0 lag: (min: 30.0, avg: 35.0, max: 62.0) +[2023-10-11 18:43:56,063][84230] Avg episode reward: [(0, '46.890'), (1, '42.850')] +[2023-10-11 18:43:58,678][85175] Updated weights for policy 1, policy_version 90920 (0.0008) +[2023-10-11 18:43:58,990][85176] Updated weights for policy 0, policy_version 89602 (0.0008) +[2023-10-11 18:43:59,047][85175] Updated weights for policy 1, policy_version 90930 (0.0008) +[2023-10-11 18:43:59,366][85176] Updated weights for policy 0, policy_version 89612 (0.0009) +[2023-10-11 18:43:59,412][85175] Updated weights for policy 1, policy_version 90940 (0.0007) +[2023-10-11 18:43:59,735][85176] Updated weights for policy 0, policy_version 89622 (0.0007) +[2023-10-11 18:44:00,112][85176] Updated weights for policy 0, policy_version 89632 (0.0008) +[2023-10-11 18:44:01,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13551.5). Total num frames: 184909824. Throughput: 0: 1658.9, 1: 1688.0. Samples: 46234212. Policy #0 lag: (min: 30.0, avg: 35.0, max: 62.0) +[2023-10-11 18:44:01,063][84230] Avg episode reward: [(0, '39.920'), (1, '45.010')] +[2023-10-11 18:44:03,312][85175] Updated weights for policy 1, policy_version 90950 (0.0008) +[2023-10-11 18:44:03,678][85175] Updated weights for policy 1, policy_version 90960 (0.0007) +[2023-10-11 18:44:04,044][85175] Updated weights for policy 1, policy_version 90970 (0.0008) +[2023-10-11 18:44:04,048][85176] Updated weights for policy 0, policy_version 89642 (0.0009) +[2023-10-11 18:44:04,428][85176] Updated weights for policy 0, policy_version 89652 (0.0008) +[2023-10-11 18:44:04,797][85176] Updated weights for policy 0, policy_version 89662 (0.0008) +[2023-10-11 18:44:06,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 184975360. Throughput: 0: 1667.3, 1: 1688.4. Samples: 46245606. Policy #0 lag: (min: 30.0, avg: 35.0, max: 62.0) +[2023-10-11 18:44:06,063][84230] Avg episode reward: [(0, '44.710'), (1, '43.460')] +[2023-10-11 18:44:08,172][85175] Updated weights for policy 1, policy_version 90980 (0.0009) +[2023-10-11 18:44:08,544][85175] Updated weights for policy 1, policy_version 90990 (0.0008) +[2023-10-11 18:44:08,917][85175] Updated weights for policy 1, policy_version 91000 (0.0008) +[2023-10-11 18:44:09,020][85176] Updated weights for policy 0, policy_version 89672 (0.0010) +[2023-10-11 18:44:09,384][85176] Updated weights for policy 0, policy_version 89682 (0.0008) +[2023-10-11 18:44:09,761][85176] Updated weights for policy 0, policy_version 89692 (0.0008) +[2023-10-11 18:44:11,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.5). Total num frames: 185040896. Throughput: 0: 1649.5, 1: 1677.3. Samples: 46264720. Policy #0 lag: (min: 30.0, avg: 35.0, max: 62.0) +[2023-10-11 18:44:11,063][84230] Avg episode reward: [(0, '40.670'), (1, '46.500')] +[2023-10-11 18:44:13,082][85175] Updated weights for policy 1, policy_version 91010 (0.0007) +[2023-10-11 18:44:13,450][85175] Updated weights for policy 1, policy_version 91020 (0.0008) +[2023-10-11 18:44:13,809][85175] Updated weights for policy 1, policy_version 91030 (0.0007) +[2023-10-11 18:44:13,956][85176] Updated weights for policy 0, policy_version 89702 (0.0009) +[2023-10-11 18:44:14,175][85175] Updated weights for policy 1, policy_version 91040 (0.0007) +[2023-10-11 18:44:14,326][85176] Updated weights for policy 0, policy_version 89712 (0.0008) +[2023-10-11 18:44:14,692][85176] Updated weights for policy 0, policy_version 89722 (0.0008) +[2023-10-11 18:44:16,063][84230] Fps is (10 sec: 13106.8, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 185106432. Throughput: 0: 1665.2, 1: 1704.6. Samples: 46284968. Policy #0 lag: (min: 30.0, avg: 35.0, max: 62.0) +[2023-10-11 18:44:16,064][84230] Avg episode reward: [(0, '47.180'), (1, '45.580')] +[2023-10-11 18:44:16,075][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000089728_91881472.pth... +[2023-10-11 18:44:16,076][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000091040_93224960.pth... +[2023-10-11 18:44:16,109][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000089472_91619328.pth +[2023-10-11 18:44:16,115][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000088160_90275840.pth +[2023-10-11 18:44:18,042][85175] Updated weights for policy 1, policy_version 91050 (0.0009) +[2023-10-11 18:44:18,412][85175] Updated weights for policy 1, policy_version 91060 (0.0007) +[2023-10-11 18:44:18,725][85176] Updated weights for policy 0, policy_version 89732 (0.0009) +[2023-10-11 18:44:18,773][85175] Updated weights for policy 1, policy_version 91070 (0.0008) +[2023-10-11 18:44:19,092][85176] Updated weights for policy 0, policy_version 89742 (0.0009) +[2023-10-11 18:44:19,461][85176] Updated weights for policy 0, policy_version 89752 (0.0009) +[2023-10-11 18:44:21,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 185171968. Throughput: 0: 1667.9, 1: 1682.1. Samples: 46295844. Policy #0 lag: (min: 30.0, avg: 35.0, max: 62.0) +[2023-10-11 18:44:21,063][84230] Avg episode reward: [(0, '41.830'), (1, '47.240')] +[2023-10-11 18:44:22,750][85175] Updated weights for policy 1, policy_version 91080 (0.0009) +[2023-10-11 18:44:23,117][85175] Updated weights for policy 1, policy_version 91090 (0.0008) +[2023-10-11 18:44:23,439][85176] Updated weights for policy 0, policy_version 89762 (0.0010) +[2023-10-11 18:44:23,481][85175] Updated weights for policy 1, policy_version 91100 (0.0010) +[2023-10-11 18:44:23,810][85176] Updated weights for policy 0, policy_version 89772 (0.0010) +[2023-10-11 18:44:24,177][85176] Updated weights for policy 0, policy_version 89782 (0.0009) +[2023-10-11 18:44:24,544][85176] Updated weights for policy 0, policy_version 89792 (0.0011) +[2023-10-11 18:44:26,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 185237504. Throughput: 0: 1653.3, 1: 1688.2. Samples: 46314974. Policy #0 lag: (min: 30.0, avg: 35.0, max: 62.0) +[2023-10-11 18:44:26,064][84230] Avg episode reward: [(0, '45.580'), (1, '44.250')] +[2023-10-11 18:44:27,568][85175] Updated weights for policy 1, policy_version 91110 (0.0008) +[2023-10-11 18:44:27,935][85175] Updated weights for policy 1, policy_version 91120 (0.0009) +[2023-10-11 18:44:28,301][85175] Updated weights for policy 1, policy_version 91130 (0.0007) +[2023-10-11 18:44:28,650][85176] Updated weights for policy 0, policy_version 89802 (0.0007) +[2023-10-11 18:44:29,023][85176] Updated weights for policy 0, policy_version 89812 (0.0009) +[2023-10-11 18:44:29,389][85176] Updated weights for policy 0, policy_version 89822 (0.0009) +[2023-10-11 18:44:31,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 185303040. Throughput: 0: 1671.3, 1: 1705.6. Samples: 46335596. Policy #0 lag: (min: 30.0, avg: 35.0, max: 62.0) +[2023-10-11 18:44:31,063][84230] Avg episode reward: [(0, '43.560'), (1, '46.290')] +[2023-10-11 18:44:32,396][85175] Updated weights for policy 1, policy_version 91140 (0.0010) +[2023-10-11 18:44:32,770][85175] Updated weights for policy 1, policy_version 91150 (0.0011) +[2023-10-11 18:44:33,144][85175] Updated weights for policy 1, policy_version 91160 (0.0009) +[2023-10-11 18:44:33,502][85176] Updated weights for policy 0, policy_version 89832 (0.0008) +[2023-10-11 18:44:33,879][85176] Updated weights for policy 0, policy_version 89842 (0.0009) +[2023-10-11 18:44:34,252][85176] Updated weights for policy 0, policy_version 89852 (0.0009) +[2023-10-11 18:44:36,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 185368576. Throughput: 0: 1666.0, 1: 1671.5. Samples: 46345514. Policy #0 lag: (min: 30.0, avg: 35.0, max: 62.0) +[2023-10-11 18:44:36,063][84230] Avg episode reward: [(0, '45.930'), (1, '44.140')] +[2023-10-11 18:44:37,189][85175] Updated weights for policy 1, policy_version 91170 (0.0009) +[2023-10-11 18:44:37,564][85175] Updated weights for policy 1, policy_version 91180 (0.0011) +[2023-10-11 18:44:37,921][85175] Updated weights for policy 1, policy_version 91190 (0.0010) +[2023-10-11 18:44:38,286][85175] Updated weights for policy 1, policy_version 91200 (0.0009) +[2023-10-11 18:44:38,393][85176] Updated weights for policy 0, policy_version 89862 (0.0009) +[2023-10-11 18:44:38,767][85176] Updated weights for policy 0, policy_version 89872 (0.0008) +[2023-10-11 18:44:39,140][85176] Updated weights for policy 0, policy_version 89882 (0.0011) +[2023-10-11 18:44:41,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 185434112. Throughput: 0: 1659.2, 1: 1693.4. Samples: 46365436. Policy #0 lag: (min: 30.0, avg: 35.0, max: 62.0) +[2023-10-11 18:44:41,063][84230] Avg episode reward: [(0, '43.780'), (1, '45.890')] +[2023-10-11 18:44:42,507][85175] Updated weights for policy 1, policy_version 91210 (0.0007) +[2023-10-11 18:44:42,881][85175] Updated weights for policy 1, policy_version 91220 (0.0007) +[2023-10-11 18:44:43,248][85175] Updated weights for policy 1, policy_version 91230 (0.0008) +[2023-10-11 18:44:43,293][85176] Updated weights for policy 0, policy_version 89892 (0.0008) +[2023-10-11 18:44:43,666][85176] Updated weights for policy 0, policy_version 89902 (0.0008) +[2023-10-11 18:44:44,040][85176] Updated weights for policy 0, policy_version 89912 (0.0010) +[2023-10-11 18:44:46,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 185499648. Throughput: 0: 1675.9, 1: 1692.8. Samples: 46385802. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-10-11 18:44:46,064][84230] Avg episode reward: [(0, '45.450'), (1, '42.270')] +[2023-10-11 18:44:47,305][85175] Updated weights for policy 1, policy_version 91240 (0.0007) +[2023-10-11 18:44:47,678][85175] Updated weights for policy 1, policy_version 91250 (0.0008) +[2023-10-11 18:44:48,042][85175] Updated weights for policy 1, policy_version 91260 (0.0007) +[2023-10-11 18:44:48,094][85176] Updated weights for policy 0, policy_version 89922 (0.0008) +[2023-10-11 18:44:48,481][85176] Updated weights for policy 0, policy_version 89932 (0.0010) +[2023-10-11 18:44:48,855][85176] Updated weights for policy 0, policy_version 89942 (0.0008) +[2023-10-11 18:44:49,229][85176] Updated weights for policy 0, policy_version 89952 (0.0008) +[2023-10-11 18:44:51,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 185565184. Throughput: 0: 1660.8, 1: 1673.0. Samples: 46395626. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-10-11 18:44:51,064][84230] Avg episode reward: [(0, '44.710'), (1, '43.910')] +[2023-10-11 18:44:51,883][85175] Updated weights for policy 1, policy_version 91270 (0.0008) +[2023-10-11 18:44:52,257][85175] Updated weights for policy 1, policy_version 91280 (0.0008) +[2023-10-11 18:44:52,621][85175] Updated weights for policy 1, policy_version 91290 (0.0010) +[2023-10-11 18:44:53,367][85176] Updated weights for policy 0, policy_version 89962 (0.0011) +[2023-10-11 18:44:53,754][85176] Updated weights for policy 0, policy_version 89972 (0.0009) +[2023-10-11 18:44:54,119][85176] Updated weights for policy 0, policy_version 89982 (0.0009) +[2023-10-11 18:44:56,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 185630720. Throughput: 0: 1664.3, 1: 1691.8. Samples: 46415748. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-10-11 18:44:56,064][84230] Avg episode reward: [(0, '45.920'), (1, '43.330')] +[2023-10-11 18:44:56,586][85175] Updated weights for policy 1, policy_version 91300 (0.0008) +[2023-10-11 18:44:56,956][85175] Updated weights for policy 1, policy_version 91310 (0.0009) +[2023-10-11 18:44:57,328][85175] Updated weights for policy 1, policy_version 91320 (0.0009) +[2023-10-11 18:44:58,091][85176] Updated weights for policy 0, policy_version 89992 (0.0007) +[2023-10-11 18:44:58,469][85176] Updated weights for policy 0, policy_version 90002 (0.0009) +[2023-10-11 18:44:58,846][85176] Updated weights for policy 0, policy_version 90012 (0.0008) +[2023-10-11 18:45:01,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 185696256. Throughput: 0: 1678.7, 1: 1696.4. Samples: 46436848. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-10-11 18:45:01,064][84230] Avg episode reward: [(0, '44.480'), (1, '43.920')] +[2023-10-11 18:45:01,363][85175] Updated weights for policy 1, policy_version 91330 (0.0008) +[2023-10-11 18:45:01,735][85175] Updated weights for policy 1, policy_version 91340 (0.0008) +[2023-10-11 18:45:02,097][85175] Updated weights for policy 1, policy_version 91350 (0.0007) +[2023-10-11 18:45:02,455][85175] Updated weights for policy 1, policy_version 91360 (0.0011) +[2023-10-11 18:45:02,858][85176] Updated weights for policy 0, policy_version 90022 (0.0008) +[2023-10-11 18:45:03,233][85176] Updated weights for policy 0, policy_version 90032 (0.0007) +[2023-10-11 18:45:03,597][85176] Updated weights for policy 0, policy_version 90042 (0.0009) +[2023-10-11 18:45:06,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 185761792. Throughput: 0: 1653.4, 1: 1689.5. Samples: 46446272. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-10-11 18:45:06,063][84230] Avg episode reward: [(0, '46.800'), (1, '44.790')] +[2023-10-11 18:45:06,498][85175] Updated weights for policy 1, policy_version 91370 (0.0010) +[2023-10-11 18:45:06,861][85175] Updated weights for policy 1, policy_version 91380 (0.0010) +[2023-10-11 18:45:07,237][85175] Updated weights for policy 1, policy_version 91390 (0.0010) +[2023-10-11 18:45:07,764][85176] Updated weights for policy 0, policy_version 90052 (0.0010) +[2023-10-11 18:45:08,132][85176] Updated weights for policy 0, policy_version 90062 (0.0011) +[2023-10-11 18:45:08,508][85176] Updated weights for policy 0, policy_version 90072 (0.0009) +[2023-10-11 18:45:11,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 185827328. Throughput: 0: 1671.0, 1: 1695.9. Samples: 46466482. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-10-11 18:45:11,064][84230] Avg episode reward: [(0, '43.070'), (1, '43.400')] +[2023-10-11 18:45:11,303][85175] Updated weights for policy 1, policy_version 91400 (0.0009) +[2023-10-11 18:45:11,672][85175] Updated weights for policy 1, policy_version 91410 (0.0009) +[2023-10-11 18:45:12,045][85175] Updated weights for policy 1, policy_version 91420 (0.0009) +[2023-10-11 18:45:12,605][85176] Updated weights for policy 0, policy_version 90082 (0.0009) +[2023-10-11 18:45:12,975][85176] Updated weights for policy 0, policy_version 90092 (0.0010) +[2023-10-11 18:45:13,351][85176] Updated weights for policy 0, policy_version 90102 (0.0007) +[2023-10-11 18:45:13,713][85176] Updated weights for policy 0, policy_version 90112 (0.0009) +[2023-10-11 18:45:16,063][84230] Fps is (10 sec: 13106.8, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 185892864. Throughput: 0: 1676.6, 1: 1695.3. Samples: 46487332. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-10-11 18:45:16,064][84230] Avg episode reward: [(0, '46.860'), (1, '46.360')] +[2023-10-11 18:45:16,149][85175] Updated weights for policy 1, policy_version 91430 (0.0009) +[2023-10-11 18:45:16,514][85175] Updated weights for policy 1, policy_version 91440 (0.0010) +[2023-10-11 18:45:16,880][85175] Updated weights for policy 1, policy_version 91450 (0.0010) +[2023-10-11 18:45:17,699][85176] Updated weights for policy 0, policy_version 90122 (0.0007) +[2023-10-11 18:45:18,072][85176] Updated weights for policy 0, policy_version 90132 (0.0008) +[2023-10-11 18:45:18,463][85176] Updated weights for policy 0, policy_version 90142 (0.0008) +[2023-10-11 18:45:20,946][85175] Updated weights for policy 1, policy_version 91460 (0.0009) +[2023-10-11 18:45:21,063][84230] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 185958400. Throughput: 0: 1656.6, 1: 1700.1. Samples: 46496564. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-10-11 18:45:21,063][84230] Avg episode reward: [(0, '43.800'), (1, '40.420')] +[2023-10-11 18:45:21,311][85175] Updated weights for policy 1, policy_version 91470 (0.0009) +[2023-10-11 18:45:21,682][85175] Updated weights for policy 1, policy_version 91480 (0.0009) +[2023-10-11 18:45:22,556][85176] Updated weights for policy 0, policy_version 90152 (0.0008) +[2023-10-11 18:45:22,922][85176] Updated weights for policy 0, policy_version 90162 (0.0007) +[2023-10-11 18:45:23,308][85176] Updated weights for policy 0, policy_version 90172 (0.0007) +[2023-10-11 18:45:25,652][85175] Updated weights for policy 1, policy_version 91490 (0.0008) +[2023-10-11 18:45:26,030][85175] Updated weights for policy 1, policy_version 91500 (0.0008) +[2023-10-11 18:45:26,062][84230] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 186023936. Throughput: 0: 1671.6, 1: 1700.6. Samples: 46517186. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-10-11 18:45:26,063][84230] Avg episode reward: [(0, '45.280'), (1, '43.000')] +[2023-10-11 18:45:26,401][85175] Updated weights for policy 1, policy_version 91510 (0.0008) +[2023-10-11 18:45:26,771][85175] Updated weights for policy 1, policy_version 91520 (0.0008) +[2023-10-11 18:45:27,399][85176] Updated weights for policy 0, policy_version 90182 (0.0008) +[2023-10-11 18:45:27,765][85176] Updated weights for policy 0, policy_version 90192 (0.0010) +[2023-10-11 18:45:28,143][85176] Updated weights for policy 0, policy_version 90202 (0.0010) +[2023-10-11 18:45:30,918][85175] Updated weights for policy 1, policy_version 91530 (0.0010) +[2023-10-11 18:45:31,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 186089472. Throughput: 0: 1672.4, 1: 1705.4. Samples: 46537800. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-10-11 18:45:31,064][84230] Avg episode reward: [(0, '42.220'), (1, '39.320')] +[2023-10-11 18:45:31,283][85175] Updated weights for policy 1, policy_version 91540 (0.0008) +[2023-10-11 18:45:31,657][85175] Updated weights for policy 1, policy_version 91550 (0.0008) +[2023-10-11 18:45:32,174][85176] Updated weights for policy 0, policy_version 90212 (0.0008) +[2023-10-11 18:45:32,544][85176] Updated weights for policy 0, policy_version 90222 (0.0010) +[2023-10-11 18:45:32,915][85176] Updated weights for policy 0, policy_version 90232 (0.0009) +[2023-10-11 18:45:35,529][85175] Updated weights for policy 1, policy_version 91560 (0.0007) +[2023-10-11 18:45:35,889][85175] Updated weights for policy 1, policy_version 91570 (0.0007) +[2023-10-11 18:45:36,062][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 186155008. Throughput: 0: 1656.5, 1: 1709.2. Samples: 46547082. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-10-11 18:45:36,063][84230] Avg episode reward: [(0, '45.170'), (1, '45.740')] +[2023-10-11 18:45:36,266][85175] Updated weights for policy 1, policy_version 91580 (0.0008) +[2023-10-11 18:45:36,952][85176] Updated weights for policy 0, policy_version 90242 (0.0009) +[2023-10-11 18:45:37,329][85176] Updated weights for policy 0, policy_version 90252 (0.0011) +[2023-10-11 18:45:37,710][85176] Updated weights for policy 0, policy_version 90262 (0.0010) +[2023-10-11 18:45:38,089][85176] Updated weights for policy 0, policy_version 90272 (0.0007) +[2023-10-11 18:45:40,319][85175] Updated weights for policy 1, policy_version 91590 (0.0008) +[2023-10-11 18:45:40,693][85175] Updated weights for policy 1, policy_version 91600 (0.0007) +[2023-10-11 18:45:41,062][85175] Updated weights for policy 1, policy_version 91610 (0.0008) +[2023-10-11 18:45:41,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 186220544. Throughput: 0: 1677.4, 1: 1699.2. Samples: 46567694. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) +[2023-10-11 18:45:41,064][84230] Avg episode reward: [(0, '42.740'), (1, '42.250')] +[2023-10-11 18:45:42,321][85176] Updated weights for policy 0, policy_version 90282 (0.0008) +[2023-10-11 18:45:42,694][85176] Updated weights for policy 0, policy_version 90292 (0.0011) +[2023-10-11 18:45:43,063][85176] Updated weights for policy 0, policy_version 90302 (0.0011) +[2023-10-11 18:45:45,003][85175] Updated weights for policy 1, policy_version 91620 (0.0007) +[2023-10-11 18:45:45,380][85175] Updated weights for policy 1, policy_version 91630 (0.0007) +[2023-10-11 18:45:45,749][85175] Updated weights for policy 1, policy_version 91640 (0.0007) +[2023-10-11 18:45:46,063][84230] Fps is (10 sec: 16383.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 186318848. Throughput: 0: 1674.0, 1: 1683.2. Samples: 46587922. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-10-11 18:45:46,064][84230] Avg episode reward: [(0, '44.660'), (1, '46.970')] +[2023-10-11 18:45:47,045][85176] Updated weights for policy 0, policy_version 90312 (0.0009) +[2023-10-11 18:45:47,425][85176] Updated weights for policy 0, policy_version 90322 (0.0010) +[2023-10-11 18:45:47,788][85176] Updated weights for policy 0, policy_version 90332 (0.0011) +[2023-10-11 18:45:49,809][85175] Updated weights for policy 1, policy_version 91650 (0.0008) +[2023-10-11 18:45:50,180][85175] Updated weights for policy 1, policy_version 91660 (0.0009) +[2023-10-11 18:45:50,540][85175] Updated weights for policy 1, policy_version 91670 (0.0007) +[2023-10-11 18:45:50,900][85175] Updated weights for policy 1, policy_version 91680 (0.0007) +[2023-10-11 18:45:51,063][84230] Fps is (10 sec: 16384.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 186384384. Throughput: 0: 1667.0, 1: 1702.6. Samples: 46597902. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-10-11 18:45:51,063][84230] Avg episode reward: [(0, '42.070'), (1, '42.400')] +[2023-10-11 18:45:51,919][85176] Updated weights for policy 0, policy_version 90342 (0.0010) +[2023-10-11 18:45:52,291][85176] Updated weights for policy 0, policy_version 90352 (0.0009) +[2023-10-11 18:45:52,661][85176] Updated weights for policy 0, policy_version 90362 (0.0009) +[2023-10-11 18:45:55,035][85175] Updated weights for policy 1, policy_version 91690 (0.0010) +[2023-10-11 18:45:55,404][85175] Updated weights for policy 1, policy_version 91700 (0.0009) +[2023-10-11 18:45:55,781][85175] Updated weights for policy 1, policy_version 91710 (0.0008) +[2023-10-11 18:45:56,062][84230] Fps is (10 sec: 13107.6, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 186449920. Throughput: 0: 1677.2, 1: 1702.1. Samples: 46618550. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-10-11 18:45:56,063][84230] Avg episode reward: [(0, '42.830'), (1, '44.540')] +[2023-10-11 18:45:56,813][85176] Updated weights for policy 0, policy_version 90372 (0.0009) +[2023-10-11 18:45:57,190][85176] Updated weights for policy 0, policy_version 90382 (0.0008) +[2023-10-11 18:45:57,574][85176] Updated weights for policy 0, policy_version 90392 (0.0008) +[2023-10-11 18:45:59,664][85175] Updated weights for policy 1, policy_version 91720 (0.0008) +[2023-10-11 18:46:00,029][85175] Updated weights for policy 1, policy_version 91730 (0.0009) +[2023-10-11 18:46:00,399][85175] Updated weights for policy 1, policy_version 91740 (0.0009) +[2023-10-11 18:46:01,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 186515456. Throughput: 0: 1674.8, 1: 1679.1. Samples: 46638258. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-10-11 18:46:01,064][84230] Avg episode reward: [(0, '44.810'), (1, '43.040')] +[2023-10-11 18:46:01,564][85176] Updated weights for policy 0, policy_version 90402 (0.0009) +[2023-10-11 18:46:01,935][85176] Updated weights for policy 0, policy_version 90412 (0.0008) +[2023-10-11 18:46:02,315][85176] Updated weights for policy 0, policy_version 90422 (0.0011) +[2023-10-11 18:46:02,692][85176] Updated weights for policy 0, policy_version 90432 (0.0009) +[2023-10-11 18:46:04,393][85175] Updated weights for policy 1, policy_version 91750 (0.0008) +[2023-10-11 18:46:04,754][85175] Updated weights for policy 1, policy_version 91760 (0.0008) +[2023-10-11 18:46:05,120][85175] Updated weights for policy 1, policy_version 91770 (0.0011) +[2023-10-11 18:46:06,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 186580992. Throughput: 0: 1671.6, 1: 1706.6. Samples: 46648584. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-10-11 18:46:06,063][84230] Avg episode reward: [(0, '45.000'), (1, '46.590')] +[2023-10-11 18:46:06,808][85176] Updated weights for policy 0, policy_version 90442 (0.0010) +[2023-10-11 18:46:07,180][85176] Updated weights for policy 0, policy_version 90452 (0.0009) +[2023-10-11 18:46:07,559][85176] Updated weights for policy 0, policy_version 90462 (0.0008) +[2023-10-11 18:46:09,191][85175] Updated weights for policy 1, policy_version 91780 (0.0010) +[2023-10-11 18:46:09,563][85175] Updated weights for policy 1, policy_version 91790 (0.0007) +[2023-10-11 18:46:09,923][85175] Updated weights for policy 1, policy_version 91800 (0.0009) +[2023-10-11 18:46:11,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 186646528. Throughput: 0: 1675.4, 1: 1694.0. Samples: 46668808. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-10-11 18:46:11,063][84230] Avg episode reward: [(0, '46.730'), (1, '43.190')] +[2023-10-11 18:46:11,655][85176] Updated weights for policy 0, policy_version 90472 (0.0009) +[2023-10-11 18:46:12,030][85176] Updated weights for policy 0, policy_version 90482 (0.0009) +[2023-10-11 18:46:12,397][85176] Updated weights for policy 0, policy_version 90492 (0.0008) +[2023-10-11 18:46:13,717][85175] Updated weights for policy 1, policy_version 91810 (0.0010) +[2023-10-11 18:46:14,083][85175] Updated weights for policy 1, policy_version 91820 (0.0008) +[2023-10-11 18:46:14,446][85175] Updated weights for policy 1, policy_version 91830 (0.0009) +[2023-10-11 18:46:14,815][85175] Updated weights for policy 1, policy_version 91840 (0.0010) +[2023-10-11 18:46:16,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 186712064. Throughput: 0: 1680.0, 1: 1683.5. Samples: 46689156. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-10-11 18:46:16,063][84230] Avg episode reward: [(0, '48.210'), (1, '46.840')] +[2023-10-11 18:46:16,071][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000091840_94044160.pth... +[2023-10-11 18:46:16,072][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000090496_92667904.pth... +[2023-10-11 18:46:16,108][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000088960_91095040.pth +[2023-10-11 18:46:16,111][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000090240_92405760.pth +[2023-10-11 18:46:16,446][85176] Updated weights for policy 0, policy_version 90502 (0.0008) +[2023-10-11 18:46:16,813][85176] Updated weights for policy 0, policy_version 90512 (0.0010) +[2023-10-11 18:46:17,194][85176] Updated weights for policy 0, policy_version 90522 (0.0009) +[2023-10-11 18:46:19,090][85175] Updated weights for policy 1, policy_version 91850 (0.0007) +[2023-10-11 18:46:19,456][85175] Updated weights for policy 1, policy_version 91860 (0.0007) +[2023-10-11 18:46:19,819][85175] Updated weights for policy 1, policy_version 91870 (0.0007) +[2023-10-11 18:46:21,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 186777600. Throughput: 0: 1678.6, 1: 1710.0. Samples: 46699568. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-10-11 18:46:21,064][84230] Avg episode reward: [(0, '46.720'), (1, '45.200')] +[2023-10-11 18:46:21,211][85176] Updated weights for policy 0, policy_version 90532 (0.0008) +[2023-10-11 18:46:21,586][85176] Updated weights for policy 0, policy_version 90542 (0.0007) +[2023-10-11 18:46:21,960][85176] Updated weights for policy 0, policy_version 90552 (0.0007) +[2023-10-11 18:46:23,808][85175] Updated weights for policy 1, policy_version 91880 (0.0010) +[2023-10-11 18:46:24,173][85175] Updated weights for policy 1, policy_version 91890 (0.0009) +[2023-10-11 18:46:24,547][85175] Updated weights for policy 1, policy_version 91900 (0.0009) +[2023-10-11 18:46:25,989][85176] Updated weights for policy 0, policy_version 90562 (0.0008) +[2023-10-11 18:46:26,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 186843136. Throughput: 0: 1682.5, 1: 1688.1. Samples: 46719370. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-10-11 18:46:26,063][84230] Avg episode reward: [(0, '46.110'), (1, '46.610')] +[2023-10-11 18:46:26,367][85176] Updated weights for policy 0, policy_version 90572 (0.0010) +[2023-10-11 18:46:26,735][85176] Updated weights for policy 0, policy_version 90582 (0.0007) +[2023-10-11 18:46:27,122][85176] Updated weights for policy 0, policy_version 90592 (0.0007) +[2023-10-11 18:46:28,597][85175] Updated weights for policy 1, policy_version 91910 (0.0008) +[2023-10-11 18:46:28,957][85175] Updated weights for policy 1, policy_version 91920 (0.0007) +[2023-10-11 18:46:29,320][85175] Updated weights for policy 1, policy_version 91930 (0.0009) +[2023-10-11 18:46:31,062][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 186908672. Throughput: 0: 1682.1, 1: 1693.9. Samples: 46739844. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-10-11 18:46:31,063][84230] Avg episode reward: [(0, '43.810'), (1, '43.710')] +[2023-10-11 18:46:31,370][85176] Updated weights for policy 0, policy_version 90602 (0.0007) +[2023-10-11 18:46:31,749][85176] Updated weights for policy 0, policy_version 90612 (0.0009) +[2023-10-11 18:46:32,134][85176] Updated weights for policy 0, policy_version 90622 (0.0008) +[2023-10-11 18:46:33,473][85175] Updated weights for policy 1, policy_version 91940 (0.0010) +[2023-10-11 18:46:33,838][85175] Updated weights for policy 1, policy_version 91950 (0.0009) +[2023-10-11 18:46:34,204][85175] Updated weights for policy 1, policy_version 91960 (0.0008) +[2023-10-11 18:46:36,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 186974208. Throughput: 0: 1681.1, 1: 1699.4. Samples: 46750022. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-10-11 18:46:36,063][84230] Avg episode reward: [(0, '46.250'), (1, '43.450')] +[2023-10-11 18:46:36,117][85176] Updated weights for policy 0, policy_version 90632 (0.0009) +[2023-10-11 18:46:36,507][85176] Updated weights for policy 0, policy_version 90642 (0.0007) +[2023-10-11 18:46:36,876][85176] Updated weights for policy 0, policy_version 90652 (0.0007) +[2023-10-11 18:46:38,301][85175] Updated weights for policy 1, policy_version 91970 (0.0007) +[2023-10-11 18:46:38,668][85175] Updated weights for policy 1, policy_version 91980 (0.0008) +[2023-10-11 18:46:39,030][85175] Updated weights for policy 1, policy_version 91990 (0.0009) +[2023-10-11 18:46:39,399][85175] Updated weights for policy 1, policy_version 92000 (0.0009) +[2023-10-11 18:46:40,762][85176] Updated weights for policy 0, policy_version 90662 (0.0008) +[2023-10-11 18:46:41,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 187039744. Throughput: 0: 1685.1, 1: 1674.3. Samples: 46769722. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) +[2023-10-11 18:46:41,063][84230] Avg episode reward: [(0, '44.150'), (1, '43.180')] +[2023-10-11 18:46:41,135][85176] Updated weights for policy 0, policy_version 90672 (0.0007) +[2023-10-11 18:46:41,507][85176] Updated weights for policy 0, policy_version 90682 (0.0007) +[2023-10-11 18:46:43,449][85175] Updated weights for policy 1, policy_version 92010 (0.0008) +[2023-10-11 18:46:43,806][85175] Updated weights for policy 1, policy_version 92020 (0.0008) +[2023-10-11 18:46:44,172][85175] Updated weights for policy 1, policy_version 92030 (0.0008) +[2023-10-11 18:46:45,473][85176] Updated weights for policy 0, policy_version 90692 (0.0010) +[2023-10-11 18:46:45,848][85176] Updated weights for policy 0, policy_version 90702 (0.0008) +[2023-10-11 18:46:46,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 187105280. Throughput: 0: 1679.7, 1: 1698.9. Samples: 46790294. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:46:46,063][84230] Avg episode reward: [(0, '47.190'), (1, '42.130')] +[2023-10-11 18:46:46,222][85176] Updated weights for policy 0, policy_version 90712 (0.0008) +[2023-10-11 18:46:48,209][85175] Updated weights for policy 1, policy_version 92040 (0.0009) +[2023-10-11 18:46:48,589][85175] Updated weights for policy 1, policy_version 92050 (0.0008) +[2023-10-11 18:46:48,948][85175] Updated weights for policy 1, policy_version 92060 (0.0010) +[2023-10-11 18:46:50,282][85176] Updated weights for policy 0, policy_version 90722 (0.0008) +[2023-10-11 18:46:50,662][85176] Updated weights for policy 0, policy_version 90732 (0.0009) +[2023-10-11 18:46:51,047][85176] Updated weights for policy 0, policy_version 90742 (0.0009) +[2023-10-11 18:46:51,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 187170816. Throughput: 0: 1687.3, 1: 1684.2. Samples: 46800304. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:46:51,063][84230] Avg episode reward: [(0, '44.920'), (1, '45.050')] +[2023-10-11 18:46:51,413][85176] Updated weights for policy 0, policy_version 90752 (0.0009) +[2023-10-11 18:46:53,075][85175] Updated weights for policy 1, policy_version 92070 (0.0010) +[2023-10-11 18:46:53,437][85175] Updated weights for policy 1, policy_version 92080 (0.0009) +[2023-10-11 18:46:53,810][85175] Updated weights for policy 1, policy_version 92090 (0.0010) +[2023-10-11 18:46:55,532][85176] Updated weights for policy 0, policy_version 90762 (0.0007) +[2023-10-11 18:46:55,902][85176] Updated weights for policy 0, policy_version 90772 (0.0008) +[2023-10-11 18:46:56,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 187236352. Throughput: 0: 1691.4, 1: 1679.6. Samples: 46820502. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:46:56,063][84230] Avg episode reward: [(0, '46.480'), (1, '44.410')] +[2023-10-11 18:46:56,278][85176] Updated weights for policy 0, policy_version 90782 (0.0009) +[2023-10-11 18:46:57,715][85175] Updated weights for policy 1, policy_version 92100 (0.0008) +[2023-10-11 18:46:58,083][85175] Updated weights for policy 1, policy_version 92110 (0.0008) +[2023-10-11 18:46:58,441][85175] Updated weights for policy 1, policy_version 92120 (0.0007) +[2023-10-11 18:47:00,302][85176] Updated weights for policy 0, policy_version 90792 (0.0008) +[2023-10-11 18:47:00,664][85176] Updated weights for policy 0, policy_version 90802 (0.0007) +[2023-10-11 18:47:01,045][85176] Updated weights for policy 0, policy_version 90812 (0.0007) +[2023-10-11 18:47:01,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 187301888. Throughput: 0: 1673.5, 1: 1692.0. Samples: 46840606. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:47:01,064][84230] Avg episode reward: [(0, '44.040'), (1, '47.230')] +[2023-10-11 18:47:02,536][85175] Updated weights for policy 1, policy_version 92130 (0.0008) +[2023-10-11 18:47:02,900][85175] Updated weights for policy 1, policy_version 92140 (0.0008) +[2023-10-11 18:47:03,269][85175] Updated weights for policy 1, policy_version 92150 (0.0009) +[2023-10-11 18:47:03,634][85175] Updated weights for policy 1, policy_version 92160 (0.0007) +[2023-10-11 18:47:05,130][85176] Updated weights for policy 0, policy_version 90822 (0.0008) +[2023-10-11 18:47:05,504][85176] Updated weights for policy 0, policy_version 90832 (0.0009) +[2023-10-11 18:47:05,875][85176] Updated weights for policy 0, policy_version 90842 (0.0010) +[2023-10-11 18:47:06,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 187367424. Throughput: 0: 1689.9, 1: 1669.2. Samples: 46850724. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:47:06,063][84230] Avg episode reward: [(0, '45.330'), (1, '45.430')] +[2023-10-11 18:47:07,721][85175] Updated weights for policy 1, policy_version 92170 (0.0010) +[2023-10-11 18:47:08,086][85175] Updated weights for policy 1, policy_version 92180 (0.0010) +[2023-10-11 18:47:08,457][85175] Updated weights for policy 1, policy_version 92190 (0.0011) +[2023-10-11 18:47:09,950][85176] Updated weights for policy 0, policy_version 90852 (0.0008) +[2023-10-11 18:47:10,319][85176] Updated weights for policy 0, policy_version 90862 (0.0010) +[2023-10-11 18:47:10,697][85176] Updated weights for policy 0, policy_version 90872 (0.0010) +[2023-10-11 18:47:11,063][84230] Fps is (10 sec: 16384.3, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 187465728. Throughput: 0: 1682.7, 1: 1688.7. Samples: 46871082. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:47:11,063][84230] Avg episode reward: [(0, '46.740'), (1, '44.030')] +[2023-10-11 18:47:12,272][85175] Updated weights for policy 1, policy_version 92200 (0.0009) +[2023-10-11 18:47:12,633][85175] Updated weights for policy 1, policy_version 92210 (0.0008) +[2023-10-11 18:47:13,002][85175] Updated weights for policy 1, policy_version 92220 (0.0009) +[2023-10-11 18:47:14,944][85176] Updated weights for policy 0, policy_version 90882 (0.0009) +[2023-10-11 18:47:15,322][85176] Updated weights for policy 0, policy_version 90892 (0.0008) +[2023-10-11 18:47:15,693][85176] Updated weights for policy 0, policy_version 90902 (0.0010) +[2023-10-11 18:47:16,062][85176] Updated weights for policy 0, policy_version 90912 (0.0008) +[2023-10-11 18:47:16,062][84230] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 187531264. Throughput: 0: 1665.2, 1: 1699.4. Samples: 46891248. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:47:16,063][84230] Avg episode reward: [(0, '44.410'), (1, '41.220')] +[2023-10-11 18:47:16,876][85175] Updated weights for policy 1, policy_version 92230 (0.0008) +[2023-10-11 18:47:17,248][85175] Updated weights for policy 1, policy_version 92240 (0.0010) +[2023-10-11 18:47:17,610][85175] Updated weights for policy 1, policy_version 92250 (0.0008) +[2023-10-11 18:47:20,045][85176] Updated weights for policy 0, policy_version 90922 (0.0009) +[2023-10-11 18:47:20,418][85176] Updated weights for policy 0, policy_version 90932 (0.0009) +[2023-10-11 18:47:20,804][85176] Updated weights for policy 0, policy_version 90942 (0.0009) +[2023-10-11 18:47:21,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 187596800. Throughput: 0: 1683.8, 1: 1680.8. Samples: 46901428. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:47:21,063][84230] Avg episode reward: [(0, '44.430'), (1, '44.190')] +[2023-10-11 18:47:21,548][85175] Updated weights for policy 1, policy_version 92260 (0.0007) +[2023-10-11 18:47:21,912][85175] Updated weights for policy 1, policy_version 92270 (0.0011) +[2023-10-11 18:47:22,289][85175] Updated weights for policy 1, policy_version 92280 (0.0010) +[2023-10-11 18:47:24,713][85176] Updated weights for policy 0, policy_version 90952 (0.0011) +[2023-10-11 18:47:25,078][85176] Updated weights for policy 0, policy_version 90962 (0.0010) +[2023-10-11 18:47:25,448][85176] Updated weights for policy 0, policy_version 90972 (0.0010) +[2023-10-11 18:47:26,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 187662336. Throughput: 0: 1678.3, 1: 1707.5. Samples: 46922082. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:47:26,064][84230] Avg episode reward: [(0, '42.570'), (1, '41.970')] +[2023-10-11 18:47:26,438][85175] Updated weights for policy 1, policy_version 92290 (0.0010) +[2023-10-11 18:47:26,802][85175] Updated weights for policy 1, policy_version 92300 (0.0007) +[2023-10-11 18:47:27,165][85175] Updated weights for policy 1, policy_version 92310 (0.0008) +[2023-10-11 18:47:27,537][85175] Updated weights for policy 1, policy_version 92320 (0.0008) +[2023-10-11 18:47:29,710][85176] Updated weights for policy 0, policy_version 90982 (0.0007) +[2023-10-11 18:47:30,079][85176] Updated weights for policy 0, policy_version 90992 (0.0008) +[2023-10-11 18:47:30,452][85176] Updated weights for policy 0, policy_version 91002 (0.0007) +[2023-10-11 18:47:31,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 187727872. Throughput: 0: 1661.9, 1: 1710.3. Samples: 46942048. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:47:31,064][84230] Avg episode reward: [(0, '45.770'), (1, '43.730')] +[2023-10-11 18:47:31,587][85175] Updated weights for policy 1, policy_version 92330 (0.0008) +[2023-10-11 18:47:31,955][85175] Updated weights for policy 1, policy_version 92340 (0.0008) +[2023-10-11 18:47:32,331][85175] Updated weights for policy 1, policy_version 92350 (0.0011) +[2023-10-11 18:47:34,550][85176] Updated weights for policy 0, policy_version 91012 (0.0009) +[2023-10-11 18:47:34,916][85176] Updated weights for policy 0, policy_version 91022 (0.0009) +[2023-10-11 18:47:35,304][85176] Updated weights for policy 0, policy_version 91032 (0.0011) +[2023-10-11 18:47:36,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 187793408. Throughput: 0: 1681.6, 1: 1692.3. Samples: 46952134. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:47:36,064][84230] Avg episode reward: [(0, '44.930'), (1, '39.950')] +[2023-10-11 18:47:36,481][85175] Updated weights for policy 1, policy_version 92360 (0.0008) +[2023-10-11 18:47:36,849][85175] Updated weights for policy 1, policy_version 92370 (0.0009) +[2023-10-11 18:47:37,217][85175] Updated weights for policy 1, policy_version 92380 (0.0011) +[2023-10-11 18:47:39,238][85176] Updated weights for policy 0, policy_version 91042 (0.0009) +[2023-10-11 18:47:39,625][85176] Updated weights for policy 0, policy_version 91052 (0.0010) +[2023-10-11 18:47:40,004][85176] Updated weights for policy 0, policy_version 91062 (0.0010) +[2023-10-11 18:47:40,377][85176] Updated weights for policy 0, policy_version 91072 (0.0011) +[2023-10-11 18:47:41,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 187858944. Throughput: 0: 1667.5, 1: 1707.4. Samples: 46972370. Policy #0 lag: (min: 25.0, avg: 35.7, max: 57.0) +[2023-10-11 18:47:41,063][84230] Avg episode reward: [(0, '49.420'), (1, '41.790')] +[2023-10-11 18:47:41,064][84801] Saving new best policy, reward=49.420! +[2023-10-11 18:47:41,389][85175] Updated weights for policy 1, policy_version 92390 (0.0008) +[2023-10-11 18:47:41,761][85175] Updated weights for policy 1, policy_version 92400 (0.0007) +[2023-10-11 18:47:42,135][85175] Updated weights for policy 1, policy_version 92410 (0.0007) +[2023-10-11 18:47:44,526][85176] Updated weights for policy 0, policy_version 91082 (0.0008) +[2023-10-11 18:47:44,891][85176] Updated weights for policy 0, policy_version 91092 (0.0007) +[2023-10-11 18:47:45,265][85176] Updated weights for policy 0, policy_version 91102 (0.0008) +[2023-10-11 18:47:46,026][85175] Updated weights for policy 1, policy_version 92420 (0.0009) +[2023-10-11 18:47:46,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 187924480. Throughput: 0: 1663.0, 1: 1708.2. Samples: 46992312. Policy #0 lag: (min: 25.0, avg: 35.7, max: 57.0) +[2023-10-11 18:47:46,063][84230] Avg episode reward: [(0, '44.280'), (1, '41.760')] +[2023-10-11 18:47:46,390][85175] Updated weights for policy 1, policy_version 92430 (0.0007) +[2023-10-11 18:47:46,763][85175] Updated weights for policy 1, policy_version 92440 (0.0010) +[2023-10-11 18:47:49,408][85176] Updated weights for policy 0, policy_version 91112 (0.0010) +[2023-10-11 18:47:49,777][85176] Updated weights for policy 0, policy_version 91122 (0.0010) +[2023-10-11 18:47:50,149][85176] Updated weights for policy 0, policy_version 91132 (0.0008) +[2023-10-11 18:47:50,563][85175] Updated weights for policy 1, policy_version 92450 (0.0010) +[2023-10-11 18:47:50,932][85175] Updated weights for policy 1, policy_version 92460 (0.0009) +[2023-10-11 18:47:51,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 187990016. Throughput: 0: 1679.4, 1: 1698.0. Samples: 47002706. Policy #0 lag: (min: 25.0, avg: 35.7, max: 57.0) +[2023-10-11 18:47:51,063][84230] Avg episode reward: [(0, '46.780'), (1, '45.210')] +[2023-10-11 18:47:51,297][85175] Updated weights for policy 1, policy_version 92470 (0.0008) +[2023-10-11 18:47:51,666][85175] Updated weights for policy 1, policy_version 92480 (0.0010) +[2023-10-11 18:47:54,096][85176] Updated weights for policy 0, policy_version 91142 (0.0008) +[2023-10-11 18:47:54,468][85176] Updated weights for policy 0, policy_version 91152 (0.0008) +[2023-10-11 18:47:54,834][85176] Updated weights for policy 0, policy_version 91162 (0.0011) +[2023-10-11 18:47:55,776][85175] Updated weights for policy 1, policy_version 92490 (0.0010) +[2023-10-11 18:47:56,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 188055552. Throughput: 0: 1666.8, 1: 1707.9. Samples: 47022940. Policy #0 lag: (min: 25.0, avg: 35.7, max: 57.0) +[2023-10-11 18:47:56,063][84230] Avg episode reward: [(0, '42.460'), (1, '42.750')] +[2023-10-11 18:47:56,137][85175] Updated weights for policy 1, policy_version 92500 (0.0008) +[2023-10-11 18:47:56,506][85175] Updated weights for policy 1, policy_version 92510 (0.0010) +[2023-10-11 18:47:58,819][85176] Updated weights for policy 0, policy_version 91172 (0.0010) +[2023-10-11 18:47:59,195][85176] Updated weights for policy 0, policy_version 91182 (0.0010) +[2023-10-11 18:47:59,563][85176] Updated weights for policy 0, policy_version 91192 (0.0009) +[2023-10-11 18:48:00,472][85175] Updated weights for policy 1, policy_version 92520 (0.0007) +[2023-10-11 18:48:00,837][85175] Updated weights for policy 1, policy_version 92530 (0.0008) +[2023-10-11 18:48:01,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13329.4). Total num frames: 188121088. Throughput: 0: 1673.0, 1: 1699.0. Samples: 47042990. Policy #0 lag: (min: 25.0, avg: 35.7, max: 57.0) +[2023-10-11 18:48:01,063][84230] Avg episode reward: [(0, '47.490'), (1, '45.210')] +[2023-10-11 18:48:01,209][85175] Updated weights for policy 1, policy_version 92540 (0.0007) +[2023-10-11 18:48:03,713][85176] Updated weights for policy 0, policy_version 91202 (0.0008) +[2023-10-11 18:48:04,091][85176] Updated weights for policy 0, policy_version 91212 (0.0009) +[2023-10-11 18:48:04,457][85176] Updated weights for policy 0, policy_version 91222 (0.0009) +[2023-10-11 18:48:04,830][85176] Updated weights for policy 0, policy_version 91232 (0.0008) +[2023-10-11 18:48:05,249][85175] Updated weights for policy 1, policy_version 92550 (0.0009) +[2023-10-11 18:48:05,614][85175] Updated weights for policy 1, policy_version 92560 (0.0009) +[2023-10-11 18:48:05,985][85175] Updated weights for policy 1, policy_version 92570 (0.0007) +[2023-10-11 18:48:06,062][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 188186624. Throughput: 0: 1683.9, 1: 1699.6. Samples: 47053684. Policy #0 lag: (min: 25.0, avg: 35.7, max: 57.0) +[2023-10-11 18:48:06,063][84230] Avg episode reward: [(0, '43.160'), (1, '42.680')] +[2023-10-11 18:48:08,925][85176] Updated weights for policy 0, policy_version 91242 (0.0007) +[2023-10-11 18:48:09,298][85176] Updated weights for policy 0, policy_version 91252 (0.0008) +[2023-10-11 18:48:09,671][85176] Updated weights for policy 0, policy_version 91262 (0.0007) +[2023-10-11 18:48:10,194][85175] Updated weights for policy 1, policy_version 92580 (0.0010) +[2023-10-11 18:48:10,561][85175] Updated weights for policy 1, policy_version 92590 (0.0007) +[2023-10-11 18:48:10,921][85175] Updated weights for policy 1, policy_version 92600 (0.0008) +[2023-10-11 18:48:11,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 188252160. Throughput: 0: 1658.5, 1: 1699.4. Samples: 47073186. Policy #0 lag: (min: 25.0, avg: 35.7, max: 57.0) +[2023-10-11 18:48:11,063][84230] Avg episode reward: [(0, '45.870'), (1, '45.680')] +[2023-10-11 18:48:13,943][85176] Updated weights for policy 0, policy_version 91272 (0.0007) +[2023-10-11 18:48:14,301][85176] Updated weights for policy 0, policy_version 91282 (0.0010) +[2023-10-11 18:48:14,673][85176] Updated weights for policy 0, policy_version 91292 (0.0008) +[2023-10-11 18:48:14,785][85175] Updated weights for policy 1, policy_version 92610 (0.0010) +[2023-10-11 18:48:15,156][85175] Updated weights for policy 1, policy_version 92620 (0.0009) +[2023-10-11 18:48:15,520][85175] Updated weights for policy 1, policy_version 92630 (0.0009) +[2023-10-11 18:48:15,889][85175] Updated weights for policy 1, policy_version 92640 (0.0007) +[2023-10-11 18:48:16,063][84230] Fps is (10 sec: 16383.7, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 188350464. Throughput: 0: 1677.2, 1: 1680.8. Samples: 47093158. Policy #0 lag: (min: 25.0, avg: 35.7, max: 57.0) +[2023-10-11 18:48:16,063][84230] Avg episode reward: [(0, '44.550'), (1, '43.990')] +[2023-10-11 18:48:16,071][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000091296_93487104.pth... +[2023-10-11 18:48:16,071][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000092640_94863360.pth... +[2023-10-11 18:48:16,101][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000089728_91881472.pth +[2023-10-11 18:48:16,110][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000091040_93224960.pth +[2023-10-11 18:48:18,594][85176] Updated weights for policy 0, policy_version 91302 (0.0007) +[2023-10-11 18:48:18,961][85176] Updated weights for policy 0, policy_version 91312 (0.0008) +[2023-10-11 18:48:19,332][85176] Updated weights for policy 0, policy_version 91322 (0.0011) +[2023-10-11 18:48:20,004][85175] Updated weights for policy 1, policy_version 92650 (0.0007) +[2023-10-11 18:48:20,368][85175] Updated weights for policy 1, policy_version 92660 (0.0011) +[2023-10-11 18:48:20,737][85175] Updated weights for policy 1, policy_version 92670 (0.0010) +[2023-10-11 18:48:21,062][84230] Fps is (10 sec: 16384.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 188416000. Throughput: 0: 1681.7, 1: 1706.0. Samples: 47104576. Policy #0 lag: (min: 25.0, avg: 35.7, max: 57.0) +[2023-10-11 18:48:21,063][84230] Avg episode reward: [(0, '45.890'), (1, '43.250')] +[2023-10-11 18:48:23,413][85176] Updated weights for policy 0, policy_version 91332 (0.0009) +[2023-10-11 18:48:23,781][85176] Updated weights for policy 0, policy_version 91342 (0.0007) +[2023-10-11 18:48:24,156][85176] Updated weights for policy 0, policy_version 91352 (0.0008) +[2023-10-11 18:48:24,816][85175] Updated weights for policy 1, policy_version 92680 (0.0008) +[2023-10-11 18:48:25,186][85175] Updated weights for policy 1, policy_version 92690 (0.0007) +[2023-10-11 18:48:25,564][85175] Updated weights for policy 1, policy_version 92700 (0.0008) +[2023-10-11 18:48:26,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 188481536. Throughput: 0: 1665.5, 1: 1709.1. Samples: 47124228. Policy #0 lag: (min: 25.0, avg: 35.7, max: 57.0) +[2023-10-11 18:48:26,064][84230] Avg episode reward: [(0, '43.720'), (1, '42.180')] +[2023-10-11 18:48:28,263][85176] Updated weights for policy 0, policy_version 91362 (0.0007) +[2023-10-11 18:48:28,661][85176] Updated weights for policy 0, policy_version 91372 (0.0007) +[2023-10-11 18:48:29,033][85176] Updated weights for policy 0, policy_version 91382 (0.0007) +[2023-10-11 18:48:29,178][85175] Updated weights for policy 1, policy_version 92710 (0.0008) +[2023-10-11 18:48:29,405][85176] Updated weights for policy 0, policy_version 91392 (0.0008) +[2023-10-11 18:48:29,542][85175] Updated weights for policy 1, policy_version 92720 (0.0008) +[2023-10-11 18:48:29,902][85175] Updated weights for policy 1, policy_version 92730 (0.0007) +[2023-10-11 18:48:31,063][84230] Fps is (10 sec: 13106.7, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 188547072. Throughput: 0: 1685.2, 1: 1684.3. Samples: 47143940. Policy #0 lag: (min: 25.0, avg: 35.7, max: 57.0) +[2023-10-11 18:48:31,064][84230] Avg episode reward: [(0, '42.080'), (1, '45.790')] +[2023-10-11 18:48:33,398][85176] Updated weights for policy 0, policy_version 91402 (0.0011) +[2023-10-11 18:48:33,773][85176] Updated weights for policy 0, policy_version 91412 (0.0011) +[2023-10-11 18:48:33,920][85175] Updated weights for policy 1, policy_version 92740 (0.0008) +[2023-10-11 18:48:34,152][85176] Updated weights for policy 0, policy_version 91422 (0.0008) +[2023-10-11 18:48:34,281][85175] Updated weights for policy 1, policy_version 92750 (0.0008) +[2023-10-11 18:48:34,652][85175] Updated weights for policy 1, policy_version 92760 (0.0007) +[2023-10-11 18:48:36,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 188612608. Throughput: 0: 1666.1, 1: 1720.0. Samples: 47155082. Policy #0 lag: (min: 25.0, avg: 35.7, max: 57.0) +[2023-10-11 18:48:36,064][84230] Avg episode reward: [(0, '43.960'), (1, '43.900')] +[2023-10-11 18:48:38,271][85176] Updated weights for policy 0, policy_version 91432 (0.0008) +[2023-10-11 18:48:38,638][85176] Updated weights for policy 0, policy_version 91442 (0.0009) +[2023-10-11 18:48:38,710][85175] Updated weights for policy 1, policy_version 92770 (0.0010) +[2023-10-11 18:48:39,008][85176] Updated weights for policy 0, policy_version 91452 (0.0010) +[2023-10-11 18:48:39,086][85175] Updated weights for policy 1, policy_version 92780 (0.0009) +[2023-10-11 18:48:39,442][85175] Updated weights for policy 1, policy_version 92790 (0.0009) +[2023-10-11 18:48:39,808][85175] Updated weights for policy 1, policy_version 92800 (0.0007) +[2023-10-11 18:48:41,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 188678144. Throughput: 0: 1660.0, 1: 1697.9. Samples: 47174048. Policy #0 lag: (min: 10.0, avg: 10.0, max: 12.0) +[2023-10-11 18:48:41,064][84230] Avg episode reward: [(0, '41.750'), (1, '44.080')] +[2023-10-11 18:48:43,107][85176] Updated weights for policy 0, policy_version 91462 (0.0008) +[2023-10-11 18:48:43,476][85176] Updated weights for policy 0, policy_version 91472 (0.0008) +[2023-10-11 18:48:43,843][85176] Updated weights for policy 0, policy_version 91482 (0.0008) +[2023-10-11 18:48:44,051][85175] Updated weights for policy 1, policy_version 92810 (0.0008) +[2023-10-11 18:48:44,422][85175] Updated weights for policy 1, policy_version 92820 (0.0010) +[2023-10-11 18:48:44,790][85175] Updated weights for policy 1, policy_version 92830 (0.0010) +[2023-10-11 18:48:46,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 188743680. Throughput: 0: 1669.5, 1: 1691.7. Samples: 47194246. Policy #0 lag: (min: 10.0, avg: 10.0, max: 12.0) +[2023-10-11 18:48:46,064][84230] Avg episode reward: [(0, '44.290'), (1, '41.260')] +[2023-10-11 18:48:47,999][85176] Updated weights for policy 0, policy_version 91492 (0.0009) +[2023-10-11 18:48:48,370][85176] Updated weights for policy 0, policy_version 91502 (0.0007) +[2023-10-11 18:48:48,736][85175] Updated weights for policy 1, policy_version 92840 (0.0009) +[2023-10-11 18:48:48,744][85176] Updated weights for policy 0, policy_version 91512 (0.0007) +[2023-10-11 18:48:49,101][85175] Updated weights for policy 1, policy_version 92850 (0.0009) +[2023-10-11 18:48:49,461][85175] Updated weights for policy 1, policy_version 92860 (0.0010) +[2023-10-11 18:48:51,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 188809216. Throughput: 0: 1652.0, 1: 1711.7. Samples: 47205052. Policy #0 lag: (min: 10.0, avg: 10.0, max: 12.0) +[2023-10-11 18:48:51,063][84230] Avg episode reward: [(0, '41.290'), (1, '44.530')] +[2023-10-11 18:48:52,872][85176] Updated weights for policy 0, policy_version 91522 (0.0008) +[2023-10-11 18:48:53,251][85176] Updated weights for policy 0, policy_version 91532 (0.0010) +[2023-10-11 18:48:53,440][85175] Updated weights for policy 1, policy_version 92870 (0.0009) +[2023-10-11 18:48:53,623][85176] Updated weights for policy 0, policy_version 91542 (0.0007) +[2023-10-11 18:48:53,798][85175] Updated weights for policy 1, policy_version 92880 (0.0008) +[2023-10-11 18:48:53,995][85176] Updated weights for policy 0, policy_version 91552 (0.0008) +[2023-10-11 18:48:54,162][85175] Updated weights for policy 1, policy_version 92890 (0.0009) +[2023-10-11 18:48:56,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 188874752. Throughput: 0: 1665.3, 1: 1687.1. Samples: 47224042. Policy #0 lag: (min: 10.0, avg: 10.0, max: 12.0) +[2023-10-11 18:48:56,064][84230] Avg episode reward: [(0, '47.400'), (1, '40.930')] +[2023-10-11 18:48:58,091][85175] Updated weights for policy 1, policy_version 92900 (0.0008) +[2023-10-11 18:48:58,131][85176] Updated weights for policy 0, policy_version 91562 (0.0009) +[2023-10-11 18:48:58,455][85175] Updated weights for policy 1, policy_version 92910 (0.0009) +[2023-10-11 18:48:58,502][85176] Updated weights for policy 0, policy_version 91572 (0.0008) +[2023-10-11 18:48:58,820][85175] Updated weights for policy 1, policy_version 92920 (0.0008) +[2023-10-11 18:48:58,874][85176] Updated weights for policy 0, policy_version 91582 (0.0008) +[2023-10-11 18:49:01,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 188940288. Throughput: 0: 1666.2, 1: 1698.3. Samples: 47244560. Policy #0 lag: (min: 10.0, avg: 10.0, max: 12.0) +[2023-10-11 18:49:01,064][84230] Avg episode reward: [(0, '42.940'), (1, '43.760')] +[2023-10-11 18:49:02,915][85175] Updated weights for policy 1, policy_version 92930 (0.0010) +[2023-10-11 18:49:03,039][85176] Updated weights for policy 0, policy_version 91592 (0.0010) +[2023-10-11 18:49:03,283][85175] Updated weights for policy 1, policy_version 92940 (0.0007) +[2023-10-11 18:49:03,409][85176] Updated weights for policy 0, policy_version 91602 (0.0007) +[2023-10-11 18:49:03,651][85175] Updated weights for policy 1, policy_version 92950 (0.0008) +[2023-10-11 18:49:03,786][85176] Updated weights for policy 0, policy_version 91612 (0.0008) +[2023-10-11 18:49:04,021][85175] Updated weights for policy 1, policy_version 92960 (0.0009) +[2023-10-11 18:49:06,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 189005824. Throughput: 0: 1650.3, 1: 1690.2. Samples: 47254896. Policy #0 lag: (min: 10.0, avg: 10.0, max: 12.0) +[2023-10-11 18:49:06,063][84230] Avg episode reward: [(0, '46.770'), (1, '41.090')] +[2023-10-11 18:49:07,725][85176] Updated weights for policy 0, policy_version 91622 (0.0008) +[2023-10-11 18:49:08,090][85176] Updated weights for policy 0, policy_version 91632 (0.0007) +[2023-10-11 18:49:08,161][85175] Updated weights for policy 1, policy_version 92970 (0.0008) +[2023-10-11 18:49:08,468][85176] Updated weights for policy 0, policy_version 91642 (0.0008) +[2023-10-11 18:49:08,529][85175] Updated weights for policy 1, policy_version 92980 (0.0008) +[2023-10-11 18:49:08,890][85175] Updated weights for policy 1, policy_version 92990 (0.0008) +[2023-10-11 18:49:11,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 189071360. Throughput: 0: 1661.5, 1: 1674.1. Samples: 47274330. Policy #0 lag: (min: 10.0, avg: 10.0, max: 12.0) +[2023-10-11 18:49:11,064][84230] Avg episode reward: [(0, '42.830'), (1, '45.680')] +[2023-10-11 18:49:12,780][85176] Updated weights for policy 0, policy_version 91652 (0.0009) +[2023-10-11 18:49:13,044][85175] Updated weights for policy 1, policy_version 93000 (0.0008) +[2023-10-11 18:49:13,171][85176] Updated weights for policy 0, policy_version 91662 (0.0009) +[2023-10-11 18:49:13,416][85175] Updated weights for policy 1, policy_version 93010 (0.0009) +[2023-10-11 18:49:13,530][85176] Updated weights for policy 0, policy_version 91672 (0.0007) +[2023-10-11 18:49:13,782][85175] Updated weights for policy 1, policy_version 93020 (0.0009) +[2023-10-11 18:49:16,063][84230] Fps is (10 sec: 13106.8, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 189136896. Throughput: 0: 1658.7, 1: 1688.4. Samples: 47294562. Policy #0 lag: (min: 10.0, avg: 10.0, max: 12.0) +[2023-10-11 18:49:16,064][84230] Avg episode reward: [(0, '46.210'), (1, '42.300')] +[2023-10-11 18:49:17,453][85176] Updated weights for policy 0, policy_version 91682 (0.0007) +[2023-10-11 18:49:17,824][85176] Updated weights for policy 0, policy_version 91692 (0.0007) +[2023-10-11 18:49:17,896][85175] Updated weights for policy 1, policy_version 93030 (0.0009) +[2023-10-11 18:49:18,196][85176] Updated weights for policy 0, policy_version 91702 (0.0008) +[2023-10-11 18:49:18,254][85175] Updated weights for policy 1, policy_version 93040 (0.0007) +[2023-10-11 18:49:18,562][85176] Updated weights for policy 0, policy_version 91712 (0.0009) +[2023-10-11 18:49:18,624][85175] Updated weights for policy 1, policy_version 93050 (0.0007) +[2023-10-11 18:49:21,063][84230] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 189202432. Throughput: 0: 1652.7, 1: 1662.3. Samples: 47304256. Policy #0 lag: (min: 10.0, avg: 10.0, max: 12.0) +[2023-10-11 18:49:21,063][84230] Avg episode reward: [(0, '41.960'), (1, '46.560')] +[2023-10-11 18:49:22,656][85176] Updated weights for policy 0, policy_version 91722 (0.0009) +[2023-10-11 18:49:22,860][85175] Updated weights for policy 1, policy_version 93060 (0.0009) +[2023-10-11 18:49:23,027][85176] Updated weights for policy 0, policy_version 91732 (0.0009) +[2023-10-11 18:49:23,224][85175] Updated weights for policy 1, policy_version 93070 (0.0008) +[2023-10-11 18:49:23,401][85176] Updated weights for policy 0, policy_version 91742 (0.0008) +[2023-10-11 18:49:23,588][85175] Updated weights for policy 1, policy_version 93080 (0.0008) +[2023-10-11 18:49:26,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 189267968. Throughput: 0: 1671.0, 1: 1669.2. Samples: 47324354. Policy #0 lag: (min: 10.0, avg: 10.0, max: 12.0) +[2023-10-11 18:49:26,063][84230] Avg episode reward: [(0, '46.570'), (1, '41.620')] +[2023-10-11 18:49:27,543][85175] Updated weights for policy 1, policy_version 93090 (0.0008) +[2023-10-11 18:49:27,607][85176] Updated weights for policy 0, policy_version 91752 (0.0008) +[2023-10-11 18:49:27,917][85175] Updated weights for policy 1, policy_version 93100 (0.0009) +[2023-10-11 18:49:27,983][85176] Updated weights for policy 0, policy_version 91762 (0.0010) +[2023-10-11 18:49:28,282][85175] Updated weights for policy 1, policy_version 93110 (0.0008) +[2023-10-11 18:49:28,358][85176] Updated weights for policy 0, policy_version 91772 (0.0007) +[2023-10-11 18:49:28,648][85175] Updated weights for policy 1, policy_version 93120 (0.0007) +[2023-10-11 18:49:31,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 189333504. Throughput: 0: 1671.2, 1: 1681.9. Samples: 47345136. Policy #0 lag: (min: 10.0, avg: 10.0, max: 12.0) +[2023-10-11 18:49:31,064][84230] Avg episode reward: [(0, '42.890'), (1, '42.780')] +[2023-10-11 18:49:32,445][85176] Updated weights for policy 0, policy_version 91782 (0.0008) +[2023-10-11 18:49:32,691][85175] Updated weights for policy 1, policy_version 93130 (0.0009) +[2023-10-11 18:49:32,822][85176] Updated weights for policy 0, policy_version 91792 (0.0009) +[2023-10-11 18:49:33,058][85175] Updated weights for policy 1, policy_version 93140 (0.0008) +[2023-10-11 18:49:33,201][85176] Updated weights for policy 0, policy_version 91802 (0.0008) +[2023-10-11 18:49:33,426][85175] Updated weights for policy 1, policy_version 93150 (0.0009) +[2023-10-11 18:49:36,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 189399040. Throughput: 0: 1657.6, 1: 1653.6. Samples: 47354056. Policy #0 lag: (min: 10.0, avg: 10.0, max: 12.0) +[2023-10-11 18:49:36,063][84230] Avg episode reward: [(0, '45.990'), (1, '39.400')] +[2023-10-11 18:49:37,330][85175] Updated weights for policy 1, policy_version 93160 (0.0008) +[2023-10-11 18:49:37,424][85176] Updated weights for policy 0, policy_version 91812 (0.0009) +[2023-10-11 18:49:37,702][85175] Updated weights for policy 1, policy_version 93170 (0.0009) +[2023-10-11 18:49:37,797][85176] Updated weights for policy 0, policy_version 91822 (0.0007) +[2023-10-11 18:49:38,060][85175] Updated weights for policy 1, policy_version 93180 (0.0008) +[2023-10-11 18:49:38,166][85176] Updated weights for policy 0, policy_version 91832 (0.0007) +[2023-10-11 18:49:41,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 189464576. Throughput: 0: 1668.4, 1: 1677.1. Samples: 47374588. Policy #0 lag: (min: 31.0, avg: 31.2, max: 41.0) +[2023-10-11 18:49:41,064][84230] Avg episode reward: [(0, '44.400'), (1, '41.730')] +[2023-10-11 18:49:42,151][85176] Updated weights for policy 0, policy_version 91842 (0.0009) +[2023-10-11 18:49:42,256][85175] Updated weights for policy 1, policy_version 93190 (0.0009) +[2023-10-11 18:49:42,525][85176] Updated weights for policy 0, policy_version 91852 (0.0008) +[2023-10-11 18:49:42,616][85175] Updated weights for policy 1, policy_version 93200 (0.0009) +[2023-10-11 18:49:42,888][85176] Updated weights for policy 0, policy_version 91862 (0.0009) +[2023-10-11 18:49:42,983][85175] Updated weights for policy 1, policy_version 93210 (0.0007) +[2023-10-11 18:49:43,261][85176] Updated weights for policy 0, policy_version 91872 (0.0008) +[2023-10-11 18:49:46,063][84230] Fps is (10 sec: 13106.8, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 189530112. Throughput: 0: 1676.9, 1: 1672.2. Samples: 47395270. Policy #0 lag: (min: 31.0, avg: 31.2, max: 41.0) +[2023-10-11 18:49:46,064][84230] Avg episode reward: [(0, '46.560'), (1, '41.120')] +[2023-10-11 18:49:47,141][85176] Updated weights for policy 0, policy_version 91882 (0.0009) +[2023-10-11 18:49:47,209][85175] Updated weights for policy 1, policy_version 93220 (0.0009) +[2023-10-11 18:49:47,499][85176] Updated weights for policy 0, policy_version 91892 (0.0009) +[2023-10-11 18:49:47,573][85175] Updated weights for policy 1, policy_version 93230 (0.0009) +[2023-10-11 18:49:47,881][85176] Updated weights for policy 0, policy_version 91902 (0.0007) +[2023-10-11 18:49:47,940][85175] Updated weights for policy 1, policy_version 93240 (0.0008) +[2023-10-11 18:49:51,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 189595648. Throughput: 0: 1665.5, 1: 1657.8. Samples: 47404444. Policy #0 lag: (min: 31.0, avg: 31.2, max: 41.0) +[2023-10-11 18:49:51,064][84230] Avg episode reward: [(0, '45.100'), (1, '43.490')] +[2023-10-11 18:49:51,731][85176] Updated weights for policy 0, policy_version 91912 (0.0010) +[2023-10-11 18:49:51,878][85175] Updated weights for policy 1, policy_version 93250 (0.0010) +[2023-10-11 18:49:52,103][85176] Updated weights for policy 0, policy_version 91922 (0.0007) +[2023-10-11 18:49:52,247][85175] Updated weights for policy 1, policy_version 93260 (0.0008) +[2023-10-11 18:49:52,476][85176] Updated weights for policy 0, policy_version 91932 (0.0008) +[2023-10-11 18:49:52,612][85175] Updated weights for policy 1, policy_version 93270 (0.0009) +[2023-10-11 18:49:52,974][85175] Updated weights for policy 1, policy_version 93280 (0.0008) +[2023-10-11 18:49:56,063][84230] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 189661184. Throughput: 0: 1681.8, 1: 1675.3. Samples: 47425400. Policy #0 lag: (min: 31.0, avg: 31.2, max: 41.0) +[2023-10-11 18:49:56,063][84230] Avg episode reward: [(0, '46.740'), (1, '45.180')] +[2023-10-11 18:49:56,615][85176] Updated weights for policy 0, policy_version 91942 (0.0007) +[2023-10-11 18:49:56,950][85175] Updated weights for policy 1, policy_version 93290 (0.0008) +[2023-10-11 18:49:56,992][85176] Updated weights for policy 0, policy_version 91952 (0.0010) +[2023-10-11 18:49:57,314][85175] Updated weights for policy 1, policy_version 93300 (0.0009) +[2023-10-11 18:49:57,374][85176] Updated weights for policy 0, policy_version 91962 (0.0008) +[2023-10-11 18:49:57,669][85175] Updated weights for policy 1, policy_version 93310 (0.0009) +[2023-10-11 18:50:01,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 189726720. Throughput: 0: 1680.3, 1: 1690.7. Samples: 47446254. Policy #0 lag: (min: 31.0, avg: 31.2, max: 41.0) +[2023-10-11 18:50:01,063][84230] Avg episode reward: [(0, '45.110'), (1, '44.480')] +[2023-10-11 18:50:01,500][85176] Updated weights for policy 0, policy_version 91972 (0.0008) +[2023-10-11 18:50:01,750][85175] Updated weights for policy 1, policy_version 93320 (0.0008) +[2023-10-11 18:50:01,883][85176] Updated weights for policy 0, policy_version 91982 (0.0007) +[2023-10-11 18:50:02,111][85175] Updated weights for policy 1, policy_version 93330 (0.0008) +[2023-10-11 18:50:02,255][85176] Updated weights for policy 0, policy_version 91992 (0.0007) +[2023-10-11 18:50:02,476][85175] Updated weights for policy 1, policy_version 93340 (0.0009) +[2023-10-11 18:50:06,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 189792256. Throughput: 0: 1673.7, 1: 1683.1. Samples: 47455312. Policy #0 lag: (min: 31.0, avg: 31.2, max: 41.0) +[2023-10-11 18:50:06,063][84230] Avg episode reward: [(0, '45.530'), (1, '44.230')] +[2023-10-11 18:50:06,359][85175] Updated weights for policy 1, policy_version 93350 (0.0009) +[2023-10-11 18:50:06,378][85176] Updated weights for policy 0, policy_version 92002 (0.0009) +[2023-10-11 18:50:06,733][85175] Updated weights for policy 1, policy_version 93360 (0.0007) +[2023-10-11 18:50:06,740][85176] Updated weights for policy 0, policy_version 92012 (0.0009) +[2023-10-11 18:50:07,092][85175] Updated weights for policy 1, policy_version 93370 (0.0008) +[2023-10-11 18:50:07,107][85176] Updated weights for policy 0, policy_version 92022 (0.0009) +[2023-10-11 18:50:07,477][85176] Updated weights for policy 0, policy_version 92032 (0.0008) +[2023-10-11 18:50:11,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 189857792. Throughput: 0: 1677.4, 1: 1699.9. Samples: 47476330. Policy #0 lag: (min: 31.0, avg: 31.2, max: 41.0) +[2023-10-11 18:50:11,063][84230] Avg episode reward: [(0, '46.070'), (1, '41.760')] +[2023-10-11 18:50:11,167][85175] Updated weights for policy 1, policy_version 93380 (0.0009) +[2023-10-11 18:50:11,532][85175] Updated weights for policy 1, policy_version 93390 (0.0007) +[2023-10-11 18:50:11,616][85176] Updated weights for policy 0, policy_version 92042 (0.0007) +[2023-10-11 18:50:11,897][85175] Updated weights for policy 1, policy_version 93400 (0.0008) +[2023-10-11 18:50:11,995][85176] Updated weights for policy 0, policy_version 92052 (0.0007) +[2023-10-11 18:50:12,359][85176] Updated weights for policy 0, policy_version 92062 (0.0009) +[2023-10-11 18:50:15,888][85175] Updated weights for policy 1, policy_version 93410 (0.0009) +[2023-10-11 18:50:16,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 189923328. Throughput: 0: 1684.1, 1: 1699.6. Samples: 47497400. Policy #0 lag: (min: 31.0, avg: 31.2, max: 41.0) +[2023-10-11 18:50:16,063][84230] Avg episode reward: [(0, '45.870'), (1, '44.320')] +[2023-10-11 18:50:16,259][85175] Updated weights for policy 1, policy_version 93420 (0.0008) +[2023-10-11 18:50:16,404][85176] Updated weights for policy 0, policy_version 92072 (0.0008) +[2023-10-11 18:50:16,623][85175] Updated weights for policy 1, policy_version 93430 (0.0008) +[2023-10-11 18:50:16,784][85176] Updated weights for policy 0, policy_version 92082 (0.0008) +[2023-10-11 18:50:16,982][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000093440_95682560.pth... +[2023-10-11 18:50:16,982][85175] Updated weights for policy 1, policy_version 93440 (0.0009) +[2023-10-11 18:50:17,023][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000091840_94044160.pth +[2023-10-11 18:50:17,164][85176] Updated weights for policy 0, policy_version 92092 (0.0009) +[2023-10-11 18:50:17,306][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000092096_94306304.pth... +[2023-10-11 18:50:17,343][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000090496_92667904.pth +[2023-10-11 18:50:21,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 189988864. Throughput: 0: 1686.0, 1: 1701.3. Samples: 47506484. Policy #0 lag: (min: 31.0, avg: 31.2, max: 41.0) +[2023-10-11 18:50:21,064][84230] Avg episode reward: [(0, '45.680'), (1, '42.130')] +[2023-10-11 18:50:21,175][85176] Updated weights for policy 0, policy_version 92102 (0.0009) +[2023-10-11 18:50:21,188][85175] Updated weights for policy 1, policy_version 93450 (0.0008) +[2023-10-11 18:50:21,545][85176] Updated weights for policy 0, policy_version 92112 (0.0008) +[2023-10-11 18:50:21,551][85175] Updated weights for policy 1, policy_version 93460 (0.0007) +[2023-10-11 18:50:21,917][85175] Updated weights for policy 1, policy_version 93470 (0.0007) +[2023-10-11 18:50:21,918][85176] Updated weights for policy 0, policy_version 92122 (0.0007) +[2023-10-11 18:50:25,903][85175] Updated weights for policy 1, policy_version 93480 (0.0007) +[2023-10-11 18:50:26,010][85176] Updated weights for policy 0, policy_version 92132 (0.0007) +[2023-10-11 18:50:26,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 190054400. Throughput: 0: 1684.7, 1: 1697.8. Samples: 47526802. Policy #0 lag: (min: 31.0, avg: 31.2, max: 41.0) +[2023-10-11 18:50:26,063][84230] Avg episode reward: [(0, '45.080'), (1, '44.830')] +[2023-10-11 18:50:26,270][85175] Updated weights for policy 1, policy_version 93490 (0.0007) +[2023-10-11 18:50:26,376][85176] Updated weights for policy 0, policy_version 92142 (0.0009) +[2023-10-11 18:50:26,635][85175] Updated weights for policy 1, policy_version 93500 (0.0007) +[2023-10-11 18:50:26,753][85176] Updated weights for policy 0, policy_version 92152 (0.0008) +[2023-10-11 18:50:30,729][85175] Updated weights for policy 1, policy_version 93510 (0.0009) +[2023-10-11 18:50:30,973][85176] Updated weights for policy 0, policy_version 92162 (0.0007) +[2023-10-11 18:50:31,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 190119936. Throughput: 0: 1678.3, 1: 1706.4. Samples: 47547580. Policy #0 lag: (min: 31.0, avg: 31.2, max: 41.0) +[2023-10-11 18:50:31,063][84230] Avg episode reward: [(0, '44.510'), (1, '41.250')] +[2023-10-11 18:50:31,099][85175] Updated weights for policy 1, policy_version 93520 (0.0008) +[2023-10-11 18:50:31,339][85176] Updated weights for policy 0, policy_version 92172 (0.0007) +[2023-10-11 18:50:31,466][85175] Updated weights for policy 1, policy_version 93530 (0.0007) +[2023-10-11 18:50:31,715][85176] Updated weights for policy 0, policy_version 92182 (0.0007) +[2023-10-11 18:50:32,085][85176] Updated weights for policy 0, policy_version 92192 (0.0008) +[2023-10-11 18:50:35,429][85175] Updated weights for policy 1, policy_version 93540 (0.0007) +[2023-10-11 18:50:35,792][85175] Updated weights for policy 1, policy_version 93550 (0.0009) +[2023-10-11 18:50:36,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 190185472. Throughput: 0: 1678.1, 1: 1710.8. Samples: 47556942. Policy #0 lag: (min: 31.0, avg: 31.2, max: 41.0) +[2023-10-11 18:50:36,063][84230] Avg episode reward: [(0, '45.340'), (1, '44.810')] +[2023-10-11 18:50:36,144][85176] Updated weights for policy 0, policy_version 92202 (0.0008) +[2023-10-11 18:50:36,160][85175] Updated weights for policy 1, policy_version 93560 (0.0008) +[2023-10-11 18:50:36,506][85176] Updated weights for policy 0, policy_version 92212 (0.0009) +[2023-10-11 18:50:36,892][85176] Updated weights for policy 0, policy_version 92222 (0.0009) +[2023-10-11 18:50:40,149][85175] Updated weights for policy 1, policy_version 93570 (0.0008) +[2023-10-11 18:50:40,515][85175] Updated weights for policy 1, policy_version 93580 (0.0008) +[2023-10-11 18:50:40,886][85175] Updated weights for policy 1, policy_version 93590 (0.0007) +[2023-10-11 18:50:41,002][85176] Updated weights for policy 0, policy_version 92232 (0.0008) +[2023-10-11 18:50:41,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 190251008. Throughput: 0: 1676.0, 1: 1708.1. Samples: 47577684. Policy #0 lag: (min: 31.0, avg: 37.8, max: 63.0) +[2023-10-11 18:50:41,063][84230] Avg episode reward: [(0, '45.490'), (1, '42.170')] +[2023-10-11 18:50:41,250][85175] Updated weights for policy 1, policy_version 93600 (0.0007) +[2023-10-11 18:50:41,372][85176] Updated weights for policy 0, policy_version 92242 (0.0007) +[2023-10-11 18:50:41,750][85176] Updated weights for policy 0, policy_version 92252 (0.0008) +[2023-10-11 18:50:45,104][85175] Updated weights for policy 1, policy_version 93610 (0.0009) +[2023-10-11 18:50:45,470][85175] Updated weights for policy 1, policy_version 93620 (0.0010) +[2023-10-11 18:50:45,805][85176] Updated weights for policy 0, policy_version 92262 (0.0008) +[2023-10-11 18:50:45,838][85175] Updated weights for policy 1, policy_version 93630 (0.0009) +[2023-10-11 18:50:46,063][84230] Fps is (10 sec: 16383.8, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 190349312. Throughput: 0: 1680.6, 1: 1691.3. Samples: 47597992. Policy #0 lag: (min: 31.0, avg: 37.8, max: 63.0) +[2023-10-11 18:50:46,064][84230] Avg episode reward: [(0, '43.700'), (1, '44.830')] +[2023-10-11 18:50:46,179][85176] Updated weights for policy 0, policy_version 92272 (0.0009) +[2023-10-11 18:50:46,557][85176] Updated weights for policy 0, policy_version 92282 (0.0008) +[2023-10-11 18:50:49,909][85175] Updated weights for policy 1, policy_version 93640 (0.0007) +[2023-10-11 18:50:50,275][85175] Updated weights for policy 1, policy_version 93650 (0.0009) +[2023-10-11 18:50:50,497][85176] Updated weights for policy 0, policy_version 92292 (0.0008) +[2023-10-11 18:50:50,633][85175] Updated weights for policy 1, policy_version 93660 (0.0008) +[2023-10-11 18:50:50,868][85176] Updated weights for policy 0, policy_version 92302 (0.0008) +[2023-10-11 18:50:51,063][84230] Fps is (10 sec: 16383.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 190414848. Throughput: 0: 1681.5, 1: 1708.4. Samples: 47607856. Policy #0 lag: (min: 31.0, avg: 37.8, max: 63.0) +[2023-10-11 18:50:51,064][84230] Avg episode reward: [(0, '43.700'), (1, '44.640')] +[2023-10-11 18:50:51,242][85176] Updated weights for policy 0, policy_version 92312 (0.0008) +[2023-10-11 18:50:54,777][85175] Updated weights for policy 1, policy_version 93670 (0.0007) +[2023-10-11 18:50:55,153][85175] Updated weights for policy 1, policy_version 93680 (0.0007) +[2023-10-11 18:50:55,483][85176] Updated weights for policy 0, policy_version 92322 (0.0008) +[2023-10-11 18:50:55,522][85175] Updated weights for policy 1, policy_version 93690 (0.0007) +[2023-10-11 18:50:55,846][85176] Updated weights for policy 0, policy_version 92332 (0.0008) +[2023-10-11 18:50:56,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 190480384. Throughput: 0: 1682.9, 1: 1702.6. Samples: 47628680. Policy #0 lag: (min: 31.0, avg: 37.8, max: 63.0) +[2023-10-11 18:50:56,064][84230] Avg episode reward: [(0, '42.200'), (1, '46.630')] +[2023-10-11 18:50:56,217][85176] Updated weights for policy 0, policy_version 92342 (0.0008) +[2023-10-11 18:50:56,589][85176] Updated weights for policy 0, policy_version 92352 (0.0007) +[2023-10-11 18:50:59,510][85175] Updated weights for policy 1, policy_version 93700 (0.0008) +[2023-10-11 18:50:59,875][85175] Updated weights for policy 1, policy_version 93710 (0.0008) +[2023-10-11 18:51:00,235][85175] Updated weights for policy 1, policy_version 93720 (0.0010) +[2023-10-11 18:51:00,717][85176] Updated weights for policy 0, policy_version 92362 (0.0008) +[2023-10-11 18:51:01,062][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 190545920. Throughput: 0: 1675.2, 1: 1671.5. Samples: 47648000. Policy #0 lag: (min: 31.0, avg: 37.8, max: 63.0) +[2023-10-11 18:51:01,063][84230] Avg episode reward: [(0, '43.960'), (1, '43.820')] +[2023-10-11 18:51:01,091][85176] Updated weights for policy 0, policy_version 92372 (0.0011) +[2023-10-11 18:51:01,476][85176] Updated weights for policy 0, policy_version 92382 (0.0010) +[2023-10-11 18:51:04,219][85175] Updated weights for policy 1, policy_version 93730 (0.0009) +[2023-10-11 18:51:04,584][85175] Updated weights for policy 1, policy_version 93740 (0.0008) +[2023-10-11 18:51:04,960][85175] Updated weights for policy 1, policy_version 93750 (0.0007) +[2023-10-11 18:51:05,321][85175] Updated weights for policy 1, policy_version 93760 (0.0008) +[2023-10-11 18:51:05,408][85176] Updated weights for policy 0, policy_version 92392 (0.0009) +[2023-10-11 18:51:05,777][85176] Updated weights for policy 0, policy_version 92402 (0.0009) +[2023-10-11 18:51:06,063][84230] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 190611456. Throughput: 0: 1678.8, 1: 1706.4. Samples: 47658818. Policy #0 lag: (min: 31.0, avg: 37.8, max: 63.0) +[2023-10-11 18:51:06,063][84230] Avg episode reward: [(0, '40.120'), (1, '46.040')] +[2023-10-11 18:51:06,145][85176] Updated weights for policy 0, policy_version 92412 (0.0008) +[2023-10-11 18:51:09,344][85175] Updated weights for policy 1, policy_version 93770 (0.0010) +[2023-10-11 18:51:09,713][85175] Updated weights for policy 1, policy_version 93780 (0.0009) +[2023-10-11 18:51:10,081][85175] Updated weights for policy 1, policy_version 93790 (0.0007) +[2023-10-11 18:51:10,169][85176] Updated weights for policy 0, policy_version 92422 (0.0007) +[2023-10-11 18:51:10,539][85176] Updated weights for policy 0, policy_version 92432 (0.0010) +[2023-10-11 18:51:10,918][85176] Updated weights for policy 0, policy_version 92442 (0.0007) +[2023-10-11 18:51:11,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 190676992. Throughput: 0: 1685.0, 1: 1701.6. Samples: 47679198. Policy #0 lag: (min: 31.0, avg: 37.8, max: 63.0) +[2023-10-11 18:51:11,063][84230] Avg episode reward: [(0, '44.320'), (1, '45.880')] +[2023-10-11 18:51:14,140][85175] Updated weights for policy 1, policy_version 93800 (0.0009) +[2023-10-11 18:51:14,512][85175] Updated weights for policy 1, policy_version 93810 (0.0010) +[2023-10-11 18:51:14,785][85176] Updated weights for policy 0, policy_version 92452 (0.0009) +[2023-10-11 18:51:14,877][85175] Updated weights for policy 1, policy_version 93820 (0.0007) +[2023-10-11 18:51:15,155][85176] Updated weights for policy 0, policy_version 92462 (0.0008) +[2023-10-11 18:51:15,520][85176] Updated weights for policy 0, policy_version 92472 (0.0008) +[2023-10-11 18:51:16,063][84230] Fps is (10 sec: 16383.5, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 190775296. Throughput: 0: 1665.7, 1: 1683.6. Samples: 47698298. Policy #0 lag: (min: 31.0, avg: 37.8, max: 63.0) +[2023-10-11 18:51:16,064][84230] Avg episode reward: [(0, '43.540'), (1, '46.510')] +[2023-10-11 18:51:18,870][85175] Updated weights for policy 1, policy_version 93830 (0.0007) +[2023-10-11 18:51:19,233][85175] Updated weights for policy 1, policy_version 93840 (0.0008) +[2023-10-11 18:51:19,600][85175] Updated weights for policy 1, policy_version 93850 (0.0010) +[2023-10-11 18:51:19,663][85176] Updated weights for policy 0, policy_version 92482 (0.0008) +[2023-10-11 18:51:20,040][85176] Updated weights for policy 0, policy_version 92492 (0.0008) +[2023-10-11 18:51:20,402][85176] Updated weights for policy 0, policy_version 92502 (0.0009) +[2023-10-11 18:51:20,769][85176] Updated weights for policy 0, policy_version 92512 (0.0010) +[2023-10-11 18:51:21,063][84230] Fps is (10 sec: 16383.7, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 190840832. Throughput: 0: 1687.1, 1: 1713.4. Samples: 47709962. Policy #0 lag: (min: 31.0, avg: 37.8, max: 63.0) +[2023-10-11 18:51:21,064][84230] Avg episode reward: [(0, '46.480'), (1, '45.410')] +[2023-10-11 18:51:23,810][85175] Updated weights for policy 1, policy_version 93860 (0.0010) +[2023-10-11 18:51:24,178][85175] Updated weights for policy 1, policy_version 93870 (0.0008) +[2023-10-11 18:51:24,541][85175] Updated weights for policy 1, policy_version 93880 (0.0007) +[2023-10-11 18:51:24,959][85176] Updated weights for policy 0, policy_version 92522 (0.0009) +[2023-10-11 18:51:25,350][85176] Updated weights for policy 0, policy_version 92532 (0.0009) +[2023-10-11 18:51:25,723][85176] Updated weights for policy 0, policy_version 92542 (0.0009) +[2023-10-11 18:51:26,063][84230] Fps is (10 sec: 13107.5, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 190906368. Throughput: 0: 1684.7, 1: 1690.0. Samples: 47729548. Policy #0 lag: (min: 31.0, avg: 37.8, max: 63.0) +[2023-10-11 18:51:26,064][84230] Avg episode reward: [(0, '43.710'), (1, '45.570')] +[2023-10-11 18:51:28,400][85175] Updated weights for policy 1, policy_version 93890 (0.0007) +[2023-10-11 18:51:28,768][85175] Updated weights for policy 1, policy_version 93900 (0.0008) +[2023-10-11 18:51:29,138][85175] Updated weights for policy 1, policy_version 93910 (0.0009) +[2023-10-11 18:51:29,501][85175] Updated weights for policy 1, policy_version 93920 (0.0008) +[2023-10-11 18:51:29,793][85176] Updated weights for policy 0, policy_version 92552 (0.0007) +[2023-10-11 18:51:30,158][85176] Updated weights for policy 0, policy_version 92562 (0.0009) +[2023-10-11 18:51:30,525][85176] Updated weights for policy 0, policy_version 92572 (0.0010) +[2023-10-11 18:51:31,063][84230] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 190971904. Throughput: 0: 1658.8, 1: 1693.7. Samples: 47748858. Policy #0 lag: (min: 31.0, avg: 37.8, max: 63.0) +[2023-10-11 18:51:31,064][84230] Avg episode reward: [(0, '45.670'), (1, '42.770')] +[2023-10-11 18:51:33,489][85175] Updated weights for policy 1, policy_version 93930 (0.0008) +[2023-10-11 18:51:33,852][85175] Updated weights for policy 1, policy_version 93940 (0.0008) +[2023-10-11 18:51:34,214][85175] Updated weights for policy 1, policy_version 93950 (0.0010) +[2023-10-11 18:51:34,663][85176] Updated weights for policy 0, policy_version 92582 (0.0009) +[2023-10-11 18:51:35,054][85176] Updated weights for policy 0, policy_version 92592 (0.0008) +[2023-10-11 18:51:35,421][85176] Updated weights for policy 0, policy_version 92602 (0.0007) +[2023-10-11 18:51:36,062][84230] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 191037440. Throughput: 0: 1684.6, 1: 1700.9. Samples: 47760202. Policy #0 lag: (min: 31.0, avg: 36.2, max: 63.0) +[2023-10-11 18:51:36,063][84230] Avg episode reward: [(0, '43.240'), (1, '45.160')] +[2023-10-11 18:51:38,243][85175] Updated weights for policy 1, policy_version 93960 (0.0007) +[2023-10-11 18:51:38,616][85175] Updated weights for policy 1, policy_version 93970 (0.0008) +[2023-10-11 18:51:38,984][85175] Updated weights for policy 1, policy_version 93980 (0.0007) +[2023-10-11 18:51:39,419][85176] Updated weights for policy 0, policy_version 92612 (0.0009) +[2023-10-11 18:51:39,790][85176] Updated weights for policy 0, policy_version 92622 (0.0011) +[2023-10-11 18:51:40,162][85176] Updated weights for policy 0, policy_version 92632 (0.0010) +[2023-10-11 18:51:41,063][84230] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 191102976. Throughput: 0: 1675.5, 1: 1683.2. Samples: 47779824. Policy #0 lag: (min: 31.0, avg: 36.2, max: 63.0) +[2023-10-11 18:51:41,064][84230] Avg episode reward: [(0, '45.140'), (1, '42.630')] +[2023-10-11 18:51:42,964][85175] Updated weights for policy 1, policy_version 93990 (0.0007) +[2023-10-11 18:51:43,327][85175] Updated weights for policy 1, policy_version 94000 (0.0007) +[2023-10-11 18:51:43,688][85175] Updated weights for policy 1, policy_version 94010 (0.0007) +[2023-10-11 18:51:44,329][85176] Updated weights for policy 0, policy_version 92642 (0.0009) +[2023-10-11 18:51:44,702][85176] Updated weights for policy 0, policy_version 92652 (0.0010) +[2023-10-11 18:51:45,077][85176] Updated weights for policy 0, policy_version 92662 (0.0009) +[2023-10-11 18:51:45,448][85176] Updated weights for policy 0, policy_version 92672 (0.0010) +[2023-10-11 18:51:46,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 191168512. Throughput: 0: 1656.2, 1: 1714.0. Samples: 47799656. Policy #0 lag: (min: 31.0, avg: 36.2, max: 63.0) +[2023-10-11 18:51:46,063][84230] Avg episode reward: [(0, '43.750'), (1, '43.620')] +[2023-10-11 18:51:47,729][85175] Updated weights for policy 1, policy_version 94020 (0.0007) +[2023-10-11 18:51:48,094][85175] Updated weights for policy 1, policy_version 94030 (0.0008) +[2023-10-11 18:51:48,458][85175] Updated weights for policy 1, policy_version 94040 (0.0008) +[2023-10-11 18:51:49,459][85176] Updated weights for policy 0, policy_version 92682 (0.0009) +[2023-10-11 18:51:49,841][85176] Updated weights for policy 0, policy_version 92692 (0.0009) +[2023-10-11 18:51:50,205][85176] Updated weights for policy 0, policy_version 92702 (0.0008) +[2023-10-11 18:51:51,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 191234048. Throughput: 0: 1684.0, 1: 1688.1. Samples: 47810564. Policy #0 lag: (min: 31.0, avg: 36.2, max: 63.0) +[2023-10-11 18:51:51,063][84230] Avg episode reward: [(0, '44.650'), (1, '45.360')] +[2023-10-11 18:51:52,609][85175] Updated weights for policy 1, policy_version 94050 (0.0008) +[2023-10-11 18:51:52,975][85175] Updated weights for policy 1, policy_version 94060 (0.0007) +[2023-10-11 18:51:53,338][85175] Updated weights for policy 1, policy_version 94070 (0.0010) +[2023-10-11 18:51:53,707][85175] Updated weights for policy 1, policy_version 94080 (0.0008) +[2023-10-11 18:51:54,380][85176] Updated weights for policy 0, policy_version 92712 (0.0008) +[2023-10-11 18:51:54,757][85176] Updated weights for policy 0, policy_version 92722 (0.0009) +[2023-10-11 18:51:55,126][85176] Updated weights for policy 0, policy_version 92732 (0.0008) +[2023-10-11 18:51:56,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 191299584. Throughput: 0: 1667.0, 1: 1689.8. Samples: 47830254. Policy #0 lag: (min: 31.0, avg: 36.2, max: 63.0) +[2023-10-11 18:51:56,063][84230] Avg episode reward: [(0, '42.070'), (1, '44.860')] +[2023-10-11 18:51:57,653][85175] Updated weights for policy 1, policy_version 94090 (0.0010) +[2023-10-11 18:51:58,028][85175] Updated weights for policy 1, policy_version 94100 (0.0008) +[2023-10-11 18:51:58,399][85175] Updated weights for policy 1, policy_version 94110 (0.0008) +[2023-10-11 18:51:59,180][85176] Updated weights for policy 0, policy_version 92742 (0.0010) +[2023-10-11 18:51:59,556][85176] Updated weights for policy 0, policy_version 92752 (0.0010) +[2023-10-11 18:51:59,925][85176] Updated weights for policy 0, policy_version 92762 (0.0008) +[2023-10-11 18:52:01,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 191365120. Throughput: 0: 1669.0, 1: 1708.5. Samples: 47850286. Policy #0 lag: (min: 31.0, avg: 36.2, max: 63.0) +[2023-10-11 18:52:01,064][84230] Avg episode reward: [(0, '45.230'), (1, '44.120')] +[2023-10-11 18:52:02,494][85175] Updated weights for policy 1, policy_version 94120 (0.0008) +[2023-10-11 18:52:02,865][85175] Updated weights for policy 1, policy_version 94130 (0.0009) +[2023-10-11 18:52:03,227][85175] Updated weights for policy 1, policy_version 94140 (0.0009) +[2023-10-11 18:52:04,077][85176] Updated weights for policy 0, policy_version 92772 (0.0008) +[2023-10-11 18:52:04,458][85176] Updated weights for policy 0, policy_version 92782 (0.0008) +[2023-10-11 18:52:04,834][85176] Updated weights for policy 0, policy_version 92792 (0.0008) +[2023-10-11 18:52:06,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 191430656. Throughput: 0: 1675.6, 1: 1675.8. Samples: 47860774. Policy #0 lag: (min: 31.0, avg: 36.2, max: 63.0) +[2023-10-11 18:52:06,064][84230] Avg episode reward: [(0, '42.040'), (1, '43.590')] +[2023-10-11 18:52:07,229][85175] Updated weights for policy 1, policy_version 94150 (0.0007) +[2023-10-11 18:52:07,595][85175] Updated weights for policy 1, policy_version 94160 (0.0008) +[2023-10-11 18:52:07,970][85175] Updated weights for policy 1, policy_version 94170 (0.0007) +[2023-10-11 18:52:08,865][85176] Updated weights for policy 0, policy_version 92802 (0.0007) +[2023-10-11 18:52:09,239][85176] Updated weights for policy 0, policy_version 92812 (0.0008) +[2023-10-11 18:52:09,605][85176] Updated weights for policy 0, policy_version 92822 (0.0008) +[2023-10-11 18:52:09,971][85176] Updated weights for policy 0, policy_version 92832 (0.0008) +[2023-10-11 18:52:11,063][84230] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 191496192. Throughput: 0: 1660.9, 1: 1700.2. Samples: 47880798. Policy #0 lag: (min: 31.0, avg: 36.2, max: 63.0) +[2023-10-11 18:52:11,063][84230] Avg episode reward: [(0, '46.900'), (1, '42.510')] +[2023-10-11 18:52:11,910][85175] Updated weights for policy 1, policy_version 94180 (0.0007) +[2023-10-11 18:52:12,281][85175] Updated weights for policy 1, policy_version 94190 (0.0007) +[2023-10-11 18:52:12,648][85175] Updated weights for policy 1, policy_version 94200 (0.0008) +[2023-10-11 18:52:13,961][85176] Updated weights for policy 0, policy_version 92842 (0.0009) +[2023-10-11 18:52:14,342][85176] Updated weights for policy 0, policy_version 92852 (0.0010) +[2023-10-11 18:52:14,699][85176] Updated weights for policy 0, policy_version 92862 (0.0009) +[2023-10-11 18:52:16,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 191561728. Throughput: 0: 1676.4, 1: 1709.0. Samples: 47901198. Policy #0 lag: (min: 31.0, avg: 36.2, max: 63.0) +[2023-10-11 18:52:16,064][84230] Avg episode reward: [(0, '42.580'), (1, '45.430')] +[2023-10-11 18:52:16,075][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000092864_95092736.pth... +[2023-10-11 18:52:16,075][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000094208_96468992.pth... +[2023-10-11 18:52:16,113][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000092640_94863360.pth +[2023-10-11 18:52:16,117][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000091296_93487104.pth +[2023-10-11 18:52:16,655][85175] Updated weights for policy 1, policy_version 94210 (0.0008) +[2023-10-11 18:52:17,028][85175] Updated weights for policy 1, policy_version 94220 (0.0011) +[2023-10-11 18:52:17,400][85175] Updated weights for policy 1, policy_version 94230 (0.0009) +[2023-10-11 18:52:17,760][85175] Updated weights for policy 1, policy_version 94240 (0.0011) +[2023-10-11 18:52:18,905][85176] Updated weights for policy 0, policy_version 92872 (0.0010) +[2023-10-11 18:52:19,284][85176] Updated weights for policy 0, policy_version 92882 (0.0010) +[2023-10-11 18:52:19,657][85176] Updated weights for policy 0, policy_version 92892 (0.0007) +[2023-10-11 18:52:21,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 191627264. Throughput: 0: 1680.4, 1: 1682.5. Samples: 47911536. Policy #0 lag: (min: 31.0, avg: 36.2, max: 63.0) +[2023-10-11 18:52:21,064][84230] Avg episode reward: [(0, '46.830'), (1, '42.480')] +[2023-10-11 18:52:21,788][85175] Updated weights for policy 1, policy_version 94250 (0.0008) +[2023-10-11 18:52:22,153][85175] Updated weights for policy 1, policy_version 94260 (0.0007) +[2023-10-11 18:52:22,521][85175] Updated weights for policy 1, policy_version 94270 (0.0008) +[2023-10-11 18:52:23,617][85176] Updated weights for policy 0, policy_version 92902 (0.0009) +[2023-10-11 18:52:23,996][85176] Updated weights for policy 0, policy_version 92912 (0.0010) +[2023-10-11 18:52:24,362][85176] Updated weights for policy 0, policy_version 92922 (0.0007) +[2023-10-11 18:52:26,062][84230] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 191692800. Throughput: 0: 1659.3, 1: 1701.1. Samples: 47931042. Policy #0 lag: (min: 31.0, avg: 36.2, max: 63.0) +[2023-10-11 18:52:26,063][84230] Avg episode reward: [(0, '42.850'), (1, '44.780')] +[2023-10-11 18:52:26,561][85175] Updated weights for policy 1, policy_version 94280 (0.0009) +[2023-10-11 18:52:26,920][85175] Updated weights for policy 1, policy_version 94290 (0.0008) +[2023-10-11 18:52:27,286][85175] Updated weights for policy 1, policy_version 94300 (0.0010) +[2023-10-11 18:52:28,495][85176] Updated weights for policy 0, policy_version 92932 (0.0008) +[2023-10-11 18:52:28,885][85176] Updated weights for policy 0, policy_version 92942 (0.0008) +[2023-10-11 18:52:29,268][85176] Updated weights for policy 0, policy_version 92952 (0.0008) +[2023-10-11 18:52:31,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 191758336. Throughput: 0: 1677.0, 1: 1702.6. Samples: 47951738. Policy #0 lag: (min: 31.0, avg: 36.2, max: 63.0) +[2023-10-11 18:52:31,063][84230] Avg episode reward: [(0, '43.460'), (1, '43.690')] +[2023-10-11 18:52:31,453][85175] Updated weights for policy 1, policy_version 94310 (0.0009) +[2023-10-11 18:52:31,819][85175] Updated weights for policy 1, policy_version 94320 (0.0010) +[2023-10-11 18:52:32,191][85175] Updated weights for policy 1, policy_version 94330 (0.0008) +[2023-10-11 18:52:33,302][85176] Updated weights for policy 0, policy_version 92962 (0.0007) +[2023-10-11 18:52:33,677][85176] Updated weights for policy 0, policy_version 92972 (0.0007) +[2023-10-11 18:52:34,040][85176] Updated weights for policy 0, policy_version 92982 (0.0008) +[2023-10-11 18:52:34,418][85176] Updated weights for policy 0, policy_version 92992 (0.0008) +[2023-10-11 18:52:36,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 191823872. Throughput: 0: 1666.1, 1: 1695.5. Samples: 47961836. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:52:36,063][84230] Avg episode reward: [(0, '43.420'), (1, '44.880')] +[2023-10-11 18:52:36,081][85175] Updated weights for policy 1, policy_version 94340 (0.0010) +[2023-10-11 18:52:36,456][85175] Updated weights for policy 1, policy_version 94350 (0.0010) +[2023-10-11 18:52:36,813][85175] Updated weights for policy 1, policy_version 94360 (0.0010) +[2023-10-11 18:52:38,404][85176] Updated weights for policy 0, policy_version 93002 (0.0007) +[2023-10-11 18:52:38,774][85176] Updated weights for policy 0, policy_version 93012 (0.0008) +[2023-10-11 18:52:39,148][85176] Updated weights for policy 0, policy_version 93022 (0.0007) +[2023-10-11 18:52:40,847][85175] Updated weights for policy 1, policy_version 94370 (0.0010) +[2023-10-11 18:52:41,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 191889408. Throughput: 0: 1664.3, 1: 1704.8. Samples: 47981864. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:52:41,063][84230] Avg episode reward: [(0, '44.130'), (1, '43.910')] +[2023-10-11 18:52:41,215][85175] Updated weights for policy 1, policy_version 94380 (0.0011) +[2023-10-11 18:52:41,582][85175] Updated weights for policy 1, policy_version 94390 (0.0010) +[2023-10-11 18:52:41,955][85175] Updated weights for policy 1, policy_version 94400 (0.0010) +[2023-10-11 18:52:43,426][85176] Updated weights for policy 0, policy_version 93032 (0.0009) +[2023-10-11 18:52:43,812][85176] Updated weights for policy 0, policy_version 93042 (0.0008) +[2023-10-11 18:52:44,177][85176] Updated weights for policy 0, policy_version 93052 (0.0010) +[2023-10-11 18:52:46,008][85175] Updated weights for policy 1, policy_version 94410 (0.0008) +[2023-10-11 18:52:46,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 191954944. Throughput: 0: 1676.7, 1: 1707.1. Samples: 48002556. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:52:46,064][84230] Avg episode reward: [(0, '47.420'), (1, '45.300')] +[2023-10-11 18:52:46,383][85175] Updated weights for policy 1, policy_version 94420 (0.0008) +[2023-10-11 18:52:46,752][85175] Updated weights for policy 1, policy_version 94430 (0.0008) +[2023-10-11 18:52:48,179][85176] Updated weights for policy 0, policy_version 93062 (0.0010) +[2023-10-11 18:52:48,556][85176] Updated weights for policy 0, policy_version 93072 (0.0008) +[2023-10-11 18:52:48,924][85176] Updated weights for policy 0, policy_version 93082 (0.0011) +[2023-10-11 18:52:50,797][85175] Updated weights for policy 1, policy_version 94440 (0.0008) +[2023-10-11 18:52:51,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 192020480. Throughput: 0: 1661.6, 1: 1704.1. Samples: 48012232. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:52:51,064][84230] Avg episode reward: [(0, '45.040'), (1, '44.650')] +[2023-10-11 18:52:51,165][85175] Updated weights for policy 1, policy_version 94450 (0.0008) +[2023-10-11 18:52:51,541][85175] Updated weights for policy 1, policy_version 94460 (0.0010) +[2023-10-11 18:52:52,910][85176] Updated weights for policy 0, policy_version 93092 (0.0009) +[2023-10-11 18:52:53,284][85176] Updated weights for policy 0, policy_version 93102 (0.0007) +[2023-10-11 18:52:53,649][85176] Updated weights for policy 0, policy_version 93112 (0.0009) +[2023-10-11 18:52:55,561][85175] Updated weights for policy 1, policy_version 94470 (0.0008) +[2023-10-11 18:52:55,929][85175] Updated weights for policy 1, policy_version 94480 (0.0011) +[2023-10-11 18:52:56,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 192086016. Throughput: 0: 1668.4, 1: 1703.4. Samples: 48032528. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:52:56,063][84230] Avg episode reward: [(0, '47.530'), (1, '43.100')] +[2023-10-11 18:52:56,304][85175] Updated weights for policy 1, policy_version 94490 (0.0009) +[2023-10-11 18:52:57,721][85176] Updated weights for policy 0, policy_version 93122 (0.0011) +[2023-10-11 18:52:58,091][85176] Updated weights for policy 0, policy_version 93132 (0.0007) +[2023-10-11 18:52:58,468][85176] Updated weights for policy 0, policy_version 93142 (0.0008) +[2023-10-11 18:52:58,839][85176] Updated weights for policy 0, policy_version 93152 (0.0008) +[2023-10-11 18:53:00,373][85175] Updated weights for policy 1, policy_version 94500 (0.0008) +[2023-10-11 18:53:00,734][85175] Updated weights for policy 1, policy_version 94510 (0.0009) +[2023-10-11 18:53:01,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 192151552. Throughput: 0: 1677.9, 1: 1693.7. Samples: 48052918. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:53:01,063][84230] Avg episode reward: [(0, '42.760'), (1, '46.020')] +[2023-10-11 18:53:01,105][85175] Updated weights for policy 1, policy_version 94520 (0.0009) +[2023-10-11 18:53:02,865][85176] Updated weights for policy 0, policy_version 93162 (0.0008) +[2023-10-11 18:53:03,243][85176] Updated weights for policy 0, policy_version 93172 (0.0009) +[2023-10-11 18:53:03,618][85176] Updated weights for policy 0, policy_version 93182 (0.0008) +[2023-10-11 18:53:05,052][85175] Updated weights for policy 1, policy_version 94530 (0.0009) +[2023-10-11 18:53:05,421][85175] Updated weights for policy 1, policy_version 94540 (0.0007) +[2023-10-11 18:53:05,784][85175] Updated weights for policy 1, policy_version 94550 (0.0009) +[2023-10-11 18:53:06,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 192217088. Throughput: 0: 1655.8, 1: 1706.3. Samples: 48062828. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:53:06,063][84230] Avg episode reward: [(0, '46.900'), (1, '45.790')] +[2023-10-11 18:53:06,153][85175] Updated weights for policy 1, policy_version 94560 (0.0008) +[2023-10-11 18:53:07,755][85176] Updated weights for policy 0, policy_version 93192 (0.0009) +[2023-10-11 18:53:08,127][85176] Updated weights for policy 0, policy_version 93202 (0.0007) +[2023-10-11 18:53:08,498][85176] Updated weights for policy 0, policy_version 93212 (0.0007) +[2023-10-11 18:53:09,990][85175] Updated weights for policy 1, policy_version 94570 (0.0008) +[2023-10-11 18:53:10,349][85175] Updated weights for policy 1, policy_version 94580 (0.0007) +[2023-10-11 18:53:10,720][85175] Updated weights for policy 1, policy_version 94590 (0.0010) +[2023-10-11 18:53:11,063][84230] Fps is (10 sec: 16383.6, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 192315392. Throughput: 0: 1676.8, 1: 1708.9. Samples: 48083400. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:53:11,064][84230] Avg episode reward: [(0, '44.650'), (1, '46.870')] +[2023-10-11 18:53:12,464][85176] Updated weights for policy 0, policy_version 93222 (0.0007) +[2023-10-11 18:53:12,846][85176] Updated weights for policy 0, policy_version 93232 (0.0008) +[2023-10-11 18:53:13,211][85176] Updated weights for policy 0, policy_version 93242 (0.0008) +[2023-10-11 18:53:14,903][85175] Updated weights for policy 1, policy_version 94600 (0.0010) +[2023-10-11 18:53:15,280][85175] Updated weights for policy 1, policy_version 94610 (0.0008) +[2023-10-11 18:53:15,644][85175] Updated weights for policy 1, policy_version 94620 (0.0007) +[2023-10-11 18:53:16,062][84230] Fps is (10 sec: 16384.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 192380928. Throughput: 0: 1684.5, 1: 1683.7. Samples: 48103308. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:53:16,063][84230] Avg episode reward: [(0, '48.960'), (1, '43.910')] +[2023-10-11 18:53:17,254][85176] Updated weights for policy 0, policy_version 93252 (0.0008) +[2023-10-11 18:53:17,624][85176] Updated weights for policy 0, policy_version 93262 (0.0008) +[2023-10-11 18:53:17,993][85176] Updated weights for policy 0, policy_version 93272 (0.0009) +[2023-10-11 18:53:19,594][85175] Updated weights for policy 1, policy_version 94630 (0.0008) +[2023-10-11 18:53:19,961][85175] Updated weights for policy 1, policy_version 94640 (0.0007) +[2023-10-11 18:53:20,320][85175] Updated weights for policy 1, policy_version 94650 (0.0010) +[2023-10-11 18:53:21,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 192446464. Throughput: 0: 1660.7, 1: 1710.1. Samples: 48113522. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:53:21,063][84230] Avg episode reward: [(0, '44.150'), (1, '44.840')] +[2023-10-11 18:53:22,104][85176] Updated weights for policy 0, policy_version 93282 (0.0008) +[2023-10-11 18:53:22,479][85176] Updated weights for policy 0, policy_version 93292 (0.0008) +[2023-10-11 18:53:22,852][85176] Updated weights for policy 0, policy_version 93302 (0.0007) +[2023-10-11 18:53:23,233][85176] Updated weights for policy 0, policy_version 93312 (0.0008) +[2023-10-11 18:53:24,376][85175] Updated weights for policy 1, policy_version 94660 (0.0008) +[2023-10-11 18:53:24,738][85175] Updated weights for policy 1, policy_version 94670 (0.0010) +[2023-10-11 18:53:25,103][85175] Updated weights for policy 1, policy_version 94680 (0.0008) +[2023-10-11 18:53:26,062][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 192512000. Throughput: 0: 1672.2, 1: 1703.2. Samples: 48133758. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:53:26,063][84230] Avg episode reward: [(0, '48.450'), (1, '44.010')] +[2023-10-11 18:53:27,305][85176] Updated weights for policy 0, policy_version 93322 (0.0007) +[2023-10-11 18:53:27,667][85176] Updated weights for policy 0, policy_version 93332 (0.0010) +[2023-10-11 18:53:28,038][85176] Updated weights for policy 0, policy_version 93342 (0.0009) +[2023-10-11 18:53:29,028][85175] Updated weights for policy 1, policy_version 94690 (0.0011) +[2023-10-11 18:53:29,396][85175] Updated weights for policy 1, policy_version 94700 (0.0011) +[2023-10-11 18:53:29,768][85175] Updated weights for policy 1, policy_version 94710 (0.0009) +[2023-10-11 18:53:30,139][85175] Updated weights for policy 1, policy_version 94720 (0.0007) +[2023-10-11 18:53:31,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 192577536. Throughput: 0: 1677.1, 1: 1681.5. Samples: 48153694. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) +[2023-10-11 18:53:31,063][84230] Avg episode reward: [(0, '43.740'), (1, '45.570')] +[2023-10-11 18:53:32,309][85176] Updated weights for policy 0, policy_version 93352 (0.0008) +[2023-10-11 18:53:32,680][85176] Updated weights for policy 0, policy_version 93362 (0.0008) +[2023-10-11 18:53:33,061][85176] Updated weights for policy 0, policy_version 93372 (0.0009) +[2023-10-11 18:53:34,176][85175] Updated weights for policy 1, policy_version 94730 (0.0009) +[2023-10-11 18:53:34,550][85175] Updated weights for policy 1, policy_version 94740 (0.0009) +[2023-10-11 18:53:34,924][85175] Updated weights for policy 1, policy_version 94750 (0.0010) +[2023-10-11 18:53:36,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 192643072. Throughput: 0: 1660.3, 1: 1715.9. Samples: 48164158. Policy #0 lag: (min: 3.0, avg: 3.0, max: 7.0) +[2023-10-11 18:53:36,064][84230] Avg episode reward: [(0, '49.220'), (1, '43.950')] +[2023-10-11 18:53:37,202][85176] Updated weights for policy 0, policy_version 93382 (0.0011) +[2023-10-11 18:53:37,574][85176] Updated weights for policy 0, policy_version 93392 (0.0008) +[2023-10-11 18:53:37,945][85176] Updated weights for policy 0, policy_version 93402 (0.0007) +[2023-10-11 18:53:39,016][85175] Updated weights for policy 1, policy_version 94760 (0.0009) +[2023-10-11 18:53:39,388][85175] Updated weights for policy 1, policy_version 94770 (0.0008) +[2023-10-11 18:53:39,760][85175] Updated weights for policy 1, policy_version 94780 (0.0010) +[2023-10-11 18:53:41,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 192708608. Throughput: 0: 1670.9, 1: 1695.4. Samples: 48184012. Policy #0 lag: (min: 3.0, avg: 3.0, max: 7.0) +[2023-10-11 18:53:41,063][84230] Avg episode reward: [(0, '43.530'), (1, '45.720')] +[2023-10-11 18:53:42,101][85176] Updated weights for policy 0, policy_version 93412 (0.0009) +[2023-10-11 18:53:42,465][85176] Updated weights for policy 0, policy_version 93422 (0.0009) +[2023-10-11 18:53:42,842][85176] Updated weights for policy 0, policy_version 93432 (0.0008) +[2023-10-11 18:53:43,531][85175] Updated weights for policy 1, policy_version 94790 (0.0007) +[2023-10-11 18:53:43,893][85175] Updated weights for policy 1, policy_version 94800 (0.0008) +[2023-10-11 18:53:44,253][85175] Updated weights for policy 1, policy_version 94810 (0.0008) +[2023-10-11 18:53:46,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 192774144. Throughput: 0: 1668.5, 1: 1696.5. Samples: 48204344. Policy #0 lag: (min: 3.0, avg: 3.0, max: 7.0) +[2023-10-11 18:53:46,064][84230] Avg episode reward: [(0, '46.110'), (1, '42.810')] +[2023-10-11 18:53:46,912][85176] Updated weights for policy 0, policy_version 93442 (0.0007) +[2023-10-11 18:53:47,283][85176] Updated weights for policy 0, policy_version 93452 (0.0008) +[2023-10-11 18:53:47,649][85176] Updated weights for policy 0, policy_version 93462 (0.0011) +[2023-10-11 18:53:48,027][85176] Updated weights for policy 0, policy_version 93472 (0.0009) +[2023-10-11 18:53:48,452][85175] Updated weights for policy 1, policy_version 94820 (0.0008) +[2023-10-11 18:53:48,825][85175] Updated weights for policy 1, policy_version 94830 (0.0009) +[2023-10-11 18:53:49,188][85175] Updated weights for policy 1, policy_version 94840 (0.0012) +[2023-10-11 18:53:51,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 192839680. Throughput: 0: 1661.5, 1: 1709.5. Samples: 48214524. Policy #0 lag: (min: 3.0, avg: 3.0, max: 7.0) +[2023-10-11 18:53:51,063][84230] Avg episode reward: [(0, '40.640'), (1, '44.850')] +[2023-10-11 18:53:52,103][85176] Updated weights for policy 0, policy_version 93482 (0.0011) +[2023-10-11 18:53:52,477][85176] Updated weights for policy 0, policy_version 93492 (0.0008) +[2023-10-11 18:53:52,854][85176] Updated weights for policy 0, policy_version 93502 (0.0007) +[2023-10-11 18:53:53,132][85175] Updated weights for policy 1, policy_version 94850 (0.0011) +[2023-10-11 18:53:53,503][85175] Updated weights for policy 1, policy_version 94860 (0.0009) +[2023-10-11 18:53:53,865][85175] Updated weights for policy 1, policy_version 94870 (0.0009) +[2023-10-11 18:53:54,236][85175] Updated weights for policy 1, policy_version 94880 (0.0008) +[2023-10-11 18:53:56,062][84230] Fps is (10 sec: 13107.6, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 192905216. Throughput: 0: 1669.3, 1: 1684.9. Samples: 48234338. Policy #0 lag: (min: 3.0, avg: 3.0, max: 7.0) +[2023-10-11 18:53:56,063][84230] Avg episode reward: [(0, '47.010'), (1, '44.960')] +[2023-10-11 18:53:56,954][85176] Updated weights for policy 0, policy_version 93512 (0.0008) +[2023-10-11 18:53:57,325][85176] Updated weights for policy 0, policy_version 93522 (0.0007) +[2023-10-11 18:53:57,706][85176] Updated weights for policy 0, policy_version 93532 (0.0007) +[2023-10-11 18:53:58,437][85175] Updated weights for policy 1, policy_version 94890 (0.0010) +[2023-10-11 18:53:58,803][85175] Updated weights for policy 1, policy_version 94900 (0.0009) +[2023-10-11 18:53:59,179][85175] Updated weights for policy 1, policy_version 94910 (0.0011) +[2023-10-11 18:54:01,063][84230] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 192970752. Throughput: 0: 1669.4, 1: 1700.2. Samples: 48254938. Policy #0 lag: (min: 3.0, avg: 3.0, max: 7.0) +[2023-10-11 18:54:01,064][84230] Avg episode reward: [(0, '42.530'), (1, '44.270')] +[2023-10-11 18:54:01,705][85176] Updated weights for policy 0, policy_version 93542 (0.0008) +[2023-10-11 18:54:02,085][85176] Updated weights for policy 0, policy_version 93552 (0.0007) +[2023-10-11 18:54:02,449][85176] Updated weights for policy 0, policy_version 93562 (0.0007) +[2023-10-11 18:54:03,170][85175] Updated weights for policy 1, policy_version 94920 (0.0009) +[2023-10-11 18:54:03,537][85175] Updated weights for policy 1, policy_version 94930 (0.0011) +[2023-10-11 18:54:03,906][85175] Updated weights for policy 1, policy_version 94940 (0.0011) +[2023-10-11 18:54:06,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.5). Total num frames: 193036288. Throughput: 0: 1670.7, 1: 1686.4. Samples: 48264594. Policy #0 lag: (min: 3.0, avg: 3.0, max: 7.0) +[2023-10-11 18:54:06,063][84230] Avg episode reward: [(0, '47.480'), (1, '43.920')] +[2023-10-11 18:54:06,503][85176] Updated weights for policy 0, policy_version 93572 (0.0008) +[2023-10-11 18:54:06,877][85176] Updated weights for policy 0, policy_version 93582 (0.0009) +[2023-10-11 18:54:07,261][85176] Updated weights for policy 0, policy_version 93592 (0.0010) +[2023-10-11 18:54:07,919][85175] Updated weights for policy 1, policy_version 94950 (0.0009) +[2023-10-11 18:54:08,284][85175] Updated weights for policy 1, policy_version 94960 (0.0009) +[2023-10-11 18:54:08,652][85175] Updated weights for policy 1, policy_version 94970 (0.0011) +[2023-10-11 18:54:11,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 193101824. Throughput: 0: 1673.7, 1: 1682.6. Samples: 48284792. Policy #0 lag: (min: 3.0, avg: 3.0, max: 7.0) +[2023-10-11 18:54:11,063][84230] Avg episode reward: [(0, '42.140'), (1, '43.450')] +[2023-10-11 18:54:11,394][85176] Updated weights for policy 0, policy_version 93602 (0.0011) +[2023-10-11 18:54:11,772][85176] Updated weights for policy 0, policy_version 93612 (0.0010) +[2023-10-11 18:54:12,149][85176] Updated weights for policy 0, policy_version 93622 (0.0010) +[2023-10-11 18:54:12,524][85176] Updated weights for policy 0, policy_version 93632 (0.0009) +[2023-10-11 18:54:12,647][85175] Updated weights for policy 1, policy_version 94980 (0.0011) +[2023-10-11 18:54:13,023][85175] Updated weights for policy 1, policy_version 94990 (0.0010) +[2023-10-11 18:54:13,385][85175] Updated weights for policy 1, policy_version 95000 (0.0008) +[2023-10-11 18:54:16,063][84230] Fps is (10 sec: 13106.7, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 193167360. Throughput: 0: 1673.4, 1: 1698.9. Samples: 48305446. Policy #0 lag: (min: 3.0, avg: 3.0, max: 7.0) +[2023-10-11 18:54:16,064][84230] Avg episode reward: [(0, '46.130'), (1, '44.260')] +[2023-10-11 18:54:16,076][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000095008_97288192.pth... +[2023-10-11 18:54:16,076][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000093632_95879168.pth... +[2023-10-11 18:54:16,115][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000092096_94306304.pth +[2023-10-11 18:54:16,116][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000093440_95682560.pth +[2023-10-11 18:54:16,119][84801] Saving a milestone ./train_atari/atari_frostbite_APPO/checkpoint_p0/milestones/checkpoint_000093632_95879168.pth +[2023-10-11 18:54:16,121][85000] Saving a milestone ./train_atari/atari_frostbite_APPO/checkpoint_p1/milestones/checkpoint_000095008_97288192.pth +[2023-10-11 18:54:16,720][85176] Updated weights for policy 0, policy_version 93642 (0.0008) +[2023-10-11 18:54:17,095][85176] Updated weights for policy 0, policy_version 93652 (0.0008) +[2023-10-11 18:54:17,408][85175] Updated weights for policy 1, policy_version 95010 (0.0009) +[2023-10-11 18:54:17,471][85176] Updated weights for policy 0, policy_version 93662 (0.0007) +[2023-10-11 18:54:17,774][85175] Updated weights for policy 1, policy_version 95020 (0.0008) +[2023-10-11 18:54:18,135][85175] Updated weights for policy 1, policy_version 95030 (0.0008) +[2023-10-11 18:54:18,493][85175] Updated weights for policy 1, policy_version 95040 (0.0007) +[2023-10-11 18:54:21,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 193232896. Throughput: 0: 1675.3, 1: 1667.5. Samples: 48314580. Policy #0 lag: (min: 3.0, avg: 3.0, max: 7.0) +[2023-10-11 18:54:21,063][84230] Avg episode reward: [(0, '44.700'), (1, '45.670')] +[2023-10-11 18:54:21,547][85176] Updated weights for policy 0, policy_version 93672 (0.0008) +[2023-10-11 18:54:21,927][85176] Updated weights for policy 0, policy_version 93682 (0.0007) +[2023-10-11 18:54:22,293][85176] Updated weights for policy 0, policy_version 93692 (0.0008) +[2023-10-11 18:54:22,685][85175] Updated weights for policy 1, policy_version 95050 (0.0008) +[2023-10-11 18:54:23,043][85175] Updated weights for policy 1, policy_version 95060 (0.0009) +[2023-10-11 18:54:23,414][85175] Updated weights for policy 1, policy_version 95070 (0.0007) +[2023-10-11 18:54:26,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 193298432. Throughput: 0: 1675.1, 1: 1684.4. Samples: 48335190. Policy #0 lag: (min: 3.0, avg: 3.0, max: 7.0) +[2023-10-11 18:54:26,064][84230] Avg episode reward: [(0, '48.640'), (1, '45.080')] +[2023-10-11 18:54:26,356][85176] Updated weights for policy 0, policy_version 93702 (0.0007) +[2023-10-11 18:54:26,725][85176] Updated weights for policy 0, policy_version 93712 (0.0008) +[2023-10-11 18:54:27,105][85176] Updated weights for policy 0, policy_version 93722 (0.0008) +[2023-10-11 18:54:27,409][85175] Updated weights for policy 1, policy_version 95080 (0.0010) +[2023-10-11 18:54:27,790][85175] Updated weights for policy 1, policy_version 95090 (0.0010) +[2023-10-11 18:54:28,151][85175] Updated weights for policy 1, policy_version 95100 (0.0010) +[2023-10-11 18:54:31,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 193363968. Throughput: 0: 1676.1, 1: 1688.1. Samples: 48355736. Policy #0 lag: (min: 3.0, avg: 3.0, max: 7.0) +[2023-10-11 18:54:31,064][84230] Avg episode reward: [(0, '48.960'), (1, '45.250')] +[2023-10-11 18:54:31,088][85176] Updated weights for policy 0, policy_version 93732 (0.0008) +[2023-10-11 18:54:31,469][85176] Updated weights for policy 0, policy_version 93742 (0.0008) +[2023-10-11 18:54:31,833][85176] Updated weights for policy 0, policy_version 93752 (0.0007) +[2023-10-11 18:54:32,172][85175] Updated weights for policy 1, policy_version 95110 (0.0009) +[2023-10-11 18:54:32,545][85175] Updated weights for policy 1, policy_version 95120 (0.0009) +[2023-10-11 18:54:32,914][85175] Updated weights for policy 1, policy_version 95130 (0.0009) +[2023-10-11 18:54:36,051][85176] Updated weights for policy 0, policy_version 93762 (0.0007) +[2023-10-11 18:54:36,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 193429504. Throughput: 0: 1677.1, 1: 1667.4. Samples: 48365026. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-11 18:54:36,063][84230] Avg episode reward: [(0, '44.540'), (1, '46.680')] +[2023-10-11 18:54:36,415][85176] Updated weights for policy 0, policy_version 93772 (0.0007) +[2023-10-11 18:54:36,797][85176] Updated weights for policy 0, policy_version 93782 (0.0009) +[2023-10-11 18:54:36,925][85175] Updated weights for policy 1, policy_version 95140 (0.0007) +[2023-10-11 18:54:37,158][85176] Updated weights for policy 0, policy_version 93792 (0.0010) +[2023-10-11 18:54:37,293][85175] Updated weights for policy 1, policy_version 95150 (0.0007) +[2023-10-11 18:54:37,671][85175] Updated weights for policy 1, policy_version 95160 (0.0009) +[2023-10-11 18:54:41,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 193495040. Throughput: 0: 1674.2, 1: 1686.6. Samples: 48385572. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-11 18:54:41,064][84230] Avg episode reward: [(0, '45.050'), (1, '44.760')] +[2023-10-11 18:54:41,166][85176] Updated weights for policy 0, policy_version 93802 (0.0008) +[2023-10-11 18:54:41,539][85176] Updated weights for policy 0, policy_version 93812 (0.0008) +[2023-10-11 18:54:41,702][85175] Updated weights for policy 1, policy_version 95170 (0.0009) +[2023-10-11 18:54:41,913][85176] Updated weights for policy 0, policy_version 93822 (0.0008) +[2023-10-11 18:54:42,071][85175] Updated weights for policy 1, policy_version 95180 (0.0008) +[2023-10-11 18:54:42,435][85175] Updated weights for policy 1, policy_version 95190 (0.0007) +[2023-10-11 18:54:42,800][85175] Updated weights for policy 1, policy_version 95200 (0.0009) +[2023-10-11 18:54:45,993][85176] Updated weights for policy 0, policy_version 93832 (0.0008) +[2023-10-11 18:54:46,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 193560576. Throughput: 0: 1668.7, 1: 1695.9. Samples: 48406344. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-11 18:54:46,063][84230] Avg episode reward: [(0, '42.510'), (1, '43.650')] +[2023-10-11 18:54:46,372][85176] Updated weights for policy 0, policy_version 93842 (0.0010) +[2023-10-11 18:54:46,741][85176] Updated weights for policy 0, policy_version 93852 (0.0010) +[2023-10-11 18:54:46,766][85175] Updated weights for policy 1, policy_version 95210 (0.0008) +[2023-10-11 18:54:47,134][85175] Updated weights for policy 1, policy_version 95220 (0.0008) +[2023-10-11 18:54:47,494][85175] Updated weights for policy 1, policy_version 95230 (0.0008) +[2023-10-11 18:54:50,915][85176] Updated weights for policy 0, policy_version 93862 (0.0007) +[2023-10-11 18:54:51,062][84230] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 193626112. Throughput: 0: 1668.0, 1: 1683.4. Samples: 48415406. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-11 18:54:51,063][84230] Avg episode reward: [(0, '46.090'), (1, '40.780')] +[2023-10-11 18:54:51,293][85176] Updated weights for policy 0, policy_version 93872 (0.0007) +[2023-10-11 18:54:51,666][85176] Updated weights for policy 0, policy_version 93882 (0.0008) +[2023-10-11 18:54:51,746][85175] Updated weights for policy 1, policy_version 95240 (0.0009) +[2023-10-11 18:54:52,120][85175] Updated weights for policy 1, policy_version 95250 (0.0009) +[2023-10-11 18:54:52,477][85175] Updated weights for policy 1, policy_version 95260 (0.0007) +[2023-10-11 18:54:55,638][85176] Updated weights for policy 0, policy_version 93892 (0.0008) +[2023-10-11 18:54:56,012][85176] Updated weights for policy 0, policy_version 93902 (0.0008) +[2023-10-11 18:54:56,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 193691648. Throughput: 0: 1669.7, 1: 1691.8. Samples: 48436058. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-11 18:54:56,063][84230] Avg episode reward: [(0, '43.630'), (1, '44.400')] +[2023-10-11 18:54:56,379][85176] Updated weights for policy 0, policy_version 93912 (0.0008) +[2023-10-11 18:54:56,511][85175] Updated weights for policy 1, policy_version 95270 (0.0007) +[2023-10-11 18:54:56,880][85175] Updated weights for policy 1, policy_version 95280 (0.0009) +[2023-10-11 18:54:57,255][85175] Updated weights for policy 1, policy_version 95290 (0.0010) +[2023-10-11 18:55:00,383][85176] Updated weights for policy 0, policy_version 93922 (0.0010) +[2023-10-11 18:55:00,762][85176] Updated weights for policy 0, policy_version 93932 (0.0007) +[2023-10-11 18:55:01,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 193757184. Throughput: 0: 1664.3, 1: 1694.5. Samples: 48456592. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-11 18:55:01,063][84230] Avg episode reward: [(0, '47.640'), (1, '43.520')] +[2023-10-11 18:55:01,133][85176] Updated weights for policy 0, policy_version 93942 (0.0008) +[2023-10-11 18:55:01,244][85175] Updated weights for policy 1, policy_version 95300 (0.0008) +[2023-10-11 18:55:01,511][85176] Updated weights for policy 0, policy_version 93952 (0.0008) +[2023-10-11 18:55:01,614][85175] Updated weights for policy 1, policy_version 95310 (0.0007) +[2023-10-11 18:55:01,972][85175] Updated weights for policy 1, policy_version 95320 (0.0008) +[2023-10-11 18:55:05,516][85176] Updated weights for policy 0, policy_version 93962 (0.0007) +[2023-10-11 18:55:05,878][85176] Updated weights for policy 0, policy_version 93972 (0.0008) +[2023-10-11 18:55:06,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 193822720. Throughput: 0: 1674.3, 1: 1693.0. Samples: 48466106. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-11 18:55:06,063][84230] Avg episode reward: [(0, '44.630'), (1, '47.180')] +[2023-10-11 18:55:06,070][85175] Updated weights for policy 1, policy_version 95330 (0.0008) +[2023-10-11 18:55:06,258][85176] Updated weights for policy 0, policy_version 93982 (0.0008) +[2023-10-11 18:55:06,437][85175] Updated weights for policy 1, policy_version 95340 (0.0007) +[2023-10-11 18:55:06,803][85175] Updated weights for policy 1, policy_version 95350 (0.0008) +[2023-10-11 18:55:07,171][85175] Updated weights for policy 1, policy_version 95360 (0.0009) +[2023-10-11 18:55:10,492][85176] Updated weights for policy 0, policy_version 93992 (0.0010) +[2023-10-11 18:55:10,873][85176] Updated weights for policy 0, policy_version 94002 (0.0009) +[2023-10-11 18:55:11,049][85175] Updated weights for policy 1, policy_version 95370 (0.0007) +[2023-10-11 18:55:11,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 193888256. Throughput: 0: 1676.9, 1: 1693.4. Samples: 48486854. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-11 18:55:11,063][84230] Avg episode reward: [(0, '46.860'), (1, '44.800')] +[2023-10-11 18:55:11,250][85176] Updated weights for policy 0, policy_version 94012 (0.0008) +[2023-10-11 18:55:11,404][85175] Updated weights for policy 1, policy_version 95380 (0.0009) +[2023-10-11 18:55:11,770][85175] Updated weights for policy 1, policy_version 95390 (0.0009) +[2023-10-11 18:55:15,272][85176] Updated weights for policy 0, policy_version 94022 (0.0008) +[2023-10-11 18:55:15,641][85176] Updated weights for policy 0, policy_version 94032 (0.0008) +[2023-10-11 18:55:15,971][85175] Updated weights for policy 1, policy_version 95400 (0.0007) +[2023-10-11 18:55:16,017][85176] Updated weights for policy 0, policy_version 94042 (0.0007) +[2023-10-11 18:55:16,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 193953792. Throughput: 0: 1665.1, 1: 1698.6. Samples: 48507100. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-11 18:55:16,064][84230] Avg episode reward: [(0, '44.080'), (1, '46.810')] +[2023-10-11 18:55:16,341][85175] Updated weights for policy 1, policy_version 95410 (0.0007) +[2023-10-11 18:55:16,709][85175] Updated weights for policy 1, policy_version 95420 (0.0010) +[2023-10-11 18:55:20,114][85176] Updated weights for policy 0, policy_version 94052 (0.0008) +[2023-10-11 18:55:20,478][85176] Updated weights for policy 0, policy_version 94062 (0.0008) +[2023-10-11 18:55:20,664][85175] Updated weights for policy 1, policy_version 95430 (0.0008) +[2023-10-11 18:55:20,858][85176] Updated weights for policy 0, policy_version 94072 (0.0008) +[2023-10-11 18:55:21,031][85175] Updated weights for policy 1, policy_version 95440 (0.0007) +[2023-10-11 18:55:21,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 194019328. Throughput: 0: 1674.8, 1: 1694.8. Samples: 48516654. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-11 18:55:21,064][84230] Avg episode reward: [(0, '46.410'), (1, '45.490')] +[2023-10-11 18:55:21,399][85175] Updated weights for policy 1, policy_version 95450 (0.0008) +[2023-10-11 18:55:25,177][85176] Updated weights for policy 0, policy_version 94082 (0.0010) +[2023-10-11 18:55:25,553][85176] Updated weights for policy 0, policy_version 94092 (0.0009) +[2023-10-11 18:55:25,625][85175] Updated weights for policy 1, policy_version 95460 (0.0009) +[2023-10-11 18:55:25,930][85176] Updated weights for policy 0, policy_version 94102 (0.0008) +[2023-10-11 18:55:25,997][85175] Updated weights for policy 1, policy_version 95470 (0.0007) +[2023-10-11 18:55:26,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 194084864. Throughput: 0: 1672.6, 1: 1693.9. Samples: 48537064. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-11 18:55:26,063][84230] Avg episode reward: [(0, '44.590'), (1, '45.530')] +[2023-10-11 18:55:26,290][85176] Updated weights for policy 0, policy_version 94112 (0.0008) +[2023-10-11 18:55:26,357][85175] Updated weights for policy 1, policy_version 95480 (0.0008) +[2023-10-11 18:55:30,315][85176] Updated weights for policy 0, policy_version 94122 (0.0008) +[2023-10-11 18:55:30,385][85175] Updated weights for policy 1, policy_version 95490 (0.0008) +[2023-10-11 18:55:30,690][85176] Updated weights for policy 0, policy_version 94132 (0.0008) +[2023-10-11 18:55:30,749][85175] Updated weights for policy 1, policy_version 95500 (0.0008) +[2023-10-11 18:55:31,056][85176] Updated weights for policy 0, policy_version 94142 (0.0009) +[2023-10-11 18:55:31,062][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 194150400. Throughput: 0: 1659.5, 1: 1687.5. Samples: 48556956. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) +[2023-10-11 18:55:31,063][84230] Avg episode reward: [(0, '44.470'), (1, '44.420')] +[2023-10-11 18:55:31,117][85175] Updated weights for policy 1, policy_version 95510 (0.0008) +[2023-10-11 18:55:31,490][85175] Updated weights for policy 1, policy_version 95520 (0.0007) +[2023-10-11 18:55:35,175][85176] Updated weights for policy 0, policy_version 94152 (0.0008) +[2023-10-11 18:55:35,512][85175] Updated weights for policy 1, policy_version 95530 (0.0009) +[2023-10-11 18:55:35,559][85176] Updated weights for policy 0, policy_version 94162 (0.0007) +[2023-10-11 18:55:35,885][85175] Updated weights for policy 1, policy_version 95540 (0.0009) +[2023-10-11 18:55:35,934][85176] Updated weights for policy 0, policy_version 94172 (0.0009) +[2023-10-11 18:55:36,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 194215936. Throughput: 0: 1672.8, 1: 1692.4. Samples: 48566842. Policy #0 lag: (min: 26.0, avg: 27.0, max: 46.0) +[2023-10-11 18:55:36,064][84230] Avg episode reward: [(0, '45.950'), (1, '42.610')] +[2023-10-11 18:55:36,250][85175] Updated weights for policy 1, policy_version 95550 (0.0008) +[2023-10-11 18:55:40,152][85176] Updated weights for policy 0, policy_version 94182 (0.0008) +[2023-10-11 18:55:40,293][85175] Updated weights for policy 1, policy_version 95560 (0.0010) +[2023-10-11 18:55:40,536][85176] Updated weights for policy 0, policy_version 94192 (0.0008) +[2023-10-11 18:55:40,663][85175] Updated weights for policy 1, policy_version 95570 (0.0009) +[2023-10-11 18:55:40,904][85176] Updated weights for policy 0, policy_version 94202 (0.0008) +[2023-10-11 18:55:41,027][85175] Updated weights for policy 1, policy_version 95580 (0.0008) +[2023-10-11 18:55:41,062][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 194281472. Throughput: 0: 1666.9, 1: 1692.2. Samples: 48587218. Policy #0 lag: (min: 26.0, avg: 27.0, max: 46.0) +[2023-10-11 18:55:41,063][84230] Avg episode reward: [(0, '45.540'), (1, '46.550')] +[2023-10-11 18:55:45,024][85176] Updated weights for policy 0, policy_version 94212 (0.0008) +[2023-10-11 18:55:45,156][85175] Updated weights for policy 1, policy_version 95590 (0.0007) +[2023-10-11 18:55:45,388][85176] Updated weights for policy 0, policy_version 94222 (0.0007) +[2023-10-11 18:55:45,520][85175] Updated weights for policy 1, policy_version 95600 (0.0007) +[2023-10-11 18:55:45,765][85176] Updated weights for policy 0, policy_version 94232 (0.0013) +[2023-10-11 18:55:45,889][85175] Updated weights for policy 1, policy_version 95610 (0.0009) +[2023-10-11 18:55:46,062][84230] Fps is (10 sec: 16384.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 194379776. Throughput: 0: 1651.6, 1: 1673.8. Samples: 48606232. Policy #0 lag: (min: 26.0, avg: 27.0, max: 46.0) +[2023-10-11 18:55:46,063][84230] Avg episode reward: [(0, '45.720'), (1, '41.180')] +[2023-10-11 18:55:49,893][85175] Updated weights for policy 1, policy_version 95620 (0.0009) +[2023-10-11 18:55:50,017][85176] Updated weights for policy 0, policy_version 94242 (0.0008) +[2023-10-11 18:55:50,260][85175] Updated weights for policy 1, policy_version 95630 (0.0007) +[2023-10-11 18:55:50,388][85176] Updated weights for policy 0, policy_version 94252 (0.0009) +[2023-10-11 18:55:50,636][85175] Updated weights for policy 1, policy_version 95640 (0.0007) +[2023-10-11 18:55:50,756][85176] Updated weights for policy 0, policy_version 94262 (0.0007) +[2023-10-11 18:55:51,062][84230] Fps is (10 sec: 16384.2, 60 sec: 13653.4, 300 sec: 13440.5). Total num frames: 194445312. Throughput: 0: 1657.5, 1: 1687.6. Samples: 48616636. Policy #0 lag: (min: 26.0, avg: 27.0, max: 46.0) +[2023-10-11 18:55:51,063][84230] Avg episode reward: [(0, '45.300'), (1, '44.780')] +[2023-10-11 18:55:51,134][85176] Updated weights for policy 0, policy_version 94272 (0.0009) +[2023-10-11 18:55:54,687][85175] Updated weights for policy 1, policy_version 95650 (0.0007) +[2023-10-11 18:55:55,064][85175] Updated weights for policy 1, policy_version 95660 (0.0009) +[2023-10-11 18:55:55,104][85176] Updated weights for policy 0, policy_version 94282 (0.0007) +[2023-10-11 18:55:55,441][85175] Updated weights for policy 1, policy_version 95670 (0.0009) +[2023-10-11 18:55:55,475][85176] Updated weights for policy 0, policy_version 94292 (0.0007) +[2023-10-11 18:55:55,801][85175] Updated weights for policy 1, policy_version 95680 (0.0008) +[2023-10-11 18:55:55,851][85176] Updated weights for policy 0, policy_version 94302 (0.0008) +[2023-10-11 18:55:56,063][84230] Fps is (10 sec: 16383.5, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 194543616. Throughput: 0: 1650.2, 1: 1695.9. Samples: 48637428. Policy #0 lag: (min: 26.0, avg: 27.0, max: 46.0) +[2023-10-11 18:55:56,064][84230] Avg episode reward: [(0, '47.710'), (1, '42.730')] +[2023-10-11 18:55:59,748][85175] Updated weights for policy 1, policy_version 95690 (0.0007) +[2023-10-11 18:55:59,868][85176] Updated weights for policy 0, policy_version 94312 (0.0009) +[2023-10-11 18:56:00,108][85175] Updated weights for policy 1, policy_version 95700 (0.0007) +[2023-10-11 18:56:00,235][85176] Updated weights for policy 0, policy_version 94322 (0.0007) +[2023-10-11 18:56:00,477][85175] Updated weights for policy 1, policy_version 95710 (0.0007) +[2023-10-11 18:56:00,600][85176] Updated weights for policy 0, policy_version 94332 (0.0007) +[2023-10-11 18:56:01,063][84230] Fps is (10 sec: 16383.5, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 194609152. Throughput: 0: 1641.5, 1: 1666.7. Samples: 48655968. Policy #0 lag: (min: 26.0, avg: 27.0, max: 46.0) +[2023-10-11 18:56:01,064][84230] Avg episode reward: [(0, '43.550'), (1, '44.690')] +[2023-10-11 18:56:04,526][85175] Updated weights for policy 1, policy_version 95720 (0.0008) +[2023-10-11 18:56:04,702][85176] Updated weights for policy 0, policy_version 94342 (0.0008) +[2023-10-11 18:56:04,889][85175] Updated weights for policy 1, policy_version 95730 (0.0007) +[2023-10-11 18:56:05,071][85176] Updated weights for policy 0, policy_version 94352 (0.0008) +[2023-10-11 18:56:05,254][85175] Updated weights for policy 1, policy_version 95740 (0.0007) +[2023-10-11 18:56:05,457][85176] Updated weights for policy 0, policy_version 94362 (0.0010) +[2023-10-11 18:56:06,063][84230] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 194674688. Throughput: 0: 1654.8, 1: 1697.3. Samples: 48667498. Policy #0 lag: (min: 26.0, avg: 27.0, max: 46.0) +[2023-10-11 18:56:06,064][84230] Avg episode reward: [(0, '45.690'), (1, '42.140')] +[2023-10-11 18:56:09,180][85175] Updated weights for policy 1, policy_version 95750 (0.0008) +[2023-10-11 18:56:09,543][85175] Updated weights for policy 1, policy_version 95760 (0.0010) +[2023-10-11 18:56:09,609][85176] Updated weights for policy 0, policy_version 94372 (0.0009) +[2023-10-11 18:56:09,906][85175] Updated weights for policy 1, policy_version 95770 (0.0007) +[2023-10-11 18:56:09,981][85176] Updated weights for policy 0, policy_version 94382 (0.0009) +[2023-10-11 18:56:10,358][85176] Updated weights for policy 0, policy_version 94392 (0.0007) +[2023-10-11 18:56:11,063][84230] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13440.4). Total num frames: 194740224. Throughput: 0: 1651.4, 1: 1689.2. Samples: 48687390. Policy #0 lag: (min: 26.0, avg: 27.0, max: 46.0) +[2023-10-11 18:56:11,064][84230] Avg episode reward: [(0, '40.520'), (1, '42.820')] +[2023-10-11 18:56:13,994][85175] Updated weights for policy 1, policy_version 95780 (0.0008) +[2023-10-11 18:56:14,358][85175] Updated weights for policy 1, policy_version 95790 (0.0009) +[2023-10-11 18:56:14,540][85176] Updated weights for policy 0, policy_version 94402 (0.0008) +[2023-10-11 18:56:14,726][85175] Updated weights for policy 1, policy_version 95800 (0.0007) +[2023-10-11 18:56:14,916][85176] Updated weights for policy 0, policy_version 94412 (0.0008) +[2023-10-11 18:56:15,292][85176] Updated weights for policy 0, policy_version 94422 (0.0008) +[2023-10-11 18:56:15,659][85176] Updated weights for policy 0, policy_version 94432 (0.0007) +[2023-10-11 18:56:16,063][84230] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 13440.4). Total num frames: 194805760. Throughput: 0: 1639.6, 1: 1679.8. Samples: 48706332. Policy #0 lag: (min: 26.0, avg: 27.0, max: 46.0) +[2023-10-11 18:56:16,064][84230] Avg episode reward: [(0, '47.430'), (1, '41.590')] +[2023-10-11 18:56:16,075][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000095808_98107392.pth... +[2023-10-11 18:56:16,076][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000094432_96698368.pth... +[2023-10-11 18:56:16,114][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000094208_96468992.pth +[2023-10-11 18:56:16,120][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000092864_95092736.pth +[2023-10-11 18:56:18,617][85175] Updated weights for policy 1, policy_version 95810 (0.0007) +[2023-10-11 18:56:18,980][85175] Updated weights for policy 1, policy_version 95820 (0.0007) +[2023-10-11 18:56:19,353][85175] Updated weights for policy 1, policy_version 95830 (0.0007) +[2023-10-11 18:56:19,713][85175] Updated weights for policy 1, policy_version 95840 (0.0009) +[2023-10-11 18:56:19,755][85176] Updated weights for policy 0, policy_version 94442 (0.0009) +[2023-10-11 18:56:20,134][85176] Updated weights for policy 0, policy_version 94452 (0.0008) +[2023-10-11 18:56:20,498][85176] Updated weights for policy 0, policy_version 94462 (0.0009) +[2023-10-11 18:56:21,062][84230] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 13440.4). Total num frames: 194871296. Throughput: 0: 1649.8, 1: 1708.4. Samples: 48717960. Policy #0 lag: (min: 26.0, avg: 27.0, max: 46.0) +[2023-10-11 18:56:21,063][84230] Avg episode reward: [(0, '42.110'), (1, '46.090')] +[2023-10-11 18:56:23,634][85175] Updated weights for policy 1, policy_version 95850 (0.0008) +[2023-10-11 18:56:23,995][85175] Updated weights for policy 1, policy_version 95860 (0.0007) +[2023-10-11 18:56:24,365][85175] Updated weights for policy 1, policy_version 95870 (0.0009) +[2023-10-11 18:56:24,831][85176] Updated weights for policy 0, policy_version 94472 (0.0009) +[2023-10-11 18:56:25,211][85176] Updated weights for policy 0, policy_version 94482 (0.0008) +[2023-10-11 18:56:25,576][85176] Updated weights for policy 0, policy_version 94492 (0.0007) +[2023-10-11 18:56:26,062][84230] Fps is (10 sec: 13107.7, 60 sec: 14199.5, 300 sec: 13440.4). Total num frames: 194936832. Throughput: 0: 1653.6, 1: 1685.0. Samples: 48737454. Policy #0 lag: (min: 26.0, avg: 27.0, max: 46.0) +[2023-10-11 18:56:26,063][84230] Avg episode reward: [(0, '48.190'), (1, '42.790')] +[2023-10-11 18:56:28,282][85175] Updated weights for policy 1, policy_version 95880 (0.0009) +[2023-10-11 18:56:28,656][85175] Updated weights for policy 1, policy_version 95890 (0.0009) +[2023-10-11 18:56:29,028][85175] Updated weights for policy 1, policy_version 95900 (0.0008) +[2023-10-11 18:56:29,707][85176] Updated weights for policy 0, policy_version 94502 (0.0010) +[2023-10-11 18:56:30,077][85176] Updated weights for policy 0, policy_version 94512 (0.0010) +[2023-10-11 18:56:30,453][85176] Updated weights for policy 0, policy_version 94522 (0.0009) +[2023-10-11 18:56:31,063][84230] Fps is (10 sec: 13106.8, 60 sec: 14199.4, 300 sec: 13440.4). Total num frames: 195002368. Throughput: 0: 1648.3, 1: 1704.6. Samples: 48757110. Policy #0 lag: (min: 29.0, avg: 29.0, max: 29.0) +[2023-10-11 18:56:31,064][84230] Avg episode reward: [(0, '43.520'), (1, '47.520')] +[2023-10-11 18:56:33,237][85175] Updated weights for policy 1, policy_version 95910 (0.0010) +[2023-10-11 18:56:33,598][85175] Updated weights for policy 1, policy_version 95920 (0.0009) +[2023-10-11 18:56:33,971][85175] Updated weights for policy 1, policy_version 95930 (0.0009) +[2023-10-11 18:56:34,449][85176] Updated weights for policy 0, policy_version 94532 (0.0008) +[2023-10-11 18:56:34,830][85176] Updated weights for policy 0, policy_version 94542 (0.0008) +[2023-10-11 18:56:35,210][85176] Updated weights for policy 0, policy_version 94552 (0.0009) +[2023-10-11 18:56:36,063][84230] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 13440.4). Total num frames: 195067904. Throughput: 0: 1661.9, 1: 1704.9. Samples: 48768142. Policy #0 lag: (min: 29.0, avg: 29.0, max: 29.0) +[2023-10-11 18:56:36,063][84230] Avg episode reward: [(0, '48.410'), (1, '43.730')] +[2023-10-11 18:56:38,001][85175] Updated weights for policy 1, policy_version 95940 (0.0009) +[2023-10-11 18:56:38,376][85175] Updated weights for policy 1, policy_version 95950 (0.0007) +[2023-10-11 18:56:38,747][85175] Updated weights for policy 1, policy_version 95960 (0.0007) +[2023-10-11 18:56:39,327][85176] Updated weights for policy 0, policy_version 94562 (0.0009) +[2023-10-11 18:56:39,702][85176] Updated weights for policy 0, policy_version 94572 (0.0009) +[2023-10-11 18:56:40,078][85176] Updated weights for policy 0, policy_version 94582 (0.0009) +[2023-10-11 18:56:40,444][85176] Updated weights for policy 0, policy_version 94592 (0.0008) +[2023-10-11 18:56:41,062][84230] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 13440.4). Total num frames: 195133440. Throughput: 0: 1660.3, 1: 1684.8. Samples: 48787956. Policy #0 lag: (min: 29.0, avg: 29.0, max: 29.0) +[2023-10-11 18:56:41,063][84230] Avg episode reward: [(0, '44.430'), (1, '48.010')] +[2023-10-11 18:56:42,730][85175] Updated weights for policy 1, policy_version 95970 (0.0008) +[2023-10-11 18:56:43,102][85175] Updated weights for policy 1, policy_version 95980 (0.0010) +[2023-10-11 18:56:43,467][85175] Updated weights for policy 1, policy_version 95990 (0.0011) +[2023-10-11 18:56:43,842][85175] Updated weights for policy 1, policy_version 96000 (0.0010) +[2023-10-11 18:56:44,573][85176] Updated weights for policy 0, policy_version 94602 (0.0007) +[2023-10-11 18:56:44,952][85176] Updated weights for policy 0, policy_version 94612 (0.0009) +[2023-10-11 18:56:45,329][85176] Updated weights for policy 0, policy_version 94622 (0.0010) +[2023-10-11 18:56:46,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 195198976. Throughput: 0: 1661.4, 1: 1712.7. Samples: 48807802. Policy #0 lag: (min: 29.0, avg: 29.0, max: 29.0) +[2023-10-11 18:56:46,063][84230] Avg episode reward: [(0, '46.630'), (1, '43.910')] +[2023-10-11 18:56:47,901][85175] Updated weights for policy 1, policy_version 96010 (0.0008) +[2023-10-11 18:56:48,279][85175] Updated weights for policy 1, policy_version 96020 (0.0008) +[2023-10-11 18:56:48,655][85175] Updated weights for policy 1, policy_version 96030 (0.0008) +[2023-10-11 18:56:49,354][85176] Updated weights for policy 0, policy_version 94632 (0.0010) +[2023-10-11 18:56:49,732][85176] Updated weights for policy 0, policy_version 94642 (0.0010) +[2023-10-11 18:56:50,097][85176] Updated weights for policy 0, policy_version 94652 (0.0009) +[2023-10-11 18:56:51,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 195264512. Throughput: 0: 1665.6, 1: 1688.9. Samples: 48818452. Policy #0 lag: (min: 29.0, avg: 29.0, max: 29.0) +[2023-10-11 18:56:51,063][84230] Avg episode reward: [(0, '45.110'), (1, '47.680')] +[2023-10-11 18:56:52,661][85175] Updated weights for policy 1, policy_version 96040 (0.0007) +[2023-10-11 18:56:53,015][85175] Updated weights for policy 1, policy_version 96050 (0.0009) +[2023-10-11 18:56:53,374][85175] Updated weights for policy 1, policy_version 96060 (0.0010) +[2023-10-11 18:56:54,006][85176] Updated weights for policy 0, policy_version 94662 (0.0008) +[2023-10-11 18:56:54,387][85176] Updated weights for policy 0, policy_version 94672 (0.0008) +[2023-10-11 18:56:54,758][85176] Updated weights for policy 0, policy_version 94682 (0.0008) +[2023-10-11 18:56:56,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 195330048. Throughput: 0: 1658.4, 1: 1694.0. Samples: 48838248. Policy #0 lag: (min: 29.0, avg: 29.0, max: 29.0) +[2023-10-11 18:56:56,063][84230] Avg episode reward: [(0, '47.550'), (1, '43.300')] +[2023-10-11 18:56:57,576][85175] Updated weights for policy 1, policy_version 96070 (0.0008) +[2023-10-11 18:56:57,945][85175] Updated weights for policy 1, policy_version 96080 (0.0008) +[2023-10-11 18:56:58,315][85175] Updated weights for policy 1, policy_version 96090 (0.0008) +[2023-10-11 18:56:58,899][85176] Updated weights for policy 0, policy_version 94692 (0.0007) +[2023-10-11 18:56:59,268][85176] Updated weights for policy 0, policy_version 94702 (0.0007) +[2023-10-11 18:56:59,642][85176] Updated weights for policy 0, policy_version 94712 (0.0010) +[2023-10-11 18:57:01,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 195395584. Throughput: 0: 1672.5, 1: 1710.9. Samples: 48858584. Policy #0 lag: (min: 29.0, avg: 29.0, max: 29.0) +[2023-10-11 18:57:01,063][84230] Avg episode reward: [(0, '45.360'), (1, '45.760')] +[2023-10-11 18:57:02,116][85175] Updated weights for policy 1, policy_version 96100 (0.0008) +[2023-10-11 18:57:02,490][85175] Updated weights for policy 1, policy_version 96110 (0.0009) +[2023-10-11 18:57:02,863][85175] Updated weights for policy 1, policy_version 96120 (0.0010) +[2023-10-11 18:57:03,707][85176] Updated weights for policy 0, policy_version 94722 (0.0009) +[2023-10-11 18:57:04,079][85176] Updated weights for policy 0, policy_version 94732 (0.0008) +[2023-10-11 18:57:04,453][85176] Updated weights for policy 0, policy_version 94742 (0.0008) +[2023-10-11 18:57:04,819][85176] Updated weights for policy 0, policy_version 94752 (0.0007) +[2023-10-11 18:57:06,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 195461120. Throughput: 0: 1678.2, 1: 1676.8. Samples: 48868936. Policy #0 lag: (min: 29.0, avg: 29.0, max: 29.0) +[2023-10-11 18:57:06,064][84230] Avg episode reward: [(0, '43.190'), (1, '44.080')] +[2023-10-11 18:57:06,834][85175] Updated weights for policy 1, policy_version 96130 (0.0010) +[2023-10-11 18:57:07,206][85175] Updated weights for policy 1, policy_version 96140 (0.0007) +[2023-10-11 18:57:07,574][85175] Updated weights for policy 1, policy_version 96150 (0.0007) +[2023-10-11 18:57:07,946][85175] Updated weights for policy 1, policy_version 96160 (0.0007) +[2023-10-11 18:57:08,643][85176] Updated weights for policy 0, policy_version 94762 (0.0007) +[2023-10-11 18:57:09,007][85176] Updated weights for policy 0, policy_version 94772 (0.0007) +[2023-10-11 18:57:09,386][85176] Updated weights for policy 0, policy_version 94782 (0.0007) +[2023-10-11 18:57:11,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 195526656. Throughput: 0: 1654.0, 1: 1705.3. Samples: 48888624. Policy #0 lag: (min: 29.0, avg: 29.0, max: 29.0) +[2023-10-11 18:57:11,064][84230] Avg episode reward: [(0, '44.110'), (1, '43.750')] +[2023-10-11 18:57:11,939][85175] Updated weights for policy 1, policy_version 96170 (0.0011) +[2023-10-11 18:57:12,306][85175] Updated weights for policy 1, policy_version 96180 (0.0011) +[2023-10-11 18:57:12,659][85175] Updated weights for policy 1, policy_version 96190 (0.0010) +[2023-10-11 18:57:13,464][85176] Updated weights for policy 0, policy_version 94792 (0.0007) +[2023-10-11 18:57:13,833][85176] Updated weights for policy 0, policy_version 94802 (0.0008) +[2023-10-11 18:57:14,208][85176] Updated weights for policy 0, policy_version 94812 (0.0008) +[2023-10-11 18:57:16,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 195592192. Throughput: 0: 1683.5, 1: 1700.4. Samples: 48909384. Policy #0 lag: (min: 29.0, avg: 29.0, max: 29.0) +[2023-10-11 18:57:16,064][84230] Avg episode reward: [(0, '46.210'), (1, '44.820')] +[2023-10-11 18:57:16,728][85175] Updated weights for policy 1, policy_version 96200 (0.0008) +[2023-10-11 18:57:17,094][85175] Updated weights for policy 1, policy_version 96210 (0.0008) +[2023-10-11 18:57:17,458][85175] Updated weights for policy 1, policy_version 96220 (0.0009) +[2023-10-11 18:57:18,393][85176] Updated weights for policy 0, policy_version 94822 (0.0009) +[2023-10-11 18:57:18,775][85176] Updated weights for policy 0, policy_version 94832 (0.0008) +[2023-10-11 18:57:19,144][85176] Updated weights for policy 0, policy_version 94842 (0.0009) +[2023-10-11 18:57:21,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 195657728. Throughput: 0: 1672.0, 1: 1685.2. Samples: 48919216. Policy #0 lag: (min: 29.0, avg: 29.0, max: 29.0) +[2023-10-11 18:57:21,064][84230] Avg episode reward: [(0, '45.910'), (1, '43.480')] +[2023-10-11 18:57:21,641][85175] Updated weights for policy 1, policy_version 96230 (0.0008) +[2023-10-11 18:57:22,023][85175] Updated weights for policy 1, policy_version 96240 (0.0009) +[2023-10-11 18:57:22,394][85175] Updated weights for policy 1, policy_version 96250 (0.0008) +[2023-10-11 18:57:23,406][85176] Updated weights for policy 0, policy_version 94852 (0.0009) +[2023-10-11 18:57:23,784][85176] Updated weights for policy 0, policy_version 94862 (0.0007) +[2023-10-11 18:57:24,161][85176] Updated weights for policy 0, policy_version 94872 (0.0009) +[2023-10-11 18:57:26,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 195723264. Throughput: 0: 1656.8, 1: 1701.9. Samples: 48939098. Policy #0 lag: (min: 29.0, avg: 29.0, max: 29.0) +[2023-10-11 18:57:26,064][84230] Avg episode reward: [(0, '46.640'), (1, '47.330')] +[2023-10-11 18:57:26,345][85175] Updated weights for policy 1, policy_version 96260 (0.0009) +[2023-10-11 18:57:26,715][85175] Updated weights for policy 1, policy_version 96270 (0.0011) +[2023-10-11 18:57:27,084][85175] Updated weights for policy 1, policy_version 96280 (0.0010) +[2023-10-11 18:57:28,182][85176] Updated weights for policy 0, policy_version 94882 (0.0009) +[2023-10-11 18:57:28,563][85176] Updated weights for policy 0, policy_version 94892 (0.0009) +[2023-10-11 18:57:28,931][85176] Updated weights for policy 0, policy_version 94902 (0.0010) +[2023-10-11 18:57:29,304][85176] Updated weights for policy 0, policy_version 94912 (0.0010) +[2023-10-11 18:57:31,062][84230] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 195788800. Throughput: 0: 1676.0, 1: 1707.6. Samples: 48960062. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) +[2023-10-11 18:57:31,063][84230] Avg episode reward: [(0, '44.390'), (1, '43.820')] +[2023-10-11 18:57:31,152][85175] Updated weights for policy 1, policy_version 96290 (0.0009) +[2023-10-11 18:57:31,531][85175] Updated weights for policy 1, policy_version 96300 (0.0010) +[2023-10-11 18:57:31,897][85175] Updated weights for policy 1, policy_version 96310 (0.0007) +[2023-10-11 18:57:32,269][85175] Updated weights for policy 1, policy_version 96320 (0.0008) +[2023-10-11 18:57:33,421][85176] Updated weights for policy 0, policy_version 94922 (0.0007) +[2023-10-11 18:57:33,803][85176] Updated weights for policy 0, policy_version 94932 (0.0007) +[2023-10-11 18:57:34,174][85176] Updated weights for policy 0, policy_version 94942 (0.0010) +[2023-10-11 18:57:36,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 195854336. Throughput: 0: 1664.0, 1: 1700.7. Samples: 48969860. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) +[2023-10-11 18:57:36,064][84230] Avg episode reward: [(0, '45.680'), (1, '45.590')] +[2023-10-11 18:57:36,357][85175] Updated weights for policy 1, policy_version 96330 (0.0009) +[2023-10-11 18:57:36,731][85175] Updated weights for policy 1, policy_version 96340 (0.0008) +[2023-10-11 18:57:37,111][85175] Updated weights for policy 1, policy_version 96350 (0.0008) +[2023-10-11 18:57:38,282][85176] Updated weights for policy 0, policy_version 94952 (0.0008) +[2023-10-11 18:57:38,659][85176] Updated weights for policy 0, policy_version 94962 (0.0008) +[2023-10-11 18:57:39,035][85176] Updated weights for policy 0, policy_version 94972 (0.0007) +[2023-10-11 18:57:40,961][85175] Updated weights for policy 1, policy_version 96360 (0.0009) +[2023-10-11 18:57:41,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 195919872. Throughput: 0: 1662.2, 1: 1707.2. Samples: 48989872. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) +[2023-10-11 18:57:41,064][84230] Avg episode reward: [(0, '45.100'), (1, '45.330')] +[2023-10-11 18:57:41,333][85175] Updated weights for policy 1, policy_version 96370 (0.0008) +[2023-10-11 18:57:41,704][85175] Updated weights for policy 1, policy_version 96380 (0.0009) +[2023-10-11 18:57:43,179][85176] Updated weights for policy 0, policy_version 94982 (0.0009) +[2023-10-11 18:57:43,547][85176] Updated weights for policy 0, policy_version 94992 (0.0010) +[2023-10-11 18:57:43,923][85176] Updated weights for policy 0, policy_version 95002 (0.0008) +[2023-10-11 18:57:45,976][85175] Updated weights for policy 1, policy_version 96390 (0.0007) +[2023-10-11 18:57:46,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 195985408. Throughput: 0: 1673.3, 1: 1708.6. Samples: 49010768. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) +[2023-10-11 18:57:46,064][84230] Avg episode reward: [(0, '44.630'), (1, '44.650')] +[2023-10-11 18:57:46,376][85175] Updated weights for policy 1, policy_version 96400 (0.0008) +[2023-10-11 18:57:46,743][85175] Updated weights for policy 1, policy_version 96410 (0.0008) +[2023-10-11 18:57:47,988][85176] Updated weights for policy 0, policy_version 95012 (0.0007) +[2023-10-11 18:57:48,358][85176] Updated weights for policy 0, policy_version 95022 (0.0008) +[2023-10-11 18:57:48,738][85176] Updated weights for policy 0, policy_version 95032 (0.0008) +[2023-10-11 18:57:50,790][85175] Updated weights for policy 1, policy_version 96420 (0.0009) +[2023-10-11 18:57:51,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 196050944. Throughput: 0: 1659.5, 1: 1704.2. Samples: 49020304. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) +[2023-10-11 18:57:51,063][84230] Avg episode reward: [(0, '44.220'), (1, '44.410')] +[2023-10-11 18:57:51,154][85175] Updated weights for policy 1, policy_version 96430 (0.0008) +[2023-10-11 18:57:51,529][85175] Updated weights for policy 1, policy_version 96440 (0.0007) +[2023-10-11 18:57:52,754][85176] Updated weights for policy 0, policy_version 95042 (0.0008) +[2023-10-11 18:57:53,126][85176] Updated weights for policy 0, policy_version 95052 (0.0009) +[2023-10-11 18:57:53,504][85176] Updated weights for policy 0, policy_version 95062 (0.0008) +[2023-10-11 18:57:53,877][85176] Updated weights for policy 0, policy_version 95072 (0.0008) +[2023-10-11 18:57:55,366][85175] Updated weights for policy 1, policy_version 96450 (0.0008) +[2023-10-11 18:57:55,732][85175] Updated weights for policy 1, policy_version 96460 (0.0008) +[2023-10-11 18:57:56,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 196116480. Throughput: 0: 1673.0, 1: 1700.2. Samples: 49040418. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) +[2023-10-11 18:57:56,063][84230] Avg episode reward: [(0, '46.280'), (1, '45.040')] +[2023-10-11 18:57:56,111][85175] Updated weights for policy 1, policy_version 96470 (0.0009) +[2023-10-11 18:57:56,472][85175] Updated weights for policy 1, policy_version 96480 (0.0008) +[2023-10-11 18:57:58,165][85176] Updated weights for policy 0, policy_version 95082 (0.0010) +[2023-10-11 18:57:58,536][85176] Updated weights for policy 0, policy_version 95092 (0.0010) +[2023-10-11 18:57:58,912][85176] Updated weights for policy 0, policy_version 95102 (0.0008) +[2023-10-11 18:58:00,347][85175] Updated weights for policy 1, policy_version 96490 (0.0011) +[2023-10-11 18:58:00,724][85175] Updated weights for policy 1, policy_version 96500 (0.0008) +[2023-10-11 18:58:01,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 196182016. Throughput: 0: 1669.3, 1: 1691.0. Samples: 49060596. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) +[2023-10-11 18:58:01,063][84230] Avg episode reward: [(0, '45.410'), (1, '43.010')] +[2023-10-11 18:58:01,098][85175] Updated weights for policy 1, policy_version 96510 (0.0007) +[2023-10-11 18:58:02,870][85176] Updated weights for policy 0, policy_version 95112 (0.0008) +[2023-10-11 18:58:03,250][85176] Updated weights for policy 0, policy_version 95122 (0.0007) +[2023-10-11 18:58:03,627][85176] Updated weights for policy 0, policy_version 95132 (0.0007) +[2023-10-11 18:58:05,364][85175] Updated weights for policy 1, policy_version 96520 (0.0008) +[2023-10-11 18:58:05,725][85175] Updated weights for policy 1, policy_version 96530 (0.0008) +[2023-10-11 18:58:06,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.3, 300 sec: 13329.4). Total num frames: 196247552. Throughput: 0: 1658.2, 1: 1703.6. Samples: 49070496. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) +[2023-10-11 18:58:06,063][84230] Avg episode reward: [(0, '48.220'), (1, '43.200')] +[2023-10-11 18:58:06,092][85175] Updated weights for policy 1, policy_version 96540 (0.0009) +[2023-10-11 18:58:07,515][85176] Updated weights for policy 0, policy_version 95142 (0.0008) +[2023-10-11 18:58:07,901][85176] Updated weights for policy 0, policy_version 95152 (0.0008) +[2023-10-11 18:58:08,269][85176] Updated weights for policy 0, policy_version 95162 (0.0008) +[2023-10-11 18:58:10,057][85175] Updated weights for policy 1, policy_version 96550 (0.0011) +[2023-10-11 18:58:10,432][85175] Updated weights for policy 1, policy_version 96560 (0.0008) +[2023-10-11 18:58:10,800][85175] Updated weights for policy 1, policy_version 96570 (0.0010) +[2023-10-11 18:58:11,062][84230] Fps is (10 sec: 16384.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 196345856. Throughput: 0: 1676.3, 1: 1703.1. Samples: 49091168. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) +[2023-10-11 18:58:11,063][84230] Avg episode reward: [(0, '44.160'), (1, '43.710')] +[2023-10-11 18:58:12,469][85176] Updated weights for policy 0, policy_version 95172 (0.0009) +[2023-10-11 18:58:12,847][85176] Updated weights for policy 0, policy_version 95182 (0.0009) +[2023-10-11 18:58:13,207][85176] Updated weights for policy 0, policy_version 95192 (0.0008) +[2023-10-11 18:58:14,768][85175] Updated weights for policy 1, policy_version 96580 (0.0008) +[2023-10-11 18:58:15,133][85175] Updated weights for policy 1, policy_version 96590 (0.0010) +[2023-10-11 18:58:15,497][85175] Updated weights for policy 1, policy_version 96600 (0.0010) +[2023-10-11 18:58:16,063][84230] Fps is (10 sec: 16383.7, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 196411392. Throughput: 0: 1681.5, 1: 1674.4. Samples: 49111080. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) +[2023-10-11 18:58:16,064][84230] Avg episode reward: [(0, '46.790'), (1, '44.410')] +[2023-10-11 18:58:16,074][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000095200_97484800.pth... +[2023-10-11 18:58:16,075][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000096608_98926592.pth... +[2023-10-11 18:58:16,111][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000095008_97288192.pth +[2023-10-11 18:58:16,114][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000093632_95879168.pth +[2023-10-11 18:58:17,124][85176] Updated weights for policy 0, policy_version 95202 (0.0007) +[2023-10-11 18:58:17,504][85176] Updated weights for policy 0, policy_version 95212 (0.0009) +[2023-10-11 18:58:17,879][85176] Updated weights for policy 0, policy_version 95222 (0.0008) +[2023-10-11 18:58:18,260][85176] Updated weights for policy 0, policy_version 95232 (0.0008) +[2023-10-11 18:58:19,522][85175] Updated weights for policy 1, policy_version 96610 (0.0008) +[2023-10-11 18:58:19,892][85175] Updated weights for policy 1, policy_version 96620 (0.0007) +[2023-10-11 18:58:20,256][85175] Updated weights for policy 1, policy_version 96630 (0.0009) +[2023-10-11 18:58:20,625][85175] Updated weights for policy 1, policy_version 96640 (0.0010) +[2023-10-11 18:58:21,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 196476928. Throughput: 0: 1665.7, 1: 1702.4. Samples: 49121424. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) +[2023-10-11 18:58:21,064][84230] Avg episode reward: [(0, '45.810'), (1, '46.960')] +[2023-10-11 18:58:22,324][85176] Updated weights for policy 0, policy_version 95242 (0.0007) +[2023-10-11 18:58:22,697][85176] Updated weights for policy 0, policy_version 95252 (0.0007) +[2023-10-11 18:58:23,058][85176] Updated weights for policy 0, policy_version 95262 (0.0011) +[2023-10-11 18:58:24,586][85175] Updated weights for policy 1, policy_version 96650 (0.0010) +[2023-10-11 18:58:24,940][85175] Updated weights for policy 1, policy_version 96660 (0.0010) +[2023-10-11 18:58:25,318][85175] Updated weights for policy 1, policy_version 96670 (0.0010) +[2023-10-11 18:58:26,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 196542464. Throughput: 0: 1682.2, 1: 1695.7. Samples: 49141876. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) +[2023-10-11 18:58:26,064][84230] Avg episode reward: [(0, '46.850'), (1, '46.740')] +[2023-10-11 18:58:26,887][85176] Updated weights for policy 0, policy_version 95272 (0.0008) +[2023-10-11 18:58:27,252][85176] Updated weights for policy 0, policy_version 95282 (0.0009) +[2023-10-11 18:58:27,621][85176] Updated weights for policy 0, policy_version 95292 (0.0010) +[2023-10-11 18:58:29,427][85175] Updated weights for policy 1, policy_version 96680 (0.0010) +[2023-10-11 18:58:29,796][85175] Updated weights for policy 1, policy_version 96690 (0.0007) +[2023-10-11 18:58:30,164][85175] Updated weights for policy 1, policy_version 96700 (0.0010) +[2023-10-11 18:58:31,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 196608000. Throughput: 0: 1693.3, 1: 1670.2. Samples: 49162124. Policy #0 lag: (min: 31.0, avg: 31.2, max: 41.0) +[2023-10-11 18:58:31,063][84230] Avg episode reward: [(0, '44.370'), (1, '46.130')] +[2023-10-11 18:58:31,703][85176] Updated weights for policy 0, policy_version 95302 (0.0010) +[2023-10-11 18:58:32,080][85176] Updated weights for policy 0, policy_version 95312 (0.0007) +[2023-10-11 18:58:32,447][85176] Updated weights for policy 0, policy_version 95322 (0.0010) +[2023-10-11 18:58:34,362][85175] Updated weights for policy 1, policy_version 96710 (0.0008) +[2023-10-11 18:58:34,752][85175] Updated weights for policy 1, policy_version 96720 (0.0009) +[2023-10-11 18:58:35,116][85175] Updated weights for policy 1, policy_version 96730 (0.0007) +[2023-10-11 18:58:36,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 196673536. Throughput: 0: 1679.1, 1: 1702.3. Samples: 49172466. Policy #0 lag: (min: 31.0, avg: 31.2, max: 41.0) +[2023-10-11 18:58:36,063][84230] Avg episode reward: [(0, '46.430'), (1, '45.960')] +[2023-10-11 18:58:36,373][85176] Updated weights for policy 0, policy_version 95332 (0.0009) +[2023-10-11 18:58:36,742][85176] Updated weights for policy 0, policy_version 95342 (0.0009) +[2023-10-11 18:58:37,115][85176] Updated weights for policy 0, policy_version 95352 (0.0008) +[2023-10-11 18:58:39,168][85175] Updated weights for policy 1, policy_version 96740 (0.0009) +[2023-10-11 18:58:39,532][85175] Updated weights for policy 1, policy_version 96750 (0.0011) +[2023-10-11 18:58:39,895][85175] Updated weights for policy 1, policy_version 96760 (0.0007) +[2023-10-11 18:58:41,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 196739072. Throughput: 0: 1698.9, 1: 1684.7. Samples: 49192678. Policy #0 lag: (min: 31.0, avg: 31.2, max: 41.0) +[2023-10-11 18:58:41,064][84230] Avg episode reward: [(0, '46.540'), (1, '46.440')] +[2023-10-11 18:58:41,243][85176] Updated weights for policy 0, policy_version 95362 (0.0008) +[2023-10-11 18:58:41,624][85176] Updated weights for policy 0, policy_version 95372 (0.0009) +[2023-10-11 18:58:41,986][85176] Updated weights for policy 0, policy_version 95382 (0.0009) +[2023-10-11 18:58:42,358][85176] Updated weights for policy 0, policy_version 95392 (0.0008) +[2023-10-11 18:58:43,879][85175] Updated weights for policy 1, policy_version 96770 (0.0008) +[2023-10-11 18:58:44,241][85175] Updated weights for policy 1, policy_version 96780 (0.0008) +[2023-10-11 18:58:44,602][85175] Updated weights for policy 1, policy_version 96790 (0.0007) +[2023-10-11 18:58:44,980][85175] Updated weights for policy 1, policy_version 96800 (0.0008) +[2023-10-11 18:58:46,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 196804608. Throughput: 0: 1702.2, 1: 1682.0. Samples: 49212886. Policy #0 lag: (min: 31.0, avg: 31.2, max: 41.0) +[2023-10-11 18:58:46,064][84230] Avg episode reward: [(0, '46.480'), (1, '48.790')] +[2023-10-11 18:58:46,546][85176] Updated weights for policy 0, policy_version 95402 (0.0007) +[2023-10-11 18:58:46,925][85176] Updated weights for policy 0, policy_version 95412 (0.0008) +[2023-10-11 18:58:47,295][85176] Updated weights for policy 0, policy_version 95422 (0.0009) +[2023-10-11 18:58:48,841][85175] Updated weights for policy 1, policy_version 96810 (0.0010) +[2023-10-11 18:58:49,214][85175] Updated weights for policy 1, policy_version 96820 (0.0009) +[2023-10-11 18:58:49,577][85175] Updated weights for policy 1, policy_version 96830 (0.0010) +[2023-10-11 18:58:51,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 196870144. Throughput: 0: 1692.1, 1: 1702.9. Samples: 49223272. Policy #0 lag: (min: 31.0, avg: 31.2, max: 41.0) +[2023-10-11 18:58:51,063][84230] Avg episode reward: [(0, '46.520'), (1, '44.230')] +[2023-10-11 18:58:51,405][85176] Updated weights for policy 0, policy_version 95432 (0.0010) +[2023-10-11 18:58:51,786][85176] Updated weights for policy 0, policy_version 95442 (0.0009) +[2023-10-11 18:58:52,156][85176] Updated weights for policy 0, policy_version 95452 (0.0009) +[2023-10-11 18:58:53,491][85175] Updated weights for policy 1, policy_version 96840 (0.0009) +[2023-10-11 18:58:53,859][85175] Updated weights for policy 1, policy_version 96850 (0.0008) +[2023-10-11 18:58:54,223][85175] Updated weights for policy 1, policy_version 96860 (0.0010) +[2023-10-11 18:58:56,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 196935680. Throughput: 0: 1694.0, 1: 1677.9. Samples: 49242904. Policy #0 lag: (min: 31.0, avg: 31.2, max: 41.0) +[2023-10-11 18:58:56,063][84230] Avg episode reward: [(0, '45.220'), (1, '46.650')] +[2023-10-11 18:58:56,206][85176] Updated weights for policy 0, policy_version 95462 (0.0009) +[2023-10-11 18:58:56,568][85176] Updated weights for policy 0, policy_version 95472 (0.0007) +[2023-10-11 18:58:56,951][85176] Updated weights for policy 0, policy_version 95482 (0.0008) +[2023-10-11 18:58:58,177][85175] Updated weights for policy 1, policy_version 96870 (0.0009) +[2023-10-11 18:58:58,553][85175] Updated weights for policy 1, policy_version 96880 (0.0007) +[2023-10-11 18:58:58,909][85175] Updated weights for policy 1, policy_version 96890 (0.0008) +[2023-10-11 18:59:01,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 197001216. Throughput: 0: 1688.9, 1: 1699.5. Samples: 49263556. Policy #0 lag: (min: 31.0, avg: 31.2, max: 41.0) +[2023-10-11 18:59:01,063][84230] Avg episode reward: [(0, '46.140'), (1, '42.580')] +[2023-10-11 18:59:01,065][85176] Updated weights for policy 0, policy_version 95492 (0.0009) +[2023-10-11 18:59:01,440][85176] Updated weights for policy 0, policy_version 95502 (0.0009) +[2023-10-11 18:59:01,806][85176] Updated weights for policy 0, policy_version 95512 (0.0008) +[2023-10-11 18:59:02,830][85175] Updated weights for policy 1, policy_version 96900 (0.0010) +[2023-10-11 18:59:03,209][85175] Updated weights for policy 1, policy_version 96910 (0.0010) +[2023-10-11 18:59:03,583][85175] Updated weights for policy 1, policy_version 96920 (0.0011) +[2023-10-11 18:59:05,802][85176] Updated weights for policy 0, policy_version 95522 (0.0009) +[2023-10-11 18:59:06,062][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 197066752. Throughput: 0: 1689.0, 1: 1683.8. Samples: 49273200. Policy #0 lag: (min: 31.0, avg: 31.2, max: 41.0) +[2023-10-11 18:59:06,063][84230] Avg episode reward: [(0, '47.640'), (1, '47.700')] +[2023-10-11 18:59:06,182][85176] Updated weights for policy 0, policy_version 95532 (0.0008) +[2023-10-11 18:59:06,546][85176] Updated weights for policy 0, policy_version 95542 (0.0009) +[2023-10-11 18:59:06,910][85176] Updated weights for policy 0, policy_version 95552 (0.0010) +[2023-10-11 18:59:07,662][85175] Updated weights for policy 1, policy_version 96930 (0.0009) +[2023-10-11 18:59:08,036][85175] Updated weights for policy 1, policy_version 96940 (0.0007) +[2023-10-11 18:59:08,405][85175] Updated weights for policy 1, policy_version 96950 (0.0009) +[2023-10-11 18:59:08,776][85175] Updated weights for policy 1, policy_version 96960 (0.0009) +[2023-10-11 18:59:11,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 197132288. Throughput: 0: 1688.6, 1: 1682.0. Samples: 49293552. Policy #0 lag: (min: 31.0, avg: 31.2, max: 41.0) +[2023-10-11 18:59:11,064][84230] Avg episode reward: [(0, '45.090'), (1, '43.520')] +[2023-10-11 18:59:11,072][85176] Updated weights for policy 0, policy_version 95562 (0.0007) +[2023-10-11 18:59:11,445][85176] Updated weights for policy 0, policy_version 95572 (0.0009) +[2023-10-11 18:59:11,822][85176] Updated weights for policy 0, policy_version 95582 (0.0009) +[2023-10-11 18:59:12,697][85175] Updated weights for policy 1, policy_version 96970 (0.0008) +[2023-10-11 18:59:13,063][85175] Updated weights for policy 1, policy_version 96980 (0.0007) +[2023-10-11 18:59:13,428][85175] Updated weights for policy 1, policy_version 96990 (0.0008) +[2023-10-11 18:59:15,791][85176] Updated weights for policy 0, policy_version 95592 (0.0009) +[2023-10-11 18:59:16,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 197197824. Throughput: 0: 1678.9, 1: 1708.7. Samples: 49314566. Policy #0 lag: (min: 31.0, avg: 31.2, max: 41.0) +[2023-10-11 18:59:16,063][84230] Avg episode reward: [(0, '46.360'), (1, '49.090')] +[2023-10-11 18:59:16,171][85176] Updated weights for policy 0, policy_version 95602 (0.0009) +[2023-10-11 18:59:16,542][85176] Updated weights for policy 0, policy_version 95612 (0.0008) +[2023-10-11 18:59:17,348][85175] Updated weights for policy 1, policy_version 97000 (0.0008) +[2023-10-11 18:59:17,713][85175] Updated weights for policy 1, policy_version 97010 (0.0008) +[2023-10-11 18:59:18,081][85175] Updated weights for policy 1, policy_version 97020 (0.0008) +[2023-10-11 18:59:20,619][85176] Updated weights for policy 0, policy_version 95622 (0.0009) +[2023-10-11 18:59:20,992][85176] Updated weights for policy 0, policy_version 95632 (0.0009) +[2023-10-11 18:59:21,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 197263360. Throughput: 0: 1682.7, 1: 1682.4. Samples: 49323896. Policy #0 lag: (min: 31.0, avg: 31.2, max: 41.0) +[2023-10-11 18:59:21,063][84230] Avg episode reward: [(0, '44.450'), (1, '47.850')] +[2023-10-11 18:59:21,369][85176] Updated weights for policy 0, policy_version 95642 (0.0011) +[2023-10-11 18:59:22,192][85175] Updated weights for policy 1, policy_version 97030 (0.0009) +[2023-10-11 18:59:22,551][85175] Updated weights for policy 1, policy_version 97040 (0.0008) +[2023-10-11 18:59:22,930][85175] Updated weights for policy 1, policy_version 97050 (0.0007) +[2023-10-11 18:59:25,239][85176] Updated weights for policy 0, policy_version 95652 (0.0008) +[2023-10-11 18:59:25,614][85176] Updated weights for policy 0, policy_version 95662 (0.0008) +[2023-10-11 18:59:25,996][85176] Updated weights for policy 0, policy_version 95672 (0.0009) +[2023-10-11 18:59:26,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 197328896. Throughput: 0: 1679.9, 1: 1698.4. Samples: 49344698. Policy #0 lag: (min: 31.0, avg: 31.2, max: 41.0) +[2023-10-11 18:59:26,064][84230] Avg episode reward: [(0, '45.340'), (1, '45.070')] +[2023-10-11 18:59:27,136][85175] Updated weights for policy 1, policy_version 97060 (0.0008) +[2023-10-11 18:59:27,534][85175] Updated weights for policy 1, policy_version 97070 (0.0009) +[2023-10-11 18:59:27,900][85175] Updated weights for policy 1, policy_version 97080 (0.0010) +[2023-10-11 18:59:30,228][85176] Updated weights for policy 0, policy_version 95682 (0.0010) +[2023-10-11 18:59:30,583][85176] Updated weights for policy 0, policy_version 95692 (0.0010) +[2023-10-11 18:59:30,960][85176] Updated weights for policy 0, policy_version 95702 (0.0009) +[2023-10-11 18:59:31,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 197394432. Throughput: 0: 1670.6, 1: 1712.6. Samples: 49365130. Policy #0 lag: (min: 31.0, avg: 31.2, max: 41.0) +[2023-10-11 18:59:31,063][84230] Avg episode reward: [(0, '45.560'), (1, '43.580')] +[2023-10-11 18:59:31,342][85176] Updated weights for policy 0, policy_version 95712 (0.0011) +[2023-10-11 18:59:31,830][85175] Updated weights for policy 1, policy_version 97090 (0.0010) +[2023-10-11 18:59:32,197][85175] Updated weights for policy 1, policy_version 97100 (0.0009) +[2023-10-11 18:59:32,567][85175] Updated weights for policy 1, policy_version 97110 (0.0010) +[2023-10-11 18:59:32,937][85175] Updated weights for policy 1, policy_version 97120 (0.0008) +[2023-10-11 18:59:35,441][85176] Updated weights for policy 0, policy_version 95722 (0.0007) +[2023-10-11 18:59:35,817][85176] Updated weights for policy 0, policy_version 95732 (0.0007) +[2023-10-11 18:59:36,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 197459968. Throughput: 0: 1686.0, 1: 1682.6. Samples: 49374860. Policy #0 lag: (min: 31.0, avg: 34.2, max: 63.0) +[2023-10-11 18:59:36,063][84230] Avg episode reward: [(0, '46.870'), (1, '44.590')] +[2023-10-11 18:59:36,196][85176] Updated weights for policy 0, policy_version 95742 (0.0010) +[2023-10-11 18:59:36,947][85175] Updated weights for policy 1, policy_version 97130 (0.0008) +[2023-10-11 18:59:37,320][85175] Updated weights for policy 1, policy_version 97140 (0.0007) +[2023-10-11 18:59:37,691][85175] Updated weights for policy 1, policy_version 97150 (0.0008) +[2023-10-11 18:59:40,119][85176] Updated weights for policy 0, policy_version 95752 (0.0009) +[2023-10-11 18:59:40,490][85176] Updated weights for policy 0, policy_version 95762 (0.0007) +[2023-10-11 18:59:40,866][85176] Updated weights for policy 0, policy_version 95772 (0.0007) +[2023-10-11 18:59:41,063][84230] Fps is (10 sec: 16384.0, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 197558272. Throughput: 0: 1684.3, 1: 1711.7. Samples: 49395724. Policy #0 lag: (min: 31.0, avg: 34.2, max: 63.0) +[2023-10-11 18:59:41,063][84230] Avg episode reward: [(0, '44.590'), (1, '47.430')] +[2023-10-11 18:59:41,622][85175] Updated weights for policy 1, policy_version 97160 (0.0007) +[2023-10-11 18:59:41,992][85175] Updated weights for policy 1, policy_version 97170 (0.0007) +[2023-10-11 18:59:42,361][85175] Updated weights for policy 1, policy_version 97180 (0.0008) +[2023-10-11 18:59:44,909][85176] Updated weights for policy 0, policy_version 95782 (0.0007) +[2023-10-11 18:59:45,278][85176] Updated weights for policy 0, policy_version 95792 (0.0007) +[2023-10-11 18:59:45,654][85176] Updated weights for policy 0, policy_version 95802 (0.0008) +[2023-10-11 18:59:46,063][84230] Fps is (10 sec: 16383.8, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 197623808. Throughput: 0: 1668.6, 1: 1713.0. Samples: 49415730. Policy #0 lag: (min: 31.0, avg: 34.2, max: 63.0) +[2023-10-11 18:59:46,063][84230] Avg episode reward: [(0, '45.210'), (1, '46.540')] +[2023-10-11 18:59:46,366][85175] Updated weights for policy 1, policy_version 97190 (0.0009) +[2023-10-11 18:59:46,730][85175] Updated weights for policy 1, policy_version 97200 (0.0009) +[2023-10-11 18:59:47,105][85175] Updated weights for policy 1, policy_version 97210 (0.0008) +[2023-10-11 18:59:49,791][85176] Updated weights for policy 0, policy_version 95812 (0.0011) +[2023-10-11 18:59:50,171][85176] Updated weights for policy 0, policy_version 95822 (0.0007) +[2023-10-11 18:59:50,550][85176] Updated weights for policy 0, policy_version 95832 (0.0010) +[2023-10-11 18:59:51,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 197689344. Throughput: 0: 1688.4, 1: 1702.5. Samples: 49425790. Policy #0 lag: (min: 31.0, avg: 34.2, max: 63.0) +[2023-10-11 18:59:51,064][84230] Avg episode reward: [(0, '42.990'), (1, '47.220')] +[2023-10-11 18:59:51,135][85175] Updated weights for policy 1, policy_version 97220 (0.0009) +[2023-10-11 18:59:51,506][85175] Updated weights for policy 1, policy_version 97230 (0.0008) +[2023-10-11 18:59:51,876][85175] Updated weights for policy 1, policy_version 97240 (0.0007) +[2023-10-11 18:59:54,531][85176] Updated weights for policy 0, policy_version 95842 (0.0009) +[2023-10-11 18:59:54,898][85176] Updated weights for policy 0, policy_version 95852 (0.0008) +[2023-10-11 18:59:55,263][85176] Updated weights for policy 0, policy_version 95862 (0.0009) +[2023-10-11 18:59:55,630][85176] Updated weights for policy 0, policy_version 95872 (0.0009) +[2023-10-11 18:59:55,970][85175] Updated weights for policy 1, policy_version 97250 (0.0009) +[2023-10-11 18:59:56,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 197754880. Throughput: 0: 1683.1, 1: 1709.1. Samples: 49446200. Policy #0 lag: (min: 31.0, avg: 34.2, max: 63.0) +[2023-10-11 18:59:56,064][84230] Avg episode reward: [(0, '46.050'), (1, '47.030')] +[2023-10-11 18:59:56,344][85175] Updated weights for policy 1, policy_version 97260 (0.0008) +[2023-10-11 18:59:56,711][85175] Updated weights for policy 1, policy_version 97270 (0.0011) +[2023-10-11 18:59:57,087][85175] Updated weights for policy 1, policy_version 97280 (0.0008) +[2023-10-11 18:59:59,658][85176] Updated weights for policy 0, policy_version 95882 (0.0009) +[2023-10-11 19:00:00,030][85176] Updated weights for policy 0, policy_version 95892 (0.0010) +[2023-10-11 19:00:00,395][85176] Updated weights for policy 0, policy_version 95902 (0.0011) +[2023-10-11 19:00:01,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 197820416. Throughput: 0: 1654.8, 1: 1707.9. Samples: 49465888. Policy #0 lag: (min: 31.0, avg: 34.2, max: 63.0) +[2023-10-11 19:00:01,063][84230] Avg episode reward: [(0, '48.060'), (1, '46.080')] +[2023-10-11 19:00:01,067][85175] Updated weights for policy 1, policy_version 97290 (0.0007) +[2023-10-11 19:00:01,427][85175] Updated weights for policy 1, policy_version 97300 (0.0009) +[2023-10-11 19:00:01,793][85175] Updated weights for policy 1, policy_version 97310 (0.0008) +[2023-10-11 19:00:04,479][85176] Updated weights for policy 0, policy_version 95912 (0.0008) +[2023-10-11 19:00:04,850][85176] Updated weights for policy 0, policy_version 95922 (0.0007) +[2023-10-11 19:00:05,237][85176] Updated weights for policy 0, policy_version 95932 (0.0010) +[2023-10-11 19:00:05,727][85175] Updated weights for policy 1, policy_version 97320 (0.0007) +[2023-10-11 19:00:06,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 197885952. Throughput: 0: 1683.0, 1: 1708.2. Samples: 49476500. Policy #0 lag: (min: 31.0, avg: 34.2, max: 63.0) +[2023-10-11 19:00:06,063][84230] Avg episode reward: [(0, '47.110'), (1, '45.730')] +[2023-10-11 19:00:06,094][85175] Updated weights for policy 1, policy_version 97330 (0.0007) +[2023-10-11 19:00:06,460][85175] Updated weights for policy 1, policy_version 97340 (0.0007) +[2023-10-11 19:00:09,510][85176] Updated weights for policy 0, policy_version 95942 (0.0008) +[2023-10-11 19:00:09,881][85176] Updated weights for policy 0, policy_version 95952 (0.0007) +[2023-10-11 19:00:10,256][85176] Updated weights for policy 0, policy_version 95962 (0.0009) +[2023-10-11 19:00:10,558][85175] Updated weights for policy 1, policy_version 97350 (0.0008) +[2023-10-11 19:00:10,931][85175] Updated weights for policy 1, policy_version 97360 (0.0009) +[2023-10-11 19:00:11,062][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13551.5). Total num frames: 197951488. Throughput: 0: 1668.1, 1: 1710.1. Samples: 49496716. Policy #0 lag: (min: 31.0, avg: 34.2, max: 63.0) +[2023-10-11 19:00:11,063][84230] Avg episode reward: [(0, '45.710'), (1, '47.780')] +[2023-10-11 19:00:11,302][85175] Updated weights for policy 1, policy_version 97370 (0.0008) +[2023-10-11 19:00:14,077][85176] Updated weights for policy 0, policy_version 95972 (0.0007) +[2023-10-11 19:00:14,446][85176] Updated weights for policy 0, policy_version 95982 (0.0007) +[2023-10-11 19:00:14,829][85176] Updated weights for policy 0, policy_version 95992 (0.0008) +[2023-10-11 19:00:15,457][85175] Updated weights for policy 1, policy_version 97380 (0.0007) +[2023-10-11 19:00:15,847][85175] Updated weights for policy 1, policy_version 97390 (0.0008) +[2023-10-11 19:00:16,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 198017024. Throughput: 0: 1660.5, 1: 1705.1. Samples: 49516582. Policy #0 lag: (min: 31.0, avg: 34.2, max: 63.0) +[2023-10-11 19:00:16,064][84230] Avg episode reward: [(0, '46.550'), (1, '44.860')] +[2023-10-11 19:00:16,074][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000096000_98304000.pth... +[2023-10-11 19:00:16,103][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000094432_96698368.pth +[2023-10-11 19:00:16,225][85175] Updated weights for policy 1, policy_version 97400 (0.0008) +[2023-10-11 19:00:16,513][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000097408_99745792.pth... +[2023-10-11 19:00:16,542][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000095808_98107392.pth +[2023-10-11 19:00:18,957][85176] Updated weights for policy 0, policy_version 96002 (0.0009) +[2023-10-11 19:00:19,324][85176] Updated weights for policy 0, policy_version 96012 (0.0008) +[2023-10-11 19:00:19,691][85176] Updated weights for policy 0, policy_version 96022 (0.0008) +[2023-10-11 19:00:20,069][85176] Updated weights for policy 0, policy_version 96032 (0.0008) +[2023-10-11 19:00:20,231][85175] Updated weights for policy 1, policy_version 97410 (0.0008) +[2023-10-11 19:00:20,613][85175] Updated weights for policy 1, policy_version 97420 (0.0010) +[2023-10-11 19:00:20,987][85175] Updated weights for policy 1, policy_version 97430 (0.0009) +[2023-10-11 19:00:21,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 198082560. Throughput: 0: 1676.1, 1: 1707.4. Samples: 49527120. Policy #0 lag: (min: 31.0, avg: 34.2, max: 63.0) +[2023-10-11 19:00:21,063][84230] Avg episode reward: [(0, '48.190'), (1, '42.910')] +[2023-10-11 19:00:21,362][85175] Updated weights for policy 1, policy_version 97440 (0.0009) +[2023-10-11 19:00:24,208][85176] Updated weights for policy 0, policy_version 96042 (0.0009) +[2023-10-11 19:00:24,573][85176] Updated weights for policy 0, policy_version 96052 (0.0010) +[2023-10-11 19:00:24,948][85176] Updated weights for policy 0, policy_version 96062 (0.0007) +[2023-10-11 19:00:25,466][85175] Updated weights for policy 1, policy_version 97450 (0.0009) +[2023-10-11 19:00:25,834][85175] Updated weights for policy 1, policy_version 97460 (0.0010) +[2023-10-11 19:00:26,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 198148096. Throughput: 0: 1664.5, 1: 1696.3. Samples: 49546960. Policy #0 lag: (min: 31.0, avg: 34.2, max: 63.0) +[2023-10-11 19:00:26,064][84230] Avg episode reward: [(0, '46.290'), (1, '43.310')] +[2023-10-11 19:00:26,210][85175] Updated weights for policy 1, policy_version 97470 (0.0008) +[2023-10-11 19:00:28,939][85176] Updated weights for policy 0, policy_version 96072 (0.0008) +[2023-10-11 19:00:29,309][85176] Updated weights for policy 0, policy_version 96082 (0.0011) +[2023-10-11 19:00:29,678][85176] Updated weights for policy 0, policy_version 96092 (0.0010) +[2023-10-11 19:00:30,276][85175] Updated weights for policy 1, policy_version 97480 (0.0008) +[2023-10-11 19:00:30,641][85175] Updated weights for policy 1, policy_version 97490 (0.0010) +[2023-10-11 19:00:31,012][85175] Updated weights for policy 1, policy_version 97500 (0.0008) +[2023-10-11 19:00:31,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 198213632. Throughput: 0: 1675.7, 1: 1682.6. Samples: 49566854. Policy #0 lag: (min: 31.0, avg: 34.2, max: 63.0) +[2023-10-11 19:00:31,063][84230] Avg episode reward: [(0, '46.040'), (1, '44.040')] +[2023-10-11 19:00:33,752][85176] Updated weights for policy 0, policy_version 96102 (0.0008) +[2023-10-11 19:00:34,129][85176] Updated weights for policy 0, policy_version 96112 (0.0010) +[2023-10-11 19:00:34,501][85176] Updated weights for policy 0, policy_version 96122 (0.0010) +[2023-10-11 19:00:34,933][85175] Updated weights for policy 1, policy_version 97510 (0.0011) +[2023-10-11 19:00:35,295][85175] Updated weights for policy 1, policy_version 97520 (0.0009) +[2023-10-11 19:00:35,658][85175] Updated weights for policy 1, policy_version 97530 (0.0010) +[2023-10-11 19:00:36,062][84230] Fps is (10 sec: 16384.3, 60 sec: 14199.5, 300 sec: 13662.6). Total num frames: 198311936. Throughput: 0: 1685.4, 1: 1694.3. Samples: 49577874. Policy #0 lag: (min: 20.0, avg: 38.0, max: 40.0) +[2023-10-11 19:00:36,063][84230] Avg episode reward: [(0, '44.220'), (1, '46.460')] +[2023-10-11 19:00:38,506][85176] Updated weights for policy 0, policy_version 96132 (0.0008) +[2023-10-11 19:00:38,876][85176] Updated weights for policy 0, policy_version 96142 (0.0008) +[2023-10-11 19:00:39,254][85176] Updated weights for policy 0, policy_version 96152 (0.0008) +[2023-10-11 19:00:39,869][85175] Updated weights for policy 1, policy_version 97540 (0.0011) +[2023-10-11 19:00:40,240][85175] Updated weights for policy 1, policy_version 97550 (0.0009) +[2023-10-11 19:00:40,602][85175] Updated weights for policy 1, policy_version 97560 (0.0011) +[2023-10-11 19:00:41,063][84230] Fps is (10 sec: 16383.7, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 198377472. Throughput: 0: 1666.3, 1: 1693.2. Samples: 49597376. Policy #0 lag: (min: 20.0, avg: 38.0, max: 40.0) +[2023-10-11 19:00:41,064][84230] Avg episode reward: [(0, '44.650'), (1, '43.830')] +[2023-10-11 19:00:43,286][85176] Updated weights for policy 0, policy_version 96162 (0.0010) +[2023-10-11 19:00:43,664][85176] Updated weights for policy 0, policy_version 96172 (0.0009) +[2023-10-11 19:00:44,023][85176] Updated weights for policy 0, policy_version 96182 (0.0009) +[2023-10-11 19:00:44,392][85176] Updated weights for policy 0, policy_version 96192 (0.0010) +[2023-10-11 19:00:44,704][85175] Updated weights for policy 1, policy_version 97570 (0.0011) +[2023-10-11 19:00:45,070][85175] Updated weights for policy 1, policy_version 97580 (0.0008) +[2023-10-11 19:00:45,441][85175] Updated weights for policy 1, policy_version 97590 (0.0008) +[2023-10-11 19:00:45,812][85175] Updated weights for policy 1, policy_version 97600 (0.0007) +[2023-10-11 19:00:46,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 198443008. Throughput: 0: 1695.0, 1: 1665.1. Samples: 49617090. Policy #0 lag: (min: 20.0, avg: 38.0, max: 40.0) +[2023-10-11 19:00:46,063][84230] Avg episode reward: [(0, '43.150'), (1, '44.550')] +[2023-10-11 19:00:48,504][85176] Updated weights for policy 0, policy_version 96202 (0.0008) +[2023-10-11 19:00:48,877][85176] Updated weights for policy 0, policy_version 96212 (0.0008) +[2023-10-11 19:00:49,248][85176] Updated weights for policy 0, policy_version 96222 (0.0008) +[2023-10-11 19:00:49,716][85175] Updated weights for policy 1, policy_version 97610 (0.0011) +[2023-10-11 19:00:50,085][85175] Updated weights for policy 1, policy_version 97620 (0.0008) +[2023-10-11 19:00:50,462][85175] Updated weights for policy 1, policy_version 97630 (0.0010) +[2023-10-11 19:00:51,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 198508544. Throughput: 0: 1681.5, 1: 1682.3. Samples: 49627872. Policy #0 lag: (min: 20.0, avg: 38.0, max: 40.0) +[2023-10-11 19:00:51,064][84230] Avg episode reward: [(0, '46.130'), (1, '47.150')] +[2023-10-11 19:00:53,291][85176] Updated weights for policy 0, policy_version 96232 (0.0010) +[2023-10-11 19:00:53,664][85176] Updated weights for policy 0, policy_version 96242 (0.0009) +[2023-10-11 19:00:54,039][85176] Updated weights for policy 0, policy_version 96252 (0.0008) +[2023-10-11 19:00:54,460][85175] Updated weights for policy 1, policy_version 97640 (0.0011) +[2023-10-11 19:00:54,832][85175] Updated weights for policy 1, policy_version 97650 (0.0010) +[2023-10-11 19:00:55,191][85175] Updated weights for policy 1, policy_version 97660 (0.0009) +[2023-10-11 19:00:56,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 198574080. Throughput: 0: 1675.8, 1: 1674.9. Samples: 49647496. Policy #0 lag: (min: 20.0, avg: 38.0, max: 40.0) +[2023-10-11 19:00:56,064][84230] Avg episode reward: [(0, '44.880'), (1, '46.920')] +[2023-10-11 19:00:58,220][85176] Updated weights for policy 0, policy_version 96262 (0.0008) +[2023-10-11 19:00:58,586][85176] Updated weights for policy 0, policy_version 96272 (0.0009) +[2023-10-11 19:00:58,957][85176] Updated weights for policy 0, policy_version 96282 (0.0010) +[2023-10-11 19:00:59,235][85175] Updated weights for policy 1, policy_version 97670 (0.0009) +[2023-10-11 19:00:59,601][85175] Updated weights for policy 1, policy_version 97680 (0.0009) +[2023-10-11 19:00:59,961][85175] Updated weights for policy 1, policy_version 97690 (0.0008) +[2023-10-11 19:01:01,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 198639616. Throughput: 0: 1692.8, 1: 1657.8. Samples: 49667358. Policy #0 lag: (min: 20.0, avg: 38.0, max: 40.0) +[2023-10-11 19:01:01,063][84230] Avg episode reward: [(0, '46.820'), (1, '42.410')] +[2023-10-11 19:01:03,075][85176] Updated weights for policy 0, policy_version 96292 (0.0009) +[2023-10-11 19:01:03,453][85176] Updated weights for policy 0, policy_version 96302 (0.0008) +[2023-10-11 19:01:03,828][85176] Updated weights for policy 0, policy_version 96312 (0.0008) +[2023-10-11 19:01:04,248][85175] Updated weights for policy 1, policy_version 97700 (0.0009) +[2023-10-11 19:01:04,664][85175] Updated weights for policy 1, policy_version 97710 (0.0010) +[2023-10-11 19:01:05,036][85175] Updated weights for policy 1, policy_version 97720 (0.0009) +[2023-10-11 19:01:06,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 198705152. Throughput: 0: 1676.7, 1: 1677.2. Samples: 49678044. Policy #0 lag: (min: 20.0, avg: 38.0, max: 40.0) +[2023-10-11 19:01:06,063][84230] Avg episode reward: [(0, '44.690'), (1, '44.120')] +[2023-10-11 19:01:07,762][85176] Updated weights for policy 0, policy_version 96322 (0.0009) +[2023-10-11 19:01:08,137][85176] Updated weights for policy 0, policy_version 96332 (0.0011) +[2023-10-11 19:01:08,502][85176] Updated weights for policy 0, policy_version 96342 (0.0008) +[2023-10-11 19:01:08,879][85176] Updated weights for policy 0, policy_version 96352 (0.0008) +[2023-10-11 19:01:09,012][85175] Updated weights for policy 1, policy_version 97730 (0.0008) +[2023-10-11 19:01:09,375][85175] Updated weights for policy 1, policy_version 97740 (0.0009) +[2023-10-11 19:01:09,747][85175] Updated weights for policy 1, policy_version 97750 (0.0007) +[2023-10-11 19:01:10,115][85175] Updated weights for policy 1, policy_version 97760 (0.0008) +[2023-10-11 19:01:11,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 198770688. Throughput: 0: 1684.2, 1: 1663.7. Samples: 49697616. Policy #0 lag: (min: 20.0, avg: 38.0, max: 40.0) +[2023-10-11 19:01:11,064][84230] Avg episode reward: [(0, '45.810'), (1, '41.410')] +[2023-10-11 19:01:12,916][85176] Updated weights for policy 0, policy_version 96362 (0.0008) +[2023-10-11 19:01:13,284][85176] Updated weights for policy 0, policy_version 96372 (0.0008) +[2023-10-11 19:01:13,668][85176] Updated weights for policy 0, policy_version 96382 (0.0009) +[2023-10-11 19:01:14,135][85175] Updated weights for policy 1, policy_version 97770 (0.0010) +[2023-10-11 19:01:14,505][85175] Updated weights for policy 1, policy_version 97780 (0.0009) +[2023-10-11 19:01:14,863][85175] Updated weights for policy 1, policy_version 97790 (0.0008) +[2023-10-11 19:01:16,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 198836224. Throughput: 0: 1687.7, 1: 1661.6. Samples: 49717576. Policy #0 lag: (min: 20.0, avg: 38.0, max: 40.0) +[2023-10-11 19:01:16,063][84230] Avg episode reward: [(0, '44.940'), (1, '48.410')] +[2023-10-11 19:01:17,781][85176] Updated weights for policy 0, policy_version 96392 (0.0009) +[2023-10-11 19:01:18,161][85176] Updated weights for policy 0, policy_version 96402 (0.0007) +[2023-10-11 19:01:18,528][85176] Updated weights for policy 0, policy_version 96412 (0.0009) +[2023-10-11 19:01:18,883][85175] Updated weights for policy 1, policy_version 97800 (0.0007) +[2023-10-11 19:01:19,251][85175] Updated weights for policy 1, policy_version 97810 (0.0009) +[2023-10-11 19:01:19,621][85175] Updated weights for policy 1, policy_version 97820 (0.0010) +[2023-10-11 19:01:21,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 198901760. Throughput: 0: 1659.9, 1: 1682.3. Samples: 49728274. Policy #0 lag: (min: 20.0, avg: 38.0, max: 40.0) +[2023-10-11 19:01:21,064][84230] Avg episode reward: [(0, '45.560'), (1, '48.120')] +[2023-10-11 19:01:22,548][85176] Updated weights for policy 0, policy_version 96422 (0.0007) +[2023-10-11 19:01:22,917][85176] Updated weights for policy 0, policy_version 96432 (0.0007) +[2023-10-11 19:01:23,292][85176] Updated weights for policy 0, policy_version 96442 (0.0009) +[2023-10-11 19:01:23,856][85175] Updated weights for policy 1, policy_version 97830 (0.0010) +[2023-10-11 19:01:24,223][85175] Updated weights for policy 1, policy_version 97840 (0.0010) +[2023-10-11 19:01:24,596][85175] Updated weights for policy 1, policy_version 97850 (0.0007) +[2023-10-11 19:01:26,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 198967296. Throughput: 0: 1681.4, 1: 1658.4. Samples: 49747666. Policy #0 lag: (min: 20.0, avg: 38.0, max: 40.0) +[2023-10-11 19:01:26,064][84230] Avg episode reward: [(0, '42.600'), (1, '45.940')] +[2023-10-11 19:01:27,255][85176] Updated weights for policy 0, policy_version 96452 (0.0009) +[2023-10-11 19:01:27,634][85176] Updated weights for policy 0, policy_version 96462 (0.0008) +[2023-10-11 19:01:28,001][85176] Updated weights for policy 0, policy_version 96472 (0.0008) +[2023-10-11 19:01:28,652][85175] Updated weights for policy 1, policy_version 97860 (0.0008) +[2023-10-11 19:01:29,011][85175] Updated weights for policy 1, policy_version 97870 (0.0007) +[2023-10-11 19:01:29,367][85175] Updated weights for policy 1, policy_version 97880 (0.0010) +[2023-10-11 19:01:31,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 199032832. Throughput: 0: 1689.0, 1: 1674.1. Samples: 49768428. Policy #0 lag: (min: 20.0, avg: 38.0, max: 40.0) +[2023-10-11 19:01:31,064][84230] Avg episode reward: [(0, '44.930'), (1, '46.340')] +[2023-10-11 19:01:32,043][85176] Updated weights for policy 0, policy_version 96482 (0.0007) +[2023-10-11 19:01:32,419][85176] Updated weights for policy 0, policy_version 96492 (0.0008) +[2023-10-11 19:01:32,789][85176] Updated weights for policy 0, policy_version 96502 (0.0008) +[2023-10-11 19:01:33,160][85176] Updated weights for policy 0, policy_version 96512 (0.0008) +[2023-10-11 19:01:33,386][85175] Updated weights for policy 1, policy_version 97890 (0.0008) +[2023-10-11 19:01:33,756][85175] Updated weights for policy 1, policy_version 97900 (0.0011) +[2023-10-11 19:01:34,133][85175] Updated weights for policy 1, policy_version 97910 (0.0009) +[2023-10-11 19:01:34,494][85175] Updated weights for policy 1, policy_version 97920 (0.0007) +[2023-10-11 19:01:36,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.1, 300 sec: 13440.4). Total num frames: 199098368. Throughput: 0: 1667.7, 1: 1677.6. Samples: 49778412. Policy #0 lag: (min: 31.0, avg: 31.6, max: 49.0) +[2023-10-11 19:01:36,064][84230] Avg episode reward: [(0, '41.590'), (1, '45.880')] +[2023-10-11 19:01:37,225][85176] Updated weights for policy 0, policy_version 96522 (0.0008) +[2023-10-11 19:01:37,593][85176] Updated weights for policy 0, policy_version 96532 (0.0007) +[2023-10-11 19:01:37,957][85176] Updated weights for policy 0, policy_version 96542 (0.0010) +[2023-10-11 19:01:38,458][85175] Updated weights for policy 1, policy_version 97930 (0.0009) +[2023-10-11 19:01:38,825][85175] Updated weights for policy 1, policy_version 97940 (0.0008) +[2023-10-11 19:01:39,205][85175] Updated weights for policy 1, policy_version 97950 (0.0010) +[2023-10-11 19:01:41,062][84230] Fps is (10 sec: 13107.6, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 199163904. Throughput: 0: 1685.1, 1: 1666.8. Samples: 49798330. Policy #0 lag: (min: 31.0, avg: 31.6, max: 49.0) +[2023-10-11 19:01:41,063][84230] Avg episode reward: [(0, '47.840'), (1, '48.570')] +[2023-10-11 19:01:41,964][85176] Updated weights for policy 0, policy_version 96552 (0.0008) +[2023-10-11 19:01:42,329][85176] Updated weights for policy 0, policy_version 96562 (0.0008) +[2023-10-11 19:01:42,707][85176] Updated weights for policy 0, policy_version 96572 (0.0009) +[2023-10-11 19:01:43,196][85175] Updated weights for policy 1, policy_version 97960 (0.0008) +[2023-10-11 19:01:43,563][85175] Updated weights for policy 1, policy_version 97970 (0.0007) +[2023-10-11 19:01:43,933][85175] Updated weights for policy 1, policy_version 97980 (0.0008) +[2023-10-11 19:01:46,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 199229440. Throughput: 0: 1682.0, 1: 1692.0. Samples: 49819190. Policy #0 lag: (min: 31.0, avg: 31.6, max: 49.0) +[2023-10-11 19:01:46,063][84230] Avg episode reward: [(0, '43.480'), (1, '44.900')] +[2023-10-11 19:01:46,911][85176] Updated weights for policy 0, policy_version 96582 (0.0009) +[2023-10-11 19:01:47,294][85176] Updated weights for policy 0, policy_version 96592 (0.0009) +[2023-10-11 19:01:47,656][85176] Updated weights for policy 0, policy_version 96602 (0.0010) +[2023-10-11 19:01:47,735][85175] Updated weights for policy 1, policy_version 97990 (0.0008) +[2023-10-11 19:01:48,107][85175] Updated weights for policy 1, policy_version 98000 (0.0009) +[2023-10-11 19:01:48,472][85175] Updated weights for policy 1, policy_version 98010 (0.0008) +[2023-10-11 19:01:51,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 199294976. Throughput: 0: 1671.4, 1: 1677.8. Samples: 49828760. Policy #0 lag: (min: 31.0, avg: 31.6, max: 49.0) +[2023-10-11 19:01:51,063][84230] Avg episode reward: [(0, '48.400'), (1, '44.120')] +[2023-10-11 19:01:51,897][85176] Updated weights for policy 0, policy_version 96612 (0.0008) +[2023-10-11 19:01:52,260][85176] Updated weights for policy 0, policy_version 96622 (0.0008) +[2023-10-11 19:01:52,527][85175] Updated weights for policy 1, policy_version 98020 (0.0007) +[2023-10-11 19:01:52,630][85176] Updated weights for policy 0, policy_version 96632 (0.0007) +[2023-10-11 19:01:52,893][85175] Updated weights for policy 1, policy_version 98030 (0.0007) +[2023-10-11 19:01:53,261][85175] Updated weights for policy 1, policy_version 98040 (0.0009) +[2023-10-11 19:01:56,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 199360512. Throughput: 0: 1671.5, 1: 1691.0. Samples: 49848928. Policy #0 lag: (min: 31.0, avg: 31.6, max: 49.0) +[2023-10-11 19:01:56,064][84230] Avg episode reward: [(0, '41.220'), (1, '45.820')] +[2023-10-11 19:01:56,707][85176] Updated weights for policy 0, policy_version 96642 (0.0009) +[2023-10-11 19:01:57,080][85176] Updated weights for policy 0, policy_version 96652 (0.0010) +[2023-10-11 19:01:57,434][85175] Updated weights for policy 1, policy_version 98050 (0.0009) +[2023-10-11 19:01:57,457][85176] Updated weights for policy 0, policy_version 96662 (0.0009) +[2023-10-11 19:01:57,820][85176] Updated weights for policy 0, policy_version 96672 (0.0008) +[2023-10-11 19:01:57,852][85175] Updated weights for policy 1, policy_version 98060 (0.0007) +[2023-10-11 19:01:58,212][85175] Updated weights for policy 1, policy_version 98070 (0.0010) +[2023-10-11 19:01:58,583][85175] Updated weights for policy 1, policy_version 98080 (0.0010) +[2023-10-11 19:02:01,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 199426048. Throughput: 0: 1675.4, 1: 1699.7. Samples: 49869458. Policy #0 lag: (min: 31.0, avg: 31.6, max: 49.0) +[2023-10-11 19:02:01,063][84230] Avg episode reward: [(0, '46.850'), (1, '44.000')] +[2023-10-11 19:02:01,929][85176] Updated weights for policy 0, policy_version 96682 (0.0010) +[2023-10-11 19:02:02,309][85176] Updated weights for policy 0, policy_version 96692 (0.0008) +[2023-10-11 19:02:02,489][85175] Updated weights for policy 1, policy_version 98090 (0.0008) +[2023-10-11 19:02:02,674][85176] Updated weights for policy 0, policy_version 96702 (0.0009) +[2023-10-11 19:02:02,861][85175] Updated weights for policy 1, policy_version 98100 (0.0008) +[2023-10-11 19:02:03,236][85175] Updated weights for policy 1, policy_version 98110 (0.0009) +[2023-10-11 19:02:06,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 199491584. Throughput: 0: 1670.1, 1: 1671.5. Samples: 49878646. Policy #0 lag: (min: 31.0, avg: 31.6, max: 49.0) +[2023-10-11 19:02:06,064][84230] Avg episode reward: [(0, '44.110'), (1, '46.940')] +[2023-10-11 19:02:06,889][85176] Updated weights for policy 0, policy_version 96712 (0.0008) +[2023-10-11 19:02:07,269][85176] Updated weights for policy 0, policy_version 96722 (0.0008) +[2023-10-11 19:02:07,348][85175] Updated weights for policy 1, policy_version 98120 (0.0008) +[2023-10-11 19:02:07,645][85176] Updated weights for policy 0, policy_version 96732 (0.0009) +[2023-10-11 19:02:07,716][85175] Updated weights for policy 1, policy_version 98130 (0.0009) +[2023-10-11 19:02:08,074][85175] Updated weights for policy 1, policy_version 98140 (0.0010) +[2023-10-11 19:02:11,063][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 199557120. Throughput: 0: 1669.9, 1: 1698.5. Samples: 49899242. Policy #0 lag: (min: 31.0, avg: 31.6, max: 49.0) +[2023-10-11 19:02:11,063][84230] Avg episode reward: [(0, '47.660'), (1, '44.080')] +[2023-10-11 19:02:11,822][85176] Updated weights for policy 0, policy_version 96742 (0.0008) +[2023-10-11 19:02:12,116][85175] Updated weights for policy 1, policy_version 98150 (0.0008) +[2023-10-11 19:02:12,202][85176] Updated weights for policy 0, policy_version 96752 (0.0009) +[2023-10-11 19:02:12,484][85175] Updated weights for policy 1, policy_version 98160 (0.0009) +[2023-10-11 19:02:12,574][85176] Updated weights for policy 0, policy_version 96762 (0.0008) +[2023-10-11 19:02:12,851][85175] Updated weights for policy 1, policy_version 98170 (0.0008) +[2023-10-11 19:02:16,062][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 199622656. Throughput: 0: 1662.7, 1: 1703.6. Samples: 49919908. Policy #0 lag: (min: 31.0, avg: 31.6, max: 49.0) +[2023-10-11 19:02:16,063][84230] Avg episode reward: [(0, '44.310'), (1, '46.050')] +[2023-10-11 19:02:16,071][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000098176_100532224.pth... +[2023-10-11 19:02:16,072][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000096768_99090432.pth... +[2023-10-11 19:02:16,108][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000095200_97484800.pth +[2023-10-11 19:02:16,111][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000096608_98926592.pth +[2023-10-11 19:02:16,614][85176] Updated weights for policy 0, policy_version 96772 (0.0008) +[2023-10-11 19:02:16,994][85176] Updated weights for policy 0, policy_version 96782 (0.0009) +[2023-10-11 19:02:17,121][85175] Updated weights for policy 1, policy_version 98180 (0.0007) +[2023-10-11 19:02:17,354][85176] Updated weights for policy 0, policy_version 96792 (0.0008) +[2023-10-11 19:02:17,492][85175] Updated weights for policy 1, policy_version 98190 (0.0008) +[2023-10-11 19:02:17,853][85175] Updated weights for policy 1, policy_version 98200 (0.0008) +[2023-10-11 19:02:21,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 199688192. Throughput: 0: 1670.3, 1: 1680.6. Samples: 49929202. Policy #0 lag: (min: 31.0, avg: 31.6, max: 49.0) +[2023-10-11 19:02:21,063][84230] Avg episode reward: [(0, '48.360'), (1, '45.160')] +[2023-10-11 19:02:21,407][85176] Updated weights for policy 0, policy_version 96802 (0.0009) +[2023-10-11 19:02:21,780][85176] Updated weights for policy 0, policy_version 96812 (0.0009) +[2023-10-11 19:02:21,902][85175] Updated weights for policy 1, policy_version 98210 (0.0008) +[2023-10-11 19:02:22,142][85176] Updated weights for policy 0, policy_version 96822 (0.0008) +[2023-10-11 19:02:22,263][85175] Updated weights for policy 1, policy_version 98220 (0.0008) +[2023-10-11 19:02:22,509][85176] Updated weights for policy 0, policy_version 96832 (0.0008) +[2023-10-11 19:02:22,630][85175] Updated weights for policy 1, policy_version 98230 (0.0010) +[2023-10-11 19:02:22,996][85175] Updated weights for policy 1, policy_version 98240 (0.0007) +[2023-10-11 19:02:26,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 199753728. Throughput: 0: 1674.6, 1: 1697.5. Samples: 49950078. Policy #0 lag: (min: 31.0, avg: 31.6, max: 49.0) +[2023-10-11 19:02:26,064][84230] Avg episode reward: [(0, '44.890'), (1, '47.810')] +[2023-10-11 19:02:26,568][85176] Updated weights for policy 0, policy_version 96842 (0.0009) +[2023-10-11 19:02:26,952][85176] Updated weights for policy 0, policy_version 96852 (0.0009) +[2023-10-11 19:02:27,118][85175] Updated weights for policy 1, policy_version 98250 (0.0009) +[2023-10-11 19:02:27,312][85176] Updated weights for policy 0, policy_version 96862 (0.0010) +[2023-10-11 19:02:27,488][85175] Updated weights for policy 1, policy_version 98260 (0.0008) +[2023-10-11 19:02:27,848][85175] Updated weights for policy 1, policy_version 98270 (0.0009) +[2023-10-11 19:02:31,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 199819264. Throughput: 0: 1670.2, 1: 1691.9. Samples: 49970486. Policy #0 lag: (min: 31.0, avg: 31.6, max: 49.0) +[2023-10-11 19:02:31,064][84230] Avg episode reward: [(0, '46.830'), (1, '44.680')] +[2023-10-11 19:02:31,338][85176] Updated weights for policy 0, policy_version 96872 (0.0008) +[2023-10-11 19:02:31,700][85176] Updated weights for policy 0, policy_version 96882 (0.0009) +[2023-10-11 19:02:31,842][85175] Updated weights for policy 1, policy_version 98280 (0.0009) +[2023-10-11 19:02:32,069][85176] Updated weights for policy 0, policy_version 96892 (0.0009) +[2023-10-11 19:02:32,205][85175] Updated weights for policy 1, policy_version 98290 (0.0009) +[2023-10-11 19:02:32,577][85175] Updated weights for policy 1, policy_version 98300 (0.0008) +[2023-10-11 19:02:36,062][84230] Fps is (10 sec: 13107.5, 60 sec: 13107.3, 300 sec: 13440.4). Total num frames: 199884800. Throughput: 0: 1668.8, 1: 1687.2. Samples: 49979782. Policy #0 lag: (min: 31.0, avg: 31.6, max: 49.0) +[2023-10-11 19:02:36,063][84230] Avg episode reward: [(0, '44.770'), (1, '43.730')] +[2023-10-11 19:02:36,162][85176] Updated weights for policy 0, policy_version 96902 (0.0008) +[2023-10-11 19:02:36,548][85176] Updated weights for policy 0, policy_version 96912 (0.0008) +[2023-10-11 19:02:36,725][85175] Updated weights for policy 1, policy_version 98310 (0.0007) +[2023-10-11 19:02:36,922][85176] Updated weights for policy 0, policy_version 96922 (0.0009) +[2023-10-11 19:02:37,096][85175] Updated weights for policy 1, policy_version 98320 (0.0009) +[2023-10-11 19:02:37,464][85175] Updated weights for policy 1, policy_version 98330 (0.0010) +[2023-10-11 19:02:41,020][85176] Updated weights for policy 0, policy_version 96932 (0.0009) +[2023-10-11 19:02:41,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 199950336. Throughput: 0: 1677.9, 1: 1687.9. Samples: 50000386. Policy #0 lag: (min: 14.0, avg: 21.5, max: 46.0) +[2023-10-11 19:02:41,063][84230] Avg episode reward: [(0, '45.070'), (1, '49.480')] +[2023-10-11 19:02:41,395][85176] Updated weights for policy 0, policy_version 96942 (0.0008) +[2023-10-11 19:02:41,448][85175] Updated weights for policy 1, policy_version 98340 (0.0007) +[2023-10-11 19:02:41,755][85176] Updated weights for policy 0, policy_version 96952 (0.0008) +[2023-10-11 19:02:41,816][85175] Updated weights for policy 1, policy_version 98350 (0.0008) +[2023-10-11 19:02:42,183][85175] Updated weights for policy 1, policy_version 98360 (0.0008) +[2023-10-11 19:02:45,703][85176] Updated weights for policy 0, policy_version 96962 (0.0008) +[2023-10-11 19:02:46,063][84230] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 200015872. Throughput: 0: 1679.5, 1: 1694.6. Samples: 50021294. Policy #0 lag: (min: 14.0, avg: 21.5, max: 46.0) +[2023-10-11 19:02:46,064][84230] Avg episode reward: [(0, '47.120'), (1, '42.680')] +[2023-10-11 19:02:46,084][85176] Updated weights for policy 0, policy_version 96972 (0.0007) +[2023-10-11 19:02:46,274][85175] Updated weights for policy 1, policy_version 98370 (0.0009) +[2023-10-11 19:02:46,458][85176] Updated weights for policy 0, policy_version 96982 (0.0007) +[2023-10-11 19:02:46,683][85175] Updated weights for policy 1, policy_version 98380 (0.0007) +[2023-10-11 19:02:46,824][85176] Updated weights for policy 0, policy_version 96992 (0.0008) +[2023-10-11 19:02:47,050][85175] Updated weights for policy 1, policy_version 98390 (0.0007) +[2023-10-11 19:02:47,421][85175] Updated weights for policy 1, policy_version 98400 (0.0009) +[2023-10-11 19:02:50,833][85176] Updated weights for policy 0, policy_version 97002 (0.0007) +[2023-10-11 19:02:51,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 200081408. Throughput: 0: 1680.2, 1: 1687.0. Samples: 50030170. Policy #0 lag: (min: 14.0, avg: 21.5, max: 46.0) +[2023-10-11 19:02:51,063][84230] Avg episode reward: [(0, '43.820'), (1, '47.770')] +[2023-10-11 19:02:51,203][85176] Updated weights for policy 0, policy_version 97012 (0.0008) +[2023-10-11 19:02:51,415][85175] Updated weights for policy 1, policy_version 98410 (0.0007) +[2023-10-11 19:02:51,577][85176] Updated weights for policy 0, policy_version 97022 (0.0008) +[2023-10-11 19:02:51,784][85175] Updated weights for policy 1, policy_version 98420 (0.0007) +[2023-10-11 19:02:52,163][85175] Updated weights for policy 1, policy_version 98430 (0.0009) +[2023-10-11 19:02:55,684][85176] Updated weights for policy 0, policy_version 97032 (0.0008) +[2023-10-11 19:02:56,010][85175] Updated weights for policy 1, policy_version 98440 (0.0009) +[2023-10-11 19:02:56,050][85176] Updated weights for policy 0, policy_version 97042 (0.0009) +[2023-10-11 19:02:56,063][84230] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 200146944. Throughput: 0: 1679.3, 1: 1690.5. Samples: 50050882. Policy #0 lag: (min: 14.0, avg: 21.5, max: 46.0) +[2023-10-11 19:02:56,063][84230] Avg episode reward: [(0, '46.430'), (1, '44.970')] +[2023-10-11 19:02:56,382][85175] Updated weights for policy 1, policy_version 98450 (0.0009) +[2023-10-11 19:02:56,414][85176] Updated weights for policy 0, policy_version 97052 (0.0009) +[2023-10-11 19:02:56,741][85175] Updated weights for policy 1, policy_version 98460 (0.0010) +[2023-10-11 19:03:00,505][85176] Updated weights for policy 0, policy_version 97062 (0.0008) +[2023-10-11 19:03:00,688][85175] Updated weights for policy 1, policy_version 98470 (0.0008) +[2023-10-11 19:03:00,878][85176] Updated weights for policy 0, policy_version 97072 (0.0007) +[2023-10-11 19:03:01,048][85175] Updated weights for policy 1, policy_version 98480 (0.0007) +[2023-10-11 19:03:01,062][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13440.4). Total num frames: 200212480. Throughput: 0: 1669.1, 1: 1698.4. Samples: 50071448. Policy #0 lag: (min: 14.0, avg: 21.5, max: 46.0) +[2023-10-11 19:03:01,063][84230] Avg episode reward: [(0, '44.570'), (1, '48.080')] +[2023-10-11 19:03:01,249][85176] Updated weights for policy 0, policy_version 97082 (0.0008) +[2023-10-11 19:03:01,418][85175] Updated weights for policy 1, policy_version 98490 (0.0007) +[2023-10-11 19:03:05,426][85176] Updated weights for policy 0, policy_version 97092 (0.0009) +[2023-10-11 19:03:05,533][85175] Updated weights for policy 1, policy_version 98500 (0.0008) +[2023-10-11 19:03:05,790][85176] Updated weights for policy 0, policy_version 97102 (0.0008) +[2023-10-11 19:03:05,899][85175] Updated weights for policy 1, policy_version 98510 (0.0008) +[2023-10-11 19:03:06,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 200278016. Throughput: 0: 1671.6, 1: 1700.8. Samples: 50080958. Policy #0 lag: (min: 14.0, avg: 21.5, max: 46.0) +[2023-10-11 19:03:06,063][84230] Avg episode reward: [(0, '47.030'), (1, '47.140')] +[2023-10-11 19:03:06,167][85176] Updated weights for policy 0, policy_version 97112 (0.0007) +[2023-10-11 19:03:06,274][85175] Updated weights for policy 1, policy_version 98520 (0.0008) +[2023-10-11 19:03:10,332][85176] Updated weights for policy 0, policy_version 97122 (0.0007) +[2023-10-11 19:03:10,425][85175] Updated weights for policy 1, policy_version 98530 (0.0007) +[2023-10-11 19:03:10,704][85176] Updated weights for policy 0, policy_version 97132 (0.0007) +[2023-10-11 19:03:10,797][85175] Updated weights for policy 1, policy_version 98540 (0.0007) +[2023-10-11 19:03:11,063][84230] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 200343552. Throughput: 0: 1668.7, 1: 1701.0. Samples: 50101712. Policy #0 lag: (min: 14.0, avg: 21.5, max: 46.0) +[2023-10-11 19:03:11,064][84230] Avg episode reward: [(0, '43.280'), (1, '47.510')] +[2023-10-11 19:03:11,082][85176] Updated weights for policy 0, policy_version 97142 (0.0009) +[2023-10-11 19:03:11,162][85175] Updated weights for policy 1, policy_version 98550 (0.0008) +[2023-10-11 19:03:11,449][85176] Updated weights for policy 0, policy_version 97152 (0.0010) +[2023-10-11 19:03:11,525][85175] Updated weights for policy 1, policy_version 98560 (0.0007) +[2023-10-11 19:03:15,362][85175] Updated weights for policy 1, policy_version 98570 (0.0007) +[2023-10-11 19:03:15,405][85176] Updated weights for policy 0, policy_version 97162 (0.0008) +[2023-10-11 19:03:15,722][85175] Updated weights for policy 1, policy_version 98580 (0.0008) +[2023-10-11 19:03:15,792][85176] Updated weights for policy 0, policy_version 97172 (0.0009) +[2023-10-11 19:03:16,063][84230] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13329.4). Total num frames: 200409088. Throughput: 0: 1659.8, 1: 1694.5. Samples: 50121432. Policy #0 lag: (min: 14.0, avg: 21.5, max: 46.0) +[2023-10-11 19:03:16,063][84230] Avg episode reward: [(0, '43.520'), (1, '45.200')] +[2023-10-11 19:03:16,090][85175] Updated weights for policy 1, policy_version 98590 (0.0008) +[2023-10-11 19:03:16,160][85176] Updated weights for policy 0, policy_version 97182 (0.0009) +[2023-10-11 19:03:20,130][85175] Updated weights for policy 1, policy_version 98600 (0.0008) +[2023-10-11 19:03:20,294][85176] Updated weights for policy 0, policy_version 97192 (0.0009) +[2023-10-11 19:03:20,501][85175] Updated weights for policy 1, policy_version 98610 (0.0008) +[2023-10-11 19:03:20,670][85176] Updated weights for policy 0, policy_version 97202 (0.0009) +[2023-10-11 19:03:20,858][85175] Updated weights for policy 1, policy_version 98620 (0.0009) +[2023-10-11 19:03:21,043][85176] Updated weights for policy 0, policy_version 97212 (0.0009) +[2023-10-11 19:03:21,062][84230] Fps is (10 sec: 16384.3, 60 sec: 13653.3, 300 sec: 13440.4). Total num frames: 200507392. Throughput: 0: 1671.3, 1: 1700.1. Samples: 50131498. Policy #0 lag: (min: 14.0, avg: 21.5, max: 46.0) +[2023-10-11 19:03:21,063][84230] Avg episode reward: [(0, '44.290'), (1, '43.630')] +[2023-10-11 19:03:24,857][85175] Updated weights for policy 1, policy_version 98630 (0.0010) +[2023-10-11 19:03:25,110][85176] Updated weights for policy 0, policy_version 97222 (0.0008) +[2023-10-11 19:03:25,217][85175] Updated weights for policy 1, policy_version 98640 (0.0009) +[2023-10-11 19:03:25,471][85176] Updated weights for policy 0, policy_version 97232 (0.0007) +[2023-10-11 19:03:25,590][85175] Updated weights for policy 1, policy_version 98650 (0.0007) +[2023-10-11 19:03:25,849][85176] Updated weights for policy 0, policy_version 97242 (0.0007) +[2023-10-11 19:03:26,063][84230] Fps is (10 sec: 16384.1, 60 sec: 13653.4, 300 sec: 13440.4). Total num frames: 200572928. Throughput: 0: 1667.4, 1: 1706.4. Samples: 50152204. Policy #0 lag: (min: 14.0, avg: 21.5, max: 46.0) +[2023-10-11 19:03:26,063][84230] Avg episode reward: [(0, '44.060'), (1, '48.900')] +[2023-10-11 19:03:29,808][85175] Updated weights for policy 1, policy_version 98660 (0.0008) +[2023-10-11 19:03:30,007][85176] Updated weights for policy 0, policy_version 97252 (0.0008) +[2023-10-11 19:03:30,175][85175] Updated weights for policy 1, policy_version 98670 (0.0008) +[2023-10-11 19:03:30,374][85176] Updated weights for policy 0, policy_version 97262 (0.0007) +[2023-10-11 19:03:30,547][85175] Updated weights for policy 1, policy_version 98680 (0.0008) +[2023-10-11 19:03:30,743][85176] Updated weights for policy 0, policy_version 97272 (0.0007) +[2023-10-11 19:03:31,063][84230] Fps is (10 sec: 16383.5, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 200671232. Throughput: 0: 1649.1, 1: 1680.0. Samples: 50171104. Policy #0 lag: (min: 14.0, avg: 21.5, max: 46.0) +[2023-10-11 19:03:31,064][84230] Avg episode reward: [(0, '45.280'), (1, '42.420')] +[2023-10-11 19:03:34,573][85175] Updated weights for policy 1, policy_version 98690 (0.0009) +[2023-10-11 19:03:34,864][85176] Updated weights for policy 0, policy_version 97282 (0.0008) +[2023-10-11 19:03:34,983][85175] Updated weights for policy 1, policy_version 98700 (0.0009) +[2023-10-11 19:03:35,238][85176] Updated weights for policy 0, policy_version 97292 (0.0008) +[2023-10-11 19:03:35,347][85175] Updated weights for policy 1, policy_version 98710 (0.0008) +[2023-10-11 19:03:35,602][85176] Updated weights for policy 0, policy_version 97302 (0.0009) +[2023-10-11 19:03:35,712][85175] Updated weights for policy 1, policy_version 98720 (0.0007) +[2023-10-11 19:03:35,979][85176] Updated weights for policy 0, policy_version 97312 (0.0009) +[2023-10-11 19:03:36,062][84230] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 200736768. Throughput: 0: 1667.6, 1: 1707.8. Samples: 50182064. Policy #0 lag: (min: 14.0, avg: 21.5, max: 46.0) +[2023-10-11 19:03:36,063][84230] Avg episode reward: [(0, '42.980'), (1, '46.260')] +[2023-10-11 19:03:39,618][85175] Updated weights for policy 1, policy_version 98730 (0.0008) +[2023-10-11 19:03:39,985][85175] Updated weights for policy 1, policy_version 98740 (0.0008) +[2023-10-11 19:03:40,036][85176] Updated weights for policy 0, policy_version 97322 (0.0008) +[2023-10-11 19:03:40,357][85175] Updated weights for policy 1, policy_version 98750 (0.0007) +[2023-10-11 19:03:40,414][85176] Updated weights for policy 0, policy_version 97332 (0.0008) +[2023-10-11 19:03:40,790][85176] Updated weights for policy 0, policy_version 97342 (0.0011) +[2023-10-11 19:03:41,063][84230] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 200802304. Throughput: 0: 1678.0, 1: 1695.1. Samples: 50202674. Policy #0 lag: (min: 1.0, avg: 12.5, max: 33.0) +[2023-10-11 19:03:41,063][84230] Avg episode reward: [(0, '43.520'), (1, '43.790')] +[2023-10-11 19:03:44,550][85175] Updated weights for policy 1, policy_version 98760 (0.0008) +[2023-10-11 19:03:44,844][85176] Updated weights for policy 0, policy_version 97352 (0.0007) +[2023-10-11 19:03:44,911][85175] Updated weights for policy 1, policy_version 98770 (0.0007) +[2023-10-11 19:03:45,214][85176] Updated weights for policy 0, policy_version 97362 (0.0008) +[2023-10-11 19:03:45,277][85175] Updated weights for policy 1, policy_version 98780 (0.0007) +[2023-10-11 19:03:45,597][85176] Updated weights for policy 0, policy_version 97372 (0.0008) +[2023-10-11 19:03:46,063][84230] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 200867840. Throughput: 0: 1664.8, 1: 1669.9. Samples: 50221510. Policy #0 lag: (min: 1.0, avg: 12.5, max: 33.0) +[2023-10-11 19:03:46,063][84230] Avg episode reward: [(0, '43.800'), (1, '43.640')] +[2023-10-11 19:03:49,313][85175] Updated weights for policy 1, policy_version 98790 (0.0008) +[2023-10-11 19:03:49,679][85175] Updated weights for policy 1, policy_version 98800 (0.0008) +[2023-10-11 19:03:49,808][85176] Updated weights for policy 0, policy_version 97382 (0.0007) +[2023-10-11 19:03:50,045][85175] Updated weights for policy 1, policy_version 98810 (0.0009) +[2023-10-11 19:03:50,170][85176] Updated weights for policy 0, policy_version 97392 (0.0007) +[2023-10-11 19:03:50,537][85176] Updated weights for policy 0, policy_version 97402 (0.0007) +[2023-10-11 19:03:51,062][84230] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 200933376. Throughput: 0: 1678.8, 1: 1695.4. Samples: 50232794. Policy #0 lag: (min: 1.0, avg: 12.5, max: 33.0) +[2023-10-11 19:03:51,063][84230] Avg episode reward: [(0, '44.290'), (1, '45.810')] +[2023-10-11 19:03:54,156][85175] Updated weights for policy 1, policy_version 98820 (0.0009) +[2023-10-11 19:03:54,528][85175] Updated weights for policy 1, policy_version 98830 (0.0008) +[2023-10-11 19:03:54,621][85176] Updated weights for policy 0, policy_version 97412 (0.0007) +[2023-10-11 19:03:54,898][85175] Updated weights for policy 1, policy_version 98840 (0.0007) +[2023-10-11 19:03:54,989][85176] Updated weights for policy 0, policy_version 97422 (0.0008) +[2023-10-11 19:03:55,364][85176] Updated weights for policy 0, policy_version 97432 (0.0008) +[2023-10-11 19:03:56,063][84230] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 13551.5). Total num frames: 200998912. Throughput: 0: 1674.8, 1: 1684.7. Samples: 50252888. Policy #0 lag: (min: 1.0, avg: 12.5, max: 33.0) +[2023-10-11 19:03:56,064][84230] Avg episode reward: [(0, '44.160'), (1, '43.670')] +[2023-10-11 19:03:58,847][85175] Updated weights for policy 1, policy_version 98850 (0.0008) +[2023-10-11 19:03:59,213][85175] Updated weights for policy 1, policy_version 98860 (0.0009) +[2023-10-11 19:03:59,422][85176] Updated weights for policy 0, policy_version 97442 (0.0007) +[2023-10-11 19:03:59,582][85175] Updated weights for policy 1, policy_version 98870 (0.0007) +[2023-10-11 19:03:59,809][85176] Updated weights for policy 0, policy_version 97452 (0.0008) +[2023-10-11 19:03:59,948][85175] Updated weights for policy 1, policy_version 98880 (0.0007) +[2023-10-11 19:04:00,178][85176] Updated weights for policy 0, policy_version 97462 (0.0010) +[2023-10-11 19:04:00,553][85176] Updated weights for policy 0, policy_version 97472 (0.0009) +[2023-10-11 19:04:01,063][84230] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 201064448. Throughput: 0: 1663.6, 1: 1678.9. Samples: 50271842. Policy #0 lag: (min: 1.0, avg: 12.5, max: 33.0) +[2023-10-11 19:04:01,063][84230] Avg episode reward: [(0, '46.340'), (1, '49.330')] +[2023-10-11 19:04:03,789][85175] Updated weights for policy 1, policy_version 98890 (0.0008) +[2023-10-11 19:04:04,159][85175] Updated weights for policy 1, policy_version 98900 (0.0010) +[2023-10-11 19:04:04,520][85175] Updated weights for policy 1, policy_version 98910 (0.0008) +[2023-10-11 19:04:04,535][85176] Updated weights for policy 0, policy_version 97482 (0.0008) +[2023-10-11 19:04:04,911][85176] Updated weights for policy 0, policy_version 97492 (0.0009) +[2023-10-11 19:04:05,293][85176] Updated weights for policy 0, policy_version 97502 (0.0008) +[2023-10-11 19:04:06,062][84230] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 201129984. Throughput: 0: 1679.8, 1: 1700.8. Samples: 50283624. Policy #0 lag: (min: 1.0, avg: 12.5, max: 33.0) +[2023-10-11 19:04:06,063][84230] Avg episode reward: [(0, '43.990'), (1, '42.930')] +[2023-10-11 19:04:08,596][85175] Updated weights for policy 1, policy_version 98920 (0.0008) +[2023-10-11 19:04:08,966][85175] Updated weights for policy 1, policy_version 98930 (0.0009) +[2023-10-11 19:04:09,229][85176] Updated weights for policy 0, policy_version 97512 (0.0008) +[2023-10-11 19:04:09,330][85175] Updated weights for policy 1, policy_version 98940 (0.0008) +[2023-10-11 19:04:09,601][85176] Updated weights for policy 0, policy_version 97522 (0.0007) +[2023-10-11 19:04:09,973][85176] Updated weights for policy 0, policy_version 97532 (0.0007) +[2023-10-11 19:04:11,063][84230] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 201195520. Throughput: 0: 1670.4, 1: 1671.5. Samples: 50302592. Policy #0 lag: (min: 1.0, avg: 12.5, max: 33.0) +[2023-10-11 19:04:11,063][84230] Avg episode reward: [(0, '46.520'), (1, '46.320')] +[2023-10-11 19:04:13,461][85175] Updated weights for policy 1, policy_version 98950 (0.0008) +[2023-10-11 19:04:13,826][85175] Updated weights for policy 1, policy_version 98960 (0.0009) +[2023-10-11 19:04:14,097][85176] Updated weights for policy 0, policy_version 97542 (0.0008) +[2023-10-11 19:04:14,200][85175] Updated weights for policy 1, policy_version 98970 (0.0009) +[2023-10-11 19:04:14,459][85176] Updated weights for policy 0, policy_version 97552 (0.0009) +[2023-10-11 19:04:14,837][85176] Updated weights for policy 0, policy_version 97562 (0.0009) +[2023-10-11 19:04:16,062][84230] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 13551.5). Total num frames: 201261056. Throughput: 0: 1671.5, 1: 1697.7. Samples: 50322716. Policy #0 lag: (min: 1.0, avg: 12.5, max: 33.0) +[2023-10-11 19:04:16,063][84230] Avg episode reward: [(0, '42.710'), (1, '45.100')] +[2023-10-11 19:04:16,070][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000098976_101351424.pth... +[2023-10-11 19:04:16,071][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000097568_99909632.pth... +[2023-10-11 19:04:16,100][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000097408_99745792.pth +[2023-10-11 19:04:16,104][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000096000_98304000.pth +[2023-10-11 19:04:18,113][85175] Updated weights for policy 1, policy_version 98980 (0.0008) +[2023-10-11 19:04:18,485][85175] Updated weights for policy 1, policy_version 98990 (0.0007) +[2023-10-11 19:04:18,859][85175] Updated weights for policy 1, policy_version 99000 (0.0009) +[2023-10-11 19:04:18,977][85176] Updated weights for policy 0, policy_version 97572 (0.0009) +[2023-10-11 19:04:19,353][85176] Updated weights for policy 0, policy_version 97582 (0.0008) +[2023-10-11 19:04:19,726][85176] Updated weights for policy 0, policy_version 97592 (0.0010) +[2023-10-11 19:04:21,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 201326592. Throughput: 0: 1681.9, 1: 1693.0. Samples: 50333932. Policy #0 lag: (min: 1.0, avg: 12.5, max: 33.0) +[2023-10-11 19:04:21,063][84230] Avg episode reward: [(0, '45.560'), (1, '47.110')] +[2023-10-11 19:04:22,752][85175] Updated weights for policy 1, policy_version 99010 (0.0009) +[2023-10-11 19:04:23,124][85175] Updated weights for policy 1, policy_version 99020 (0.0009) +[2023-10-11 19:04:23,488][85175] Updated weights for policy 1, policy_version 99030 (0.0010) +[2023-10-11 19:04:23,727][85176] Updated weights for policy 0, policy_version 97602 (0.0008) +[2023-10-11 19:04:23,857][85175] Updated weights for policy 1, policy_version 99040 (0.0010) +[2023-10-11 19:04:24,124][85176] Updated weights for policy 0, policy_version 97612 (0.0010) +[2023-10-11 19:04:24,499][85176] Updated weights for policy 0, policy_version 97622 (0.0007) +[2023-10-11 19:04:24,867][85176] Updated weights for policy 0, policy_version 97632 (0.0007) +[2023-10-11 19:04:26,062][84230] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13551.5). Total num frames: 201392128. Throughput: 0: 1656.1, 1: 1689.5. Samples: 50353228. Policy #0 lag: (min: 1.0, avg: 12.5, max: 33.0) +[2023-10-11 19:04:26,063][84230] Avg episode reward: [(0, '42.290'), (1, '45.650')] +[2023-10-11 19:04:27,936][85175] Updated weights for policy 1, policy_version 99050 (0.0010) +[2023-10-11 19:04:28,306][85175] Updated weights for policy 1, policy_version 99060 (0.0007) +[2023-10-11 19:04:28,669][85175] Updated weights for policy 1, policy_version 99070 (0.0009) +[2023-10-11 19:04:28,879][85176] Updated weights for policy 0, policy_version 97642 (0.0008) +[2023-10-11 19:04:29,246][85176] Updated weights for policy 0, policy_version 97652 (0.0009) +[2023-10-11 19:04:29,616][85176] Updated weights for policy 0, policy_version 97662 (0.0010) +[2023-10-11 19:04:31,063][84230] Fps is (10 sec: 13107.1, 60 sec: 13107.3, 300 sec: 13551.5). Total num frames: 201457664. Throughput: 0: 1672.0, 1: 1710.1. Samples: 50373708. Policy #0 lag: (min: 1.0, avg: 12.5, max: 33.0) +[2023-10-11 19:04:31,063][84230] Avg episode reward: [(0, '44.510'), (1, '41.480')] +[2023-10-11 19:04:32,604][85175] Updated weights for policy 1, policy_version 99080 (0.0009) +[2023-10-11 19:04:32,978][85175] Updated weights for policy 1, policy_version 99090 (0.0010) +[2023-10-11 19:04:33,355][85175] Updated weights for policy 1, policy_version 99100 (0.0009) +[2023-10-11 19:04:33,498][85217] Stopping RolloutWorker_w8... +[2023-10-11 19:04:33,498][85870] Stopping RolloutWorker_w14... +[2023-10-11 19:04:33,498][85210] Stopping RolloutWorker_w2... +[2023-10-11 19:04:33,498][85219] Stopping RolloutWorker_w10... +[2023-10-11 19:04:33,498][85217] Loop rollout_proc8_evt_loop terminating... +[2023-10-11 19:04:33,498][85870] Loop rollout_proc14_evt_loop terminating... +[2023-10-11 19:04:33,498][85209] Stopping RolloutWorker_w0... +[2023-10-11 19:04:33,497][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000099104_101482496.pth... +[2023-10-11 19:04:33,498][85222] Stopping RolloutWorker_w12... +[2023-10-11 19:04:33,498][84801] Stopping Batcher_0... +[2023-10-11 19:04:33,498][85219] Loop rollout_proc10_evt_loop terminating... +[2023-10-11 19:04:33,498][85210] Loop rollout_proc2_evt_loop terminating... +[2023-10-11 19:04:33,498][85209] Loop rollout_proc0_evt_loop terminating... +[2023-10-11 19:04:33,498][85222] Loop rollout_proc12_evt_loop terminating... +[2023-10-11 19:04:33,499][84801] Loop batcher_evt_loop terminating... +[2023-10-11 19:04:33,499][85214] Stopping RolloutWorker_w6... +[2023-10-11 19:04:33,498][84230] Component RolloutWorker_w14 stopped! +[2023-10-11 19:04:33,499][85214] Loop rollout_proc6_evt_loop terminating... +[2023-10-11 19:04:33,500][85215] Stopping RolloutWorker_w4... +[2023-10-11 19:04:33,500][84230] Component RolloutWorker_w8 stopped! +[2023-10-11 19:04:33,500][85215] Loop rollout_proc4_evt_loop terminating... +[2023-10-11 19:04:33,500][84230] Component RolloutWorker_w2 stopped! +[2023-10-11 19:04:33,501][85902] Stopping RolloutWorker_w15... +[2023-10-11 19:04:33,501][85221] Stopping RolloutWorker_w11... +[2023-10-11 19:04:33,501][85902] Loop rollout_proc15_evt_loop terminating... +[2023-10-11 19:04:33,501][84230] Component RolloutWorker_w10 stopped! +[2023-10-11 19:04:33,501][85220] Stopping RolloutWorker_w7... +[2023-10-11 19:04:33,501][85221] Loop rollout_proc11_evt_loop terminating... +[2023-10-11 19:04:33,502][85220] Loop rollout_proc7_evt_loop terminating... +[2023-10-11 19:04:33,502][85212] Stopping RolloutWorker_w3... +[2023-10-11 19:04:33,502][85212] Loop rollout_proc3_evt_loop terminating... +[2023-10-11 19:04:33,502][84230] Component RolloutWorker_w0 stopped! +[2023-10-11 19:04:33,502][85213] Stopping RolloutWorker_w1... +[2023-10-11 19:04:33,503][85213] Loop rollout_proc1_evt_loop terminating... +[2023-10-11 19:04:33,503][84230] Component RolloutWorker_w12 stopped! +[2023-10-11 19:04:33,503][84230] Component Batcher_1 stopped! +[2023-10-11 19:04:33,504][84230] Component Batcher_0 stopped! +[2023-10-11 19:04:33,498][85000] Stopping Batcher_1... +[2023-10-11 19:04:33,504][85218] Stopping RolloutWorker_w9... +[2023-10-11 19:04:33,504][84230] Component RolloutWorker_w6 stopped! +[2023-10-11 19:04:33,504][85218] Loop rollout_proc9_evt_loop terminating... +[2023-10-11 19:04:33,504][85216] Stopping RolloutWorker_w5... +[2023-10-11 19:04:33,504][85223] Stopping RolloutWorker_w13... +[2023-10-11 19:04:33,505][84230] Component RolloutWorker_w4 stopped! +[2023-10-11 19:04:33,505][85216] Loop rollout_proc5_evt_loop terminating... +[2023-10-11 19:04:33,505][85223] Loop rollout_proc13_evt_loop terminating... +[2023-10-11 19:04:33,505][84230] Component RolloutWorker_w15 stopped! +[2023-10-11 19:04:33,505][84230] Component RolloutWorker_w11 stopped! +[2023-10-11 19:04:33,506][84230] Component RolloutWorker_w7 stopped! +[2023-10-11 19:04:33,506][84230] Component RolloutWorker_w3 stopped! +[2023-10-11 19:04:33,507][84230] Component RolloutWorker_w1 stopped! +[2023-10-11 19:04:33,508][84230] Component RolloutWorker_w9 stopped! +[2023-10-11 19:04:33,508][84230] Component RolloutWorker_w5 stopped! +[2023-10-11 19:04:33,508][84230] Component RolloutWorker_w13 stopped! +[2023-10-11 19:04:33,532][85175] Weights refcount: 2 0 +[2023-10-11 19:04:33,532][85176] Weights refcount: 2 0 +[2023-10-11 19:04:33,519][85000] Loop batcher_evt_loop terminating... +[2023-10-11 19:04:33,534][85176] Stopping InferenceWorker_p0-w0... +[2023-10-11 19:04:33,534][85175] Stopping InferenceWorker_p1-w0... +[2023-10-11 19:04:33,535][85175] Loop inference_proc1-0_evt_loop terminating... +[2023-10-11 19:04:33,535][85176] Loop inference_proc0-0_evt_loop terminating... +[2023-10-11 19:04:33,534][84230] Component InferenceWorker_p0-w0 stopped! +[2023-10-11 19:04:33,535][84230] Component InferenceWorker_p1-w0 stopped! +[2023-10-11 19:04:33,547][85000] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000098176_100532224.pth +[2023-10-11 19:04:33,553][85000] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p1/checkpoint_000099104_101482496.pth... +[2023-10-11 19:04:33,610][85000] Stopping LearnerWorker_p1... +[2023-10-11 19:04:33,611][85000] Loop learner_proc1_evt_loop terminating... +[2023-10-11 19:04:33,611][84230] Component LearnerWorker_p1 stopped! +[2023-10-11 19:04:34,499][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000097696_100040704.pth... +[2023-10-11 19:04:34,525][84801] Removing ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000096768_99090432.pth +[2023-10-11 19:04:34,528][84801] Saving ./train_atari/atari_frostbite_APPO/checkpoint_p0/checkpoint_000097696_100040704.pth... +[2023-10-11 19:04:34,561][84801] Stopping LearnerWorker_p0... +[2023-10-11 19:04:34,561][84801] Loop learner_proc0_evt_loop terminating... +[2023-10-11 19:04:34,561][84230] Component LearnerWorker_p0 stopped! +[2023-10-11 19:04:34,562][84230] Waiting for process learner_proc0 to stop... +[2023-10-11 19:04:35,110][84230] Waiting for process learner_proc1 to stop... +[2023-10-11 19:04:35,111][84230] Waiting for process inference_proc0-0 to join... +[2023-10-11 19:04:35,112][84230] Waiting for process inference_proc1-0 to join... +[2023-10-11 19:04:35,112][84230] Waiting for process rollout_proc0 to join... +[2023-10-11 19:04:35,113][84230] Waiting for process rollout_proc1 to join... +[2023-10-11 19:04:35,114][84230] Waiting for process rollout_proc2 to join... +[2023-10-11 19:04:35,114][84230] Waiting for process rollout_proc3 to join... +[2023-10-11 19:04:35,115][84230] Waiting for process rollout_proc4 to join... +[2023-10-11 19:04:35,116][84230] Waiting for process rollout_proc5 to join... +[2023-10-11 19:04:35,116][84230] Waiting for process rollout_proc6 to join... +[2023-10-11 19:04:35,117][84230] Waiting for process rollout_proc7 to join... +[2023-10-11 19:04:35,117][84230] Waiting for process rollout_proc8 to join... +[2023-10-11 19:04:35,118][84230] Waiting for process rollout_proc9 to join... +[2023-10-11 19:04:35,118][84230] Waiting for process rollout_proc10 to join... +[2023-10-11 19:04:35,119][84230] Waiting for process rollout_proc11 to join... +[2023-10-11 19:04:35,119][84230] Waiting for process rollout_proc12 to join... +[2023-10-11 19:04:35,120][84230] Waiting for process rollout_proc13 to join... +[2023-10-11 19:04:35,120][84230] Waiting for process rollout_proc14 to join... +[2023-10-11 19:04:35,120][84230] Waiting for process rollout_proc15 to join... +[2023-10-11 19:04:35,121][84230] Batcher 0 profile tree view: +batching: 169.4280, releasing_batches: 0.0882 +[2023-10-11 19:04:35,121][84230] Batcher 1 profile tree view: +batching: 171.1171, releasing_batches: 0.0916 +[2023-10-11 19:04:35,122][84230] InferenceWorker_p0-w0 profile tree view: +wait_policy: 0.0001 + wait_policy_total: 2719.8592 +update_model: 203.8977 + weight_update: 0.0009 +one_step: 0.0036 + handle_policy_step: 11387.2732 + deserialize: 67.8708, stack: 191.3545, obs_to_device_normalize: 2551.6316, forward: 5130.2110, prepare_outputs: 2472.3548, send_messages: 470.4710 +[2023-10-11 19:04:35,122][84230] InferenceWorker_p1-w0 profile tree view: +wait_policy: 0.0001 + wait_policy_total: 2651.0380 +update_model: 207.6231 + weight_update: 0.0010 +one_step: 0.0033 + handle_policy_step: 11454.8767 + deserialize: 66.3402, stack: 196.2973, obs_to_device_normalize: 2545.6217, forward: 5196.0787, prepare_outputs: 2478.5822, send_messages: 472.7199 +[2023-10-11 19:04:35,122][84230] Learner 0 profile tree view: +misc: 0.0181, prepare_batch: 269.8839 +train: 3655.9835 + epoch_init: 0.1934, minibatch_init: 13.1094, losses_postprocess: 900.9874, kl_divergence: 32.3313, update: 390.2512, after_optimizer: 2132.3176 + calculate_losses: 170.0940 + losses_init: 0.4121, forward_head: 59.3155, bptt_initial: 1.4107, bptt: 1.7758, tail: 38.2225, advantages_returns: 11.2744, losses: 44.1380 +[2023-10-11 19:04:35,123][84230] Learner 1 profile tree view: +misc: 0.0185, prepare_batch: 273.5297 +train: 3665.6169 + epoch_init: 0.1982, minibatch_init: 13.3124, losses_postprocess: 904.4971, kl_divergence: 31.9800, update: 386.6066, after_optimizer: 2143.9310 + calculate_losses: 168.1524 + losses_init: 0.3726, forward_head: 56.5476, bptt_initial: 1.5022, bptt: 2.1474, tail: 38.4990, advantages_returns: 11.3008, losses: 44.0333 +[2023-10-11 19:04:35,123][84230] RolloutWorker_w0 profile tree view: +wait_for_trajectories: 1.2418, enqueue_policy_requests: 410.3453, process_policy_outputs: 190.9864, env_step: 7812.1516, finalize_trajectories: 3.6309, complete_rollouts: 2.9402 +post_env_step: 378.0216 + process_env_step: 86.1159 +[2023-10-11 19:04:35,124][84230] RolloutWorker_w15 profile tree view: +wait_for_trajectories: 1.2152, enqueue_policy_requests: 410.2754, process_policy_outputs: 192.1594, env_step: 7670.0288, finalize_trajectories: 3.5882, complete_rollouts: 2.9059 +post_env_step: 379.9931 + process_env_step: 86.1388 +[2023-10-11 19:04:35,124][84230] Loop Runner_EvtLoop terminating... +[2023-10-11 19:04:35,125][84230] Runner profile tree view: +main_loop: 15014.4369 +[2023-10-11 19:04:35,125][84230] Collected {0: 100040704, 1: 101482496}, FPS: 13422.0